Some Lua 5.4 builds (e.g., vcpkg) removed LUA_GCSETPAUSE and
LUA_GCSETSTEPMUL constants, replacing them with LUA_GCPPAUSE and
LUA_GCPSTEPMUL. Add conditional defines to handle both cases.
* `boost::range::sort` conflicts with `boost::sort`. Implementations are moved into a dedicated .cpp file.
* `boost::system` is a header only library now and the stub is removed
This fixes two issues:
- use an unsigned type, so we can use the whole 9 bits and have 512
keys, not 256
- fix the bounds check in AttributeKeyStore to reflex the lower
threshold that was introduced in #618
Hat tip @oobayly for reporting this.
Since we only allocate 4 bits and `char` is signed, the usable space
was -8..7. Larger values (like, say, 12) overflow, and get interpreted
as a negative value, which means they don't act as a filter, since all
zoom values are natural numbers.
The tests didn't actually test that a zoom value could be roundtripped.
I updated them, and verified they failed before the code change, and
passed after the code change.
I've also allocated an extra bit so that we support minzooms up to z31,
vs just up to z15, since I think (?) some people generate up to z16
CompactNodeStore doesn't know how to compute if it contains a node,
which is a prerequisite for sharding.
The two settings don't make much sense together: sharding will create N
CompactNodeStores, which each will take as much memory as a single one,
since each will likely have a large node ID.
This differs from BinarySearchNodeStore and SortedNodeStore, where each of
the N store instances will take roughly 1/N memory.
Instead:
- fail faster and more clearly by throwing if CompactNodeStore.contains is
called
- don't enable sharding if --compact is passed
Previously, calling Centroid(...) on an invalid geometry (such as
https://www.openstreetmap.org/relation/9769005, which I think gets
simplified to having 0 rings) would throw, killing the lua process.
Instead, return nil.
This changes the default to lazy geometries for both in-memory and on-disk.
--fast selects materialized geometries when running in memory, and unsharded
stores when running on disk.
Eep, two fixes here as well:
- I had rejigged how the skipping of LayerAsCentroid's algorithm
argument worked; this rejigging ultimately broke it entirely, as `i`
would never get incremented.
- If `way_keys` is provided, we are no longer guaranteed that we'll have
stored the `label` node of the relation
When I replaced #604 with #626, I botched extracting this part of the
code. I had the trait, which taught kaguya how to serialize
`PossiblyKnownTagValue`, but I missed updating the parameter type
of `Attribute` to actually use it, so it was a no-op.
This PR restores the behaviour of avoiding string copies, but now that
we have protozero's data_view class, we can use that rather than
our own weirdo struct.
This PR generalizes the idea of `node_keys`, adds `way_keys`, and fixes#402.
I'm not too sure if this is generally useful - it's useful for one of my
use cases, and I see someone asking about it in https://github.com/systemed/tilemaker/issues/190
and, elsewhere, in https://github.com/onthegomap/planetiler/issues/99
If you feel it complicates the maintainer story too much, please reject.
The goal is to reduce memory usage for users doing thematic extracts by
not indexing nodes that are only used by uninteresting ways.
For example, North America has ~1.8B nodes, needing 9.7GB of RAM for its node
store. By contrast, if your interest is only to build a railway map, you
require only ~8M nodes, needing 70MB of RAM. Or, to build a map of
national/provincial parks, 12M nodes and ~120MB of RAM.
Currently, a user can achieve this by pre-filtering their PBF using
osmium-tool. If you know exactly what you want, this is a good
long-term solution. But if you're me, flailing about in the OSM data
model, it's convenient to be able to tweak something in the Lua script
and observe the results without having to re-filter the PBF and update
your tilemaker command to use the new PBF.
Sample use cases:
```lua
-- Building a map without building polygons, ~ excludes ways whose
-- only tags are matched by the filter.
way_keys = {"~building"}
```
```lua
-- Building a railway map
way_keys = {"railway"}
```
```lua
-- Building a map of major roads
way_keys = {"highway=motorway", "highway=trunk", "highway=primary", "highway=secondary"}`
```
Nodes used in ways which are used in relations (as identified by
`relation_scan_function`) will always be indexed, regardless of
`node_keys` and `way_keys` settings that might exclude them.
A concrete example, given a Lua script like:
```lua
function way_function()
if Find("railway") ~= "" then
Layer("lines", false)
end
end
```
it takes 13GB of RAM and 100 seconds to process North America.
If you add:
```lua
way_keys = {"railway"}
```
It takes 2GB of RAM and 47 seconds.
Notes:
1. This is based on `lua-interop-3`, as it interacts with files that are
changed by that. I can rebase against master after lua-interop-3 is
merged.
2. The names `node_keys` and `way_keys` are perhaps out of date, as they
can now express conditions on the values of tags in addition to their
keys. Leaving them as-is is nice, as it's not a breaking change.
But if breaking changes are OK, maybe these should be
`node_filters` and `way_filters` ?
3. Maybe the value for `node_keys` in the OMT profile should be
expressed in terms of a negation, e.g. `node_keys = {"~created_by"}`?
This would avoid issues like https://github.com/systemed/tilemaker/issues/337
4. This also adds a SIGUSR1 handler during OSM processing, which prints
the ID of the object currently being processed. This is helpful for
tracking down slow geometries.