This changes the default to lazy geometries for both in-memory and on-disk.
--fast selects materialized geometries when running in memory, and unsharded
stores when running on disk.
This PR generalizes the idea of `node_keys`, adds `way_keys`, and fixes#402.
I'm not too sure if this is generally useful - it's useful for one of my
use cases, and I see someone asking about it in https://github.com/systemed/tilemaker/issues/190
and, elsewhere, in https://github.com/onthegomap/planetiler/issues/99
If you feel it complicates the maintainer story too much, please reject.
The goal is to reduce memory usage for users doing thematic extracts by
not indexing nodes that are only used by uninteresting ways.
For example, North America has ~1.8B nodes, needing 9.7GB of RAM for its node
store. By contrast, if your interest is only to build a railway map, you
require only ~8M nodes, needing 70MB of RAM. Or, to build a map of
national/provincial parks, 12M nodes and ~120MB of RAM.
Currently, a user can achieve this by pre-filtering their PBF using
osmium-tool. If you know exactly what you want, this is a good
long-term solution. But if you're me, flailing about in the OSM data
model, it's convenient to be able to tweak something in the Lua script
and observe the results without having to re-filter the PBF and update
your tilemaker command to use the new PBF.
Sample use cases:
```lua
-- Building a map without building polygons, ~ excludes ways whose
-- only tags are matched by the filter.
way_keys = {"~building"}
```
```lua
-- Building a railway map
way_keys = {"railway"}
```
```lua
-- Building a map of major roads
way_keys = {"highway=motorway", "highway=trunk", "highway=primary", "highway=secondary"}`
```
Nodes used in ways which are used in relations (as identified by
`relation_scan_function`) will always be indexed, regardless of
`node_keys` and `way_keys` settings that might exclude them.
A concrete example, given a Lua script like:
```lua
function way_function()
if Find("railway") ~= "" then
Layer("lines", false)
end
end
```
it takes 13GB of RAM and 100 seconds to process North America.
If you add:
```lua
way_keys = {"railway"}
```
It takes 2GB of RAM and 47 seconds.
Notes:
1. This is based on `lua-interop-3`, as it interacts with files that are
changed by that. I can rebase against master after lua-interop-3 is
merged.
2. The names `node_keys` and `way_keys` are perhaps out of date, as they
can now express conditions on the values of tags in addition to their
keys. Leaving them as-is is nice, as it's not a breaking change.
But if breaking changes are OK, maybe these should be
`node_filters` and `way_filters` ?
3. Maybe the value for `node_keys` in the OMT profile should be
expressed in terms of a negation, e.g. `node_keys = {"~created_by"}`?
This would avoid issues like https://github.com/systemed/tilemaker/issues/337
4. This also adds a SIGUSR1 handler during OSM processing, which prints
the ID of the object currently being processed. This is helpful for
tracking down slow geometries.
Currently, Tilemaker uses member functions for interop:
```lua
function node_function(node)
node:Layer(...)
```
This PR changes Tilemaker to use global functions:
```lua
function node_function()
Layer(...)
```
The chief rationale is performance. Every member function call needs to
push an extra pointer onto the stack when crossing the Lua/C++ boundary.
Kaguya serializes this pointer as a Lua userdata. That means every
call into Lua has to malloc some memory, and every call back from Lua
has to dereference through this pointer.
And there are a lot of calls! For OMT on the GB extract, I counted
~1.4B calls from Lua into C++.
A secondary rationale is that a global function is a bit more honest.
A user might believe that this is currently permissible:
```lua
last_node = nil
function node_function(node)
if last_node ~= nil
-- do something with last_node
end
-- save the current node for later, for some reason
last_node = node
```
But in reality, the OSM objects we pass into Lua don't behave quite
like Lua objects. They're backed by OsmLuaProcessing, who will move
on, invalidating whatever the user thinks they've got a reference to.
This PR has a noticeable decrease in reading time for me, measured
on the OMT profile for GB, on a 16-core computer:
Before:
```
real 1m28.230s
user 19m30.281s
sys 0m29.610s
```
After:
```
real 1m21.728s
user 17m27.150s
sys 0m32.668s
```
The tradeoffs:
- anyone with a custom Lua profile will need to update it, although the
changes are fairly mechanical
- Tilemaker now reserves several functions in the global namespace,
causing the potential for conflicts
* Rely on packaged rapidjson dependency
* Add rapidjson to cmake
* Add -pthread option to LIB flags variable in Makefile.
* Allow environment setting of CXXFLAGS.
* Avoid git during build
* Github CI build using ubuntu 18.04
Co-authored-by: Felix Delattre <felix@delattre.de>
* Add support for a z_order field and sort output objects by z_order
PostGIS import tools like Osm2pgsql and Imposm can write a z_order field
to the database table in order to allow map styles to render features
ordered by road class and vertical layer. Tilemaker now gets a Lua
callback to set the z_order for an OSM object (default 0) and will sort
the vectortile features by z_order. z_order is also taken into account
for combining of features. z_order values are limited to 1-byte unsigned
integer.
* Document new ZOrder callback function
* Drop unnecessary variable
* Implement z_order sorting for the OpenMapTiles example config
This commit also adapts the minimum zoom levels to the lastest version.