mirror of
https://github.com/systemed/tilemaker.git
synced 2026-05-06 16:30:00 -04:00
master
1 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
52b62dfbd5 |
some memory and concurrency improvements (#612)
* extract ClipCache to own file
Some housekeeping: extract clip_cache.cpp
* templatize ClipCache, apply to MultiLineStrings
This provides a very small benefit. I think the reason is two-fold:
there aren't many multilinestrings (relative to multipolygons), and
clipping them is less expensive.
Still, it did seem to provide a small boost, so leaving it in.
* housekeeping: move test, minunit
* --log-tile-timings: verbose timing logs
This isn't super useful to end users, but is useful for developers.
If it's not OK to leave it in, let me know & I'll revert it.
You can then process the log:
```bash
$ for x in {0..14}; do echo -n "z$x "; cat log-write-node-attributes.txt | grep ' took ' | sort -nk3 | grep z$x/ | awk 'BEGIN { min = 999999; max = 0; }; { n += 1; t += $3; if ($3 > max) { max = $3; max_id = $1; } } END { print n, t, t/n, max " (" max_id ")" }'; done
z0 1 7.04769 7.04769 7.047685 (z0/0/0)
z1 1 9.76067 9.76067 9.760671 (z1/0/0)
z2 1 9.98514 9.98514 9.985141 (z2/1/1)
z3 1 9.98514 9.98514 9.985141 (z3/2/2)
z4 2 14.4699 7.23493 8.610035 (z4/5/5)
z5 2 20.828 10.414 13.956526 (z5/10/11)
z6 5 6464.05 1292.81 3206.252711 (z6/20/23)
z7 13 11306.4 869.727 3275.475707 (z7/40/46)
z8 35 15787.1 451.061 2857.506681 (z8/81/92)
z9 86 20723.8 240.974 1605.788985 (z9/162/186)
z10 277 25456.8 91.9018 778.311785 (z10/331/369)
z11 960 28851.3 30.0534 627.351078 (z11/657/735)
z12 3477 24031.6 6.91158 451.122972 (z12/1315/1471)
z13 13005 13763.7 1.05834 156.074701 (z13/2631/2943)
z14 50512 24214.7 0.479385 106.358450 (z14/5297/5916)
```
This shows each zoom's # of tiles, total time, average time, worst case
time (and the tile that caused it).
In general, lower zooms are slower than higher zooms. This seems
intuitively reasonable, as the lower zoom often contains all of
the objects in the higher zoom.
I would have guessed that a lower zoom would cost 4x the next higher
zoom on a per-tile basis. That's sort of the case for `z12->z13`,
`z11->z12`, `z10->z11`, and `z9->z10`. But not so for other zooms,
where it's more like a 2x cost.
Looking at `z5->z6`, we see a big jump from 10ms/tile to 1,292ms/tile.
This is probably because `water` has a minzoom of 6.
This all makes me think that the next big gain will be from re-using
simplifications.
This is sort of the mirror image of the clip cache:
- the clip cache avoids redundant clipping, and needs to be computed
from lower zooms to higher zooms
- a simplification cache could make simplifying cheaper, but needs to
be computed from higher zooms to lower zooms
The simplification cache also has two other wrinkles:
1. Is it even valid? e.g. is `simplify(object, 4)` the same as
`simplify(simplify(object, 2), 2)` ? Maybe it doesn't have to be the
same, because users are already accepting that we're losing accuracy
when we simplify.
2. Rendering an object at `z - 1`, needds to (potentially) stitch together
that object from 4 tiles at `z`. If those have each been simplified,
we may introduce odd seams where the terminal points don't line up.
* more, smaller caches; run destructors outside lock
* use explicit types
* don't populate unnecessary vectors
* reserve vectors appropriately
* don't eagerly call way:IsClosed()
This saves a very little bit of time, but more importantly, tees up
lazily evaluating the nodes in a way.
* remove locks from geometry stores
Rather than locking on each store call, threads lease a range of the
ID space for points/lines/multilines/polygons. When the thread ends,
it return the lease.
This has some implications:
- the IDs are no longer guaranteed to be contiguous
- shapefiles are a bit weird, as they're loaded on the main
thread -- so their lease won't be returned until program
end. This is fine, just pointing it out.
This didn't actually seem to affect runtime that much on my 16 core
machine, but I think it'd help on machines with more cores.
* increase attributestore shards
When scaling to 32+ cores, this shows up as an issue. Try a really
blunt hammer fix.
* read_pbf: less lock contention on status
`std::cout` has some internal locks -- instead, let's synchronize
explicitly outside of it so we control the contention.
If a worker fails to get the lock, just skip that worker's update.
* tile_worker: do syscall 1x/thread, not 1x/tile
* tilemaker: avoid lock contention on status update
If a worker can't get the lock, just skip their update.
* Revert "don't eagerly call way:IsClosed()"
This reverts commit
|