Files
Colin Dellow d62c480deb be able to render the planet with 32gb of RAM (#618)
* move OutputObjects to mmap store

For the planet, we need 1.3B output objects, 12 bytes per, so ~15GB
of RAM.

* treat objects at low zoom specially

For GB, ~0.3% of objects are visible at low zooms.

I noticed in previous planet runs that fetching the objects for tiles in
the low zooms was quite slow - I think it's because we're scanning 1.3B
objects each time, only to discard most of them. Now we'll only be
scanning ~4M objects per tile, which is still an absurd number, but
should mitigate most of the speed issue without having to properly
index things.

This will also help us maintain performance for memory-constrained
users, as we won't be scanning all 15GB of data on disk, just a smaller
~45MB chunk.

* make more explicit that this is unexpected

* extend --materialize-geometries to nodes

For Points stored via Layer(...) calls, store the node ID in the
OSM store, unless `--materialize-geometries` is present.

This saves ~200MB of RAM for North America, so perhaps 1 GB for the
planet if NA has similar characteristics as the planet.

Also fix the OSM_ID(...) macro - it was lopping off many more bits
than needed, due to some previous experiments. Now that we want to track
nodes, we need at least 34 bits.

This may pose a problem down the road when we try to address thrashing.
The mechanism I hoped to use was to divide the OSM stores into multiple
stores covering different low zoom tiles. Ideally, we'd be able to
recall which store to look in -- but we only have 36 bits, we need 34
to store the Node ID, so that leaves us with 1.5 bits => can divide into
3 stores.

Since the node store for the planet is 44GB, dividing into 3 stores
doesn't give us very much headroom on a 32 GB box. Ah well, we can
sort this out later.

* rejig AttributePair layout

On g++, this reduces the size from 48 bytes to 34 bytes.

There aren't _that_ many attribute pairs, even on the planet scale, but
this plus a better encoding of string attributes might save us ~2GB at
the planet level, which is meaningful for a 32GB box

* fix initialization order warning

* add PooledString

Not used by anything yet. Given Tilemaker's limited needs, we can get
away with a stripped-down string class that is less flexible than
std::string, in exchange for memory savings.

The key benefits - 16 bytes, not 32 bytes (g++) or 24 bytes (clang).

When it does allocate (for strings longer than 15 bytes), it allocates
from a pool so there's less per-allocation overhead.

* add tests for attribute store

...I'm going to replace the string implementation, so let's have some
backstop to make sure I don't break things

* rejig isHot

Break dependency on AttributePair, just work on std::string

* teach PooledString to work with std::string

...this will be useful for doing map lookups when testing if an
AttributePair has already been created with the given value.

* use PooledString in AttributePair

AttributePair has now been trimmed from 48 bytes to 18 bytes. There are
40M AttributeSets for the planet. That suggests there's probably ~30M AttributePairs,
so hopefully this is a savings of ~900MB at the planet level.

Runtime doesn't seem affected.

There's a further opportunity for savings if we can make more strings
qualify for the short string optimization. Only about 40% of strings
fit in the 15 byte short string optimization.

Of the remaining 60%, many are Latin-alphabet title cased strings like
`Wellington Avenue` -- this could be encoded using 5 bits per letter,
saving us an allocation.

Even in the most optimistic case where:

- there are 30M AttributePairs
- of these, 90% are strings (= 27M)
- of these, 60% don't fit in SSO (=16m)
- of these, we can make 100% fit in SSO

...we only save about 256MB at the planet level, but at some significant
complexity cost. So probably not worth pursuing at the moment.

* log timings

When doing the planet, especially on a box with limited memory, there
are long periods with no output. Show some output so the user doesn't
think things are hung.

This also might be useful in detecting perf regressions more granularly.

* AppendVector: an append-only chunked vector

When using --store, deque is nice because growing doesn't require
invalidating the old storage and copying it to a new location.

However, it's also bad, because deque allocates in 512-byte chunks,
which causes each 4KB OS page to have data from different z6 tiles.

Instead, use our own container that tries to get the best of both worlds.

Writing a random access iterator is new for me, so I don't trust this
code that much. The saving grace is that the container is very limited,
so errors in the iterator impelementation may not get exercised in
practice.

* fix progress when --store present

* mutex on RelationScan progress output

* make NodeStore/WayStore shardable

This adds three methods to the stores:

- `shard()` returns which shard you are
- `shards()` returns how many shards total
- `contains(shard, id)` returns whether or not shard N has an item with
  id X

SortedNodeStore/SortedWayStore are not implemented yet, that'll come in
a future commit.

This will allow us to create a `ShardedNodeStore` and `ShardedWayStore`
that contain N stores. We will try to ensure that each store has data
that is geographically close to each other.

Then, when reading, we'll do multiple passes of the PBF to populate each store.
This should let us reduce the working set used to populate the stores,
at the cost of additional linear scans of the PBF. Linear scans of disk
are much less painful than random scans, so that should be a good trade.

* add minimal SortedNodeStore test

I'm going to rejig the innards of this class, so let's have some tests.

* stop using internal linkage for atomics

In order to shard the stores, we need to have multiple instances
of the class.

Two things block this currently: atomics at file-level, and
thread-locals.

Moving the atomics to the class is easy.

Making the thread-locals per-class will require an approach similar
to that adopted in
https://github.com/systemed/tilemaker/blob/52b62dfbd5b6f8e4feb6cad4e3de86ba27874b3a/include/leased_store.h#L48,
where we have a container that tracks the per-class data.

* SortedNodeStore: abstract TLS behind storage()

Still only supports 1 class, but this is a step along the path.

* SortedWayStore: abstract TLS behind storage()

* SortedNodeStore: support multiple instances

* SortedWayStorage: support multiple instances

* actually fix the low zoom object collection

D'oh, this "worked" due to two bugs cancelling each other:

(a) the code to find things in the low zoom list never found anything,
    because it assumed a base z6 tile of 0/0

(b) we weren't returning early, so the normal code still ran

Rejigged to actually do what I was intending

* AppendVector tweaks

* more low zoom fixes

* implement SortedNodeStore::contains

* implement SortedWayStore::contains

* use TileCoordinatesSet

* faster covered tile enumeration

Do a single pass,  rather than one pass per zoom.

* add ShardedNodeStore

This distributes nodes into one of 8 shards, trying to roughly group
parts of the globe by complexity.

This should help with locality when writing tiles.

A future commit will add a ShardedWayStore and teach read_pbf to read in
a locality-aware manner, which should help when reading ways.

* add ShardedWayStore

Add `--shard-stores` flag.

It's not clear yet this'll be a win, will need to benchmark.

The cost of reading the PBF blocks repeatedly is a bit higher than I was
expecting. It might be worth seeing if we can index the blocks to skip
fruitless reads.

* fewer, more balanced shards

* skip ReadPhase::Ways passes if node store is empty

* support multiple passes for ReadPhase::Relations

* fix check for first way

* adjust shards

With this distribution, no node shard is more than ~8.5GB.

* Relations: fix effectiveShards > 1 check

Oops, bug that very moderately affected performance in the non
`--shard-stores` case

* extend --materialize-geometries to LayerAsCentroid

It turns out that about 20% of LayerAsCentroid calls are for nodes,
which this branch could already do.

The remaining calls are predominantly ways, e.g. housenumbers.

We always materialize relation centroids, as they're expensive to
compute.

In GB, this saves about 6.4M points, ~102M. Scaled to the planet, it's
perhaps a 4.5GB savings, which should let us use a more aggressive shard
strategy.

It seems to add 3-4 seconds to the time to process GB.

* add `DequeMap`, change AttributeStore to use it

This implements the idea in https://github.com/systemed/tilemaker/issues/622#issuecomment-1866813888

Rather than storing a `deque<T>` and a `flat_map<T*, uint32_t>`,
store a `deque<T>` and `vector<uint32_t>`, to save 8 bytes per
AttributePair and AttributeSet.

* capture s(this)

Seems to save ~1.5 seconds on GB

* fix warning

* fix warning, really

* fewer shards

Shard 1 (North America) is ~4.8GB of nodes, shard 4 (some of Europe) is
3.7GB. Even ignoring the memory savings in the recent commits, these
could be merged.

* extract option parsing to own file

We'd like to have different defaults based on whether `--store` is
present. Now that option parsing will have some more complex logic,
let's pull it into its own class so it can be more easily tested.

* use sensible defaults based on presence of --store

* improve test coverage

* fixes

* update number of shards to 6

This has no performance impact as we never put anything in the 7th
shard, and so we skip doing the 7th pass in the ReadPhase::Ways and
ReadPhase::Relations phase.

The benefit is only to avoid emitting a noisy log about how the 7th store
has 0 entries in it.

Timings with 6 shards on Vultr's 16-core machine here: https://gist.github.com/cldellow/77991eb4074f6a0f31766cf901659efb

The new peak memory is ~12.2GB.

I am a little perplexed -- the runtime on a 16-core server was
previously:

```
$ time tilemaker --store /tmp/store --input planet-latest.osm.pbf --output tiles.mbtiles --shard-stores
real	195m7.819s
user	2473m52.322s
sys	73m13.116s
```

But with the most recent commits on this branch, it was:

```
real	118m50.098s
user	1531m13.026s
sys	34m7.252s
```

This is incredibly suspicious. I also tried re-running commit
bbf0957c1e, and got:

```
real	123m15.534s
user	1546m25.196s
sys	38m17.093s
```

...so I can't explain why the earlier runs took 195 min.

Ideas:

- the planet changed between runs, and a horribly broken geometry was
  fixed

- Vultr gives quite different machines for the same class of server

- perhaps most likely: I failed to click "CPU-optimized" when picking
  the earlier server, and got a slow machine the first time, and a fast
  machine the second time. I'm pretty sure I paid the same $, so I'm
  not sure I believe this.

I don't think I really believe that a 33% reduction in runtime is
explained by any of those, though. Anyway, just another thing to
be befuddled by.

* --store uses lazy geometries; permit overriding

I did some experiments on a Hetzner 48-core box with 192GB of RAM:

--store, materialize geometries:
real 65m34.327s
user 2297m50.204s
sys 65m0.901s

The process often failed to use 100% of CPU--if you naively divide
user+sys/real you get ~36, whereas the ideal would be ~48.

Looking at stack traces, it seemed to coincide with calls to Boost's
rbtree_best_fit allocator.

Maybe:

- we're doing disk I/O, and it's just slower than recomputing the geometries
- we're using the Boost mmap library suboptimally -- maybe there's
  some other allocator we could be using. I think we use the mmap
  allocator like a simple bump allocator, so I don't know why we'd need
  a red-black tree

--store, lazy geometries:
real 55m33.979s
user 2386m27.294s
sys 23m58.973s

Faster, but still some overhead (user+sys/real => ~43)

no --store, materialize geometries: OOM

no --store, lazy geometries (used 175GB):
real 51m27.779s
user 2306m25.309s
sys 16m34.289s

This was almost 100% CPU - user+sys/real => ~45)

From this, I infer:

- `--store` should always default to lazy geometries in order to
  minimize the I/O burden

- `--materialize-geometries` is a good default for non-store usage,
  but it's still useful to be able to override and use lazy geometries,
  if it then means you can fit the data entirely in memory
2023-12-28 15:23:35 +00:00

489 lines
16 KiB
C++

/*! \file */
#ifndef _TILE_DATA_H
#define _TILE_DATA_H
#include <map>
#include <set>
#include <vector>
#include <memory>
#include <boost/sort/sort.hpp>
#include "output_object.h"
#include "append_vector.h"
#include "clip_cache.h"
#include "mmap_allocator.h"
#define TILE_DATA_ID_SIZE 34
typedef std::vector<class TileDataSource *> SourceList;
class TileBbox;
// We cluster output objects by z6 tile
#define CLUSTER_ZOOM 6
#define CLUSTER_ZOOM_WIDTH (1 << CLUSTER_ZOOM)
#define CLUSTER_ZOOM_AREA (CLUSTER_ZOOM_WIDTH * CLUSTER_ZOOM_WIDTH)
struct OutputObjectXY {
OutputObject oo;
Z6Offset x;
Z6Offset y;
};
class TileCoordinatesSet {
public:
TileCoordinatesSet(uint zoom);
bool test(TileCoordinate x, TileCoordinate y) const;
void set(TileCoordinate x, TileCoordinate y);
size_t size() const;
private:
uint zoom;
std::vector<bool> tiles;
};
struct OutputObjectXYID {
OutputObject oo;
Z6Offset x;
Z6Offset y;
uint64_t id;
};
template<typename OO> void finalizeObjects(
const std::string& name,
const size_t& threadNum,
const unsigned int& baseZoom,
typename std::vector<AppendVectorNS::AppendVector<OO>>::iterator begin,
typename std::vector<AppendVectorNS::AppendVector<OO>>::iterator end,
typename std::vector<std::vector<OO>>& lowZoom
) {
size_t z6OffsetDivisor = baseZoom >= CLUSTER_ZOOM ? (1 << (baseZoom - CLUSTER_ZOOM)) : 1;
#ifdef CLOCK_MONOTONIC
timespec startTs, endTs;
clock_gettime(CLOCK_MONOTONIC, &startTs);
#endif
int i = -1;
for (auto it = begin; it != end; it++) {
i++;
if (it->size() > 0 || i % 10 == 0 || i == 4095) {
std::cout << "\r" << name << ": finalizing z6 tile " << (i + 1) << "/" << CLUSTER_ZOOM_AREA;
#ifdef CLOCK_MONOTONIC
clock_gettime(CLOCK_MONOTONIC, &endTs);
uint64_t elapsedNs = 1e9 * (endTs.tv_sec - startTs.tv_sec) + endTs.tv_nsec - startTs.tv_nsec;
std::cout << " (" << std::to_string((uint32_t)(elapsedNs / 1e6)) << " ms)";
#endif
std::cout << std::flush;
}
if (it->size() == 0)
continue;
// We track a separate copy of low zoom objects to avoid scanning large
// lists of objects that may be on slow disk storage.
for (auto objectIt = it->begin(); objectIt != it->end(); objectIt++)
if (objectIt->oo.minZoom < CLUSTER_ZOOM)
lowZoom[i].push_back(*objectIt);
// If the user is doing a a small extract, there are few populated
// entries in `object`.
//
// e.g. Colorado has ~9 z6 tiles, 1 of which has 95% of its output
// objects.
//
// This optimizes for the small extract case by doing:
// - for each vector in objects
// - do a multi-threaded sort of vector
//
// For small extracts, this ensures that all threads are used even if
// only a handful of entries in `objects` are non-empty.
//
// For a global extract, this will have some overhead of repeatedly
// setting up/tearing down threads. In that case, it would be
// better to assign chunks of `objects` to each thread.
//
// That's a future performance improvement, so deferring for now.
boost::sort::block_indirect_sort(
it->begin(),
it->end(),
[baseZoom](const OO& a, const OO& b) {
// Cluster by parent zoom, so that a subsequent search
// can find a contiguous range of entries for any tile
// at zoom 6 or higher.
const size_t aX = a.x;
const size_t aY = a.y;
const size_t bX = b.x;
const size_t bY = b.y;
for (size_t z = CLUSTER_ZOOM; z <= baseZoom; z++) {
const auto aXz = aX / (1 << (baseZoom - z));
const auto bXz = bX / (1 << (baseZoom - z));
if (aXz != bXz)
return aXz < bXz;
const auto aYz = aY / (1 << (baseZoom - z));
const auto bYz = bY / (1 << (baseZoom - z));
if (aYz != bYz)
return aYz < bYz;
}
return false;
},
threadNum
);
}
std::cout << std::endl;
}
template<typename OO> void collectTilesWithObjectsAtZoomTemplate(
const unsigned int& baseZoom,
const typename std::vector<AppendVectorNS::AppendVector<OO>>::iterator objects,
const size_t size,
std::vector<TileCoordinatesSet>& zooms
) {
size_t maxZoom = zooms.size() - 1;
uint16_t z6OffsetDivisor = baseZoom >= CLUSTER_ZOOM ? (1 << (baseZoom - CLUSTER_ZOOM)) : 1;
int64_t lastX = -1;
int64_t lastY = -1;
for (size_t i = 0; i < size; i++) {
const size_t z6x = i / CLUSTER_ZOOM_WIDTH;
const size_t z6y = i % CLUSTER_ZOOM_WIDTH;
for (size_t j = 0; j < objects[i].size(); j++) {
// Compute the x, y at the base zoom level
TileCoordinate baseX = z6x * z6OffsetDivisor + objects[i][j].x;
TileCoordinate baseY = z6y * z6OffsetDivisor + objects[i][j].y;
// Translate the x, y at the requested zoom level
TileCoordinate x = baseX / (1 << (baseZoom - maxZoom));
TileCoordinate y = baseY / (1 << (baseZoom - maxZoom));
if (lastX != x || lastY != y) {
lastX = x;
lastY = y;
for (int zoom = maxZoom; zoom >= 0; zoom--) {
zooms[zoom].set(x, y);
x /= 2;
y /= 2;
}
}
}
}
}
template<typename OO>
OutputObjectID outputObjectWithId(const OO& input) {
return OutputObjectID({ input.oo, 0 });
}
template<>
inline OutputObjectID outputObjectWithId<OutputObjectXYID>(const OutputObjectXYID& input) {
return OutputObjectID({ input.oo, input.id });
}
template<typename OO> void collectLowZoomObjectsForTile(
const unsigned int& baseZoom,
typename std::vector<std::vector<OO>> objects,
unsigned int zoom,
const TileCoordinates& dstIndex,
std::vector<OutputObjectID>& output
) {
if (zoom >= CLUSTER_ZOOM)
throw std::runtime_error("collectLowZoomObjectsForTile should not be called for high zooms");
uint16_t z6OffsetDivisor = baseZoom >= CLUSTER_ZOOM ? (1 << (baseZoom - CLUSTER_ZOOM)) : 1;
for (size_t i = 0; i < objects.size(); i++) {
const size_t z6x = i / CLUSTER_ZOOM_WIDTH;
const size_t z6y = i % CLUSTER_ZOOM_WIDTH;
for (size_t j = 0; j < objects[i].size(); j++) {
// Compute the x, y at the base zoom level
TileCoordinate baseX = z6x * z6OffsetDivisor + objects[i][j].x;
TileCoordinate baseY = z6y * z6OffsetDivisor + objects[i][j].y;
// Translate the x, y at the requested zoom level
TileCoordinate x = baseX / (1 << (baseZoom - zoom));
TileCoordinate y = baseY / (1 << (baseZoom - zoom));
if (dstIndex.x == x && dstIndex.y == y) {
if (objects[i][j].oo.minZoom <= zoom) {
output.push_back(outputObjectWithId(objects[i][j]));
}
}
}
}
}
template<typename OO> void collectObjectsForTileTemplate(
const unsigned int& baseZoom,
typename std::vector<AppendVectorNS::AppendVector<OO>>::iterator objects,
size_t iStart,
size_t iEnd,
unsigned int zoom,
const TileCoordinates& dstIndex,
std::vector<OutputObjectID>& output
) {
if (zoom < CLUSTER_ZOOM)
throw std::runtime_error("collectObjectsForTileTemplate should not be called for low zooms");
uint16_t z6OffsetDivisor = baseZoom >= CLUSTER_ZOOM ? (1 << (baseZoom - CLUSTER_ZOOM)) : 1;
for (size_t i = iStart; i < iEnd; i++) {
// If z >= 6, we can compute the exact bounds within the objects array.
// Translate to the base zoom, then do a binary search to find
// the starting point.
TileCoordinate z6x = dstIndex.x / (1 << (zoom - CLUSTER_ZOOM));
TileCoordinate z6y = dstIndex.y / (1 << (zoom - CLUSTER_ZOOM));
TileCoordinate baseX = dstIndex.x * (1 << (baseZoom - zoom));
TileCoordinate baseY = dstIndex.y * (1 << (baseZoom - zoom));
Z6Offset needleX = baseX - z6x * z6OffsetDivisor;
Z6Offset needleY = baseY - z6y * z6OffsetDivisor;
// Kind of gross that we have to do this. Might be better if we split
// into two arrays, one of x/y and one of OOs. Would have better locality for
// searching, too.
OutputObject dummyOo(POINT_, 0, 0, 0, 0);
const size_t bz = baseZoom;
const OO targetXY = {dummyOo, needleX, needleY };
auto iter = std::lower_bound(
objects[i].begin(),
objects[i].end(),
targetXY,
[bz](const OO& a, const OO& b) {
// Cluster by parent zoom, so that a subsequent search
// can find a contiguous range of entries for any tile
// at zoom 6 or higher.
const size_t aX = a.x;
const size_t aY = a.y;
const size_t bX = b.x;
const size_t bY = b.y;
for (size_t z = CLUSTER_ZOOM; z <= bz; z++) {
const auto aXz = aX / (1 << (bz - z));
const auto aYz = aY / (1 << (bz - z));
const auto bXz = bX / (1 << (bz - z));
const auto bYz = bY / (1 << (bz - z));
if (aXz != bXz)
return aXz < bXz;
if (aYz != bYz)
return aYz < bYz;
}
return false;
}
);
for (; iter != objects[i].end(); iter++) {
// Compute the x, y at the base zoom level
TileCoordinate baseX = z6x * z6OffsetDivisor + iter->x;
TileCoordinate baseY = z6y * z6OffsetDivisor + iter->y;
// Translate the x, y at the requested zoom level
TileCoordinate x = baseX / (1 << (baseZoom - zoom));
TileCoordinate y = baseY / (1 << (baseZoom - zoom));
if (dstIndex.x == x && dstIndex.y == y) {
if (iter->oo.minZoom <= zoom) {
output.push_back(outputObjectWithId(*iter));
}
} else {
// Short-circuit when we're confident we'd no longer see relevant matches.
// We've ordered the entries in `objects` such that all objects that
// share the same tile at any zoom are in contiguous runs.
//
// Thus, as soon as we fail to find a match, we can stop looking.
break;
}
}
}
}
class TileDataSource {
public:
// Store for generated geometries
using point_store_t = std::vector<Point>;
using linestring_t = boost::geometry::model::linestring<Point, std::vector, mmap_allocator>;
using linestring_store_t = std::vector<linestring_t>;
using multi_linestring_t = boost::geometry::model::multi_linestring<linestring_t, std::vector, mmap_allocator>;
using multi_linestring_store_t = std::vector<multi_linestring_t>;
using polygon_t = boost::geometry::model::polygon<Point, true, true, std::vector, std::vector, mmap_allocator, mmap_allocator>;
using multi_polygon_t = boost::geometry::model::multi_polygon<polygon_t, std::vector, mmap_allocator>;
using multi_polygon_store_t = std::vector<multi_polygon_t>;
std::mutex storeMutex;
// Threads can grab one of the stores and work on them in a thread local.
std::vector<std::pair<size_t, point_store_t*>> availablePointStoreLeases;
std::vector<std::pair<size_t, linestring_store_t*>> availableLinestringStoreLeases;
std::vector<std::pair<size_t, multi_linestring_store_t*>> availableMultiLinestringStoreLeases;
std::vector<std::pair<size_t, multi_polygon_store_t*>> availableMultiPolygonStoreLeases;
virtual std::string name() const = 0;
protected:
size_t numShards;
uint8_t shardBits;
std::mutex mutex;
bool includeID;
uint16_t z6OffsetDivisor;
// Guards objects, objectsWithIds.
std::vector<std::mutex> objectsMutex;
// The top-level vector has 1 entry per z6 tile, indexed by x*64 + y
// The inner vector contains the output objects that are contained in that z6 tile
//
// In general, only one of these vectors will be populated.
//
// If config.include_ids is true, objectsWithIds will be populated.
// Otherwise, objects.
std::vector<AppendVectorNS::AppendVector<OutputObjectXY>> objects;
std::vector<std::vector<OutputObjectXY>> lowZoomObjects;
std::vector<AppendVectorNS::AppendVector<OutputObjectXYID>> objectsWithIds;
std::vector<std::vector<OutputObjectXYID>> lowZoomObjectsWithIds;
// rtree index of large objects
using oo_rtree_param_type = boost::geometry::index::quadratic<128>;
boost::geometry::index::rtree< std::pair<Box,OutputObject>, oo_rtree_param_type> boxRtree;
boost::geometry::index::rtree< std::pair<Box,OutputObjectID>, oo_rtree_param_type> boxRtreeWithIds;
unsigned int baseZoom;
std::vector<point_store_t> pointStores;
std::vector<linestring_store_t> linestringStores;
std::vector<multi_linestring_store_t> multilinestringStores;
std::vector<multi_polygon_store_t> multipolygonStores;
ClipCache<MultiPolygon> multiPolygonClipCache;
ClipCache<MultiLinestring> multiLinestringClipCache;
public:
TileDataSource(size_t threadNum, unsigned int baseZoom, bool includeID);
void collectTilesWithObjectsAtZoom(std::vector<TileCoordinatesSet>& zooms);
void collectTilesWithLargeObjectsAtZoom(std::vector<TileCoordinatesSet>& zooms);
void collectObjectsForTile(uint zoom, TileCoordinates dstIndex, std::vector<OutputObjectID>& output);
void finalize(size_t threadNum);
void addGeometryToIndex(
const Linestring& geom,
const std::vector<OutputObject>& outputs,
const uint64_t id
);
void addGeometryToIndex(
const MultiLinestring& geom,
const std::vector<OutputObject>& outputs,
const uint64_t id
);
void addGeometryToIndex(
const MultiPolygon& geom,
std::vector<OutputObject>& outputs, // so we can mutate objectID to skip clip cache
const uint64_t id
);
void addObjectToSmallIndex(const TileCoordinates& index, const OutputObject& oo, uint64_t id);
void addObjectToLargeIndex(const Box& envelope, const OutputObject& oo, uint64_t id) {
std::lock_guard<std::mutex> lock(mutex);
if (id == 0 || !includeID)
boxRtree.insert(std::make_pair(envelope, oo));
else
boxRtreeWithIds.insert(std::make_pair(envelope, OutputObjectID({oo, id})));
}
void collectLargeObjectsForTile(uint zoom, TileCoordinates dstIndex, std::vector<OutputObjectID>& output);
std::vector<OutputObjectID> getObjectsForTile(
const std::vector<bool>& sortOrders,
unsigned int zoom,
TileCoordinates coordinates
);
virtual Geometry buildWayGeometry(OutputGeometryType const geomType, NodeID const objectID, const TileBbox &bbox);
virtual LatpLon buildNodeGeometry(NodeID const objectID, const TileBbox &bbox) const;
void open() {
// Put something at index 0 of all stores so that 0 can be used
// as a sentinel.
pointStores[0].push_back(Point(0,0));
linestringStores[0].push_back(linestring_t());
multipolygonStores[0].push_back(multi_polygon_t());
multilinestringStores[0].push_back(multi_linestring_t());
}
void reportSize() const;
// Accessors for generated geometries
using handle_t = void *;
NodeID storePoint(Point const &input);
inline size_t getShard(NodeID id) const {
// Note: we only allocate 34 bits for the IDs. This allows us to
// use bits 35 and 36 for TileDataSource-specific handling (e.g.,
// OsmMemTiles may want to generate points/ways on the fly by
// referring to the WayStore).
return id >> (TILE_DATA_ID_SIZE - shardBits);
}
virtual void populateMultiPolygon(MultiPolygon& dst, NodeID objectID);
inline size_t getId(NodeID id) const {
return id & (~(~0ull << (TILE_DATA_ID_SIZE - shardBits)));
}
const Point& retrievePoint(NodeID id) const {
const auto& shardId = getShard(id);
const auto& shard = pointStores[shardId];
const auto offset = getId(id);
if (offset > shard.size()) throw std::out_of_range("Could not find generated node with id " + std::to_string(id) + ", shard " + std::to_string(shardId) + ", offset=" + std::to_string(offset));
return shard.at(offset);
}
NodeID storeLinestring(const Linestring& src);
const linestring_t& retrieveLinestring(NodeID id) const {
const auto& shardId = getShard(id);
const auto& shard = linestringStores[shardId];
const auto offset = getId(id);
if (offset > shard.size()) throw std::out_of_range("Could not find generated linestring with id " + std::to_string(id) + ", shard " + std::to_string(shardId) + ", offset=" + std::to_string(offset));
return shard.at(offset);
}
NodeID storeMultiLinestring(const MultiLinestring& src);
multi_linestring_t const &retrieveMultiLinestring(NodeID id) const {
const auto& shardId = getShard(id);
const auto& shard = multilinestringStores[shardId];
const auto offset = getId(id);
if (offset > shard.size()) throw std::out_of_range("Could not find generated multi-linestring with id " + std::to_string(id) + ", shard " + std::to_string(shardId) + ", offset=" + std::to_string(offset));
return shard.at(offset);
}
NodeID storeMultiPolygon(const MultiPolygon& src);
multi_polygon_t const &retrieveMultiPolygon(NodeID id) const {
const auto& shardId = getShard(id);
const auto& shard = multipolygonStores[shardId];
const auto offset = getId(id);
if (offset > shard.size()) throw std::out_of_range("Could not find generated multi-polygon with id " + std::to_string(id) + ", shard " + std::to_string(shardId) + ", offset=" + std::to_string(offset));
return shard.at(offset);
}
};
void populateTilesAtZoom(
const std::vector<class TileDataSource *>& sources,
std::vector<TileCoordinatesSet>& zooms
);
#endif //_TILE_DATA_H