faster tag map, faster Find()/Holds(), avoid mallocs

Cherry-picked from
https://github.com/systemed/tilemaker/pull/604/commits/b3221667a9d2366410dbfdc7f25f3062d7a135ef,
https://github.com/systemed/tilemaker/pull/604/commits/5c807a9841b866c6dc403141effd4c9d14459034,
https://github.com/systemed/tilemaker/pull/604/commits/13b3465f1c80052aa2d622e3915af08b8c5eae9a
and fixed up to work with protozero's data_view structure.

Original commit messages below, the timings will vary but the idea is
the same:

Faster tagmap
=====

Building a std::map for tags is somewhat expensive, especially when
we know that the number of tags is usually quite small.

Instead, use a custom structure that does a crappy-but-fast hash
to put the keys/values in one of 16 buckets, then linear search
the bucket.

For GB, before:
```
real 1m11.507s
user 16m49.604s
sys 0m17.381s
```

After:
```
real	1m9.557s
user	16m28.826s
sys	0m17.937s
```

Saving 2 seconds of wall clock and 20 seconds of user time doesn't
seem like much, but (a) it's not nothing and (b) having the tags
in this format will enable us to thwart some of Lua's defensive
copies in a subsequent commit.

A note about the hash function: hashing each letter of the string
using boost::hash_combine eliminated the time savings.

Faster Find()/Holds()
=====

We (ab?)use kaguya's parameter serialization machinery. Rather than
take a `std::string`, we take a `KnownTagKey` and teach Lua how to
convert a Lua string into a `KnownTagKey`.

This avoids the need to do a defensive copy of the string when coming
from Lua.

It provides a modest boost:

```
real 1m8.859s
user 16m13.292s
sys 0m18.104s
```

Most keys are short enough to fit in the small-string optimization, so
this doesn't help us avoid mallocs. An exception is `addr:housenumber`,
which, at 16 bytes, exceeds g++'s limit of 15 bytes.

It should be possible to also apply a similar trick to the `Attribute(...)`
functions, to avoid defensive copies of strings that we've seen as keys
or values.

avoid malloc for Attribute with long strings
=====

After:

```
real	1m8.124s
user	16m6.620s
sys	0m16.808s
```

Looks like we're solidly into diminishing returns at this point.
This commit is contained in:
Colin Dellow
2023-12-03 17:16:32 -05:00
parent 53b48e24d0
commit c4518f3cca
8 changed files with 331 additions and 51 deletions
+1
View File
@@ -96,6 +96,7 @@ file(GLOB tilemaker_src_files
src/shp_mem_tiles.cpp
src/sorted_node_store.cpp
src/sorted_way_store.cpp
src/tag_map.cpp
src/tile_data.cpp
src/tilemaker.cpp
src/tile_worker.cpp
+1
View File
@@ -120,6 +120,7 @@ tilemaker: \
src/shp_mem_tiles.o \
src/sorted_node_store.o \
src/sorted_way_store.o \
src/tag_map.o \
src/tile_data.o \
src/tilemaker.o \
src/tile_worker.o \
+20 -11
View File
@@ -17,6 +17,8 @@
#include <boost/container/flat_map.hpp>
class TagMap;
// Lua
extern "C" {
#include "lua.h"
@@ -32,6 +34,19 @@ extern bool verbose;
class AttributeStore;
class AttributeSet;
// A string, which might be in `currentTags` as a value. If Lua
// code refers to an absent value, it'll fallback to passing
// it as a std::string.
//
// The intent is that Attribute("name", Find("name")) is a common
// pattern, and we ought to avoid marshalling a string back and
// forth from C++ to Lua when possible.
struct PossiblyKnownTagValue {
bool found;
uint32_t index;
std::string fallback;
};
/**
\brief OsmLuaProcessing - converts OSM objects into OutputObjects.
@@ -75,31 +90,25 @@ public:
using tag_map_t = boost::container::flat_map<protozero::data_view, protozero::data_view, DataViewLessThan>;
// Scan non-MP relation
bool scanRelation(WayID id, const tag_map_t &tags);
bool scanRelation(WayID id, const TagMap& tags);
/// \brief We are now processing a significant node
void setNode(NodeID id, LatpLon node, const tag_map_t &tags);
void setNode(NodeID id, LatpLon node, const TagMap& tags);
/// \brief We are now processing a way
bool setWay(WayID wayId, LatpLonVec const &llVec, const tag_map_t &tags);
bool setWay(WayID wayId, LatpLonVec const &llVec, const TagMap& tags);
/** \brief We are now processing a relation
* (note that we store relations as ways with artificial IDs, and that
* we use decrementing positive IDs to give a bit more space for way IDs)
*/
void setRelation(int64_t relationId, WayVec const &outerWayVec, WayVec const &innerWayVec, const tag_map_t &tags, bool isNativeMP, bool isInnerOuter);
void setRelation(int64_t relationId, WayVec const &outerWayVec, WayVec const &innerWayVec, const TagMap& tags, bool isNativeMP, bool isInnerOuter);
// ---- Metadata queries called from Lua
// Get the ID of the current object
std::string Id() const;
// Check if there's a value for a given key
bool Holds(const std::string& key) const;
// Get an OSM tag for a given key (or return empty string if none)
const std::string Find(const std::string& key) const;
// ---- Spatial queries called from Lua
// Find intersecting shapefile layer
@@ -200,6 +209,7 @@ public:
inline AttributeStore &getAttributeStore() { return attributeStore; }
struct luaProcessingException :std::exception {};
const TagMap* currentTags;
private:
/// Internal: clear current cached state
@@ -259,7 +269,6 @@ private:
class LayerDefinition &layers;
std::vector<std::pair<OutputObject, AttributeSet>> outputs; // All output objects that have been created
const boost::container::flat_map<protozero::data_view, protozero::data_view, DataViewLessThan>* currentTags;
std::vector<OutputObject> finalizeOutputs();
+3 -6
View File
@@ -9,6 +9,7 @@
#include <map>
#include "osm_store.h"
#include "pbf_reader.h"
#include "tag_map.h"
#include <protozero/data_view.hpp>
class OsmLuaProcessing;
@@ -62,16 +63,12 @@ public:
);
// Read tags into a map from a way/node/relation
using tag_map_t = boost::container::flat_map<protozero::data_view, protozero::data_view, DataViewLessThan>;
template<typename T>
void readTags(T& pbfObject, const PbfReader::PrimitiveBlock& pb, tag_map_t& tags) {
tags.reserve(pbfObject.keys.size());
void readTags(T &pbfObject, PbfReader::PrimitiveBlock const &pb, TagMap& tags) {
for (uint n=0; n < pbfObject.keys.size(); n++) {
auto keyIndex = pbfObject.keys[n];
auto valueIndex = pbfObject.vals[n];
protozero::data_view key = pb.stringTable[keyIndex];
protozero::data_view value = pb.stringTable[valueIndex];
tags[key] = value;
tags.addTag(pb.stringTable[keyIndex], pb.stringTable[valueIndex]);
}
}
+56
View File
@@ -0,0 +1,56 @@
#ifndef _TAG_MAP_H
#define _TAG_MAP_H
#include <vector>
#include <string>
#include <boost/container/flat_map.hpp>
#include <protozero/data_view.hpp>
// We track tags in a special structure, which enables some tricks when
// doing Lua interop.
//
// The alternative is a std::map - but often, our map is quite small.
// It's preferable to have a small set of vectors and do linear search.
//
// Further, we can avoid passing std::string from Lua -> C++ in some cases
// by first checking to see if the string we would have passed is already
// stored in our tag map, and passing a reference to its location.
// Assumptions:
// 1. Not thread-safe
// This is OK because we have 1 instance of OsmLuaProcessing per thread.
// 2. Lifetime of map is less than lifetime of keys/values that are passed
// This is true since the strings are owned by the protobuf block reader
// 3. Max number of tag values will fit in a short
// OSM limit is 5,000 tags per object
class TagMap {
public:
TagMap();
void reset();
void addTag(const protozero::data_view& key, const protozero::data_view& value);
// Return -1 if key not found, else return its keyLoc.
int64_t getKey(const char* key, size_t size) const;
// Return -1 if value not found, else return its keyLoc.
int64_t getValue(const char* key, size_t size) const;
const protozero::data_view* getValueFromKey(uint32_t keyLoc) const;
const protozero::data_view* getValue(uint32_t valueLoc) const;
boost::container::flat_map<std::string, std::string> exportToBoostMap() const;
private:
uint32_t ensureString(
std::vector<std::vector<const protozero::data_view*>>& vector,
const protozero::data_view& value
);
std::vector<std::vector<const protozero::data_view*>> keys;
std::vector<std::vector<uint32_t>> key2value;
std::vector<std::vector<const protozero::data_view*>> values;
};
#endif _TAG_MAP_H
+102 -25
View File
@@ -5,16 +5,108 @@
#include "helpers.h"
#include "coordinates_geom.h"
#include "osm_mem_tiles.h"
#include "tag_map.h"
using namespace std;
const std::string EMPTY_STRING = "";
thread_local kaguya::State *g_luaState = nullptr;
thread_local OsmLuaProcessing* osmLuaProcessing = nullptr;
// A key in `currentTags`. If Lua code refers to an absent key,
// found will be false.
struct KnownTagKey {
bool found;
uint32_t index;
};
template<> struct kaguya::lua_type_traits<KnownTagKey> {
typedef KnownTagKey get_type;
typedef const KnownTagKey& push_type;
static bool strictCheckType(lua_State* l, int index)
{
return lua_type(l, index) == LUA_TSTRING;
}
static bool checkType(lua_State* l, int index)
{
return lua_isstring(l, index) != 0;
}
static get_type get(lua_State* l, int index)
{
KnownTagKey rv = { false, 0 };
size_t size = 0;
const char* buffer = lua_tolstring(l, index, &size);
int64_t tagLoc = osmLuaProcessing->currentTags->getKey(buffer, size);
if (tagLoc >= 0) {
rv.found = true;
rv.index = tagLoc;
}
// std::string key(buffer, size);
// std::cout << "for key " << key << ": rv.found=" << rv.found << ", rv.index=" << rv.index << std::endl;
return rv;
}
static int push(lua_State* l, push_type s)
{
throw std::runtime_error("Lua code doesn't know how to use KnownTagKey");
}
};
template<> struct kaguya::lua_type_traits<PossiblyKnownTagValue> {
typedef PossiblyKnownTagValue get_type;
typedef const PossiblyKnownTagValue& push_type;
static bool strictCheckType(lua_State* l, int index)
{
return lua_type(l, index) == LUA_TSTRING;
}
static bool checkType(lua_State* l, int index)
{
return lua_isstring(l, index) != 0;
}
static get_type get(lua_State* l, int index)
{
PossiblyKnownTagValue rv = { false, 0 };
size_t size = 0;
const char* buffer = lua_tolstring(l, index, &size);
// For long strings where we might need to do a malloc, see if we
// can instead pass a pointer to a value from this object's tag
// map.
//
// 15 is the threshold where gcc no longer applies the small string
// optimization.
if (size > 15) {
int64_t tagLoc = osmLuaProcessing->currentTags->getValue(buffer, size);
if (tagLoc >= 0) {
rv.found = true;
rv.index = tagLoc;
return rv;
}
}
rv.fallback = std::string(buffer, size);
return rv;
}
static int push(lua_State* l, push_type s)
{
throw std::runtime_error("Lua code doesn't know how to use PossiblyKnownTagValue");
}
};
std::string rawId() { return osmLuaProcessing->Id(); }
bool rawHolds(const std::string& key) { return osmLuaProcessing->Holds(key); }
const std::string& rawFind(const std::string& key) { return osmLuaProcessing->Find(key); }
bool rawHolds(const KnownTagKey& key) { return key.found; }
const std::string rawFind(const KnownTagKey& key) {
if (key.found) {
auto value = *(osmLuaProcessing->currentTags->getValueFromKey(key.index));
return std::string(value.data(), value.size());
}
return EMPTY_STRING;
}
std::vector<std::string> rawFindIntersecting(const std::string &layerName) { return osmLuaProcessing->FindIntersecting(layerName); }
bool rawIntersects(const std::string& layerName) { return osmLuaProcessing->Intersects(layerName); }
std::vector<std::string> rawFindCovering(const std::string& layerName) { return osmLuaProcessing->FindCovering(layerName); }
@@ -36,7 +128,6 @@ std::vector<double> rawCentroid() { return osmLuaProcessing->Centroid(); }
bool supportsRemappingShapefiles = false;
const std::string EMPTY_STRING = "";
int lua_error_handler(int errCode, const char *errMessage)
{
@@ -156,18 +247,6 @@ string OsmLuaProcessing::Id() const {
return to_string(originalOsmID);
}
// Check if there's a value for a given key
bool OsmLuaProcessing::Holds(const string& key) const {
return currentTags->find(key) != currentTags->end();
}
// Get an OSM tag for a given key (or return empty string if none)
const string OsmLuaProcessing::Find(const string& key) const {
auto it = currentTags->find(key);
if(it == currentTags->end()) return EMPTY_STRING;
return std::string(it->second.data(), it->second.size());
}
// ---- Spatial queries called from Lua
vector<string> OsmLuaProcessing::FindIntersecting(const string &layerName) {
@@ -606,7 +685,7 @@ void OsmLuaProcessing::setVectorLayerMetadata(const uint_least8_t layer, const s
// Scan relation (but don't write geometry)
// return true if we want it, false if we don't
bool OsmLuaProcessing::scanRelation(WayID id, const tag_map_t &tags) {
bool OsmLuaProcessing::scanRelation(WayID id, const TagMap& tags) {
reset();
originalOsmID = id;
isWay = false;
@@ -620,15 +699,13 @@ bool OsmLuaProcessing::scanRelation(WayID id, const tag_map_t &tags) {
}
if (!relationAccepted) return false;
boost::container::flat_map<std::string, std::string> m;
for (const auto& i : tags) {
m[std::string(i.first.data(), i.first.size())] = std::string(i.second.data(), i.second.size());
}
osmStore.store_relation_tags(id, m);
// If we're persisting, we need to make a real map that owns its
// own keys and values.
osmStore.store_relation_tags(id, tags.exportToBoostMap());
return true;
}
void OsmLuaProcessing::setNode(NodeID id, LatpLon node, const tag_map_t &tags) {
void OsmLuaProcessing::setNode(NodeID id, LatpLon node, const TagMap& tags) {
reset();
originalOsmID = id;
@@ -656,7 +733,7 @@ void OsmLuaProcessing::setNode(NodeID id, LatpLon node, const tag_map_t &tags) {
}
// We are now processing a way
bool OsmLuaProcessing::setWay(WayID wayId, LatpLonVec const &llVec, const tag_map_t &tags) {
bool OsmLuaProcessing::setWay(WayID wayId, LatpLonVec const &llVec, const TagMap& tags) {
reset();
wayEmitted = false;
originalOsmID = wayId;
@@ -706,7 +783,7 @@ bool OsmLuaProcessing::setWay(WayID wayId, LatpLonVec const &llVec, const tag_ma
}
// We are now processing a relation
void OsmLuaProcessing::setRelation(int64_t relationId, WayVec const &outerWayVec, WayVec const &innerWayVec, const tag_map_t &tags,
void OsmLuaProcessing::setRelation(int64_t relationId, WayVec const &outerWayVec, WayVec const &innerWayVec, const TagMap& tags,
bool isNativeMP, // only OSM type=multipolygon
bool isInnerOuter) { // any OSM relation with "inner" and "outer" roles (e.g. type=multipolygon|boundary)
reset();
+13 -9
View File
@@ -28,6 +28,8 @@ bool PbfProcessor::ReadNodes(OsmLuaProcessing& output, PbfReader::PrimitiveGroup
{
// ---- Read nodes
std::vector<NodeStore::element_t> nodes;
TagMap tags;
for (auto& node : pg.nodes()) {
NodeID nodeId = node.id;
@@ -45,17 +47,15 @@ bool PbfProcessor::ReadNodes(OsmLuaProcessing& output, PbfReader::PrimitiveGroup
nodes.push_back(std::make_pair(static_cast<NodeID>(nodeId), latplon));
if (significant) {
tags.reset();
// For tagged nodes, call Lua, then save the OutputObject
boost::container::flat_map<protozero::data_view, protozero::data_view, DataViewLessThan> tags;
tags.reserve((node.tagEnd - node.tagStart) / 2);
for (int n = node.tagStart; n < node.tagEnd; n += 2) {
auto keyIndex = pg.translateNodeKeyValue(n);
auto valueIndex = pg.translateNodeKeyValue(n + 1);
protozero::data_view key{pb.stringTable[keyIndex].data(), pb.stringTable[keyIndex].size()};
protozero::data_view value{pb.stringTable[valueIndex].data(), pb.stringTable[valueIndex].size()};
tags[key] = value;
const protozero::data_view& key = pb.stringTable[keyIndex];
const protozero::data_view& value = pb.stringTable[valueIndex];
tags.addTag(key, value);
}
output.setNode(static_cast<NodeID>(nodeId), latplon, tags);
}
@@ -84,6 +84,7 @@ bool PbfProcessor::ReadWays(
std::vector<WayStore::ll_element_t> llWays;
std::vector<std::pair<WayID, std::vector<NodeID>>> nodeWays;
TagMap tags;
LatpLonVec llVec;
std::vector<NodeID> nodeVec;
@@ -131,7 +132,7 @@ bool PbfProcessor::ReadWays(
if (llVec.empty()) continue;
try {
tag_map_t tags;
tags.reset();
readTags(pbfWay, pb, tags);
bool emitted = output.setWay(static_cast<WayID>(pbfWay.id), llVec, tags);
@@ -164,6 +165,8 @@ bool PbfProcessor::ScanRelations(OsmLuaProcessing& output, PbfReader::PrimitiveG
if (pg.relations().empty())
return false;
TagMap tags;
int typeKey = findStringPosition(pb, "type");
int mpKey = findStringPosition(pb, "multipolygon");
@@ -173,7 +176,7 @@ bool PbfProcessor::ScanRelations(OsmLuaProcessing& output, PbfReader::PrimitiveG
WayID relid = static_cast<WayID>(pbfRelation.id);
if (!isMultiPolygon) {
if (output.canReadRelations()) {
tag_map_t tags;
tags.reset();
readTags(pbfRelation, pb, tags);
isAccepted = output.scanRelation(relid, tags);
}
@@ -202,6 +205,7 @@ bool PbfProcessor::ReadRelations(
if (pg.relations().empty())
return false;
TagMap tags;
std::vector<RelationStore::element_t> relations;
int typeKey = findStringPosition(pb, "type");
@@ -245,7 +249,7 @@ bool PbfProcessor::ReadRelations(
continue;
try {
tag_map_t tags;
tags.reset();
readTags(pbfRelation, pb, tags);
output.setRelation(pbfRelation.id, outerWayVec, innerWayVec, tags, isMultiPolygon, isInnerOuter);
+135
View File
@@ -0,0 +1,135 @@
#include "tag_map.h"
#include <boost/functional/hash.hpp>
#include <iostream>
TagMap::TagMap() {
keys.resize(16);
key2value.resize(16);
values.resize(16);
}
void TagMap::reset() {
for (int i = 0; i < 16; i++) {
keys[i].clear();
key2value[i].clear();
values[i].clear();
}
}
const std::size_t hashString(const std::string& str) {
// This is a pretty crappy hash function in terms of bit
// avalanching and distribution of output values.
//
// But it's very good in terms of speed, which turns out
// to be the important measure.
std::size_t hash = str.size();
if (hash >= 4)
hash ^= *(uint32_t*)str.data();
return hash;
}
const std::size_t hashString(const char* str, size_t size) {
// This is a pretty crappy hash function in terms of bit
// avalanching and distribution of output values.
//
// But it's very good in terms of speed, which turns out
// to be the important measure.
std::size_t hash = size;
if (hash >= 4)
hash ^= *(uint32_t*)str;
return hash;
}
uint32_t TagMap::ensureString(
std::vector<std::vector<const protozero::data_view*>>& vector,
const protozero::data_view& value
) {
std::size_t hash = hashString(value.data(), value.size());
const uint16_t shard = hash % vector.size();
for (int i = 0; i < vector[shard].size(); i++)
if (*(vector[shard][i]) == value)
return shard << 16 | i;
vector[shard].push_back(&value);
return shard << 16 | (vector[shard].size() - 1);
}
void TagMap::addTag(const protozero::data_view& key, const protozero::data_view& value) {
uint32_t valueLoc = ensureString(values, value);
// std::cout << "valueLoc = " << valueLoc << std::endl;
uint32_t keyLoc = ensureString(keys, key);
// std::cout << "keyLoc = " << keyLoc << std::endl;
const uint16_t shard = keyLoc >> 16;
const uint16_t pos = keyLoc;
// std::cout << "shard=" << shard << ", pos=" << pos << std::endl;
if (key2value[shard].size() <= pos) {
// std::cout << "growing shard" << std::endl;
key2value[shard].resize(pos + 1);
}
key2value[shard][pos] = valueLoc;
}
int64_t TagMap::getKey(const char* key, size_t size) const {
// Return -1 if key not found, else return its keyLoc.
std::size_t hash = hashString(key, size);
const uint16_t shard = hash % keys.size();
for (int i = 0; i < keys[shard].size(); i++) {
const protozero::data_view& candidate = *keys[shard][i];
if (candidate.size() != size)
continue;
if (memcmp(candidate.data(), key, size) == 0)
return shard << 16 | i;
}
return -1;
}
int64_t TagMap::getValue(const char* value, size_t size) const {
// Return -1 if value not found, else return its valueLoc.
std::size_t hash = hashString(value, size);
const uint16_t shard = hash % values.size();
for (int i = 0; i < values[shard].size(); i++) {
const protozero::data_view& candidate = *values[shard][i];
if (candidate.size() != size)
continue;
if (memcmp(candidate.data(), value, size) == 0)
return shard << 16 | i;
}
return -1;
}
const protozero::data_view* TagMap::getValueFromKey(uint32_t keyLoc) const {
const uint32_t valueLoc = key2value[keyLoc >> 16][keyLoc & 0xFFFF];
return values[valueLoc >> 16][valueLoc & 0xFFFF];
}
const protozero::data_view* TagMap::getValue(uint32_t valueLoc) const {
return values[valueLoc >> 16][valueLoc & 0xFFFF];
}
boost::container::flat_map<std::string, std::string> TagMap::exportToBoostMap() const {
boost::container::flat_map<std::string, std::string> rv;
for (int i = 0; i < keys.size(); i++) {
for (int j = 0; j < keys[i].size(); j++) {
uint32_t valueLoc = key2value[i][j];
auto key = *keys[i][j];
auto value = *values[valueLoc >> 16][valueLoc & 0xFFFF];
rv[std::string(key.data(), key.size())] = std::string(value.data(), value.size());
}
}
return rv;
}