Best way to merge external data into an OSM pbf

What are your recommendations on how to merge external data into an OSM pbf file?

My use case:

  • I have some data like a CSV with osm_id, osm_type, addiontal_tags
  • I want to add those additional_tags to an osm pbf
  • I would use this file with osm2pgsql to make decisions based on the osm tags and additional tags in LUA

I found that brouter has a process for that to add “pseudo tags” before importing the pbf into brouter. Die Berechung der Welt, Generierung von Pseudo-Tags für den BRouter - media.ccc.de (Is that what they do? How?)

I found I could run a full osm2pgsql run, then add tags in SQL, then export this into a PBF again from Postgres following How can we export data from PostgreSQL database into *.osm.pbf? - #2 by Richard. I was hoping for a process that is a bit more direct and maybe faster?

I could probably create artificial changesets and apply those with osmium Osmium manual pages – osmium-apply-changes (1). That would probably work. (Any examples of this in action?) I assume those artificial changesets would then become the last change for those object (changing the last edit user, last edit timestamp, version) which would not be ideal (but maybe OK…)

Creating artifical changes will probably not work (easily) because you presumably want to keep the existing tags. But there are easier options anyway: One is to use PyOsmium, write a small script that reads the CSV file into memory and then goes through all the OSM data and adds the relevant tags. Another would be to hack something with the OPL file format which all osmium tools can read and write. But you have to make sure you don’t get any duplicate keys that way.

3 Likes

You can do this with pyosmium. It works similar to what is described in the cookbook Adding Relation Information to Member Ways - Pyosmium 4.0.2 :

Load your CSV into three dictionaries, one for each OSM type with osm_id as key and your additional tags as value. Then follow the second part of the cookbook: create a writer for your output file, open a reader for your input file, add three id filters (one for each OSM type) and a `handler_for_filtered(writer). Inside the loop merge the tags and write the new object.

Without having tried it, the code should roughly look like this:

import osmium
import csv

extra = {'N': {}, 'W': {}, 'R': {}}

with open('extra.csv') as fd:
    for l in csv.DictReader(fd):
        extra[l['osm_type']][l['osm_id']] = l['additional']

with osmium.SimpleWriter('myoutput.pbf', overwrite=True) as writer:
    fp = osmium.FileProcessor('myinput.pbf')\
               .with_filter(osmium.filter.IdFilter(extra['N'].keys()).enable_for(osmium.osm.NODE))\
               .with_filter(osmium.filter.IdFilter(extra['W'].keys()).enable_for(osmium.osm.WAY))\
               .with_filter(osmium.filter.IdFilter(extra['R'].keys()).enable_for(osmium.osm.RELATION))\
               .handler_for_filtered(writer)

    for way in fp:
        tags = dict(way.tags)
        tags.update(extra[way.type_str()][way.id])
        writer.add(way.replace(tags=tags))
3 Likes

Or, I forgot a solution that might make more sense in your case: Read the CSV from the Lua file first thing. Then merge the data in on the fly.

2 Likes

Arndt (BRouter) hat mir ebenfalls geantwortet, wie es in BRouter gelöst ist. Für die Doku kopiere ich es nach hier:

wir machen das nicht auf Ebene der PBF-Datei, sondern einen Schritt weiter in der Verarbeitungskette, wenn die Tags zu einer Way-ID schon als key-value map vorliegen.

Folgende Java-Klasse liest die Pseudo-Tags aus einer CSV in eine Map im Memory und addiert sie dann zu den bestehenden Tags in einer key-value-map:

brouter/brouter-map-creator/src/main/java/btools/mapcreator/DatabasePseudoTagProvider.java at master · abrensch/brouter · GitHub