I’m working on a solution to acquire, and keep updated, OSM data in a Postgres (PostGIS) database.
I’m using currently using osm2pgsql to import data from Geofabrik.
In order to keep this data up-to-date I am using Pyosmium (a Python wrapper around Osmium) to download ChangeFiles in an OSC file format (which is just XML).
Osm2pgsql can then read the change files and perform updates.
The challenge that I’m facing is that I also need to raise events based on what ways have changed. Basically, I need to figure out exactly what roads/tracks have just been altered in order to notify downstream systems. The downstream system needs to be notified about changed roads/tracks because it needs to reprocess the data for our internal business needs.
The problem is that I’m not sure how to determine exactly what has changed. From reading around, the OSC/XML OsmChange files are not actually a good choice for this purpose. The reason for this, as I understand, is that a node could change and that node is part of a way but the way itself has not changed. For example, if the shape of an off-road track has been changed because the lat/long of a node has changed, then the ChangeFile will include the node, but not the way. This means that I can’t raise an event about the way based off the OSC file.
According to this answer and this answer, they tend to agree that the OSC files don’t include enough context about what has really changed.
Then I wondered if I can use osm2pgsql’s processing callbacks. But the problem I noticed is that, according to the documentation:
These functions are called for each new or modified OSM object in the input file.
Well we’ve already established that the input file (OSC OsmChange file) doesn’t have enough information and context.
So my question is: Based on an OSC OsmChange file that is about to be given to osm2pgsql, how can I determine exactly which roads/tracks are about to be updated, so that I can raise events/notifications for downstream systems?