Hi. We at Delhivery wish to contribute roads data to OSM.
Have consolidated trace data that represents geometries missing in OSM. (Trace data belongs to us and has not been scraped)
Facing few challenges here, need help.
What format to be used for bulk contribution? (.osm or something)
How to transform data with information of road geometry and direction into a format that will connect new data with existing graph and not create some new floating graph that is not accessible via existing graph, where as they would connected actually.
This is main challenge, probably close to the concept of ‘conflation’ as much I have managed to understand.
Lack of go-to knowledge of large scale OSM contribution. Some institute must have done it in past where OSM community would have helped them.
With whom to coordinate for programmatic contribution (both operational and technical guidance might be needed)
My recommendation is to avoid any attempt at automatically contributing this data. Automatically contributing data would have to surmount the following obstacles:
connectivity with existing road network, as you have already mentioned;
potential duplication of already-existing roads;
potential overlap with other already-existing features (what if one of your roads goes through a building in OSM?);
you will not know whether your traces are private driveways, service roads, residential roads, or even larger roads, therefore the contribution will be of limited quality.
I would recommmend to make your data available to the volunteer community as some form of editor backdrop imagery and/or a web tool that highlights missing bits, so that the data can be manually incorporated by the community. This will take longer but it will lead to a more satisfying result.
Hi Frederik, those are fair concerns and here is how we plan to avoid them:
Connectivity: Snap segments to nodes or add new nodes where needed into both proposed roads and existing ones.
Duplication: Data is filtered by snapping onto existing network, so only meaningful information makes it through.
Overlap: Use human-validation loop here.
Road-tags: Human-validation loop will take care of this too. AI to infer road surface type, speed and directionality to be inferred with trace data.
Manually data addition by community indeed creates confidence but has atleast following limitations:
We can not purchase satellite imagery to be shared externally.
While choosing open sourcing data over internal version, we still need to make it available as soon as possible to add value. Given existing state of maps within non-metro cities, this process might take that can make it infeasible to wait.