New algorithm for inferring road surface tags

Hi all! This is my first forum post, so apologize if this the wrong category.

I wanted to bring attention to an open source project I’ve been working on, which is a road surface-type classifier algorithm for OSM. Currently it is trained on USGS NAIP imagery (it’s the only tiled imagery source that allows offline processing), and so it is limited to the United States. For the limited resolution imagery (stuck with ~2m GSD imagery using the tiled source), it does quite well: >90% accuracy at classifying paved vs unpaved roads, and >80% accuracy when splitting between asphalt, concrete, bricks, and unpaved!

The motivation behind the project is since the TIGER import in the US is largely unverified, I’m hoping to add surface tags to help open source routers make more-informed decisions. I find that this overall lack of knowledge of which roads are paved / unpaved is prohibitive in my area (rural CO).

I’m also currently starting work on a JOSM plugin that can leverage this model to allow quick classification of the road surface types over a broad area, allowing the user to quickly accept / reject the algo’s result and correct the label.

I’m curious on overall feedback and reception of an idea like this, since I’m happy to expand the concept to tagging other things (number of lanes maybe)? I’m also interested in access better imagery sources if they are available in order to get higher resolution data and/or global coverage.

I’m also interested in any feedback regarding how this project can best support its goal for OSM, adding road surface tags efficiently. Currently I’m working on a JOSM plugin but curious if there were any other ideas.

Project link is here:
https://github.com/jdalrym2/road_surface_classifier

Thanks all!

Jon

8 Likes

I’m a little skeptical since I’m of the opinion that no tag is better than the wrong tag. If I’m understanding correctly, 90% accuracy means 10% wrong. I’m very opposed to changing a current tag based on imagery without concurrence from the previous mapper.
Sorry to be so negative, it sounds like a cool project, but for this type of tag I think no tag is better than an incorrect tag.

6 Likes

I wholeheartedly agree! It’s a very valid concern regarding machine learning in general: there’s no way to guarantee 100% accuracy. To that end, I’d like to elaborate on how I think this project could fit in.

  • Yep 90% right means 10% wrong: as a result I do not recommend using this to adjust existing surface tags. This project is more targeted toward adding new surface tags in places mappers won’t be likely to be working (rural US where we still have the TIGER imports).
  • I’m hoping to leverage this project in a way to assist mappers, rather than automatically make decisions. The goal of my JOSM plugin idea would be to use the project to infer surface tags, and then the mapper can accept / reject based on the (often higher resolution) imagery that they have access to. This allows a semi-automated way to add surface tags over broad areas, while still getting confirmation from a mapper such that we are not compromising the integrity of OSM.

Thanks for reaching out!

3 Likes

that is too low accuracy to be imported into OSM

also how it was verified? Based on manual aerial classification (that also has some error rare) or verified by survey?

is there some way to review accuracy of published dataset?

Detection of places where surface claimed by model and surface tagged in OSM differs may be interesting for review of OSM data

7 Likes

This is outstandingly good. Thank you for this.

Have you considered publishing a data dump of the output - perhaps just a CSV of way_id,surface? That way, routers could get up and running with the project right away.

Also, it may be useful to detect cases where road surface mismatches its class: making it more valuable for manual verification and mapping (and maybe reveal badly classified roads).

5 Likes

That is quite literally 90% of roads in the flyover states!

In most cases I would concur that a hit rate of 90% accuracy isn’t good enough for an import, but in the case of rural US highway=residential where there’s currently no surface tag, I would say it’s a massive improvement over the current situation and I’d be minded to support it. The TIGER A41 roads are still a massive problem, and making the situation 90% better would not only make OSM much better in itself, it would help us to leapfrog other mapping providers who also have very poor data for these areas.

1 Like

OK, I see point that it could be helpful in cases which are 100% wrong right now.

Maybe https://www.maproulette.org/ would be a good idea?

3 Likes

Currently it is trained on USGS NAIP imagery (it’s the only tiled
imagery source that allows offline processing), and so it is limited to
the United States.

You may be interested to know that Mapbox allows their satellite data to be used for offline machine learning stuff.

https://wiki.openstreetmap.org/wiki/Machine_learning#cite_note-1

3 Likes

I see you provide 20 random classification samples, all of which are correctly classified.
It would be more interesting to have a sample of cases in which the model fails to correctly recognize whether the road is paved or unpaved, so that we can understand when this happens and mappers can more easily recognize mistakes when approving these changes.

1 Like

does it recognize the case when a road is not visible (hidden by trees, in a tunnel, etc.)?

Hi,

I second @Mateusz_Konieczny’s questions. How did you calculate the accurracy? Did you compare it to self-collectd ground truth? Did you compare it with exising surface=* tags in OSM?

I think that 90% accurracy is not good enough for import (except you can proof that existing surface=* tags in that area are of lower accuracy). Mapped attributes that could be unset otherwise provide the impression that someone saw the feature on the ground. On the other hand, the highway=residential imported by TIGER are not helpful for serious routing applications either.

Because mapped data in contrast to unmapped data tends to make people more reluctant to editing OSM (see the consequences of TIGER import on the US and the number of volunteer mappers per inhabitants in contrast to various European countries), I am against this import.

However, if a dump of way_id,start_node_id,end_node_id,surface is provided under ODbL [1], routing engines and other data consumers can mix it in if it is good enough for them (e.g. @Richard). Over time, mappers will add more and more surface=* tags and one day filling the remaining gaps with your results might be a good idea.

Best regards

Michael

[1] Any other license would be an ODbL violation, but that is off-topic here.

2 Likes

How did you calculate the accurracy? Did you compare it to
self-collectd ground truth? Did you compare it with exising surface=*
tags in OSM?

Y’know, StreetComplete has been available for years, and has helped add a lot of 𝚜𝚞𝚛𝚏𝚊𝚌𝚎 tags to OSM. Since it’s aimed for “on the ground” surveying, I presume any 𝚜𝚞𝚛𝚏𝚊𝚌𝚎 tags added via it would be trustworthy. Comparing 𝘵𝘩𝘢𝘵 data to your detected data would (IMO) be good.

Processing OSM historical data is hard. I wrote a tool ( GitHub - amandasaurus/osm-tag-csv-history: Create a CSV file of OSM tag changes ) to convert a history file into a CSV for each tag change. You could probably use it to find “all 𝚜𝚞𝚛𝚏𝚊𝚌𝚎 tags added with StreetComplete”.

To echo others: providing a raw dump of CSV would be nice & useful for others to t̶e̶a̶r̶ y̶o̶u̶r̶ h̶a̶r̶d̶e̶a̶r̶n̶e̶d̶ w̶o̶r̶k̶ t̶o̶ s̶h̶r̶e̶d̶s̶ provide constructive feedback. :wink:

2 Likes

Main challenge here is that area where this data would be most useful (TIGER deserts) would be also likely devoid of StreetComplete edits, and may have characteristic unique to it, making not so easy to compare data.

For example SC data may be in cities where this algorithm is better/worse.

Looks useful! I’m in CO as well, and have struggled with surface tags. They are valuable for routing, but surveying is next to impossible in these larger states. I think the sheer size of the open spaces in the western US is hard for people to grasp if they haven’t lived there…

I do agree that 90% is probably not good enough for automated classification, but since you are proposing a mapper-assistance tool, then I think 90% is plenty good, and for places that are very hard to survey, it might actually be better than some guesses that people have applied.

I do know Boulder County has been pretty rigorously updated with surface tags due to all the cyclists, so it could be an interesting area to examine for accuracy.

1 Like

Thanks all for the wonderful insight.

how it was verified? Based on manual aerial classification (that also has some error rare) or verified by survey?

The dataset was generated using a randomized subset of OSM ways that did have surface tags, scattered all throughout the US. So it’s assumed that this is a fair mix of aerial classification and survey.

Have you considered publishing a data dump of the output - perhaps just a CSV of way_id,surface ? That way, routers could get up and running with the project right away.

This is a wonderful idea! This is a great way to make this project helpful right away, while being tolerant of error. I’ll plan on this.

The TIGER A41 roads are still a massive problem, and making the situation 90% better would not only make OSM much better in itself, it would help us to leapfrog other mapping providers who also have very poor data for these areas.

Thanks for the compliment! Polishing the TIGER import is the main motivation behind this project. The data in my area is very poor and makes OSM-based routing very prohibitive. This is what I’m hoping to improve.

OK, I see point that it could be helpful in cases which are 100% wrong right now.

Right: I agree this shouldn’t be used to directly import to OSM, but regardless this is a powerful tool with tolerable accuracy for a lot of use cases. I think publishing inference results for a wide swath of the US + providing a JOSM plugin to allow this to assist mappers (using them to provide the ultimate ground truth decision).

You may be interested to know that Mapbox allows their satellite data to be used for offline machine learning stuff.

Thanks for this info! I will reach out to them. Of note is that the NAIP imagery, as-is, being at 2.3 m / px is probably the main limitation for this algo at the moment. But with this imagery I proved that this is a well-formed classification problem and my ML model is a feasible solution. Getting hold of high-res imagery should drastically improve these results.

I see you provide 20 random classification samples, all of which are correctly classified.
It would be more interesting to have a sample of cases in which the model fails to correctly recognize whether the road is paved or unpaved, so that we can understand when this happens and mappers can more easily recognize mistakes when approving these changes.

This is actually completely by coincidence. Thanks for the input. I’ll provide several examples of correctly categorized and incorrectly categorized results instead.

does it recognize the case when a road is not visible (hidden by trees, in a tunnel, etc.)?

No, this is an edge case. I will note that the model does well even when the road is hidden by trees; I’m guessing even getting a few pixels of road through the canopy is enough for it to make a decent decision. But this is only speculation.

I second @Mateusz_Konieczny’s questions. How did you calculate the accurracy? Did you compare it to self-collectd ground truth? Did you compare it with exising surface=* tags in OSM?

See above: yes the dataset used for training this model and validating the accuracy came from a random sampling of existing surface=* tags in OSM spread across the entire US. I detail the data generation process carefully on the project site if you’re curious, under the “data prep notebook”.

Because mapped data in contrast to unmapped data tends to make people more reluctant to editing OSM (see the consequences of TIGER import on the US and the number of volunteer mappers per inhabitants in contrast to various European countries), I am against this import.

Don’t worry, I’m not hoping to use this as-is to import mass amounts of data to OSM. I think a JOSM plugin to allow mappers to access this algo + providing a separate dataset dumping way_id, surface, etc., is the best way to leverage this at the moment.

Y’know, StreetComplete has been available for years, and has helped add a lot of 𝚜𝚞𝚛𝚏𝚊𝚌𝚎 tags to OSM.

Agreed that StreetComplete is a good way to get decent “on-the-ground” surveyed ground truth for surface tags. But as others hinted I’m a little worried that focusing solely on these would bias the model to cities & populated areas, which is not what it’s trying to target. Also, thanks for the project link, I’ll check it out!

Looks useful! I’m in CO as well, and have struggled with surface tags. They are valuable for routing, but surveying is next to impossible in these larger states. I think the sheer size of the open spaces in the western US is hard for people to grasp if they haven’t lived there…

We’re on the same page, thanks for the compliment!

I do know Boulder County has been pretty rigorously updated with surface tags due to all the cyclists, so it could be an interesting area to examine for accuracy.

This is a great idea for a local case study and to tell a good story. I’ll look into this!


Either way, thanks all: this gives me a lot going forward. I’m hoping to leverage this project currently in 3 ways:

  • Seeking out higher-res imagery via Mapbox and/or others, to get higher accuracy. Improving the model would always be beneficial.
  • Writing a JOSM plugin (already started this) so other mappers can use the algo as they wish to assist in aerial-assisted mapping. This is selfishly for me, I’ll likely use this along with the nicer ESRI imagery to mark paved vs. unpaved for the TIGER deserts I’m most interested in.
  • Publishing a CSV mapping way_ids to inferred surface=* predictions so routers can use it immediately, and so others can get a feel for the quality of the data this provides.

Jon

5 Likes

BTW, Poland has high-quality imagery being available woythout restrictions (geoportal.gov.pl)

This also would allow to list cases where model and OSM claims a different thing (and if model was right then fixing and rerunning build of pattern matching will improve accuracy!)

2 Likes

This is great. If not an actual surface tag, maybe an alternative tag with prediction score could still be relevant. I don’t know OSM enough to know if creating such custom tags is possible.

Regardless, YES to the CSV idea people shared. Put it all in a CSV, put it on the Github and then anyone can plug into it and do cool stuff.

@jdalrym2 : Do you think you’re going to expand the type of surfaces detected? This could be very useful for hiking trails. People with young kids often need to know if they can use their push chair, or bikers need to know if the surface is appropriate for their bike type.

If @Richard is in favour of these approaches then I’m sure they’re a good idea.

A few of my own idle thoughts:

  • Sampling. My only concern is that surface tags in OSM are quite likely to be collected in a non-random way, so a random selection for training may bias the algorithm a bit (although, if anything, I’d expect in a negative fashion).

For instance many more highway=residential have a surface mapped in La Plata County (CO) compared with Moffat County. Even in Summit County surface is much more intensively mapped in obvious tourist areas (e.g., Montezuma lack surface tags, they actually look to be mainly unpaved; even in residential districts N of I-70 there’s a lot of roads lacking surface tags).

I wonder if stratification of the sampling might improve predictions. A simple approach might be to assign counties to buckets based on ratio of roads with/without surface tags.

  • Bing Streetside for additional verification. In some places Bing Streetside imagery is adequate for checking status of roads. For the most part Bing vehicles will have stuck to paved roads, but it’s often possible to view the surface of side roads (not infrequently they have a short paved section just short of the yield sign). This might be a way of crowd-sourcing more data in places with low incidence of surface tags (e.g., through MapRoulette & similar). (Note there is no Bing coverage of Montezuma, CO at all.)

  • Lidar a very wild idea was whether USGS Lidar might be any use. It’s 1m horizontal accuracy. I did find a couple of papers that suggest aerial (as opposed to moving vehicle) Lidar can be used to calculate a rugosity index. The technique which looked easiest to apply unfortunately used differences between Lidar with different horizontal resolutions, most of the rest gather data from a vehicle. USGS don’t appear to offer a DSM, but, in principle you might be able to apply the same masking techniques to point clouds.

2 Likes

I attempt to be terse and helpful as I write this. In going-on-14-years of OSM contributions, I find that “if you don’t know it, don’t tag it” is an excellent place from where to begin. Some of the most successful mapping I’ve both seen and participated in are the “enter 90%, because we know it, let others take to 95%, then 98%, calling that number ‘both completion and correctness’ together.” The last two percent of 98%, 99% and finally 100% is where “more eyes” (more than one person, certainly, and in many cases, more than two) look at the data. These “eyes” happen as “sees routes as complete” or “has a good eye for landuse” or “lots of experience by this OSM editor with seeing a bigger picture of things” all tend to see that 98% or 99% zone as “time for me to come here and offer my perspective.” That really is the ‘best’ way OSM gets to 100% complete and correct. (At least in my experience).

Sure, this evolves, as do we all, as do our perspectives, experiences and ability to both judge quality and become better in our abilities to judge quality.

This sort of “organic growth” (and allowing it to emerge and grow in an open, uncontrolled, but vigilant way) is what encourages it to flourish. Let’s keep that up.

1 Like