New algorithm for inferring road surface tags

Minh_Nguyen · December 18, 2022, 5:05pm

By the way, your analysis indicates a correlation (based on very few data points), so it’s inaccurate to describe it as a causation. One can also observe, with as many data points, that there were more contributors immediately after the TIGER import than before it. There are many other potential factors at play, which would be a decent separate topic.

The editor-layer-index project has collected a number of high-resolution imagery sources at the state and county level that you could use to supplement NAIP. ELI is what powers the background imagery selector in iD; there’s a similar index for JOSM.

Many of the local sources are leaf-off imagery, which is great for avoiding the edge case mentioned earlier. However, this also means the conditions will be more damp in some places, changing the appearance of some surfaces.

Some of the agencies that have published these layers also publish high-resolution DEMs, though few have been added directly to ELI.

Quaternion · December 29, 2022, 4:06pm

@jdalrym2, it would certainly be interesting to see what the results would look like if you a) used a much smaller training set of ways with a known to be correct surface tag, or b) used only ways assigned both a surface and smoothness tag for training?

I am pretty sure you are currently severely limited by the quality of the training data.

SomeoneElse · December 29, 2022, 4:33pm

300k of 45M USA highways have smoothness, so while that’s not a big percentage it might be enough actual data, if in the right place.

Quaternion · December 29, 2022, 5:12pm

Still plenty, even if we take into account that about 84% of these ways are paved and such a bias should be avoided in the training set.