Counting street name occurances

What is the easiest way to go about counting the occurences of street names in the US?

What I want to do is count all the different uses of each of the 50 states as they are used in the names for roads throughout the country. Compiling a list that will show me their total usage.

1 Like


If you know a bit of Python, try Geodesk: GeoDesk for Python | GeoDesk Documentation

Download an .osm.pbf file of the United States: Geofabrik Download Server
Convert into a .gol file using Geodesk’s gol-tool
Open the gol file and use a GQL query to select a subset of objects you are interested in - in this case, roads, similar to the safe_for_cycling query here: Sets of Features | GeoDesk Documentation
Then for each state count how many roads contain its name
You can use regular expressions to do string matching: Query Language | GeoDesk Documentation

1 Like

Then for each state count how many roads contain its name

yes, you should probably combine the OpenStreetMap ways with the same name and connected to each other, before you do the counting


While that would probably solve 99% of cases, there is still a sizable set of “discontinuous” named streets, i.e. those interrupted by an unnamed bridge, by a pedestrian street, or by an odd-shaped junction.

A more bulletproof method would be to just count occurrences of a street name within a jurisdiction, but it’s tricky to find out what the right jurisdiction level is where street names are guaranteed to be unique. (And I’d bet that it varies from state to state, as most things in the US).

If those odd cases are not of major interest for your application, ignore them, but have in mind that they exist.


Luckily, I wrote a tool to do just that: osm-lump-ways. :slightly_smiling_face: It can connect OSM ways with highway & name tag, groupped by name & geometry connection.

Here’s how you get the longest 1000 streets in an OSM data file:

osm-lump-ways -i ~/osm-data/baden-wuerttemberg.osm.pbf -f highway -f name -g name -o bawu-names.geojsons --only-longest-n-per-file 1000

Then convert that to geojson to CSV:

jq <bawu-names.geojsons "[.properties.tag_group_0, (.properties.length_m|round/10), (.geometry.coordinates[0][0][1]|tostring)+\",\"
+(.geometry.coordinates[0][0][0]|tostring), (.geometry.coordinates[-1][-1][1]|tostring)+\",\"+(.geometry.coordinates[-1][-1][0]|tostr
ing)]|@csv" > bawu-streets.csv