Overpass - count popular street names

kubahahaha · November 21, 2022, 5:20pm

Hello. I’d like to count popular street names in my part of world.
Street names are distinct in every city, and due to OSM data model one street name can be on multiple ways within city boundaries.

I was thinking about following approach:

Define area to query;
Download all admin_boundary=8 areas (cities) within area from step 1;
For each area from step 2 download all named streets there.

I’d be happy with output in csv format with columns city name and street name for all named streets and cities in area from step 1 as I can count them later in some other tool (perfect would be columns street_name and number of cities with such street.

Here is code I wrote so far (overpass-turbo):

[out:csv(city, street)];
({{geocodeArea: powiat lubelski}};) -> .searchArea;

rel[admin_level=8](area.searchArea)->.cities;

foreach.cities (
  map_to_area -> .city;
  way[highway][name](area.city);
  make stat street=set(t["name"]);
  out;
);
out body;

But I am apparently missing something. Can someone help?

IlBano · November 22, 2022, 11:22am

Hi,

you need to cycle through the ways you extract inside the foreach.
You can use the for statement breaking on name tag so to avoid displaying duplicates (assuming that a street is usually made up of several ways)

[out:csv(city, street)];
({{geocodeArea: powiat lubelski}};) -> .searchArea;

rel[admin_level=8](area.searchArea)->.cities;

foreach.cities (
  map_to_area -> .city;
  way[highway][name](area.city);
  for (t["name"]){
  	make stat city=city.set(t["name"]),street=set(t["name"]);
  	out;
  }
);
out body;

mmd · November 22, 2022, 11:48am

I usually include the number of ways and the total length in meters in the result (I use a similar query for my tests):

({{geocodeArea: powiat lubelski}};) -> .searchArea;

rel[admin_level=8](area.searchArea)->.cities;

foreach.cities (
  map_to_area -> .city;
  way[highway][name](area.city);
  for (t["name"]){
  	make stat city=city.set(t["name"]),
              street=_.val,
              num=count(ways),
              len=sum(length());
  	out;
  }
);

By the way, I also simplified the value for “street” in the make statement a bit…

kubahahaha · November 23, 2022, 3:55pm

Thank you for help.

Can you please tell me why first query (@IlBano) find less results than the second one?

Eg. street Lubelska is 7 times in first result and 14 times in second one. Missing values are:

Niemce	Lubelska	34	3411.326
Jakubowice Konińskie	Lubelska	20	3590.405
Dys	Lubelska	3	725.337
Jakubowice Konińskie-Kolonia	Lubelska	3	2001.214
Zalesie	Lubelska	13	1473.184
Łosień	Lubelska	4	358.65
Krzczonów	Lubelska	4	1022.176

IlBano · November 23, 2022, 6:00pm

I believe we found a bug…

my query with default timeout (180s) extract 1014 rows
mmd’s one with same timeout gets 1187 rows
my query with 240s of timeout gets 1275 rows

My idea is that:

mmd query obtains more data since it uses _.val which I suppose is faster than set(t[“name”])
timeout exception is not correctly managed inside the for cycle

As a workaround simply set a higher timeout

mmd · November 23, 2022, 6:12pm

The small difference is probably just coincidence, and might be related to variations in server load. The issue with [out:csv] is that it doesn’t print any error messages (!), in particular you would never see a timeout error message.

Overpass API/Overpass QL - OpenStreetMap Wiki → Section “Checking CSV output for complete data” describes possible approaches to detect incomplete data.

Yes, that’s probably needed in this case. I tried the query on another instance where the timeout issue did not occur.

kubahahaha · November 23, 2022, 7:48pm

Thank you again.

The issue with [out:csv] is that it doesn’t print any error messages

Wow, that’s strange