Overpass best practice: complex query vs. binary operator

I’m building an app that, while itself is not a router, needs to know about potential pedestrian-routable infrastructure in an area of user-supplied center and radius. Currently, the app gets this data with Overpass—the current query is defined here. Currently the app considers “potential pedestrian-routable infrastructure” to be ways with tags like highway=footway, highway=residential, railway=platform, etc., but not tags like access=private, foot=no, sidewalk=separate, etc. See the query for a full list. While the tag list may need to be adjusted, what tags should be considered “routable” is not the specific topic in question.

One way to construct an Overpass query for this is to list every possible combination of allowed/disallowed tags, like this, however with the number of allowed tags (18) combined with the number of disallowed tags (22), this query is quite unwieldy. Another method, which is what I use currently, is to use two subqueries, one for the allowed tags, and a second for the disallowed tags, then subtract the second from the first, like this.

I’ve been working on optimizing the query, in particular finding formulas for upper [maxsize:...] and [timeout:...] given the radius. I would guess that the “subtraction” method requires the server to load all allowed features, then to load all disallowed features, before doing the subtraction. I thought this wouldn’t be an issue since there would generally be far fewer disallowed objects than allowed ones, but isn’t always the case. In particular, this parking lot has over 22 thousand amenity=parking_space+access=private, which aren’t relevant to pedestrian routing, but they still get caught up in the query because of the access tag.

What do people think the best solution is? Some ideas:

  1. Limit the query’s memory usage so that queries in this particular parking lot don’t work.
  2. Raise the memory limit so that queries in the lot do work, but queries in other areas are slower.
  3. Use the harder-to-read, but more memory-efficient, “combinatorial” query.
  4. Refine the “disallowed” part of the query to only match highway ways, like (way[highway][access!=private]; way[highway][foot!=no]; way[highway][sidewalk!=separate]; ...), resulting in allowing routing along otherwise disallowed non-highway ways, like railway=platform+access=private.
  5. Same as above, but exclude non-highway tags from the allowed tags, resulting in disallowing routing along e.g. railway=platform.
  6. Use a query service other than Overpass.
  7. Remove the “disallowed” part of the query and filter down on the client side. I’d like to avoid this because I’d like the query to be user-configurable, and because it would require downloading the tags of all objects, while currently I only need the geometry.

Side note but I wish the Overpass API somehow returned the amount of memory and time it ended up needing to perform the query, so that I could use that to directly adjust [maxsize:...] and [timeout:...] without needing to guess-and-check.

1 Like

There are quite a few questions, so let’s start with a few other ones:

  1. Why are you trying to define maxsize instead of leaving it as default?
  2. Is it important to specify (around:{{maximum_distance}},{{center}}) if you have already limited the bbox globally?

You could write something like:

...
(
    way[highway=footway];
    way[highway=living_street];
    way[highway=path];
    way[highway=pedestrian];
    way[highway=platform];
    way[highway=primary];
    way[highway=primary_link];
    way[highway=residential];
    way[highway=secondary];
    way[highway=secondary_link];
    way[highway=service];
    way[highway=steps];
    way[highway=tertiary];
    way[highway=tertiary_link];
    way[highway=track];
    way[highway=unclassified];
    way[leisure=track];
    way[man_made=pier];
    way[railway=platform];
)->.w;

way.w
[access!=agricultural]
[access!=customers]
[access!=delivery]
[access!=destination]
[access!=discouraged]
[access!=forestry]
[access!=no]
[access!=permit]
[access!=private]
[access!=unknown]
[access!=use_sidepath]
[foot!=agricultural]
[foot!=customers]
[foot!=delivery]
[foot!=destination]
[foot!=discouraged]
[foot!=forestry]
[foot!=no]
[foot!=permit]
[foot!=private]
[foot!=unknown]
[foot!=use_sidepath]
[sidewalk!=separate];
...

Perhaps Overpass can optimize this query.

For optimization. The smaller the maxsize and timeout, the higher the priority the Overpass server will consider the request, and the sooner it will send a response.

It’s not strictly necessary, but since my app doesn’t need any data outside the radius, it would save client-side memory to not download it.

That works great! I was looking for methods to remove things from the current set before, and the first suggestion I got was the (q1) - (q2) method. I didn’t know you could filter down the existing set like this. Thank you!

Have you considered working from planet dumps/extracts, prefilter and preprocess the data exactly according to your needs and providing your own query service for the game? See also Overpass API performance issues - #74 by lonvia

3 Likes

I figured it wouldn’t be necessary to host my own backend since I expect it to be very low volume. Overpass results are cached on the client side so a user should only need to make one query once every few days, and I don’t even think there are more than 10 total users currently. I do also allow the user to change which Overpass server to use (it defaults to private.coffee), so they have the option of choosing a regional mirror or hosting their own.

edit: Also because I want users to be able to customize the query, so pre-filtering on the backend isn’t an option

One query every few days is indeed okay and not worth the effort of a separate backend.

Have you tried this format:
way[highway~“^(footway|living_street|path|etc…)$”];

Is a circular search area essential? rectangular bboxes are more efficient.

There are so many foot!=*. Would [foot~“^(yes|designated)$”] cover all bases? Similar for ‘access’.