How to identify whether a (lat,lng) is within a National Park, body of water, Theme Park, etc

There is a well known XKCD comic that describes building a look-up API to determine whether a (lat, lng) point is within a National Park. Nominatim reverse lookup helps get administrative boundaries, but does not provide additional layers such as National Parks, bodies of water (say you are out on a boat), Theme Parks, and so on. I noticed that I was unable to find a solution to this problem on Google Search results, despite being a very useful, and very well known problem category. Could someone from the community help me set up an OSMPythonTools or OverPassAPI call to perform these equivalent queries?

Problem: use OSM to query to know (lat,lng) is inside National Park, Theme Parks, etc


Thank you!

Cross-post to Stackoverflow:

According the the Overpass QL docs:

The standalone query is_in returns the areas and closed ways that cover

  • the given coordinates (when specified) or
  • one or more nodes from the input set (when no coordinates are specified).

There is usage information in the linked page.

This answer is incomplete. How does one first query for all such layers, i.e., National Parks, buildings, etc, within a GeoJson or bounding box; then, the is_in function can be applied as a map (the parallel operation) across the array of polygons in the layer. Do you have a complete answer?

This ought to be about four lines of Overpass API code, very likely in one contained query.
The query presumably is so common that the tutorials would include it; yet, them seem to not.

Thank you.

Not so. The is_in query returns all polygons where the point is in the polygon. The second part of the linked item states

" is_in cannot be directly used with any of the Overpass QL filters. For filtering the is_in result, a further query is needed (see below): the [square brackets] shown below, indicate optional parts and are not part of the syntax to type."

A concrete example is provided at the end of the section using [admin_level=2].

This Overpass query shows the output for a single place in Ireland (Clonfert) in a CSV format for convenience. Ireland is rich in a whole range of polygons, mainly concerned with boundaries, so I was sure I could show a range of different types of objects.

1 Like

I will read the documentation. How can your assertion be true? There are millions of polygons across hundreds of thousands of layers for which the (lat,lng) point is_in. There must be a mechanism for specifying, and first gathering, which layers the function is applied to. Happy to admit my error after reading the documentation if I am incorrect. Thank you. PSA: I wish these forums and technical experiences were more social: all of these will be replaced by AI in the next few years and so nobody should think posting technical questions is not a pro-social behavior. More news for another forum I suppose. Thank you.

You are misunderstanding something, but I am not quite sure what it is.

The OSM database contains, by planetary standards, a fairly limited amount of data. For any given point on planet Earth, there will be at most, I would guess, a few dozen polygons that cover that point. Finding those polygons is well within the capacity of a somewhat fancy personal computer. There is no need to specify which layers to search (and there are no layers in the OSM database, anyway!). It is quite feasible to apply the question to the entire database.

For example, the query is_in(67.4723,16.4850) will tell you that this point is in the Vásstenjávvre lake, Padjelanta national park, Jokkmokk municipality, Norrbotten county, Sweden, and that’s it.

2 Likes

I often misunderstand something. Overpass and OSM are generic frameworks for geospatial storage and can contain as many layers as the database wants. OSM contains everything from individual mailboxes and fountains to small scale pathways within facilities themselves declared as layers. The coverage is spotty, but intentionally generic. These are called ways when they are one-dimensional geospatial polygons. What am I misunderstanding? (PSA: I did write a somewhat extensive API query system a few years ago in code but with GitHub search limitations what they are I will never be able to find that original code.) Thanks.

there are no layers in a classic GIS sense in OSM data. The query I gave returns absolutely every polygon in the database which contains the place node “Clonfert”

2 Likes

No, OSM does not contain multiple layers in the GIS sense, and cannot, at least not without extensive modification AFAIK.

OSM does have a layer tag (https://wiki.openstreetmap.org/wiki/Key:layer) but that expresses an entirely different concept, that of the vertical relative position of features, for example a bridge is higher level than the river that it crosses.

There are no “declared” layers in OSM there are only nodes, ways, and relations, each of which may have tags. However, if you are using an extract from OSM, such as the shapefiles from GeoFabrik, you will see that the process that created the extract has translated the OSM data into GIS layers (but not thousands of them). However, that is just so the data can be used in a traditional GIS, such as QGIS. A different provider of extracts from OSM may choose to do the translation differently.

Highly unlikely in any spatial database, let alone OSM. I can only think of a couple of dozen polygons that a given point may be in. Nation, state/province, county/parish, city, some political subdivision of a city maybe, one or two land use polygons, a building, a body of water (but probably not at the same time as being in a building), park district, fire protection district, National Forest boundary, military base, watershed, a special taxing district or two. I am sure I have omitted some, but there are that many, and some of those may not be mapped in OSM.

2 Likes

The OSM.org website lets you query a point to find nearby points and polygons the point is within. The latter search does what you want, and is implemented as an Overpass query, similar to the examples you’ve been given.

As an example, see “enclosing features” on Query Features | OpenStreetMap

1 Like

Thank you pnorman tekim and turepalsson. I went digging through my old software engineering from several years ago, and this is the more complete answer I was searching for:

  1. While OSM does not have the concept of layers its analogous concept is simply to return nodes, ways, and areas within a prescribed bounding box.

Here are the National Parks in a large region of the western US.
http://overpass-api.de/api/interpreter?data=[out:json];nwr[boundary=national_park](23.740542,-130.804067,63.743083,-90.800719);out%20geom;

Here are the buildings in and around Bern, Switzerland (narrowed to a random suburb to limit its length for demonstration purposes).
http://overpass-api.de/api/interpreter?data=[out:json];nwr["building"](46.979329,7.5066788,47.001101,7.531752);out geom;

Here are the fire stations in and around Bern, Switzerland.
http://overpass-api.de/api/interpreter?data=[out:json];nwr["amenity"="fire_station"](46.879329,7.366788,47.001101,7.531752);out geom;

We can see that in each of the OverpassAPI calls, the nwr query is applying a selector filter ["amenity"="fire_station"] and subsequently querying only within the bounding box (46.879329,7.366788,47.001101,7.531752). The out geom simply refers to the format for the output.

  1. is_in does exactly what was specified by you three. There are a few caveats that I feel are worth mentioning. The is_in key attribute (which actually comes up first in web searches and produces about a dozen sub-sets) is a mostly deprecated, transitory field of manually added string fields to associate with the node. This is not what any of us were referring to nor the solution to this question. The is_in query subsequently does exactly what was specified by you three: searches for every way and area for which the lat,lng encloses. (And ways and areas, when categorized by an attribute, are equivalent concepts to layers in other languages.) This is the base of the correct solution.

Here are all the ways and areas that a specific point near the summit of Grand Teton is in, including administrative layers (names of counties), national parks, time zones, and all other layers in the database (one for a nearby point simply specifies ‘rock’ referring to an amateur inclusion of the geologic surface layer as added to the database).
http://overpass-api.de/api/interpreter?data=[out:json];is_in(43.740542,-110.804067);out%20geom;

Here is the same is_in query applying the filter returning results of national parks:
http://overpass-api.de/api/interpreter?data=[out:json];is_in(43.740542,-110.804067);area._[boundary=national_park];out%20geom;

A key insight for wayward searchers who arrive here is that is_in is very fast to filter and I was delighted by this. Because the is_in provides results for every potential boundary the return set can be huge and extensive; filtering after the is_in query is, however, the expected way to perform these OverpassAPI searches.

Here is a point known to be in a building near Bern, Switzerland, and how to identify whether the point is inside a building, as per whether the query returns any results. In this case we discover the building is a barn.
http://overpass-api.de/api/interpreter?data=[out:json];is_in(46.9942212,7.5283544);way._[building];out geom;

And, if you are a wayward traveller searching out for a barn on their quest for a hobo paradise to spend the night you can filter for barns in this way:
http://overpass-api.de/api/interpreter?data=[out:json];is_in(46.9942212,7.5283544);way._[building=barn];out geom;

As I said, the filtering is very fast and the expected design of the OverpassAPI. The alternative would be to first query all barns in a region and filter by one specific lat,lng, and could be faster were the API and database constructed differently.

Thank you so much for everyone who chipped in to help, and I hope this solves what future web searchers need when they find this page. Please feel free to reach out to me by email now or in the future for any additional questions, future web searchers.

Hi David. Glad you found our responses helpful. Here are a few minor clarifications.

I am not seeing how the results can be “huge and extensive.” I have tested some latitude and longitude coordinate pairs around Colorado, US, and generally I get about 10 results or less per lat, lon, for example, for 39.1037780,-106.3941188 I get:

  • United States
  • Colorado (State of)
  • San Isabel National Forest
  • Lake County
  • America/Denver Timezone
  • Contiguous United States
  • Four Corners States
  • UTC−07:00 standard time

But yes, if you only want National Parks, then an additional term in the filter is needed as you have specified.

Now, it may look like a lot of data, because of each result, you get all of its tags, and for the ways, coordinates of all of its vertices (nodes) and in JSON, that all takes up a lot of screen space.

It is deprecated because the information as to what a given object is in, is inherent in any spatial database, including OSM.

Sort of, but not exactly. One might use OSM tags along with the characteristics of the OSM element (node, way, or relation, whether the way closes), to translate the OSM data into various layers in a GIS format, such as a shapefile. However, your translation of OSM data into layers may differ than mine. For example you may include all linear transportation features in a single “layer”, and I may choose to put railways in a separate “layer”. OSM doesn’t specify how this translation is to be done. Also, this applies not just to “ways and areas”, but nodes and relations as well (in OSM “areas”/polygons result from either closed ways with the appropriate tags, multipolygon relations, or boundary relations, “areas” are not a fundamental OSM element like nodes, ways and relations).

It is quite common in OSM to specify surface=*, including bare_rock. In a sense most things in OSM are “amateur” (most of us are volunteers and therefore “amateurs”), and this is no more or less amateur than most other things in OSM. A visit to a location, along with satellite/aerial imagery is usually sufficient to determine surface=bare_rock.

I am not here to argue over the definition of words, and I choose my words accurately and carefully. That said, if we want to have a discourse on the introduction to OSM we can. I broadly agree with your description, though we need not: I was only doing so for the larger community. One can easily imagine the ‘standard’ OSM being expanded with more ‘attributes’ and thus the OSM equivalent of layers. This is an open source database framework. If you search for any NASA derived data you can imagine thousands of new attributes assigned to each node or way with relative ease. Your description to is_in is not quite correct as per its terminology page, only in spirit, as the is_in attribute has an intention its its modern form. As for those administrative results for the point in Colorado you choose, I noticed that you are the third or fourth member who has thought of OSM as a much more administrative database than the cohorts I had known several years, and perhaps that is simply a difference in the long term perspective of the community. Thank you again for your commitment in this question.

People absolutely do extract specific data to create GIS-style layers for GIS tools that expect layers. Many OSM uses (renderers being an obvious example) select certain pieces of OSM data and load into into a database that permits spatial queries. OSM data is just data. You can, subject to the license, do what you like with it.

That said, it isn’t immediately clear if your original question is answered (or actually, what your original question even was)?

Thank you for your input. Any future search engine arrivals who read my contributions and want additional information on how I approached this problem my reach out to me directly through email. At the time of writing my email is available in numerous places.