US States Boundaries Only, despite Overlap with Canadian provinces

I have this query below that grabs admin level 4 boundaries within the US, but it seems that British Columbia and the Yukon Territories overlap slightly with the US border somewhere so those are getting selected too.

Thanks for any suggestions.

[out:json];
area
  ['admin_level'='2']
['name'='United States'];
(relation['admin_level'='4'](area););
out geom;

This is because Overpass’s area filter is more like an intersection test than a containment test:

Nodes are found if they are properly inside or on the border of the area. Ways are found if at least one point (also points on the segment) is properly inside the area. A way ending on the border and not otherwise crossing the area is not found. Relations are found if one of its members is properly inside the area.

I assume the minor overlap is due to a boundary dispute, which would normally result in one of the boundary member ways representing a boundary claim within the other territory’s claims.

There are a few workarounds.

Each U.S. state and territory is tagged with ISO3166-2=US-*, so you could filter relations by that tag without even consulting the United States relation:

[out:json][timeout:25];
relation["admin_level"="4"]["ISO3166-2"~"^US-"];
out geom;

Each boundary relation has a label member corresponding to its centroid and/or an admin_centre member corresponding to its capital, so you can recurse down to that member, filter it down to the centroids or capitals within the U.S., and recurse back up to the level 4 boundary relation:

[out:json][timeout:25];
area["admin_level"="2"]["name"="United States"]->.usa;
relation["admin_level"="4"](area.usa);
node(r:"label")(area.usa);
relation(bn)["admin_level"="4"];
out geom;

The U.S. boundary relation’s subarea members are the state boundary relations (but for some reason not the territorial boundary relations). In general, you can’t rely on the subarea role, because it’s kind of a hack, but it happens to mostly work in this case. You can directly recurse down to the subarea members:

[out:json][timeout:25];
relation["admin_level"="2"]["name"="United States"];
relation(r:"subarea")["admin_level"="4"];
out geom;

If you’re open to trying a tool very different than Overpass, QLever supports these workarounds, along with a wider variety of spatial relationships like sfContains and sfTouches. But what may be more interesting is its ability to join OSM to Wikidata, which explicitly knows which boundaries represent U.S. states per se, regardless of geometry:

PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX ogc: <http://www.opengis.net/rdf#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX osmkey: <https://www.openstreetmap.org/wiki/Key:>
PREFIX osm: <https://www.openstreetmap.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX osm2rdfkey: <https://osm2rdf.cs.uni-freiburg.de/rdf/key#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT * WHERE {
  ?state rdf:type osm:relation .
  ?state osmkey:name ?name .
  ?state osm2rdfkey:wikidata ?item .
  SERVICE <https://qlever.cs.uni-freiburg.de/api/wikidata> {
	?item wdt:P31 wd:Q35657 .
  }
  ?state geo:hasGeometry/geo:asWKT ?geometry .
}

Another SPARQL engine, Sophox, supports a similar query, though it can only return the centroid of each boundary:

SELECT * WHERE {
  ?state osmt:admin_level "4";
         osmm:type "r";
         osmt:name ?name;
         osmm:loc ?coordinates;
         osmt:wikidata ?item.
  SERVICE <https://query.wikidata.org/sparql> {
    ?item wdt:P31 wd:Q35657.
  }
}
2 Likes