This is because Overpass’s area filter is more like an intersection test than a containment test:
Nodes are found if they are properly inside or on the border of the area. Ways are found if at least one point (also points on the segment) is properly inside the area. A way ending on the border and not otherwise crossing the area is not found. Relations are found if one of its members is properly inside the area.
I assume the minor overlap is due to a boundary dispute, which would normally result in one of the boundary member ways representing a boundary claim within the other territory’s claims.
There are a few workarounds.
Each U.S. state and territory is tagged with ISO3166-2=US-*
, so you could filter relations by that tag without even consulting the United States relation:
[out:json][timeout:25];
relation["admin_level"="4"]["ISO3166-2"~"^US-"];
out geom;
Each boundary relation has a label
member corresponding to its centroid and/or an admin_centre
member corresponding to its capital, so you can recurse down to that member, filter it down to the centroids or capitals within the U.S., and recurse back up to the level 4 boundary relation:
[out:json][timeout:25];
area["admin_level"="2"]["name"="United States"]->.usa;
relation["admin_level"="4"](area.usa);
node(r:"label")(area.usa);
relation(bn)["admin_level"="4"];
out geom;
The U.S. boundary relation’s subarea
members are the state boundary relations (but for some reason not the territorial boundary relations). In general, you can’t rely on the subarea
role, because it’s kind of a hack, but it happens to mostly work in this case. You can directly recurse down to the subarea
members:
[out:json][timeout:25];
relation["admin_level"="2"]["name"="United States"];
relation(r:"subarea")["admin_level"="4"];
out geom;
If you’re open to trying a tool very different than Overpass, QLever supports these workarounds, along with a wider variety of spatial relationships like sfContains
and sfTouches
. But what may be more interesting is its ability to join OSM to Wikidata, which explicitly knows which boundaries represent U.S. states per se, regardless of geometry:
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX ogc: <http://www.opengis.net/rdf#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX osmkey: <https://www.openstreetmap.org/wiki/Key:>
PREFIX osm: <https://www.openstreetmap.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX osm2rdfkey: <https://osm2rdf.cs.uni-freiburg.de/rdf/key#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT * WHERE {
?state rdf:type osm:relation .
?state osmkey:name ?name .
?state osm2rdfkey:wikidata ?item .
SERVICE <https://qlever.cs.uni-freiburg.de/api/wikidata> {
?item wdt:P31 wd:Q35657 .
}
?state geo:hasGeometry/geo:asWKT ?geometry .
}
Another SPARQL engine, Sophox, supports a similar query, though it can only return the centroid of each boundary:
SELECT * WHERE {
?state osmt:admin_level "4";
osmm:type "r";
osmt:name ?name;
osmm:loc ?coordinates;
osmt:wikidata ?item.
SERVICE <https://query.wikidata.org/sparql> {
?item wdt:P31 wd:Q35657.
}
}