Proposed double-entry of Consolidated City-Counties

Incidentally, this article listed New Orleans under the “Consolidated since their creation” heading, alongside San Francisco. It also claimed, without evidence, that the city has always served as the parish government since establishment. This is incorrect. In fact, unlike any other parish, it had not one but two parish governments prior to consolidation. I’ve corrected the article, moving New Orleans down to the “Merged” section and adding a fuller explanation with sources. There was a suggestion to correct this and other entries back in 2013, but apparently no one noticed it.

Apologies if you were taking Wikipedia at its word and I’ve pulled the rug out from under you.

The former was just a way of explaining what you can plainly see on the ground as you enter the city. In OSM, the quintessential mappable boundary is one that is marked on the ground. If we didn’t place such an importance on such real-world artifacts, the project wouldn’t have even allowed us to map boundaries in the first place.

The latter is the very raison d’être of OSM: local knowledge. Yes, sometimes we have to temper our local knowledge of obscure quirks for the public good. But the public is also served by us bringing something unique to the table. OSM is well-known for its detail-oriented coverage of geography, even sometimes at the cost of uniformity. In the event of a conflict between the Census Bureau and reality, we proudly and unapologetically choose reality. By trotting out a variety of mass-manufactured experiences that put San Francisco in San Francisco County, you’re reminding me of how unfortunate it was when OSM used to cause lots more data consumers to replicate that bug.

And yes, it is a bug. The Supreme Court has affirmed on multiple occasions that the City and County of San Francisco is a direct subdivision of the state, exclusive of any county. The official status is not charter city, not charter county, but rather charter city and county. It is as distinct from a city and from a county as peanut butter and jelly is yummier than peanut butter or jelly alone. How else would we get seemingly redundant phrases like this all over the Constitution and California Codes?

…any town, city, county, city and county, municipal corporations, private persons, partnerships or corporations…

In other words, the sign on the Golden Gate Bridge isn’t merely combining two names to save space; it’s stating the name of a single combined jurisdiction. This is not the case in Orleans Parish, despite it having shared a government with New Orleans for so many years.

This seems to be optimizing for naïve users who aren’t using Wikidata effectively. As you point out, each county is an instance of a statewide subclass, such as county of California (Q13212489), reflecting the fact that counties are a matter of state law. For better or worse, it’s normal and necessary to query recursively for subclasses of whatever class you’re looking for:

?county wdt:P31/wdt:P279* wd:Q13188841.

On the other hand, if you only query Wikidata for direct instances of county of the United States (Q47168), you’ll get only 36 results, all of them defunct – including St. Vrain’s County (Q2323678) and the other 11 counties of the extralegal Jefferson Territory, which not even OpenHistoricalMap currently covers. To exclude the historic counties, you need to filter out anything that’s an instance of former administrative territorial entity (Q19953632) or that has a dissolved, abolished or demolished date (P576) statement. This too is normal. In fact, every data consumer that relies on Wikidata statements ends up having to do something similar to weed out surprises.

As far as I know, each Wikidata item representing a consolidated city is an instance of consolidated city–county (Q3301053) statement. In turn, the consolidated city–county (Q3301053) item is a subclass of county of the United States (Q47168). Therefore, if the user queries recursively for U.S. counties and subclasses thereof, they will get San Francisco (Q62). That’s fine, except they will also get San Francisco County (Q13188841), an entity that legally does not exist but whose name still pops up in miscellaneous contexts. Maybe the user will notice that the two items are coextensive with (P3403) and said to be the same as (P460) each other and filter out the one they don’t want?

But let’s suppose the user only cares about California, and instead of searching for U.S. counties that are located in the administrative territorial entity (P131) of California (Q99), they search for instances of county of California (Q13212489). They get only San Francisco County, not San Francisco, because the latter isn’t an instance of a California county per se. Yet the former doesn’t have a OpenStreetMap relation ID (P402) to point to. You got me!

There are a couple solutions to this problem that automatically get the naïve user the data they need without having to think about it:

Yes, Overpass has a dedicated lrs_in() function for testing whether a value appears in a semicolon-delimited value list. For example, if you merely search for cuisine=french in New Orleans:

nwr["cuisine"="french"](area.searchArea);

you’ll miss Café du Monde, which is tagged cuisine=coffee_shop;french. (No donut?) Fortunately, you will get your beignets if you treat the tag as a list:

nwr["cuisine"](area.searchArea)(if: lrs_in("french", t["cuisine"]));

Granted, this is not very discoverable or memorable, and it might run a bit slower. Most people use regular expressions instead:

nwr["cuisine"~"french"](area.searchArea);
1 Like