I haven’t been overly engaged for the last couple of days because I’ve been visiting (of all places) New Orleans. Or Orleans Parish? Or both?
Anyway, I’ve added another column to the table showing how each CCC emerged (since creation or merged): Talk:United States/Boundaries - OpenStreetMap Wiki
One point that keeps coming up is that the CCCs of San Francisco / San Francisco County, Denver / Denver County, etc., have never been separate entities because they were created as one entity (although it could be argued that two entities were consolidated upon creation, but I’ll leave it to others who might be more familiar with these topics as to if that’s a reasonable rebuttal).
This is true of the six Alaska “City and Borough of XXX” entities and also true of New Orleans / Orleans Parish.
So from the table I’m building, I can see a difference between San Francisco and Philadelphia, since Philadelphia / Philadelphia County were once separate entities that merged (1952) versus San Francisco existing only ever as a CCC.
But I’m not seeing any difference between New Orleans / Orleans Parish and San Francisco / San Francisco County, other than New Orleans / Orleans Parish were not always coextensive, however they have been since 1870 so I’m not sure that would be relevant for a boundary discussion in 2024.
In other words, if San Francisco and Denver are single entity, what would the rationale be for New Orleans to be a double entry? Or if New Orleans is double, why would the others be single?
@Minh_Nguyen’s position, “strongly held”, is that New Orleans and Orleans Parish are two separate entities.
@ZeLonewolf’s position, also seemingly strongly held, is that San Francisco City and County is one entity.
But what’s the difference? Is it really that over 150 years ago they were not coextensive? Or is it really that if you lived there then you’d just know?
In defense of usability for the naïve user (like myself)
I very much like the implicit criteria @stevea mentions:
Imagine “somebody not from around here” trying to figure this out.
In other words, how does a naïve user (like myself) walk from the most likely sources a naïve user would start with and make their way to and through OSM data, such as Wikipedia → Wikidata → OSM, with full traceability as to what is happening to each entity along the way so that they are not surprised when San Francisco County and Denver County do not have entries in OSM?
If the solution isn’t to add the boundaries to OSM, then is the solution most helpful for future naïve users to start further upstream, perhaps at Wikidata?
Should Denver County (Q15906757), Broomfield County (Q16088503), Nantucket County (Q2991355), and San Francisco County (Q13188841) (and possibly Orleans Parish (Q486231) based on the above) be merged with their city entry in Wikidata?
Should those Qids only be an “instance of” (P31) a “Wikimedia redirect” (Q21528878) instead of county of California (Q13212489) or county of Colorado (Q13410403) or a county of Massachusetts (Q13410485) or parish of Louisiana (Q13410524)?
Assuming that doesn’t disrupt data consumers of Wikidata, this would at least not lead the naïve user to expect those boundaries in OSM if they are taking that particular naïve path through Wikidata.
But Wikidata is only one naïve path.
Imagine a naïve user only interested in counties who starts with census data and sees San Francisco County (Explore Census Data) and Denver County (Explore Census Data) but not being able to find those in OSM and not being aware they are coextensive and were created as one entity with the city and county consolidated?
Or a list of counties in California from Zillow (https://www.zillow.com/browse/homes/ca/), or Ballotpedia (Counties in California - Ballotpedia), or Cal_Berkley (Table 1: List of Counties in California & Zoning Percentage | Othering & Belonging Institute), or even the government of California itself (https://notary.cdn.sos.ca.gov/forms/notary-county-codes.pdf), or any other place a naïve (but reasonable) user might start.
With a little research they can probably find out those are CCCs, but I’m not sure the naïve user wouldn’t be even more confused when they see some CCCs have both county and city in OSM but some do not, with no obvious rhyme or reason to it.
Of course there are only 44 of them and with a few days of study they can figure out that whether the county and city of a CCC is coextensive is an important clue as to if OSM has them listed as two separate entities or one.
But that doesn’t quite explain all the discrepancies, so with some additional research they might realize that whether the city and county were consolidated simultaneously with their creation is another important clue as to if OSM has two separate boundaries.
And with some more work they might compare the official names of the 44 CCCs to realize that is yet another clue as to if OSM has two separate boundaries, along with examining the signs entering the city, and the organization of the sheriff and police departments, and the …, etc., etc., etc.
But at what point can the naïve user just get back to their actual application, which doesn’t overly concern itself with whether San Francisco County is a separate entity than San Francisco as much as it just needs a list of counties and cities in California, consolidated or not?
I’m not suggesting a specific solution, but whatever it is, the end result should be that it is relatively easy for the naïve user to travel a reasonable path from the most likely sources a naïve user would start with to OSM, otherwise OSM becomes too difficult to use unless that naïve user invests the time and effort to become an expert in details that would elevate him from being a naïve user.
But the naïve user loves being naïve. He wants to be naïve. He needs to be naïve – not for the sake of being naïve, but so he can focus his efforts on his true passion, his application!
I don’t think anything proposed here would necessarily make the OSM data “unusable”, as long as the naïve user could easily figure out why San Francisco County and Denver County are present in Wikidata and other sources but not OSM.
However I think the defense of accuracy at the expense of usability is not as straightforward as it’s sometimes presented.
No doubt accuracy is important and should be defended, but accuracy isn’t categorically better than usability.
At the extreme, if it’s accurate but unusable, what’s the point?