Proposed double-entry of Consolidated City-Counties

All – I hope all enjoyed their Thanksgiving (where applicable) and their weekend!

@stevea has made some recent inputs on the wiki talk page that I interpret as being in favor of making the exception for the double entries of CCCs (which I don’t personally interpret as a “double entry” as much as two entities incidentally having coextensive boundaries, but I concede there’s plenty of room for viewing it differently than me):

My preference is to say “there are a few dozen ‘double-things’ one tagged 6, one 8.” We’re most of the way there, the way to get there (call it reality-based tagging) is to agree “a set of a few dozen things which we call CCCs are tagged with a double-method that we agree to, with no exceptions.” . . . I’m listening to downstream user concerns and I see no (or few) contradictions, so continue down the crisp path described here. With more discussion, I believe we’ll achieve this consensus. And we will have “rules” for what CCCs are (co-terminous or mostly-so polygons, one tagged 6, one tagged 8), as we do now . . .


Also, I have been focused more on the United States admin level page and less so on the United States/Boundaries page, but happen to notice that the double-entry approach is actually already prescribed for CCCs on the United States/Boundaries page (although it does note some instances are not coded that way):

A consolidated city-county is mapped as two separate, coterminous boundaries, one for the city (admin_level=8) and the other for the county (admin_level=6), though some are only tagged with admin_level=6 perhaps also border_type=county;city.

This has been the guidance for the entire seven+ years the page has existed: 3 July 2017 changes

The deletions of Denver County, Nantucket County, and San Francisco County that occurred roughly three years ago seems to have actually departed from the guidance at that time (at least the guidance on that page). Broomfield County in Colorado was deleted before 2017.

At any rate, I’m highlighting that

  1. the prescription for double entry has been in place for seven years (I assume without contention or even great notice)
  2. at one point all 17 CCCs with coextensive boundaries (aside from Alaska) were mapped that way (the remaining CCCs do not have coextensive boundaries and thus require two entries anyway)
  3. it is the current status for 13 of those 17 CCCs
  4. three of the four that do not currently have double entries were deleted after the guidance was established (i.e., in violation of it)

So if we can simply agree to use the existing guidance on the United States/Boundaries page for the 17 CCCs with coextensive boundaries (plus those in Alaska and perhaps others in the future), then I think it’s just an issue of how to tag them, which I would suggest the current practices for tagging county (admin_level=6; border_type=county) and city (admin_level=8; border_type=city|town|…|…) are sufficient, perhaps with a note indicating they are part of a CCC pair or an additional tag (or tags) that would allow easy querying for identification/verification.

I’m strongly opposed to duplicating boundaries for three states:

  • Virginia
  • Maryland
  • Hawaii

Each of these states is divided only to one level of hierarchy. Virginia’s counties and independent cities are collectively space-filling entities that do not overlap. The same is true for Maryland. Therefore, it would be wrong to introduce a new level of hierarchy.

In Hawaii, there are likewise no political entities below the county level. We should not duplicate the Honolulu County boundary to create a fictional entity.

Other states we can have a debate about, but those three states should be a hard no because of the way they’re organized.

For the other locations, I would offer the following considerations:

  • It doesn’t really matter too much how the wiki evolved to where it is and what editing disputes have happened in the past. Picking out statements from the wiki as if they’re some kind of legal foundation won’t get you very far. Anyone can edit the wiki and there’s all sorts of things that are wrong or don’t reflect community consensus. The only thing that matters is correctly documenting what the most current community zeitgeist is.
  • For consolidated municipal / county entities – are these really two separate entities with co-extensive boundaries, or are they a single entity that performs the role of both municipality and county? Are separate municipal offices, organizations, and political or popular identities maintained?
  • How does this impact data consumers that currently expect these cases to be one entity? Will the change cause data consumers currently using the admin_level=6 boundaries to encounter duplication when new, overlapping admin_level=8 boundaries are introduced to the map? Have you explored the impact on known data consumers?
  • Does the duplication support all use cases well or is the proposed change tailored to support one specific use case? If a data consumer wishes to treat consolidated entities as consolidated objects, is there sufficient tagging so that they can de-duplicate these boundaries?
  • Which entity should maintain the history of the original boundary? OSM IDs are semi-stable, so we should not cause unexpected behavior by changing what a boundary relation represents.

At @ZeLonewolf’s request I’ve split off the data modeling discussion of whether or not to have both a city and county boundary as this is slightly distinct from the question of tagging.

1 Like

The principle to follow consistently is that we map territories, not governments. The rule of thumb I use is: where there’s a welcome sign, there’s a boundary. A local government only serves to justify the boundary as an administrative boundary as opposed to some other kind of boundary. Conversely, we may decline to map some administrative boundaries that only exist on paper. For example, epidemiological reports are uniformly organized according to county-equivalents throughout the country, but such statistical usage is less tangible than most of the boundaries we map.

Consolidated city–counties are tricky because they exist on a spectrum; many retain some identity at both levels. The 1870 act consolidating New Orleans with Orleans Parish defined coextensive parish and city boundaries. Signs mark the parish limits along the highway, sometimes without a corresponding city limit sign (though I think this is just a mistake). State-authorized agencies continue to be aligned to the parish rather than the city, such as the Orleans Parish School Board. Local residents rarely have much reason to refer to the parish, but they do acknowledge its existence. I think it would be appropriate to continue to model Orleans Parish as a coextensive boundary relation.

Imgur

Legally, there is only one boundary for the City and County of San Francisco. This is reflected in the city-and-county limit sign along the highway and on the Golden Gate Bridge. Certainly saving costs on metal might have factored into the use of a single sign, but it also reflects how local residents hardly ever mention “San Francisco County” as a separate notion.

Imgur

To me, a separate San Francisco County boundary would be more pedantic than a separate Orleans Parish boundary and not as legally justified. However, I remember being a bit annoyed when it was deleted a couple years ago. I could get used to seeing that boundary again.

One more data point: as a rule, we aren’t mapping Ohio’s paper townships, so cities and villages that have withdrawn from their surrounding townships exist inside a “hole” in our admin_level=7 coverage. In other words, the withdrawn cities and villages are treated as exceptions. This works because they don’t actually function as township equivalents in real life. They don’t have the dual nature that we’re grappling with for CCCs. All we’re doing is omitting the paper townships because we think no one will care but everyone would be surprised.

2 Likes

Offtopic: I love when hilly cities have a single elevation precise to the one foot level. (edit: excuse me, city counties)

5 Likes

Thanks, Brian. Answers below:

None of these are part of the CCC discussion with the exception of the “city” of Honolulu, which is currently represented as a node in OSM. I personally am indifferent as to the fate of the node itself, but this discussion would not result in a “city” boundary for Honolulu when a municipality does not even nominally exist.

Virginia’s 38x independent cities; Baltimore, Maryland; St. Louis, Missouri; and Carson City, Nevada, are all already coded as admin_level=6 without any corresponding county, as they should be.

It doesn’t really matter too much how the wiki evolved to where it is and what editing disputes have happened in the past. Picking out statements from the wiki as if they’re some kind of legal foundation won’t get you very far. Anyone can edit the wiki and there’s all sorts of things that are wrong or don’t reflect community consensus. The only thing that matters is correctly documenting what the most current community zeitgeist is.

That’s a fair point about the wiki – and perhaps I’ve overstated its importance for this particular discussion. However I’m hopeful the wiki will get to (and be maintained) where it would reflect current community consensus and I’m assuming the statement currently there did at the time it was made, or at least indicates there’s not been much consideration of what the community consensus is on the particular topic of CCCs since it has gone unchallenged for so long.

For consolidated municipal / county entities – are these really two separate entities with co-extensive boundaries, or are they a single entity that performs the role of both municipality and county? Are separate municipal offices, organizations, and political or popular identities maintained?

I’m not an expert in CCCs so I can’t answer everything here, but as for if “are these really two separate entities with co-extensive boundaries”, I can say – it depends . . .

At least 15 CCCs do not have coextensive boundaries. For example, Athens / Clarke County in Georgia is a CCC, but the boundary of Athens is smaller and within Clarke County, and Clarke County has another municipality, Winterville, located inside it.

Those 15 without coextensive boundaries would need (and currently do have) separate entities in OSM regardless of whether their governmental functions are combined.

And at least 17 CCCs are coextensive, of which 13 currently have the county and the city in OSM, and all of which have previously had both the county and city in OSM.

I might also add that of those 13, some are non-trivial, such as the five counties / boroughs of New York City, Philadelphia / Philadelphia County, New Orleans / Orleans Parish, Indianapolis / Marion County, etc.

Note that the 15 non-coextensive CCCs and the 17 coextensive CCCs do not include Alaska and four that I cannot verify whether they are coextensive or not.

Also, my list CCCs comes from the Consolidated city-county - Wikipedia list and Wikidata entities. I maintain there is a Philadelpha County, I’m sure recognized by the state of Pennsylvania as a county, as distinct from the city of Philadelphia, even though their government functions are consolidated and boundaries coextensive.

How does this impact data consumers that currently expect these cases to be one entity? Will the change cause data consumers currently using the admin_level=6 boundaries to encounter duplication when new, overlapping admin_level=8 boundaries are introduced to the map? Have you explored the impact on known data consumers?

I admit I haven’t explored that and wasn’t aware such a wonderful resource even existed. I’d have to take time to look or rely on someone more familar with it to help me, however @stevea states “I’m listening to downstream user concerns and I see no (or few) contradictions”, which I’m assuming answers this portion.

Does the duplication support all use cases well or is the proposed change tailored to support one specific use case?

This is not an approach tailored specifically to me. Getting a list of counties and also getting a list of cities, even if their administrative functions are combined and even if their boundaries are coextensive would be a common exercise for any data consumer interested in that level of administrative boundary.

I actually believe it supports all use cases even better than what the current status does, the current status being that four counties are missing from OSM’s data. Of course I’m speaking for other data consumers, but currently they have to address either the 4 missing county entries or the 13 with entries that also have coextensive cities.

Here’s an example using San Francisco and Philadelphia, both similar sized, major CCCs that have coextensive boundaries, one of which has both the county and city in OSM and the other of which does not.

This Overpass query should result in a list of the 58 counties in California: List of counties in California - Wikipedia

[out:json];
area[wikidata=Q99]->.a; // Q99 = California
relation[admin_level=6](area.a);
out tags;

It does return 58 rows, one being the CCC of San Francisco.

A user might use query to return all cities in California:

[out:json];
area[wikidata=Q99]->.a; // Q99 = California
relation[admin_level=8](area.a);
out tags;

It returns 594 rows but San Francisco is missing, although someone might reasonably think San Francisco should show up on the list of admin_level=8 cities in California.

Turning to Pennsylvania, there are 67 counties: List of counties in Pennsylvania - Wikipedia

[out:json];
area[wikidata=Q1400]->.a; // Q1400= Pennsylvania
relation[admin_level=6](area.a);
out tags;

It does return 67 rows, one being the CCC of Philadelphia County.

But a user might also reasonably think this query will return all cities:

[out:json];
area[wikidata=Q1400]->.a; // Q1400 = Pennsylvania
relation[admin_level=8](area.a);
out tags;

It returns 2262 rows and the CCC of Philadelphia is present as someone might expect from a list of admin_level=8 cities in Pennsylvania.

There is no different between San Francisco / San Francisco County and Philadelphia / Philadelphia County other than three years ago someone seemingly arbitrarily deleted San Francisco County and I’m just asking for the four missing counties to be put back into OSM as they were previously and the tagging updated to mirror the other 13 CCCs with coextensive boundaries.

If a data consumer wishes to treat consolidated entities as consolidated objects, is there sufficient tagging so that they can de-duplicate these boundaries?

Not if tagged as normal counties and cities, however we could possibility use a tag that highlights not only that they are CCCs, but coextensive CCCs if appliable (since not all CCCs are coextensive). Of a data consumer wanted to treat them as consolidated objects then they could.

Alternatively I suppose they could check the boundaries themselves, but I suspect the tag would be more much efficient and welcomed by the data consumer.

Which entity should maintain the history of the original boundary? OSM IDs are semi-stable, so we should not cause unexpected behavior by changing what a boundary relation represents.

The relations for San Francisco County, Nantucket County, Denver County, and Broomfield County are already in the OSM database, just deleted. They just need to be reverted and the tags cleaned up.

The relations for San Francisco County and Nantucket County are the ones from years ago. I couldn’t revert the deletions of Denver County and Broomfield County so I created new ones.


Bottom line – there are 17 CCCs with coextensive county and city boundaries, and it would be beneficial to all data consumers, current and future, for them to be entered into OSM consistently, but they are not.

At present four have only the city at admin_level=6, the same as an independent city (which they are not), while 13 have both the city and the county entry at admin_level=8 and admin_level=6 respectively, the same as the other 15 CCCs that do not have coextensive boundaries.

To make the 17 CCCs with coextensive boundaries the same, we should either:

  1. reinstate the four counties that are not present, or
  2. delete the 13 counties that are present

Option 1 would also have the added benefit that all CCCs would treated the same regardless of having coextensive boundaries or not and will match Wikidata and Wikipedia.

Either way, from there data consumers can adjust since they know San Francisco / San Francisco County will be treated the same as Philadelphia / Philadelphia County.

Once we figure this portion out perhaps the discussion on tagging can follow.

1 Like

Honolulu is currently represented by this boundary relation, which I see now you’ve changed (incorrectly) to a boundary=administrative. It was previously a boundary=place. In the local understanding, Honolulu stretches from Halawa to Makapuʻu Point. It has customary, but not administrative boundaries, and isn’t represented by a single CDP either.

Thanks, Brian – I’ve reverted it back.

I did that when I was just a little kid in this discussion :grin:

I was referring to this node and had forgotten the relation: Node: ‪Honolulu‬ (‪21442033‬) | OpenStreetMap

1 Like

@Minh_Nguyen

So should San Francisco be considered more of an independent city?

Is there a difference from Carson City, Nevada?

And what about a solution for the coextensive CCCs being that a node be used to represent the city admin_level=8 rather than a relation?

It’s that way for a handful of CCCs already.

Ultimately I go back to where someone should be able to do a search for counties and get the correct list of counties and do a search for cities and get the correct list as well.

IMO (as a former City and County of San Francisco resident, for whatever that’s worth), it feels reasonable to me to model San Francisco as both an admin_level=6 and admin_level=8 boundary. The unified government maintains institutions typically the responsibilities both of California counties (a Board of Supervisors, a sheriff department, a County Transportation Authority) and California municipalities (a mayor, a police department, a transportation agency).

I find this point persuasive: I think you’d expect San Francisco to appear both in a list of counties in California (in OSM, all admin_level=6 relations) and a list of municipalities in California (in OSM, all admin_level=8 relations), which would support tagging both concepts. I think it also contrasts with the situation in e.g. Virginia, where I don’t think you’d expect independent cities to appear in a list of municipalities. But someone more familiar with that area might be able to speak more definitively to that.

1 Like

One question I would have for the CCCs, if we are doing double entries, which one (level 6 or 8) gets the tag:

official_name=City and County of (Name of City)?

Also, with regard to the overpass discussion, it would entirely be possible to do something like the following if the problem is simply “making all cities query-able”:

(
  [boundary=administrative][admin_level=8](area.a);
  [boundary=administrative][admin_level=6][border_type=consolidated_city_county](area.a);
);

After all, it is already the case that you have to make state-specific use-case-dependent queries to answer the question of “all cities/towns in state X”.

At the least, there needs to be tagging that allows for data consumers that don’t want repeated administrive entities to de-duplicate them if such duplication is added to the database.

I’m all for clear and consistent tagging.

For ones who don’t want them listed twice, perhaps they’d need some awareness that some are being added but they probably already have a solution for the CCCs that currently have both city and county entries.

I’m having trouble picturing the use case you have in mind here. Do you happen to have an example of a data consumer for whom this duplication would present an unwanted/unintuitive side effect? One bad enough that they’d really want to de-duplicate the entries?

One thing that’s coming to mind is something like Nominatim listing an address as San Francisco, San Francisco, California, USA, which doesn’t really seem so wrong. But I’m sure there could be other issues that I’m not thinking of.

Any application where a data consumer uses OSM boundaries to walk up and down the hierarchy. E.g., a data consumer shows data for “Denver”. If there’s breadcrumbs leading up the admin hierarchy, an application should have the ability to walk the user straight up to “Colorado” without having a pointless “Denver County” layer in-between that shows the user the exact same data.

This part (“bad enough”) is entirely subjective. But a data consumer should have the freedom to control de-duplication, or not, as their preference. It’s completely reasonable to have a way to tag a city as a CCC (in a structured way), and it provides information not otherwise obtainable short of a spatial query to discern it.

…or they already have duplication in their database and simply haven’t noticed it yet. I have over 200,000 boundaries in mine, and I really only become aware of issues when a user reports it. These kinds of things are not always the easiest to pick out at scale.

The other issue is for use cases in which admin_level=6 CCC entities have been added by exception to an end-user’s application while admin_level=8 entities are added by rule. If the level 8 boundary gets added later, it could get swept in without anything to indicate the duplication. If there is at least a CCC indicator in tagging, the end user can query for the CCC IDs and handle the duplicates. That would impact me directly, along with any data consumer querying for 7/8 boundaries by rule and others by exception for city-level boundaries.

Unfortunately, I’m not as familiar with how Nevada, Virginia, etc. structure their county equivalents. I purposely chose the two cities that I’m personally familiar with: my birth certificate says Orleans Parish, and I worked for a time in the City (and County of San Francisco). Just from my two cities, it looks like the CCCs fall along a spectrum, so I wonder if we should handle them on a case-by-case basis. It sure would be easier if San Francisco had kept its original name of Yerba Buena, because then we could simply observe the degree to which the name San Francisco remains in use, as I’ve done with Orleans.

One might see the differences among the CCCs as pedantic and prefer consistency if it ever comes to it. As an engineer, I would completely understand this tradeoff. But with my geographer’s hat on, I’m less inclined to apply a common solution to various states that don’t themselves seem to care for any interstate consistency. After all, “consolidated city–county” is census terminology, a generalization for statistical convenience.

This is one reason I lean towards keeping San Francisco’s boundary consolidated. Whenever you see the authorities use “San Francisco County”, what they really mean is the City and County of San Francisco or, more specifically, the City and County of San Francisco in the capacity of a county. The 1856 act abolished the County of San Francisco, renamed the surviving city to a “City and County”, and gave it the former county’s boundaries and responsibilities.

This consolidation wouldn’t set a precedent for other edge cases across the country, only for the (zero) other city–counties in California that were established in the same manner. Along these lines, I would expect to handle Louisiana’s city–parishes consistently with each other. New Orleans and Orleans Parish have the same official statuses as Baton Rouge and East Baton Rouge Parish, respectively, which translate into real-world effects like boundary signs. The only difference is that Baton Rouge is not coextensive with East Baton Rouge Parish, so the need for a distinct boundary is more obvious at a glance.

I’m not sure we should guarantee that querying for a single admin_level=* value will correctly return every jurisdiction of a given type. It seems overly simplistic, since the admin_level system is fundamentally nothing more than an indication that one boundary nests inside another – a topological attribute, not a semantic one. Sometimes one can make further inferences based on a subset of the data, but this doesn’t mean the entire dataset must also satisfy the same expectations. If one already has to make exceptions to include New York City and Washington, D.C., in a nationwide list of cities, what’s a few more? :grin:

border_type=* is a better tool for coming up with such lists. I think it would be reasonable for San Francisco to have border_type=city;county. Tools like Overpass have built-in support for semicolon-delimited value lists, or you can match on a regular expression instead.

We probably shouldn’t put too much emphasis on Nominatim’s behavior, since in reality, it doesn’t know how to form a proper U.S. address. That said, it’s fair to assume that any implementation of a breadcrumb-style reverse geocoding result would show los dos San Franciscos if we have separate boundary relations for both. Back when this was the case, I tended to ignore it as a rough edge, not ideal – but also not as wrong as the national capital being listed as “Washington, D.C., Washington, D.C.” in some languages and “Washington (city), Washington (city)” in others due to some overzealous data reconcilation.

I had asked earlier about using a node to represent the admin_level=8 city portion as but didn’t see any responses to that.

Any thoughts there? Consolidated city-counties with relations for the county and nodes for the city are:

  • Anaconda / Deer Lodge County in Montana
  • Butte / Silver Bow County in Montana
  • Camden / Camden County in North Carolina
  • Hartsville / Trousdale County in Tennessee

This doesn’t answer every question posed here (i.e., which one should get the official name tag, etc.), but just curious as to if a solution might appear out of consideration for that.

I’m curious – has this ever come up with the others, like Philadelphia / Philadelphia County or Indianapolis / Marion County?

Those have always been that way from what I can see.

Has that caused any particular hardship?

That would be counterproductive. As far as I know, Anaconda has a well-defined boundary, whether it’s coextensive with or distinct from Deer Lodge County’s boundary. Reducing Anaconda to a mere place node would assert that it has no well-defined boundary, similar to a neighborhood in a city or a village inside a New England town.

Ok, just curious.

It’s currently only a node, btw, but I didn’t do that one! :rofl: