Question: should CDPs relations have a label node?

Pretty simple question. I was updating CDPs in Utah, and I was wondering if CDPs should have a label node.

My current understanding of CDPs is that, even though their statistical boundaries, their footprint gives data consumer a good idea of where populated places are. Also, since these boundaries are released by the federal government it gives an authorial source on the matter.

This leads to the reason of the question, since were using CDPs in OSM, would it help to also add a label?

Yes, the U.S. community generally prefers to add unincorporated place points as label members of CDP boundaries. However, some of us are starting to doubt that practice:

The label role normally refers to an alternative representation of more or less the same feature. A CDP is so often such a crude representation of an unincorporated community that using it as an area representation of the community verges on misinformation. Apparently we’ve been misleading data consumers to think that a CDP is equivalent to an unincorporated place. Even from the Census Bureau’s standpoint, an urban area would better represent the community’s shape than a CDP, especially if it’s adjacent to a larger built-up area.

An area representation of the community is very useful for rendering and search use cases. In principle, we could import the urban area as a boundary=place relation and make the place point its label, but I’m unsure if we have the appetite to maintain those much more intricate boundaries.

1 Like

And urban areas are kind of subjective anyway to be honest.

The official Census urban areas are mostly data-driven and algorithmic. The main source of subjectivity, distinguishing between urban clusters and urbanized areas, was eliminated in 2020. Of course, the choices that led to that algorithm are somewhat subjective and so is our choice of CDPs versus urban areas. I just hope data consumers don’t get the wrong idea about this, treating CDPs as more than they are.

1 Like

Should have elaborated a bit more, I personally don’t agree with some of the Census Bureau urban areas they’ve come up with for cities that I know well haha

1 Like

If we’re mapping CDPs (which is a broader issue), then I think adding label nodes makes sense in most cases, but not all cases.

Typically, a CDP will have a direct correlation with a named unincorporated community and there’s a pretty clear association between the CDP records from the Census Bureau and the Populated Place records from GNIS. When things match up, there’s a good case for using the Populated Place node as the label for the CDP.

However, there are some rare cases where the two don’t match up at all, where the CDP encompasses more than one unincorporated community and doesn’t share a name with any of the communities within the boundary. In that case, it’s hard to argue that a label node for the CDP would be meaningful.

GNIS includes CDPs separately. For example, Amelia Census Designated Place (class “Census”) is distinct from Amelia (class “Populated Place”). Previously there was a feature in class “Civil” too, representing the incorporated village before it dissolved. We never imported the Census or Civil classes from GNIS, preferring TIGER boundaries instead.

So there isn’t a one-to-one correspondence as far as any of these datasets is concerned. The label memberships in our database represent a conflation on our part – and Wikidata’s and Wikipedia’s, though Wikidata wants to split them apart.

The GNIS Feature ID on “* Census Designated Place” records matches the PLACENS ID in the TIGER boundary data.

1 Like

Right, this means that the boundary and place point have different gnis:feature_id=* tags.

Wikidata usually has both feature IDs on the same item with different subject has role (P2868) qualifiers. This is a placeholder until editors get around to splitting up the item and figuring out which statements pertain to which entity. (Otherwise Wikidata will flag it as a constraint violation.)

OpenHistoricalMap would keep the elements unrelated because they would typically have different start_date=* values. Plenty of unincorporated communities predate the system of CDPs. Similarly, on Wikidata, the inception (P571) date would typically differ between the two items.

All of this would point in the direction of keeping the elements separate in OSM, except that geocoder developers like being able to associate the place with an area, even if the area is somewhat or mostly wrong. It would still be more accurate than any guessed radius around the place point, which is what they would fall back to.

One way to emphasize that this is overconflation would be to omit place=* from the boundary relation. That’s what the previous thread explored in detail.

1 Like

Yes. Agreed on all points.