Place classification - town vs suburb, remote rural areas, how does a CDP relate to place

willkmis · April 5, 2024, 7:02pm

I can think of quite a few examples in Central and Southern California.

Santa Maria and Santa Barbara are 60 miles from each other, separated by farmland and a mountain range, but are both in one MSA because they’re at either end of Santa Barbara County. If anything, Santa Maria is more associated with San Luis Obispo 25 miles to its north, which has its own MSA.
Ridgecrest is a 25,000 population city in the Mojave desert. But it happens to be on the eastern border of Kern County, so it’s in the Bakersfield-Delano MSA, even though Bakersfield is 100 miles and across the Sierra Nevada from it.
A lot of the Mojave desert cities are in a similar boat: Lancaster/Palmdale is in the Los Angeles MSA, while Victorville and Barstow are in the Riverside MSA. All are at least 50 miles and across undeveloped terrain from the named MSA cities. I’d guess they’d at least be their own divisions if they were in a separate county, but they happen not to be. I think even Needles, a three and a half hour drive from Riverside across the Mojave (though definitely just a town), is technically in the Riverside MSA because its still in San Bernardino County.
South Lake Tahoe is probably not a city, but it is also two hours and across the Sierra Nevada from Sacramento. But it’s in the Sacramento MSA because it’s in El Dorado County, which also includes some Sacramento suburbs at its western end. Truckee, which is slightly smaller than South Lake Tahoe but certainly thought of as being in the same Tahoe region, is its own Micropolitan Statistical Area because its in a different county. South Lake Tahoe’s closest city is Carson City, NV, 25 miles away and its own MSA.
I speculate that if the counties were smaller, then the Coachella Valley (Palm Springs/Indio), the Temecula Valley (Temecula/Murrieta), and Escondido would at least be their own divisions, since they’re not or barely connected by continuous urbanization to their main MSAs (Riverside for the first two, and San Diego for Escondido). So to me they seem to be their own nuclei. The bar seems to be pretty low if San Rafael/Marin County gets its own division in Northern California. Even Los Angeles and Long Beach probably constitute separate nuclei worthy of their own divisions if New York and Newark do, though they’re at least both named in the one Division.

I wouldn’t be surprised if the “top few in the MSA” works pretty well in some areas, but hopefully this illustrates why I’m a bit skeptical of it.

Minh_Nguyen · April 5, 2024, 7:46pm

So if Santa Barbara County were partitioned between Santa Maria and Santa Barbara, would Santa Maria be its own MSA independent of either Santa Barbara or San Luis Obispo? That’s what I’m asking for examples of. Polycentric MSAs quite often have cities separated by this distance or more; an MSA isn’t strictly a contiguous expanse of urbanization, as your examples show.

Anyways, if the CBSAs are such a bad basis for place classification because of the awkward county boundaries, then we could use the more granular Urban Areas, which are also named after cities but never follow administrative boundaries.

The difficulty with UAs is that sometimes they’re even more granular than the city boundaries that laypeople are familiar with. Twentynine Palms would stand apart from Riverside–San Bernardino, but so would Twentynine Palms North, which we don’t recognize as a separate place within the city limits. After all, not every city is a bedroom community; most cities are also defined by economic activity and cultural and historical factors.

The Census Bureau used to distinguish bigger Urbanized Areas from smaller Urban Clusters but did away with that distinction a few years ago, so we can’t determine city solely based on the existence of a UA. Santa Maria would stand apart from San Luis Obispo, but so would Nipomo and Arroyo Grande–Grover Beach–Pismo Beach in between. At the end of the day, we’d still need population thresholds, but at least the population figures would correspond more closely to our place points in states like California.

ZeLonewolf · April 5, 2024, 7:54pm

From this thread, I see a couple themes.

Classifying place= by population alone is non-useful. If population is tagged and data consumers want to use population as the sole discriminator between places, we gain zero additional benefit from determining place= based on fixed population thresholds. In fact, we might as well tag every place place=city and then ask data consumer to just look at population as the descriminator.

Therefore, the only utility in the specific place value is in discriminating population centers in a way that cannot be done from population alone.

I think it’s worth digging deeper into the MSA μSA categories. If we could use census bureau categories to classify cities (or at least use them in a way that allows us to tag =city and =town in a consistent way on the large end of the scale), then it’s simply a ruleset that we can all live by, even if our favorite city doesn’t make the cut.

It does reveal some interesting corner cases, though. Would we be okay with place=town + capital=4 + name=Montpelier? Because that’s probably appropriate tagging, and if a data consumer wanted to separately promote state capitals, they could do so by consuming the capital= tag separately.

ezekielf · April 5, 2024, 9:40pm

This seems fine to me. Vermont could easily have Burlington as its only place=city. We only have one MSA in the state, and Burlington is the core city. Whether there are any other place=city’s in VT depends on what we decide about μSA core cities. If they also qualify for place=city status, then that would also include Bennington, Rutland, and Barre. Either way, Montpelier is probably fine relying on its capital=4 tag to provide a boost in data consumers that support it.

Joseph_R_P · April 5, 2024, 10:59pm

I do understand that the GNIS listing geographical features such as populated places is federal recognition, but it’s not a complete list of them. Like you mentioned, things as miniscule as trailer parks or railroad sidings and the like are included in the system as Populated Places, but as of the last time I looked, my residential subdivision with a common name was not, so it’s not exactly 1:1 with OSM classification standards. Opposed to that, however, is how every single CDP or municipality is formed by some kind government organization in the US. While a municipality and a census-designated place aren’t 1:1 in every regard, in the 48 states, no CDP or municipality overlap as far as I know.

That aside, in your opinion, what should constitute a populated place in the US in OSM? To my knowledge, there’s no definitive answer, but, in my opinion, if we went with ‘GNIS Populated Places + other named neighborhoods/suburbs that can be proven of their existence based on another source - CDPs not listed in the system as Populated Places’ being the criteria for when a populated place should be given a node on the map, that seems closer to a compromise though still probably flawed in various ways.

ZeLonewolf · April 5, 2024, 11:04pm

Every single CDP in Rhode Island overlaps with a municipality. Every square inch of Rhode Island is part of a municipality. There are no unincorporated areas.

Joseph_R_P · April 6, 2024, 2:47am

New England towns just like townships in the Midwest or New Jersey politically function like typical towns or cities but are structurally more equivalent to a ‘sub-county’. It’s often that, depending on the state, they have incorporated villages or CDPs or such with their own identities and address cities within them.

Minh_Nguyen · April 6, 2024, 4:05am

I wasn’t suggesting that we use GNIS as the sole arbiter of whether something gets mapped as a place. GNIS is extensive but by no means comprehensive when it comes to minor places. Its coverage of neighborhoods is particularly lacking.

A CDP is a statistical convenience, equivalent to a municipal corporation boundary for demographic purposes. If you’re looking for an equivalent to a city or town center, which is what something like place=village represents, then a CDP is technically the wrong tool for the job. You want a point feature with meaningful coordinates, not an area feature or its centroid. You want something defined by not only a bunch of people living somewhere but also some measure of present or past commercial activity.

In other words, you want whatever Populated Place feature or features the population has grown around, which is likely to have been imported from GNIS. In the rural West, maybe it seems like every CDP is centered around the only stoplight or post office for miles around, but this is far from the case everywhere.

stevea · April 6, 2024, 10:27am

I don’t mean to sound harsh here, but saying “just like townships in the Midwest or New Jersey” is not only much too bold, but actually much too broad to be effectively true.

Saying “like typical towns or cities” is pretty hand-wavy. I wish to offer constructive criticism, so, if you mean “incorporated” by that then please use that specific word to denote that. Otherwise, I (we) don’t know how you mean that. And “structurally more equivalent to a ‘sub-county’” sounds very much like the concept of township (7-subordinates-to-6 using USA admin_level values), although “township” is a well-abused word, which usually means the sort of “sub-county thing” you mention. But, a township is not a “typical town or city,” they are quite deliberately distinct.

See, there are legal definitions of things (organized, incorporation…the first is quintessentially defined by each state, the second seems pretty widely well-agreed-upon) and there are colloquial understandings of things. These are not the same senses of understanding. We must be careful (perhaps not exact), at least as precise as we can be. We cannot be loose and “well, you know what I mean,” because even though we are from the same country and have similar culture as to law and conurbation tradition (largely, not exactly) we truly don’t know what we mean. Exactly. Unless we rather precisely denote these things. So, let’s carefully do that as best we can, rather than expecting that others see things “our” (my, her, their…) way.

An incorporated village and a CDP are wholly, almost wildly different entities. To rather simplistically “or them together” seems a bit reckless to me. They are distinct. Let’s keep them that way, in our thoughts, our words and our map (data).

It’s an easy slope to slip down into, but as we are careful, we can avoid losing our balance.

Joseph_R_P · April 6, 2024, 5:37pm

I do recognize that having a sort of “downtown” strengthens a higher classification or one that implies its independence from a nearby place, but I don’t think it’s necessary for a community to have a central location like a commercial area or main crossroads for it to be tagged as a hamlet, village, or town as opposed to a neighborhood or such. CDPs like Sandy Valley and Amargosa Valley in Nevada have their own distinct identities relative to where they’re located and aren’t exactly adjacent to a larger community like your typical US-colloquial suburb would be, but they have no area reminiscent of any sort of a “downtown”, rather schools, parks, and general stores scattered near their main roads here and there. Then there are also various unincorporated communities Cold Creek, Corn Creek, and Trout Canyon, in Clark County as well as others throughout the state with no centralized location (other than the geographic centroids themselves) or commercial development whatsoever but are more standalone than typical residential subdivisions or neighborhoods in urban scenarios.

Joseph_R_P · April 6, 2024, 5:44pm

I’m aware that legal definitions and colloquial definitions are not the same. By OSM standards, these features are very similar, however. We already avoid tagging places like villages, towns, and cities based off their incorporation status. But what these places are legally isn’t exactly my point. What I’m saying is that these various administrative divisions within counties often have their own distinct identities and can’t necessarily be equated to villages/towns/cities on OSM.

stevea · April 6, 2024, 6:12pm

Joseph (if I may so address you) I appreciate your (two, to both me and Minh) clarifications, as such. When you say “by OSM standards, these features are very similar” again, we must be careful. I’m OK (and the community seems OK) with having an expanded conversation (as we do here now, very important to continue it!) that explores whether a downtown “central business district” might have some component (in an OSM context) to whether we choose to tag city. Similarly as to whether a hardware / grocery store or a hospital / college or university might “tip the scale” towards a hamlet becoming a village or a village becoming a town. These are good conversation points, but what is actually happening is “OSM standards are actually hammering out the details” (rather than these values being “very similar”). We don’t want these values to be similar, we want them to be distinct. Sometimes they aren’t clearly so, we continue to strive to find what makes them tick — regionally and amongst ourselves in a wider (national, likely) context. There appear to be “regional lumpiness” in our data, this seems directly because we “smear” the gray zones of where the boundaries are. Making this sausage is not a perfect science, but the more we find (and agree upon!) the distinctions, rather than how much overlap and similarity there is, the better we’ll be able to apply the differences to what are actually different from each other.

Minh_Nguyen · April 6, 2024, 6:46pm

If a community is very isolated, then its point location might certainly be more arbitrary. Traditionally it would be at a crossroads, the post office, a schoolhouse, or the railroad station. These days there are other options if a community doesn’t have a recognizable anchor institution. Maybe a centroid is a decent last resort in some cases, but not as a first stop in general.

You’re highlighting that rural America isn’t homogenous, either. If I take all your points together, what I think you’re really vouching for is the Urban Area designation, not CDPs, even if there is some overlap. Most UAs and CDPs contain or are associated with a place that GNIS acknowledges as something. I don’t think this alone would be a rubric for deciding between place=town and place=city, but it would bolster the case for promoting something beyond a mere place=locality or place=hamlet.

stevea · April 6, 2024, 7:50pm

To be clear about what we are specifically addressing, guidelines at United States/Tags - OpenStreetMap Wiki regarding how we use the place=* tag in the USA remain a bit loose, but they are better than nothing. There are plenty of CDPs in the USA which are also tagged with place=*, that is acceptable to our community.

When Minh says (directly above) that “a crossroads…” can be used to define a point location for a “very isolated community,” he may be paraphrasing that wiki. Indeed, it suggests for us “to identify population centers, put the place=* POI at the location that is generally considered the center of the place, for example, a town square, a county courthouse, or the intersection that serves as the origin of a city’s street grid.”

These shouldn’t actually be arbitrary, but they may seem that way to those unfamiliar with the specifics of the place.

Let’s agree a place=locality is indeed a place, but it has an association about it that it is (currently) unpopulated. It may be a historical vestige: for example, many “railroad places” once signified an old halt on a rail line that no longer exists (neither the “place” nor the rail line) and this “place” might be something that current locals know as a distinct name for those immediately surrounding environs, even as “nobody lives there, there are no services there, but it IS a distinct ‘place’ known by that name.” On the other hand, such a “locality” name might be historically accurate, but doesn’t truly reflect how people know that area today, and so we might better consider that node (and its name=* and place=* tags) so named to exit OSM proper, perhaps finding its way into OpenHistoricalMap.

Minh_Nguyen · July 9, 2024, 2:39am

I’m kind of surprised that Nevada’s unincorporated towns didn’t come up in this conversation. Moapa Valley come up in Slack last year, but I don’t think there was any consensus about it. For posterity, I think we should basically classify this kind of place without regard for its (lack of) incorporation. This is pretty important, because not every unincorporated town is as remote as Moapa. The most well-known one is perhaps Paradise, home of the Las Vegas Strip – hardly anyone’s idea of a remote desert oasis, though it was in the past.

The boundaries of these places are at least boundary=census, since the Census Bureau considers them to be CDPs. But they’re a bit more than run-of-the-mill CDPs. The county directs extra funding for services at the territory covered by an incorporated town. On the other hand, its governing board is merely advisory, which puts them on the same plane as a humble New York City community district:

Manhattan Community Boards as administrative boundaries

After reading through the attached link, what I understand is that the community boards are simply an advisory group formed for the purpose of liaison and civic engagement and have no actual powers or duties of their own.

The purpose of each board is to encourage and facilitate civic engagement within their
communities, and to work with City agencies that deliver municipal services

I was fishing in the report for anything in the form of powers and duties that these boards have. I see a lot of language like…

Community boards have also increased collaboration with City agencies to ensure food delivery, communication of vital information, and access to healthcare services for our constituents.

This still reads as “advisory functions and civic engagement”.

Even the planning and service delivery facets that have been mentioned, all appear to be advisory on the part of the boards.

ZeLonewolf · July 9, 2024, 2:48am

Just like in Maine, I would consider towns in Nevada – or NYC community boards – or New England counties – as a collective class of boundaries that should be characterized on the whole. I don’t think we need to examine each individual entity’s degree of real-ness, but rather consider the characteristics of each whole group.

stevea · July 9, 2024, 3:17am

Brian, if you mean “as a whole within a single state,” that’s one thing. If you mean “as a whole across states,” no. It (sometimes) makes sense to (sometimes) examine things within a state as we might compare them, we’re likely to get a certain similarity that perhaps things sensibly coalesce together (they should at a state level or so). It almost never makes sense to do this across states (territories, district…) as that’s asking for the sort of trouble that often happens when you discover that you’re comparing apples to oranges (interstate), not apples to apples (more likely to happen intrastate).

You can categorize these things “sub-6 entities” (below county) where semantics get hairy and quite locally-specific. But you can’t treat “all 9s together,” if that’s what you mean by “each whole group.” A 9 in one state is there because its 8 (likely different than any other particular 8) needs it to be there. The reasons are different for another states 8s and 9s (if it even has 9s).

If I’m misunderstanding what you mean by “each whole group,” I apologize and ask for clarification.

ZeLonewolf · July 9, 2024, 10:46am

What I’m describing is a general concept that is descriptive of how we assign tagging. There are two forces in play – which I’ll call collective tagging and equivalency tests.

The Maine case is a great example of both. When I mapped the boundaries in Maine the first time, I made all the incorporated towns and cities admin_level=8 and the unincorporated ones admin_level=9. I did this simply out of expediency because the local mapper I was working with came up with that scheme and I figured he knew better what the reality was there. I also figured that mapping them was the hard part and if we needed to change one tag, it was a trivial edit to do so with JOSM and Overpass.

What we decided in Maine was to tag all of them admin_level=8 based on this principle of collective tagging. In other words, we decided that the whole group of these boundaries represented a class of features, and that we tag all features in that class collectively. We reserved the tagging of those differences for a lower-level tag (in this case, border_type), for any data consumer that cares to know about this distinction. And in that case, we decided that all of the unincorporated townships also represented a (narrower) class of feature, and we tagged them with a common border_type=township.

Lastly, once we decided that all of these should be the same admin_level, what remains is which value to assign. There was not even a debate on that point – these boundaries were equivalent to the cities and town in other states which are tagged admin_level=8. Even though every place is different, we still try for equivalency tests to the maximum extent possible, because it makes things easier for data consumers to have the least possible variability between places. The further afield we go, the less these equivalency tests work, but we make the best-possible effort nonetheless.

For example – why didn’t we tag New York City admin_level=6 and then each of New York’s counties admin_level=7 and then other cities and towns admin_level=8? It’s because we put high value in the equivalency between counties, and county-equivalents, across the US.

This is not just a thing for boundaries, it applies to any OSM feature that we’re considering.

Now anyways to the matter at hand. Why does the logical reasoning that we used in Maine not simply apply in Nevada the same way?

Minh_Nguyen · July 9, 2024, 2:22pm

I relegated this to a footnote in another thread, but if we were especially concerned with equivalency, we would’ve modeled New York City very differently:

Five counties at admin_level=6, like all the other counties in the state
One city spanning all five counties at admin_level=7, analogous to upstate towns
Five boroughs at admin_level=8, analogous to upstate villages
Community districts at admin_level=10 if at all

But we didn’t, because “Manhattan Community District 1, Manhattan, New York, New York County, New York” would be a bit much in geocoding, and New York County is a paper county. Rationalization is a tradeoff.

I’m still not sure I understand the point you’re trying to make regarding unincorporated towns – how would you approach place classification, and separately how would you tag their boundaries? Note that in Nevada, there are legally “cities” and “unincorporated towns”, but “town” can informally refer to small cities.

ezekielf · July 9, 2024, 5:38pm

Maybe fodder for another thread, but for a while now it’s been bothering me that in New York State, Town/Township boundaries are admin_level=7 while in every surrounding state they are admin_level=8 (overpass visualization). If we put such a high value on equivalency it seems to me that NY Town boundaries should be changed to admin_level=8.