New England place name inflation

I think the description at United States/Tags – Places is pretty good for northern New England:

  • place=city
    Cities at the center of metropolitan areas, always incorporated, with a metropolitan population above 50,000 (a rather small city — most cities have much larger populations — up to many millions of people). Some boomburbs are also tagged as cities, given their size (>100,000) and continuing growth.

  • place=town
    Smaller-sized towns or cities (which may or may not be incorporated), including suburbs (as that word is known in US English to mean “smaller incorporated city near a different, large center city”), generally with a population between 10,000 and 50,000 within incorporation limits. In sparsely populated rural areas, cities with a population less than 10,000 are also tagged place=town if they are state capitals (capital=4 — rare, though Montpelier, Vermont qualifies), county seats (capital=6) or otherwise especially important centers of civic activity with more-major amenities such as hospitals, universities, courts, dozens of commercial and/or industrial businesses, etc. Some place=town POIs represent population centers within unincorporated areas. These POIs may lie within census-designated places (CDPs) but do not necessarily correspond to them.

  • place=village
    Small cities and villages with a population generally less than 10,000. In sparsely populated rural areas place=village will have fewer amenities than place=town with only a bare minimum of commercial and civic amenities. (A consensus is emerging that a “village” has at least a small commercial landuse area: a market, a fuel station/convenience store, a bank, et cetera, this is flexible as of 2023) Some place=village POIs represent population centers within unincorporated areas, such as former villages that have disbanded.

  • place=hamlet
    For isolated settlements with fewer than about 200 residents.

As an example, Burlington, Vermont has a population of 44k, but is the central-municipality of a 108k-person urban area, so would just barely make a cut for place=city. In contrast, Rutland, VT is the second-largest municipality outside of the Burlington urban area at only 16k population and may not make the cut for place=city. I’d be fine with Burlington being the only place=city in Vermont.

It’s likely that different guidance would be needed for southern New England where the higher population gives suburban neighborhoods larger populations than stand-alone Vermont towns.

1 Like

I’ve been sharpening up that Tags wiki, based on input from topics like this one as well as other communication I’ve enjoyed with several Contributors. It does, quite deliberately, attempt to “thread the needle” on the one hand between absolute population numbers (including them as a rough guideline) and on the other hand with the importance of cultural centers and other “big” amenities (hospitals, courthouses…) that might be found in a small population area that could boost it into a higher-up place=* category.

While I believe that the Census Bureau can offer arms-length-away guidelines for OSM to consult (not denote), OSM’s long-term trend of deliberately avoiding (or uniquely tagging, as with boundary=census) Census Bureau categorizations is good for OSM to continue. As our United States admin level - OpenStreetMap Wiki says about this,

Note that the Census Bureau tends to “make everything a county” (equivalent), whereas OSM strives to more strictly denote things “as they actually are.”

Let’s keep doing this. And again, as this is an excellent dialog, let’s continue and refine our understandings of how we chose place=* values. Minh substantially wrote “our other” admin_level-oriented wiki (United States/Boundaries - OpenStreetMap Wiki), as “there is room for more than one book in the library.” It may be that we choose a state-by-state approach as we do this. That would be a longer-term goal to achieve, but we can do that.

I have reverted all of these changesets by t2editme. Based on my searches in OSMCha it looks like they only affected Vermont and New Hampshire.

Revert changesets:

3 Likes

Totally agree on that point. If someone want’s to rate places by it’s population, we have another tag for this which can be used :wink:

2 Likes

Even so, if you ask someone in Rhode Island to name the important cities there, they’re going to come up with a much denser set of cities than someone from South Carolina or Utah who looks at the same state and the same facts, not because of a lack of local knowledge but because of a different perspective. This perspective is informed by population density. We really need to establish a consensus about the degree to which we care about consistent density across regions. Only then can we come up with guidelines that don’t sound arbitrary.

4 Likes

I’m not sure those numbers work out west. For example these places, are currently tagged as cities, but are less than 100k & are a long ways from someplace > 100k
Durango 19k
Alamosa 10k
Sterling. 14k
I guess it depends what we mean by ‘territory’

My examples are in CO, but I think you’ll find similar trends in any of the western states (ex coast).
I don’t have a strong opinion about this, we could end up with all of the CO cities in the denver metro area, and all of the UT cites in the SLC metro area.

The only reason these guidelines have concrete population thresholds is that the global definitions used to suggest some thresholds. They’re mainly in there just to disabuse mappers of the notion that place=* values correspond to the colloquial or official terms for places in their particular region. Novice mappers frequently act on that assumption, resulting in less usable data than even a simple “just keep whatever the GNIS import did” rule would produce.

Underlying these thresholds has always been the idea of a hierarchy. A suburb subordinates itself to a central city. Even a boomburb does despite sometimes being larger than the central city. A former Apple cartographer, Justin O’Beirne, once observed that Google Maps intentionally omits many place labels a certain distance away from a metropolitan area, “clearing its neighborhood” to enhance the city’s discoverability on the map and reduce clutter. It’s weird but it works!

Other hierarchies are readily available in the data without having to fudge anything. Midwestern states like Indiana and Ohio historically developed according to a hub and spoke system: beyond the metropolitan areas, there was a network of at least one town in each county that was connected by mainline railroads, canals, or intercounty highways. These towns, usually county seats, make for obvious place=towns, even if they’ve since been bypassed by Interstates and no longer matter as much economically.

This development pattern doesn’t apply further east, where railroads had less influence, or further west, where the terrain and climate didn’t allow for such systematic infrastructure. But maybe we can think of other hierarchies that yield the desired density of towns, whatever that density is.

2 Likes

Thanks for saying this, I agree that it’s the crux of the problem.

At the most fundamental level, I think the classification system should do the same that that we did with the highway classification, which is to better normalize the density of similar-importance places across the country, but while not completely washing away the fact that some parts of the country are more dense than other. So the East Coast should be denser than the interior, but not proportionally denser. I think that’s what would be the most sane for cartography.

I would want Cheyenne, WY (pop. 65K) to appear on national-scale maps and have the highest classification because it’s the most important city in its area.

Similarly, I don’t want Jersey City, NJ (pop. 280K), a suburb of New York City, to appear on a map until I’m looking at a map of metropolitan-area scale.

However, I would want truly twin cities to both show up at lower zooms, for example Minneapolis/St. Paul or Dallas/Fort Worth.

So I agree with others that solely using population thresholds are the wrong way to tackle this problem. Ultimately the place= values ought ought to be based on a population center’s relative importance in proportion to the importance of other nearby population centers.

And yes, I am totally onboard with completely overhauling the essentially arbitrary classification decisions I made in Rhode Island.

4 Likes

At zoom 8 and below osm-carto uses population to selectively display and to size place=city labels. However, at zoom 9 and above it appears to use only the city, town, village classification. All place=city and place=town are rendered at zoom 9 and up. place=village is introduced at zoom 12.

1 Like

In OSM Americana, the place=* value is one of two factors in determining whether to filter out a place at a given zoom level. (The place=* value also affects the icon.) The other factor is an opaque numeric “rank” provided by OpenMapTiles. The rank is an apparent attempt to reduce the density of labels algorithmically based on some combination of the place=* value, population=*, and the length of the name=* (not the localized name, go figure). It’s not a bad idea for renderers to declutter place labels in postprocessing, taking some of the pressure off mappers to do the same job manually, though this particular implementation leaves much to be desired.

2 Likes

An idea that seems somewhat reasonable to me is that maybe it makes sense to say that a Micopolitan Statistical Area usually has a (small) anchor place=city and a Metropolitan Statistical Area has at least one large anchor place=city if not multiple. The anchor “cities” of the Micropolitan areas would certainly include some rather small towns by comparison to the Metropolitan areas. However, this might be a good way to account for variations in development patterns and population density across the country.

That feels intuitively correct to me, looking at it. I think for New England the Micro/Metropolitan NECTAs (available in TIGERweb) provide better results and map more closely the the MSAs for the rest of the country.

1 Like

NECTAs and MSAs (really, any data published by the Census Bureau) can certainly guide us, but shouldn’t “rule” us. Let’s decide for ourselves what the right (statewide, regional-area…) “balances” are. If the Census Bureau somewhat or actually bolsters those as OSM agrees with their data, that’s OK, and some confirmation OSM is on the right track. But let’s not be wed to always agreeing with the Census Bureau, or representing faithfully its data. It and OSM have different goals and different definitions for what each of us say exists.

I continue to think this is an awesome discussion. I very much agree (and it’s good that primary authors, movers and shakers of Americana as a renderer are here) that “how place names render” really IS an important consideration in all this, despite OSM’s admonishment not to tag for any given renderer. “Better” (with consensus) decisions should come earlier, and the correctly-rendered renderings should come later. We can do this.

1 Like

Setting aside the thresholds, I think it might be wise to rewrite the United States/Tags – Places preamble to more strongly disambiguate the concept of municipalities (generally modeled as boundary relations) and place POIs.

In the global context boundary relations are a high bar to map and a single place=* node is a good first pass, but where boundary relations do exist for municipalities, we should be more clear of the usage of the place=* node. As coverage of municipal boundaries expands across the US we are more able to treat these concepts of settlements/localities and municipalities independently.

I think the situations can be distilled into a few buckets:

  • (1) A municipality is mapped as a boundary relation and the main settlement it covers fills (or mostly fills) its extent.

    Examples:

  • (2) A municipality is mapped as a boundary relation and the settlement[s] it covers are only a small portion of the municipal extent.

    Examples of New England “Town” municipalities:

    • Middlebury, VT (boundary) – The main settlement called “Middlebury” is in the northwest corner of the municipality. A separate settlement of East Middlebury also exists within the municipality, as well as a named hamlet of Farmingdale. Much farmland in between makes for distinct settlements.
    • Royalton, VT (boundary) – The municipality is “Royalton”, but the largest and primary settlement is “South Royalton”. There are smaller settlements of “Royalton” (in the center) and “North Royalton” as well.
  • (3) A municipality is mapped as a boundary relation, but there are either no settlements or no distinctive settlements
    Examples:

    • Lewis, VT (boundary) – Lewis is actually unincorporated and managed by the state as it has no population, but is a still municipal boundary exclusive of the neighboring Towns.
    • Lower Frankfort Township, PA (boundary) – The 1,757 residents are spread out across rolling farmland with no historical center or particularly dense cluster of settlement. There is no village to speak of and the town garage is on a random rural road.
  • (4) A municipality exists, but it hasn’t had its boundary mapped yet.

  • (5) There is a settlement without a municipality strongly associated

    • smaller settlements within a municipality
    • settlements in unincorporated areas.

While in all cases the POI place node represents the settlement rather than the municipality, I do think that in bucket (1) there is a very strong association between them and disambiguating what tags go on the boundary relation versus the POI place node isn’t necessarily obvious.

The second bucket (2) feels much more clear to me – I don’t think the POI place node should have the tags related to the municipality (total municipal population, municipal Wikidata id, etc). Those municipal values should be on the boundary relation and the POI place node should have only those related to the settlement itself (if known).

For bucket (3) I don’t think it makes sense to have any POI place node as there really isn’t any single point that one can show up to and say “I’ve arrived at ____”. Having a POI at the geographic center is a falsehood for labeling which can be done automatically from the boundary extent anyway.

Bucket (4) is probably a compromise where the POI just holds a conglomeration until disambiguation is possible.

Bucket (5), like bucket (2) would have the POI node only having tags related to the settlement itself.

8 Likes

This is a great breakdown of different settlement/municipal boundary patterns, @Adam_Franco. Thank you.

Another bucket would be a settlement that extends beyond the core municipal boundary. This can happen in metropolitan areas where all the land separating a city from surrounding towns is developed to the point that they effectively merge together. Sometimes the core municipality absorbs the surrounding towns. Other times it doesn’t. For example the Boston admin boundary represents the municipality, but the Boston place node represents a more general sense of the urban area which probably includes a number of suburbs that aren’t technically within Boston city limits.

Most of the cases I’ve come across are Bucket 1. For those, most have a relation with a boundary way and a place node. There is usually a place tag on the node and a place tag on the relation. (I found one with a place tag on the boundary way also). Perhaps the place tag should only be on the node?

Here, the admin boundary (if there is one), and the place are tightly coupled.
Bucket 2 & 3 don’t seem common around here, but I’m sure we could find some.

Is it OK that this discussion is broader than New England or would you like to keep it focused?

Good question. I think that yes, this discussion is broader than New England, but it is possible that New England is the only part of the country that is completely out of whack due to its history of development and municipalities being named “Town” rather than “Township” in a way that confuses people trying to assign values to the place hierarchy. As well, many New England states have little or no unincorporated area and no county governments, so “Towns” are large and sometimes sparsely-populated administrative areas here.

In other parts of the country with County-level government Towns/Cities/Boroughs were formed by carving out a densely populated part of the County for self-government, leaving the less populated portions outside of the incorporated territory. This makes those towns much more likely to fit the OSM place=town definition than the “New England Town” as they had a dense enough population to bother incorporating.

Using this Overpass rendering of place=* nodes it doesn’t seem that the rest of the country is as miscategorized as New England:

Mid-Atlantic and Southeast seem like they have a reasonable distribution of cities/towns/villages/hamelts:


I don’t have enough local context to know if the Midwest and plains are over-classified or not:


The west looks much more distributed overall, but Wyoming has a bunch of place=town with populations < 100 which may or may not be appropriate given how little else there is out there:


Long story short, New England is a particular problem but we should try to solve the overall classification with an eye toward a national standard.

5 Likes

Absolutely fantastic work here, @Adam_Franco . Thank you for all you are showing us!

Considering the fairly natural distribution of places though out states with appropriately
sized incorporated municipalities. I feel like we could include one of more tag based on population or some other relative measure of size or importance. Adding the second tag would override the normally rendered zoom of the normal place value. This woul prevent confusion by shifting importance to another tag instead artificially changing the incorporation type.

So in the example of New England states, the designated legal incorporation of “town” would appear as the default value of the place tag. Though either fully or partial based ignored in favor of the importance= value.

Taking a look on Addison County:
Compared to Middlebury Vergennes looks to me less important based on what I can see on OSM. So I would rather see Middlebury as city and Vergennes as town. Wikipedia follows our Carto-Map, which seems to follow the boundary relation, not the place-node.

Relation of Vergennes is considered as city but the place-node of Vergennes is a village. :smiley: