Framework for aligning New England place nodes to census categories

Right, (imo), “curation rather than data gathering.” That really is how this best evolves. (Again, imo). Thanks, Minh.

And look at what Maine’s libraries do: Urban/Rural Designations: Maine E-rate for Libraries as even they have an “appeals process” for Mainer (Maniac?) input on whether any area is urban or rural. And “there we go again” at noticing that it is both wide-area opinion as well as individual opinion which tussle with each other at where the fuzzy line actually gets drawn.

I’ve been to Orono, Maine (university “town” or “city”?) and while I think the population is over 10,000, is that not a city in this context? Or is town correct? (I don’t want to get lost in a single example, and it was a long time ago I was there, so I’m not a local with local perspectives by a long shot).

I would urge folks to consider broadly the criteria to which we divide population centers into categories at a higher level and consider the lists I’m generating here as a thought exercise to consider places which are on the margin between =city and town and what factors we think are important in distinguishing them. The goal here is separating the most important population centers from lesser cities.

There seems to be a consensus at least that context matters - a population center in a sparse area gets higher weight than the same population center in a more densely settled region. There also seems to be a consensus that the presence of border_type=city administrative boundaries (i.e. places called “city” and have a mayor and so forth) are irrelevant in the decision to assign place=city. After all, some very tiny places (like the city of Palmer, MA, population 12K, which I’ve personally never heard of) are legally cities.

The issue, and the original reason why this discussion is ongoing, is that there is a general sense that we have disproportionately too many place=city nodes in New England compared to the importance level that place=city has compared to place=town nodes in other areas. That means this is not solved unless we re-categorize some subset of cities tagged place=city as place=town.

As such, we have successfully “curated” way too many cities on the map for the reality of what these population centers are, and curation is useless without criteria, whether objective, subjective, or more likely a combination of both.

It seems to me that, over the years, overclassification east of the Mississippi and underclassification west of it has always stemmed from a naïve overreliance on administrative areas or their population counts. Your experiment also relies on population counts of administrative areas, namely, the populations of cities and towns, counties (aggregated into CBSAs), and states. However, it uses a formula for allowing more city nodes per MSA depending on a state’s population density based on some qualitative “bonus factors”.

As a dabbling map designer, I appreciate that you’ve replicated the label density scrubber found in some GIS tools, and that I long for renderers like MapLibre to provide. Unfortunately, there’s no way for any of us to judge whether these factors are versatile enough for OSM, other than the one constraint that “the map” “looks right”. This reduces place=* to a purely presentational attribute, rather than one about a populated place’s function in society. If place classification formerly suffered from “garbage in, garbage out”, this greener recycling process will require more transparency the moment anyone disagrees with its results. I’m not entirely sure we’ll ever be able to hone place classification into a science, but any formula we come up with deserves scrutiny.

In the other threads linked at the top, I’ve floated a half-baked idea for how we could classify places nationally without a minimum of fudge factors and magic numbers. We could restrict place=town/city/suburb to places within Urban Areas. Within a UA, place=city would be subject to a simple test of whether any surrounding places would be considered its suburbs or those of another city within the UA, and perhaps the same would extend to choosing a place=town in a smaller UA. This most likely aligns with the Census Bureau’s practice of titling UAs based on their “high-density nuclei”, except that we wouldn’t reclassify any MCD or directional place name as place=town or city. Palmer, Massachusetts, falls within the Springfield UA and would not be its place=city.

Where I get stuck is how to draw the line between a UA that has at least one place=city at its core and a UA that has only a place=town at its core. The same uncertainty recently caused the Census Bureau to drop its distinction between Urban Clusters and Urbanized Areas, which had been set at a population of 50,000 since 1950. They say we’re now free to categorize UAs based on any population threshold we want. Thanks a lot, Census Bureau!

Maybe we could scale up the old threshold to 109,515, based on the country’s population growth since 1950. That’s right around the median UA population of 101,536 and the formerly documented place=city cutoff of 100,000. Maybe we set a budget for the number of place=city nodes within a UA, based on the UA’s population divided by that threshold. But I think these arbitrary cutoffs are only useful to the extent that they align with real-world differences in how places function.

There have also been concerns that some sparsely populated regions of the country would go blank if we rely on UAs, which like CBSAs require a certain minimum housing density. Nome, Alaska, would be relegated to a village unless we come up with some exception for it. In general, though, I don’t think our goal should really be to pad out the map artificially. A stylesheet should pull in place=village and rely on symbol collision if it needs to maintain an even label density everywhere.

Perhaps we could also consider how Natural Earth classifies places at its three scales. I would rather leave the subjective curation to them and focus on value that we can add independently as a data-driven project, but many renderers mash up Natural Earth at low zoom levels and OSM at high zoom levels, so some degree of alignment may benefit the broader ecosystem. As well, some consistency between regions would benefit our users. Whether it’s our methodology or the resulting density that is consistent, predictability will encourage more data consumers to make more thorough use of our data.

Mmmm, sausage! Seriously, I like how we certainly get right to it!

In the USA, for about 15 years, we’ve been tweaking these. They work best when they run as Brian says with a relative magnitude sense: (each of which is suffixed with “around here” which is deliberately squishy as to how far out, but really means something well- and widely-understood in a more-local context) large, medium, small, tiny. This is their place.

Then, there is their admin_level and boundary and so on, which while I know is related in some sense we have as mappers to map correctly, has serious overlap and blur with the concept and actual key of place in many minds. I keep saying that, we keep saying that, these are choppy waters like that around here sometimes.

It seems like we both want to and can and do but fully haven’t in some places all of the above. It gets better with admin_level, and some smoothing of place is underway.

It’s true that how place names render on a map has a fair amount to do with this. That’s why our brains visually trigger with the sense of relative big, medium, small, tiny -ness. We see our sense of place around us and that can be powerful. In some sense, we are hallucinating into existence our own sense of togetherness and community as we do this. It’s freakin’ awesome to watch.

Below is the list of Urban Areas in Massachusetts with population listed. There’s also a few other columns in the raw data like land and water area and population density. I included UAs that have the primary city in another state but include “MA” in the descriptive name.

Name Population
Boston, MA–NH 4,382,009
Providence, RI–MA 1,285,806
Worcester, MA–CT 482,085
Springfield, MA–CT 442,145
Barnstable Town, MA 303,269
Nashua, NH–MA 242,984
New Bedford, MA 155,491
Leominster–Fitchburg, MA 111,790
Amherst Town–Northampton–Easthampton Town, MA 90,570
Pittsfield, MA 50,720
North Adams, MA 25,432
Greenfield, MA 22,294
Southbridge Town, MA 20,789
Vineyard Haven–Edgartown–Oak Bluffs, MA 14,064
Athol, MA 13,557
Nantucket, MA 12,011
Ipswich, MA 9,380
Spencer, MA 8,196
Lee, MA 8,119
Ware, MA 5,662
Provincetown, MA 5,698
Sunderland–South Deerfield, MA 5,048
Winchendon, MA 4,866
Pepperell, MA 6,103

If I were curating this, I would snap the city/town threshold between New Bedford and Leominster/Fitchburg (it’s pronounced Lemminstah in case you’re wondering). I consider Leominster/Fitchburg to be more like two sprawled out towns that just happen to cover enough territory that they managed to collect up a decent amount of people. But it’s not an urban center at all, in the way that Fall River or Burlington, VT are (to pick examples with similar population)

I would also exclude Barnstable Town for all the reasons discussed elsewhere and for the fact that it’s basically just a sprawled suburban area on Cape Cod.

I would also include Fall River, MA, which has a population of 94,000 and is included in the Providence, RI-MA UA and has a significant urban center with urban character. Excluding cities in other states, that would leave Massachusetts with five cities - Boston, Worcester, Springfield, New Bedford, and Fall River.

Some of the smallest cities on that list may well even qualify as place=village. I’ve never even heard of Sunderland.

Moving north to Vermont, here’s the table:

Name Population
Burlington, VT 118,032
Lebanon, NH–VT 30,299
Barre–Montpelier, VT 20,014
Rutland, VT 19,550
Bennington, VT 13,759
St. Albans, VT 11,368
Brattleboro, VT 10,285
Milton, VT 6,417
Middlebury, VT 6,154
Springfield, VT 5,140
St. Johnsbury, VT 4,883
Bellows Falls, VT–NH 3,978

On this list, I would include only Burlington as a city, and I would even make Montpelier a place=town despite being the state capital.

Now, moving onto Maine:

Name Population
Portland, ME 205,356
Dover–Rochester, NH–ME 72,391
Portsmouth, NH–ME 95,090
Bangor, ME 61,539
Lewiston, ME 60,743
Brunswick, ME 31,361
Augusta, ME 24,005
Waterville, ME 25,529
Sanford, ME 15,067
North Windham, ME 10,271
Rockland, ME 9,868
Camden, ME 4,660
Skowhegan, ME 4,795
South Paris, ME 4,371
Houlton, ME 4,281
Rumford, ME 5,585
South Berwick, ME–NH 5,584
Presque Isle, ME 5,361
Millinocket, ME 3,812
Belfast, ME 3,754
Boothbay Harbor, ME 3,067

Of this list, I would likely only include Portland, Portsmouth (NH), Bangor and mayyyybe Lewiston. I hesitate on Lewiston based on the utter lack of notability outside of Maine (based on the very hand-wavy concept of, if you asked a New Englander outside of Maine to name cities in Maine, most people won’t come up with it). But with a population of 60K in spacious Maine (the UA includes adjacent Auburn), it’s hard to argue against it.

If we’re excluding Montpelier, there’s a good argument for excluding Augusta on similar grounds (motto: “at least we’ve technically got more people than Montpelier”).

I would also exclude Dover/Rochester (which are both in NH but include a little Maine). It’s high on the list but if you take a look at a map, the census bureau has combined two distinct spread-out areas with a big enough lasso around that they managed to combine a decent population count. But a significant population center it is not. I also assess those places to have very low name recognition outside of New Hampshire.

This leaves Maine with three cities: Portland, Bangor, and Lewiston.

1 Like

This would validate the threshold of around 100,000 to 110,000 that I floated above. If we want to apply the same framework nationwide, we’d need more gut checks along these lines.

In order to choose which titled places within the Leominster–Fitchburg UA would qualify for place=city, we would essentially have to split the UA in two. It certainly wouldn’t make sense to double-count the entire UA’s population by classifying both places as city. Roughly halving the UA’s population would likely cause it to fall well below the threshold for any city within the UA, matching your expectations.

Likewise, if we roughly halve the Sunderland–South Deerfield UA’s population, it would easily fall below the threshold for a place=village within the UA.

Other double- and triple-barreled UAs may be less obvious just judging by title or appearance. Of course, it would be great if we could more precisely divvy up the UA’s population rather than “roughly halving” it. The process for defining a UA involves creating urban area agglomerations that probably correspond to lobes like the ones above, but I don’t think they publish those intermediate geographies.

Did you find my “would have suburbs around it” test useful in identifying Fall River as a place=city within that UA? If not, we could look for a more rigorous standard. Through the 2000 census, the Census Bureau designated central places within Urban Clusters and Urbanized Areas. They dropped these lists in 2010, considering them redundant to the principal cities in CBSAs. Should we reintroduce the previous central place concept or adopt CBSA principal cities to supplement the places in the UA’s title? Previously, I found that principal cities tended to overpopulate some of the larger MSAs with too many place=city contenders, 19 in the Los Angeles–Long Beach–Anaheim MSA alone.

Barnstable Town is an MCD. By the rules for naming UAs, this means the Census Bureau couldn’t find within the UA a specific populated place with at least 2,500 people, or even a CDP. On that basis, we can say that no place within Barnstable Town should be a place=city, even if we set the floor for a city-based UA at 100,000 or so.

Do you feel strongly about also including Portsmouth and Bangor as place=city? The Dover–Rochester UA’s population would be split between the two cities, so there’s no problem denying both that coveted city status. But Bangor would really blow a hole in any threshold around 100,000. Bangor would also frustrate other heuristics that have been proposed in the past, like the presence of an international airport, so maybe there’s just something special about it that shouldn’t sway a regional or national classification standard.

For some of the low hanging fruit in Maine, I think Rockland and Westbrook being towns instead of cities would be pretty close to unanimously supported.

1 Like

This feels right to me as a Vermont resident.

Metro Burlington is the only one of these that feels like a it is big enough that if you ate out at a restaurant once per week you would never be able to visit them all (including fast food establishments) once turnover is considered. The rest of these have a few dozen restaurants, maybe 100 at most.

Around a population of 100,000 or more (for the Urban Area, not just municipal boundary) seems reasonable to me. One place=city in Vermont (Burlington) makes sense. However, I have heard voices arguing that some regionally signifcant medium sized places like Concord, NH or Lewiston, ME should qualify for place=city. Seems like we have some consesus building to do on exactly what we are aiming for the place=city tag to represent. Should it be only for the most significant, densely populated, urban centers in a region? Or should it be for any regionally significant urban center including smaller ones in more sparsely populated regions? I’m ok with either definition but if we don’t establish a consensus on this point I think we’ll continue to talk past each other about what the qualifying factors should be.

This is to say: There is some concept of scale of place that a single person can know intimately and keep track of themselves as new buildings are constructed and businesses open and close. I feel like a “town” is at least potentially knowable by someone regularly out in it and exploring it.

In contrast, a person in a large “city” may only be able to intimately know several distinct neighborhoods that they frequent while other neighborhoods ebb and flow out of sight.

There is certainly a fuzzy threshold between a small city and a large town, but thinking of scale in this knowability way feels like the ~100,000 UA population is about right for this kind of distinction between town and city.

2 Likes

First, a disclaimer: I am not a native New Englander, so feel free to take my opinion with a grain of salt, or disregard it. But I have visited quite a few times, so I consider myself moderately familiar with the region.

It seems like this discussion is mostly focusing on the city vs town distinction among distinct urban cores. One thing I haven’t seen mentioned is the status of somewhere like Cambridge, MA, currently tagged as place=city. The administrative city has a population of 120,000, making it the 4th largest municipality in Massachusetts. Cambridge would certainly be described as being “in the Boston area”, but it’s certainly no sleepy bedroom community where all residents commute into Boston: it has its own economic engines, major destinations, business districts, and (from what I can tell) a distinct sense of community identity. When I visited a few months ago, it certainly seemed sufficiently large as to feel ‘cosmopolitan’, like you could be unaware of what was going on on the other side of the town, but that’s hard to say from the outside. I also suspect that many folks living in surrounding smaller places like Arlington or Watertown commute to Cambridge, not Boston. It’s certainly notable in the sense of “people from outside of the region have heard of it”, probably more so than somewhere like New Bedford or even Springfield IMO, although places with major universities are always going to be outliers with that sort of thing. Cambridge is listed second in the MSA and NECTA names after Boston.

I guess I would find it helpful to delineate what the consensus is on places like Cambridge, and why. Is it impossible for it to be a place=city because it is across a river from another place=city thought of as more prominent/that has 5x the municipal population? What would allow the urban area to have multiple place=city places? Is there some other factor that finds Cambridge to have insufficient amenities to be considered a city? To avoid seeming like I’m just playing devil’s advocate, I’d say that I would probably tag Cambridge or somewhere like it as place=city. But my read is that others do not share this opinion, so I’m interested to hear arguments that could change my mind.

Then there are other places, also currently tagged city, that have slightly lower municipal populations (but all >100k) but are also further from central Boston while remaining in the census Urban Area, like Quincy, Lynn, and Lowell. Are all of these clearly place=town, on par in importance with an isolated place with 10,000 people? Or is there some combination of population x distance that allows for one to be a city? Does there have to be undeveloped land as you go out from Boston for it to be one? Some other factor determining them to be center-like in character, rather than fully subordinate to Boston? I’m less personally familiar with these places, so I don’t have a strong stance, but they seem like they could reasonably be place=city to me.

2 Likes

I’ve spent significant time in Cambridge earlier in my life, it I’d put that as an obvious place=suburb. It’s effectively part of Boston, is on the subway lines and lacks an independent urban core.

If tomorrow, Boston annexed Cambridge, we wouldn’t even be having this discussion – we’d all agree that Cambridge is just a neighborhood in Boston that used to be its own municipality.

2 Likes

Interesting! I think this sort of tagging could make sense, but it’d be a pretty radical change in mapping practice to map independent incorporated areas as place=suburb. As far as I can tell, in the US that tag is only used to map regions within incorporated cities, essentially large neighborhoods. Is there anywhere in the world that maps places outside of the main city’s municipal boundaries as place=suburb? I guess this is the logical endpoint of divorcing the admin_level/municipal classification of a place from its place=* value though!

1 Like

I think this is still up in the air. If we’re tracking my idea to base classification on Urban Areas, we’ve only gotten as far as possibly determining whether the UA should have a place=city in it, which would obviously be true for the Boston UA. But we’re less clear on how many places within the UA get that distinction, and which ones besides the primary place named in the title.

If we set a budget for the number of city based on how many times over the UA exceeds the 100,000 threshold, then there would certainly be room for Cambridge. Why is it that Cambridge doesn’t appear in the UA’s title? Apparently if fails both tests for inclusion as a secondary name: it has only 54,000 housing units, less than two-thirds of Boston’s 302,000. After all, the carveout for secondary names is intended for close ties, not also-rans.

I don’t have as much local knowledge as you, but my impression is that it might be possible to relegate Cambridge to the same classification as more obscure places within the same UA. At the scales that Cambridge becomes especially relevant as a city, a data consumer is less likely to make distinctions between city and town or even use place=* points at all. These are scales were place=suburb starts to become relevant.

If this poses a problem, maybe we could consider its primary position in the Cambridge–Newton–Framingham metropolitan division. I’m just wary of bringing in anything from the CBSA hierarchy, because it’s based on administrative area populations that include rural populations. That problem doesn’t show up in this case, because Cambridge is in the middle of such a “thickly settled” area, as those lovely MassDOT signs say.

I’m not sure that I really thought about suburbs, but there’s certainly a string of suburbs between Fall River and surrounding cities (mainly Providence and New Bedford). If your urban core is big enough, suburbs just naturally follow as population decays outward. The suburb test is more viable in the most built-up regions and I’m not sure it’s a good test as a general rule otherwise.

In the context of Maine, Portsmouth is significant, but in the context of New England and the greater Boston area, I have to concede that it’s approximately a Newport, RI in scale. I excluded Newport from my lists because it’s overshadowed by Fall River and Providence, which are of a larger scale to the point where Newport is clearly in a lower tier.

I included Bangor on the basis of name recognition and I have to admit I’ve never actually pulled off the highway to visit the place. My logic in including it is probably giving it a rural bonus, in the same way I’d include Fairbanks or Nome based on relative importance. I could be persuaded either way.

I’ve been avoiding and somewhat dreading this question. In general, we probably shouldn’t apply place=suburb to the American definition of “suburb”, because globally this tag is meant for inner-city city divisions, such as La Défense in Paris. Even most inner suburbs in the U.S. couldn’t be characterized as inner-city. But the calculus may be different in New England, where municipal boundaries don’t even attempt to approximate settlement patterns.

The sense I get is that the usage of place=suburb only for named places within the boundary of a larger incorporated city originates from regions where city boundaries are expanded in a logical manner as urban areas grow. In some US metro areas this happens, but others annex areas in a more haphazard fashion. Here you can see how Boston has annexed a number of towns to the south but not so many to the north. To the west, Brighton has been annexed despite its tenuous connection, while Brookline, Cambridge, and Somerville remain independent. To me it seems odd to say that these three can’t be place=suburb because they are independently governed. Functionally I think they are just as much of suburbs as Brighton is.

Some cities like Columbus, OH have even expanded fully around small municipalities that remain independent. I can’t imagine that locals really consider the enclaves significantly more distinct from the the main city than other annexed towns right nearby.

Basically I’m in favor of being a bit flexible with the usage of place=suburb and allowing it to be used for cases where it would seem sensible for a municipality to be annexed to the larger city but for whatever reason that hasn’t happened.

1 Like

Oh interesting it appears that La Défense is technically outside of the Paris admin boundary (edit: or is it this one, or this one :woozy_face:). Maybe the European and North American concept of “suburb” is not as different as we’ve been thinking. Or maybe I’ve misunderstood something here.

Depending on size, some enclaves could be good candidates for suburb and others might not be. In states like Ohio, cities have historically had considerable leeway to annex surrounding territory, often by coercion, even at times by fiat without the consent of its inhabitants. In Columbus, many of the city’s outlying “tentacles” and even some of the well-connected parts have a classically suburban character. In other words, in a European model, these areas would be considered to be on the outskirts or beyond, even within the city limits. But it’s difficult to compare a sprawling American city to older European cities.

Maybe I chose a bad example. It was just the first example of a European new city center that popped into my head.

Distinguishing between inner cities and inner suburbs is complex and requires us to consider history and socioeconomics. It isn’t something you can necessarily draw up rules about. That’s why I was hoping to skirt the issue for a while.

2 Likes

Yes, this gets to the heart of it. Ignoring our American English ideas of what a “suburb” is, essentially I see the argument that Cambridge should be place=suburb is that its role in the Boston area is similar to Brighton, or other suburbs like South Boston or Back Bay: major regions within the urbanized area.

I think this is a reasonable argument from an urban planning perspective! Certainly, as Brian put it above, if Boston annexed Cambridge tomorrow we wouldn’t be having the discussion. But I’m not sure it conforms to the prevailing definitions of the place= tags, or maybe more importantly, what any user/consumer is expecting. Would an area map of Boston or Massachusetts really show Cambridge, South Boston, Quincy, and Roxbury the same way? Maybe it would, but I kind of doubt it? At some point, I think the fact that it’s its own incorporated place might make a difference in people’s conception of the place, and so it might be a difference worth preserving in tagging.