New England place name inflation

I think this is another point which is a general issue, where this thread & the Alaska one are throwing out a range of nice examples.

Many district councils in Scotland (although you can go back to older admin entities or urban statistical areas): for instance, Argyll & Bute has no real internal boundaries, but has a small number of obviously recognisable towns (Oban, Dunoon, Helensburgh, Campbeltown, Rothesay), smaller places which may be towns (Lochgilphead, Inveraray, Tobermory, Bowmore), and which are mapped as such.

It’s not uncommon for towns in England to only have a recogisable boundary as a negative: Windsor doesn’t exist administratively, but can be [defined by boundaries].(overpass turbo) of adjacent places (technically it is an “unparished area”).

I wouldn’t disagree! Perhaps highlights why choice of key/tag names should be done carefully.

2 Likes

Looking yesterday at Plains, GA where president Carter is living, I also found some county admin polygons with the place tag. This query of Columbus GA and south-east area shows both node and polygon defined places.

I have no problem with defining admin areas with polygons plus add to it the place tag. These admin polygons are usefull for navigation tools to locate streets and addresses. A place node is usefull simply to control where the name is rendered. And while I have no problem to define Columbus consolidated City-county of 207,000 people as place=city, it seems to me exagerated to define the adjacent Chattahoochee County (mostly military area landcover) as place=city. Even worse for Plains who does not seem to have clear boundaries and is defined in OSM with a circle. I dont think that the a notorious president in a village is enough to define it as a city and assure a better rendering.

This is true of space-filling concepts like place=state and place=county, but some other place=* values like city and town represent the acknowledged center of a settlement. For example, my city’s place=city node is located at the intersection of the city’s main north–south and west–east streets, the point of origin for the entire city’s address system, whereas the boundary’s centroid is in a residential area about 5 miles (8 km) to the southeast. In the past, some mappers misunderstood the purpose of these features and deleted place nodes in favor of boundary relations, forcing other mappers to restore them manually.

The analogy with Barnstable and Hyannis would be if, say, Downtown Cincinnati was called something special like “Fort Washington” and Fort Washington were much more famous than Cincinnati, to the point that few people would care to see “Cincinnati” labeled on a map of Ohio.

Plains has a clear boundary, and it is very nearly the shape of a circle. The U.S. has lots of circle towns, particularly in Georgia. If only every town were so defined, we could map each one as a point, tag it with a radius in an agreed-upon unit, and call it a day.

…unless the radius is specified in a customary unit like “poles” that has changed in meaning over time. :upside_down_face:

Could we say that the place node is a place of collision with various dimensions of mapping ? Montréal city is also Montréal island part of an archipelego and surrounded by two other cities. At zoom 9, the americana style only shows these 3 cities and it seem to move Longueuil tag to avoid collision. But the Carto-css main OSM style does show more infos including Ile de Montréal (Montréal island) thus adding possibility of mapping collisions in such urban context. Then the tendancy for some to better place is town/city moving the place node. :melting_face:

Americana should ideally label islands but doesn’t get that data from OpenMapTiles:

And yes, island is another example of place representing more than settlements per se.

I’d like to add my vote as a representative from Maine, I carefully and deliberately did all the place tagging for Maine a few years ago as part of a much-needed municipal boundary clean-up.

This thread seems to be mostly a conversation about renderers inappropriately showing too many “unimportant” names? And then an extended argument about what “regionally important” means and how to measure it?

OSM is a place to record verifiable facts about the real world. “Importance” is an opinion and is as incompatible with OSM as restaurant ratings.

Here’s a collection of what I’m seeing suggested as proxies for importance: population, post office, grocery store, hospital, hardware store…

If your renderer cares about population, then it should be reading the population tags (I added these to all Maine place nodes for this reason). If you want to count the hospitals, then the renderer can search the area for hospitals and add them up to decide how big to make the label.

Adam_Franco wrote:

Searching via OverPass it looks like New England also has a particular problem of place=* being placed on boundary relations.

Yeah, I did that and wasn’t sure if it was correct. I wouldn’t oppose fixing it in Maine as long as there aren’t data consumers expecting it.

1 Like

Perhaps we’ve been using this word too loosely. I agree that we should steer clear of assigning value judgments to places. On the other hand, what is cartography if not curation? Around here, we try to facilitate data-driven cartography, but some things end up needing a holistic summation of the facts. The amenities you propose as proxies are still just rules of thumb. Rules of thumb are difficult for data consumers to apply satisfactorily.

I dream of the day when we can stop worrying about road classification in favor of expecting renderers to perform road network analysis, traffic analysis, and machine learning classification to classify the roads themselves. With places, we might need even more powerful chatbot technology. :grimacing:

On the other hand, maybe the lesson from all this back and forth is that we need to be less precise in our classifications. Is it enough to say that a city is the kind of place that would have its own suburbs (in an American sense), that a town is the smallest kind of place centered around a commercial/retail core, and that a village is the smallest kind of place that provides basic community amenities?

2 Likes

Not exactly, although that is one rather obvious result. The real issue is that the words “city”, “town”, and “village” each have multiple meanings, but only one applies to the OSM tags place=city, place=town and place=village. Each of these three words may refer to the class (or government type) of an incorporated municipality (the City of Vergennes, the Town of Brattleboro, the Village of Freeport), or they may refer to a settlement’s size, population, and/or importance decreasing from city to town to village. The meaning of the OSM tags place=city/town/village is the latter, not the former. This is often confused since the legal status of a municipality is easy to look up and often uses one of these words.

An incorporated Village can become large and important enough that place=town is the appropriate tag. An incorporated City can be so small and unimportant that place=village is the appropriate tag. In New England, many incorporated Towns contain only small settlements for which place=village or place=hamlet is the appropriate tag.

3 Likes

I’m suggesting we stop looking for a quantitative proxy for qualitative tags and instead help data consumers transition to using the quantitative tags by making sure those tags (like population) are available.

I understand that “system of government” and “regional importance” are different things. But if we want a proxy for “regional importance”, I think “system of government” is a pretty good one (in Maine).

I feel the same about highway tags where it’s mostly unverifiable opinions. It seems much more appropriate to record the size, surface, speed limit, lanes etc and let a data consumer do math. If that math is hard, I’ve got MEDOT data on “how many cars per day was this road designed to support” which is exactly what a routing/rendering engine wants to know, no need for an opinion or proxy.

Any good map incorporates qualitative choices that don’t boil down to numbers. It pains me to say it, but users expect to see San Francisco more prominent than San José, which is the much larger city. You’ll be hard-pressed to find a rigorous reason as to why, other than arbitrarily choosing a proxy like tallest skyscraper, airline flights per day, or Google N-grams mentions, which would inevitably boost some wrong cities too.

A road’s design volume or design speed? I’ve got an expressway to nowhere to sell you. Annual average daily traffic? Those are some fine primary roads crisscrossing the airport parking lot. But a data consumer can use traffic counts and graph connectivity to refine a rough baseline classification, as Apple and Google have done, respectively. It’s the same with place classification: we should think of the place=* value as a first draft, based on rules of thumb that are as simple as possible, and let data consumers bring in other signals to massage it to taste.

4 Likes

But if we want a proxy for “regional importance”, I think “system of government” is a pretty good one (in Maine).

This would be convenient, but I don’t know if it’s true. Taking a look at Adam_Franco’s post above: New England place name inflation - #37 by Adam_Franco, it’s pretty clear just by eye that the place scale in Maine is dramatically different from practices in both neighboring states like New Hampshire and the rest of the country, in that there are way more towns.

In general, the reason “system of government” is not generally a useful proxy for OSM place tagging is that every jurisdiction uses a different system. So what might appear nice and objective locally leads to wholly inconsistent results with the rest of the world, making the map much less useful in my opinion. For instance, in my home state of California, there is no legal difference between cities and towns, and municipalities are free to call themselves whatever they want. Obviously that’s not a useful criterion for tagging, but I doubt Maine’s unique system of government matches OSM’s worldwide definitions either.

More broadly, one thing that always nags me in these sorts of conversations is a conflation of the concepts of “verifiability” and “objectivity”. place and highway are two prominent examples (there are others, like track_type, smoothness, etc) of OSM tags that try to encapsulate a spectrum of possibilities with discrete values. Doing so is always going to lead to some ambiguities on the edge cases, making them a bit subjective: reasonable mappers might disagree on some classifications. But I don’t think that makes the existence of a gradient in place importance “unverifiable”, even if it’s not easily quantified and fairly subjective. I also think a “squishy” definition in which some places toggle between values from time to time is more desirable and consistent in the long run than a locally “objective” criterion that doesn’t match any other worldwide use and is therefore arbitrary. The same goes for highway classification: I feel like the equivalent is tagging only by whether something is a state route or by traffic counts. While objective, this ignores the significantly different conditions in different areas: a relatively minor street in a big city will have more daily traffic than a trunk road in the country, for example. So I think we should continue discussing and refining which heuristics work the best in practice for classification, but with the understanding that there will never be a single list of criteria that can be universally applied to substitute for “importance”.

8 Likes

I agree with @willkmis about California having “no legal difference between cities and towns,” in fact the documented evidence I have noted in United States admin level - OpenStreetMap Wiki on the topic is describing it as I have seen in a number of places, namely that California cities and towns are “synonymous by law.” However, municipalities are not free to call themselves “whatever they want,” they may call themselves a “city” or they may call themselves a “town,” and these are synonymous by law. A very tiny quibble indeed.

I also very much like and agree with the idea that what we (in the USA) have already entered into OSM regarding place=* tagging is “a good first draft.” Because of TIGER data, Census Bureau updates, many talk-us discussions around 2009-11, 2017 and a “more firm consensus achieved” around 2020, I’d say somewhere around fourth or fifth draft is more accurate (a minor quibble we’ll never really accurately denote). But the idea of an “ever-increasingly-accurate” state of our place=* tagging as an ongoing discussion and fine-tuning of it is a good one. Exactly as here (about New England), these might either start out or end up being (somewhat) region-based, but keeping the entirety of the USA in mind as we do this will continue to bode well for our ability to do this well, and well into the future.

Places and even admin_level=* (as Minh’s Diary entry about Ohio points out) values truly can be “squishy.” What seems to me happens from time to time (as here, about New England) is that different and wider sub-communities come to realize this, have some of this history described (about the somewhat-fragile, but “continuing to hold fast” consensus we have) and begin to nod their heads: it’s a learning curve, not only at the individual level, but at the project-wide level. That’s fine, and I think we’ll continue to “draft” even more “accurate consensus” (a weirdly OSM thing) about these topics well into our future. It only makes for a better map and a better mapping community.

5 posts were merged into an existing topic: Slash-separated Native American names

Hmmm… at the risk of nudging a sleeping bear to see if it’s sleeping or dead, are there any summations/conclusions from this thread & associated discussions on Slack?

Like @willkmis, I was psyched to see @Adam_Franco’s New England place name inflation - #37 by Adam_Franco, but worried that there were 51 posts after it. Even with his seemingly clear description, there were plenty of questions.

I’ve driven @Minh_Nguyen nuts with my thoughts on place=*, but even after our discussions and reading this thread, it seems that it is way too overloaded to be resolved. By overloaded, I mean that it’s a single key people expect and are trying to use to collectively, exhaustively, and hierarchically describe populated places (along with other place=* values) and tell the renderer how to map those, without regard for what these places actually call themselves & contrary to what is objectively verifiable in documentation. And, everything else that doesn’t fit in those buckets is thrown in a catchall-tag like place=locality.

I’m sure this list is incomplete, but here are some of the concepts it seems are being compressed into a single value:

  • legal place type
  • OSM place type
  • actual size - population & area, sometimes bucketed
  • perceived size - population & area, sometimes bucketed
  • something called importance
  • has a boundary or not
  • has some administration or not
  • fits neatly into a hierarchy or not
  • is a representative for a larger entity of some sort (e.g. area/region/MSA)

I know that there’s lots of tooling and consumer expectations built around place=* and admin_level=*, but are there any efforts to attempt alternative models with a little more breathing room and support for the fact that not everything is the same everywhere, so we don’t have to have massive decoder tables like @stevea’s Herculean US Admin Levels and objectively inaccurate place=* values?

Not sure if I should :duck: or :popcorn: here…

1 Like

admin_level wiki has always been a consensus of hallucination, tending towards prescriptive (moreso than Minh’s Boundaries). Simple things like whether a way can be tagged or must be part of a relation have yet to see full, wide agreement. We’re “full adults” at 19+ years (and growing), yet we squabble with seemingly fundamental disagreements. “Massive decoder table” is a consensus of hallucination. It’s something we’ve hammered together as a loose, effective agreement so far.

Make sense of the data as you will. That these things only approximately chase one another back and forth seems obvious.

That’s a sensible list of confusion, I mean that as a compliment. Drawing boundaries around these is somewhere around “this is where and how people map.” (And have for 19+ years). We do the best we can, and lots think we’re a pretty good map. We have our quirks, yes.

I agree with the sentiment that “we must all be able to make sense of these together.” So, it seems we are alike there. I can make sense of our data, as they are now. I have a sense of how these data came to be this way, as I’ve been along this journey for almost 15 years. So if you were to put these (bulleted items) into some kind of semantic sense together that all works for us while it works for you, I am listening. Right now, I see a pretty good list of things that could be tightened up, that’s a good start.

By all means, if there is something wrong or needs correction in the wiki (admin_level, Boundaries…) feel free to correct it. Especially in those three New England states, as I state in my update notes to that wiki edit. I do advise this is minefield, sir. It could be seen as “coming in way too hot.” This is oddly diplomacy. Something like a Talk sprout (new section) in a wiki stating that you’d like to make (a) change(s) is a good, already suggested start. It really must be a sturdy suggestion that might lead to a proposal to change things (depends on sentiment, consensus) and like I say, you have a good start.

Another way to be convincing is to offer evidence of vandalism or outright error that hasn’t been noticed or unable to be coped with. Do you seem to have a little confusion? Yes, and you make good points. Then, there is the “OSM method” of how we tag. That’s what we have.

@jeffmeyer Having been through this with highway classification (which I consider very successful but long-running) – these discussions are important but insufficient. Somebody has to choose to take ownership by doing things like creating demos and test examples and that kind of thing for people to probe at and form consensus over time. With highway classification, I was actually producing off-database renderings to demonstrate what the proposed new classification would look like to get everyone on the same page. A bunch of the other key players in the American map community were on it, @Adam_Franco hosted a session at State of the Map US in Tucson, etc., etc.

These kinds of harmony efforts are hard, complex, and take a lot of energy. With the highway effort, we were motivated because we were launching Americana and our emphasis on highway distinctions was so prominent that we couldn’t ignore the underlying data issues. To an extent, place has the same issue, but on the rendering side, the problem is much less pronounced since we don’t have the problem of literally disjoint rendering like we did with roads.

So if you see stagnation, it’s because nobody’s grabbed the bull with the horn yet to drag it through to the finish line.

Wranglers welcome.

5 Likes

I’ve run a number of laps here and am gesturing outward with the baton I’ve carried while doing so. Others are indeed quite welcome to spell me and the others who run this marathon, as it is an ongoing effort to “achieve better consensus.” Grabbing of batons and further sprints ahead appreciated.

Edit: Brian’s suggestion as to a kind of “admin_level harmony” (wiki with data with data with wiki) is exactly what the doctor ordered, as it really DID make a difference for highway classifications, though it took a renderer to do that. We have an admin_level renderer that is useful (it’s the French “QA or test here” link in the wiki). Noticeably harder is getting agreement among contributors. We can do it, we have so far (with little rancor) and I think we can continue. It keeps getting polished, it keeps shining, but there is (seemingly) always more polishing to do. This can be exhausting, though I find the rewards of results like we’re beginning to see in New England (because we’ve got appropriate OT queries to check our work, like above) worth the efforts by many. This is most certainly not a one-person (even as a leader, consultant, technologist, expert mapper, political scientist in American Studies…) affair. “It takes a village,” heh.

1 Like

My read on this thread is that we (continue to) have a consensus that the official designations of places in New England should go in border_type=* rather than place=*, out of a desire for some degree of harmonization across regions. This is not to say that towns aren’t places, just that they don’t fit into the same definition of “place” as the other local-scale place=* values. The discussion quickly turned to alternative criteria for place=* classification. I think most participants here are on board with place=* being based on some fuzzy, holistic criteria, just as in highway classification. That’s the easy part. :wink:

This is a great list, showing the challenge of adhering to a bespoke place classification system. These are all factors that could influence a single holistic place classification scale, but that isn’t to say we must limit ourselves to place=*.

In general, whenever official designations routinely differ from an OSM-centric classification system, the official designation can go in designation or a more specific key. In the U.S., we’ve tended to tag official designations as border_type=* on the boundary relation, because virtually every officially designated place has a boundary. This is more predictable and thus more reusable than conflating, say, New England towns with place=town and expecting data consumers to know the difference between the OSM and New England definitions.

Unfortunately, this requires the boundary to be mapped first, and some countries have official designations for places without definite boundaries. Instead, China pairs place=* with place:CN=* and the Philippines pairs place=* with place:PH=*. These local subkeys also extend to other classification systems: France pairs school=* with school:FR=*. By analogy, we could tag place=municipality place:US-VT=town on a Vermont town node, but a proliferation of hyperlocal subkeys like place:US-CA-Orange-Irvine=village (for Irvine’s “villages”) would be very difficult for editors to support, versus something more unified like place:official=US-CA-Orange-Irvine:village, or simply designation=village.

Regardless, in New England, the boundary of a town is much more meaningful than its abstract centroid, hence the hemming and hawing about border_type=*.

In OSM, we have two options for subordinating one administrative area to another, and both have drawbacks:

  • One boundary=administrative relation lies within another and has a numerically higher admin_level=* value. This forces us to adhere to a single linear hierarchy of places, which breaks down in many respects. For example, we have no good answer for how to indicate that a CDP boundary is simultaneously the third-level division of the Navajo Nation and equivalent in rank to a third-level division of the State of Arizona, yet the Navajo Nation outranks Arizona and the CDP is only considered administrative by the Navajo Nation. Maybe is_in:*=* tags make a comeback?

  • One boundary relation is a member of another with the role subarea. This enables us to model multiple inheritance, and it conveniently solves the problem that some countries allow an administrative area to subordinate an administrative area that lies outside its boundary. Unfortunately, since our data model requires a relation to list its members, rather than vice versa, we can quickly end up with monstrous, fragile relations, not to mention reference loops. Another disadvantage is the utter lack of software support.

Settlements (place=isolated_dwelling/locality/hamlet/village/town and place=block/neighbourhood/quarter/suburb/city nodes) can form a sensible, usable hierarchy or two if we decouple them from their boundaries. But there’s no official nationwide classification system, other than maybe something tangentially related to core-based statistical areas, so we run against the limits of trying to be maximally data-driven and objective. For example, we haven’t reached a consensus about the degree to which amenities and services should factor into promoting a place from one classification level to another. But I think this is because we’re too focused on edge cases and not focused enough on the 90% case.

6 Likes

For a bit of background knowledge, in addition to border_type=* and place:CN=* (in China) for “official designation” of a place, there are at least two other keys used in various parts of the world for this same purpose:

  • official_status=* has a draft proposal from 2010 but never had an RFC. Nonetheless it has over 100,000 uses, almost all in Russia or Belarus.

  • admin_title=* had a proposal in 2015 but was rejected. Reasons cited were preference towards designation=*, name:prefix=*, or stating that the administative title was just part of the official_name=*. Currently this key has about 900 uses, almost all in western Germany