Multiple delimited names in the name tag

I agree, it’s a fine suggestion, although I would offer a slightly different take. Instead, I would make name the “formatted for display” name (basically what it tends to be today), and introduce a new key called simply names which would be “a semicolon-delimited list of names used locally”.

2 Likes

In this case, they are also tagged with name:left and name:right. But which of these equally prominent names would you propose go in name? Or would you leave it empty and have name:left and name:right only?

1 Like

But how does that solve the situation where a certain community has one name for a place and another community has another name? Which should be the display_name? Both but formatted nicely? If so, what does that actually solve over the current name tag?

Maybe I’ve misunderstood…

The current name tag is apparently not intended to be machine-parsable. So if you want your map to display names with a consistent delimiter (say, “always a dash” or “always a newline”), rather than whatever arbitrary delimiter the original mapper chose, you can’t. Or if you want to pick just one of the commonly used names, nope, you can’t do that either.

The suggestion being batted around towards the bottom of this thread is that there’s one machine-parsable tag (consistently semicolon-separated), and another non-machine-parsable tag containing the current freeform tagging for those renderers which aren’t capable of parsing a machine-parsable tag.

2 Likes

OK, thanks for clarifying.

Personally, that doesn’t seem like a great idea. It seems like it’s just tag duplication that “kicks the can down the road” a bit. I imagine sometime down the line, map editors will put whatever they want to be displayed in either of those tags - regardless of the intentions behind the two tags.

I don’t think so - currently in OSM we have nothing that explicitly says:

  • What languages are actually used in a particular place.
  • What languages are on the road signs if I’m trying to follow directions there.
  • If there is a locally-accepted names compromise in use (“Derry/Londonderry” or similar)

For example, about the first place name I learnt in Finland was Eteläinen Rautatiekatu**, because I was staying in a hotel there. The street name was in both Finnish and Swedish, but despite that most people weren’t fluent in Swedish (FWIW the main language used in other parts of Finland is different)

Currently in Helsinki places have “name” matching “name:fi”, which is correct for “asking for directions” but not for “reading the street signs”.

** South Railway Street. When I first went this was still a railway line.

Absolutely. The main reason I favor using the standard semicolon separator for a delimited list of names is that composite names using slashes, dashes, or other concatenating characters can be distinguished from a list of co-equal but distinct names. Whether NameA/NameB is considered a name in its own right or two co-equal names depends on the location, history, and cultural context. In my opinion, a distinct way of tagging each of these situations would be a good thing.

If the name tag had been strictly used for a single primary name up until now and some people wanted to start putting multiple names into the tag using the semicolon separator, I’d agree with you. However, this is not the case. Multiple names in the name tag are common occurrence. Perhaps your stance is that this is a bad thing and we should be working to reduce this practice? Or perhaps you are saying that it’s a fine practice but it’s better to have a variety of different separator characters than a standard one? I can’t say I follow your reasoning for why a structured name tag is a bad idea.

Currently everybody who uses OSM sees odd slashes, dashes, and semicolons separating lists of names in many places. Perhaps a semicolon looks slightly more awkward, but I’d argue that inconsistently formatted name lists look plenty bad already. Any map style that wants to (as Richard says) not look like ass, is going to try to do something sensible with these lists.

If the argument is that the name tag should only ever contain one name, that seems reasonable to me, but given how long the practice has been tolerated I’m not hopeful about that path forward. Standardizing on the semicolon delimiter for lists of names, and leaving slashes, dashes, etc for truly composite names (those that truly are a distinct name of their own) seems far more achievable to me.

4 Likes

Lennon asked us to imagine a world without countries. I’m settling for decidedly less. :wink:

1 Like

Hmm. So I agree with all of that. But I’m still missing where exactly display_name comes in. How does that tell us about any of your points more than name does?

Let’s pick a specific example, say Derry/Londonderry, how would you see that tagged differently using display_name compared to just name?

iD does maintain a territory–language mapping based on industry-standard CLDR data. iD only uses it to decide which name:* fields to show first in a very long list of languages, so the stakes are very low. It doesn’t matter much that, for example, Portuguese is listed as a language for Switzerland, because all it means is that Portuguese is easier to find in the list.

I wonder how good of a starting point default_language would actually be for building a more nuanced dataset. I think it would require more research than punctuation-fiddling. For example, are all the labels in Switzerland really only in German? Are all the labels of Morocco really only in Arabic? It’s nice that this key can occur on subnational boundaries such as South Tyrol, where signs are apparently posted only in German and never combined with Italian. But it isn’t nuanced enough for places where the streets are in German but everything else is in English. How is a data consumer to determine that a Chinatown anywhere outside Québec mostly uses Chinese when the Chinatown’s boundary may not be well-defined, let alone verifiable enough to add to OSM where it can be used as a source for this dataset?

These are all rhetorical questions, of course. Everyone knows Switzerland speaks four languages separated by slashes. Far be it for me to question that tagging.

If OSM insists that data consumers take a bring-your-own-languages approach to native name labeling, ignoring name entirely, then I see three possibilities:

  1. Each data consumer builds a slightly different meta-dataset of regional defaults, some even taking advantage of a backdoor for making OSM dependent on proprietary data. Some alternative distributions of OSM data might bring in other names that never work their way back to OSM.
  2. Data consumers who still care about open data turn to Wikidata. Some renderers have been quietly giving priority to Wikidata’s public-domain labels over OSM name:* tags for years, so this is not without precedent. But name remained the one name key that was entirely OSM’s own. I think some data consumers have appreciated the hyperlocal knowledge in name. A hard stance about delimiters could convince some data consumers to take another look at Wikidata’s native label (P1705) property, which is structured as a list.
  3. Data consumers ignore the entreaties in this thread. Nominatim continues to split name on semicolons.

Regardless, mappers may not particularly care where renderers get their labels from, except when their hand-curated choice of languages doesn’t seem to have an effect.

1 Like

The idea (AIUI) is that it would either be tagged as:

display_name=Derry / Londonderry
name=Derry;Londonderry

Or an alternative proposal:

name=Derry / Londonderry
names=Derry;Londonderry

Amusingly the local vogue appears to be for a tilde, as in Derry~Londonderry. My regex is getting more arcane by the hour.

1 Like

Where I think " display_name" could help is with the third of my “in OSM we have nothing that explicitly says” examples, as per Richard’s reply above. A different “known-separator”-separated field could help with either “names on signs” or “names spoken round here” depending on how we defined “names that is used locally” per https://community.openstreetmap.org/t/multiple-delimited-names-in-the-name-tag/6803/60 by Brian above.

I don’t see where this solves your concerns regarding edit wars. After display_name is approved, mappers will copy the “beautiful” name to display_name and adjust name to a “ugly” machine readable value. Means as long as no renderer supports either listed name or display_name the edit war begins and whether the renderer will take any action is not up to us mappers.
If so, the machine-readable tag needs to be a new key.

Then maybe you can explain why using ; as delimiter worries you so much, but using - or / or ~ is totally fine for you, as you haven’t complained about those delimiters in name, which are commonly used since years.

The only difference here is the substitution of _/_ by ;. The semicolon really seems to be the main issue for many, which I can somewhat understand. Would it be so horrible to make an exception for name tags to use the slash with spaces as separator? Seems like a good consensus to me.

Renderers can then choose to display the full name tag, which is no problem as the slash is already widely used in scenarios like these and humans (generally) understand how to interpret these. Alternatively they can interpret the separator and only display what they think is relevant based on name:lang=* tags.

Some examples:
name=Bruxelles - Brusselname=Bruxelles / Brussel
(to avoid confusion with names that actually contain a dash, like 's-Hertogenbosch)

name=Biel/Biennename=Biel / Bienne + official_name=Biel/Bienne
(add spaces to indicate the separation of the two names, but of course the official name should be saved in its right tag)

name=België / Belgique / Belgien :white_check_mark: :smiley:

So Biel/Bienne is not “something invented” in favor of “political correctness”? People actually calling the city like this? I would assume some people call it Biel, some others Bienne. Same goes for Cottbus\Chóśebuz. No one calls it like this. German speaking people call it Cottbus, Sorbian speaking people call it Chóśebuz. So based on your argument, both names should be removed from the map, as they are just “something invented”?

Slashes frequently occur in individual names of places, airports, train stations, buildings, businesses, and so on:

In principle, we could use any character as a delimiter – even a space. However, there needs to be an established mechanism to escape the delimiter in case it really does appear inside a name. That escape sequence should never validly occur inside a name, or it too needs to be escapable. If the delimiter is a slash, then all the things that legitimately have a slash in their names now need to be escaped. That’s potentially a lot of features, certainly more than one (the number of features with an escaped semicolon today).

While it may make some mappers happy because they get to use the slash as if it’s a delimiter, others will be upset about Switzerland’s four names being rendered on multiple lines. And edit wars would be even more likely between mappers who dutifully escape these very common slashes and those who can’t stand seeing // in openstreetmap-carto and other renderers who haven’t even gotten on board with semicolon-splitting in name, let alone slash-unescaping.

If we’re optimizing for not looking ugly in openstreetmap-carto, then @clay_c and @ezekielf earlier suggested (a semicolon followed by a space) as a compromise. Americana treats this sequence the same as a semicolon, but some editors like iD currently strip out the space as an extraneous character.

That’s why I specifically wrote:

The lack of spaces indicates that it is to be considered part of one name.

However, I’m not set on the slashes (with spaces), I want to find a consensus. I’m not sure if a semicolon is going to do that (although the ;_ already makes it more readable for humans). If it does, I’m perfectly happy with it.

1 Like

Ah, I missed that part about space padding. This is slightly less common but still common enough to pose a problem I think. As far as I can tell, none of these name=* tags lists multiple values that can be formatted as if they’re independent values:

In some languages, and according to some English style guides, a slash is surrounded by spaces if it separates a compound word from another word – and it retains these spaces if you then embed that combination inside another name. (The same is true of hyphens, although very proper typography would demand the use of an en dash.)

Escaping a space-padded slash could look pretty messy. The escape sequence for would be ;; , based on the escape sequence for ;, but so far it hasn’t proven necessary.

You are short circuiting there.

It is completely possible and legit that the written name of something is different from what is colloquially used, that doesn’t make the written form less “used” or less on the ground than the purely spoken form. Actually if that wasn’t the case you would have to zap essentially every single name tag for places in German-speaking Switzerland.

1 Like

It should be noted that this is a very special case that boils down to self defense and not that the Swiss community is in love with it.

Because there is no preferred language for the country out of the 4 official plus 1 de facto in use (noting that we could have used the latin formal name too) for a while we tried to simply leave the name tag empty, however there was no stopping our dear colleagues to the North from “fixing” that. Culminating in it being set to the German “Schweiz” because the countries centroid happens to be in the German-speaking region.

1 Like