Multiple delimited names in the name tag

I’m not hard against, but do note that many existing posts are quite intertwined with parts about disadvantages of “multiple delimited names in the name tag” as well as suggestions to modify/improve them (which should IMHO definitely remain in this thread), as well as suggestions for alternative ways to accomplish similar result (which might indeed benefit from being in new thread).

Perhaps new replies at least should each be split in two different messages? (one in new thread commenting on parts of messages related to default_language-alike methods, and one message in this thread commenting only on the name-alike method. Although I do envision it would be hard to keep such messages usefully crossreferenced, if one tries to compare their pro and contra. :slightly_frowning_face: )

default_language’s blast radius is too large for any data consumer to use for any use case that involves reuse or caching. Think of all the commotion whenever the coastline breaks and floods the world, or when the Great Lakes dry up, and how long it takes for the Standard layer to recover. Now imagine that multiplied by literally everything in a country at every zoom level. No changeset can cause anywhere near that scale of disruption by modifying individual name tags. Meanwhile, any legitimate change to a default_language tag would require modifying every name tag in the country. This is one of those ideas that sounds great on paper until considering how OSM is produced.

1 Like

I will consider this, but I won’t be able to give this potential creation of a new language default thread any attention for 6 hours or so.

The good news: OSM Americana now supports the semicolon delimiter in name, name:*, and also ref (for things like terminal gate numbers and highway exit numbers).

The bad news: OSM Americana can’t support slashes, dashes, and spaces as delimiters. But just imagine if the places that use these delimiters were to migrate to semicolons.

1 Like

Pompously announcing a really bad idea doesn’t make it less bad. The “other prominent OSM data consumers” are only doing a quick fixup to avoid ugly breakage in the name of being lenient in what they accept, that is not the same as “support”.

As has been pointed out multiple times in this thread, things are not so simple. Often in (proper) multi-lingual regions the -actual- name of the place is composite and is customarily written with a separator.

Please stop trying to rearrange the world according to a naive, CS-driven, concept of normalization.

The idea with semicolons may look neat and clean to a computer programmer but it solves preciously little for the name tag. name tags with multiple languages in them is just the very tip of the ice berg when it comes to problematic content. We also have descriptive names, names with extra info, categories in names, names with full route descriptions (any PT route). Each of these has its own particular problems for data users. When you add semicolons to the mix, you just pile yet another format on top of all that already exist.

If we are looking pragmatically at the situation, then the de facto use of the name tag has been for a long time to be the label or display name of the place. That is nowhere written down because we always strife for a name tag that adheres to the definition in the wiki. However, in reality it is what mappers tend to do (because of the feedback they get from the map) and because it helps to avoid conflicts. Maybe it’s time to just accept that the name tag is on of the ‘human’ tags in OSM, only to be displayed but not interpreted by a computer. As long as we make sure that the necessary data is also available in tags that are machine-readable, that’s a workable compromise.

My personal suggestion her would be to introduce a new tag display_name and get carto to render that preferably where it now renders name. Then advertise the tag among data user and start slowly moving non-names into the new tag. No mass edits necessary. Just rename tags when you come upon a problematic use. If it takes 10 years to get to a clean name state, that’s fine. No rush.

I certainly don’t think that it is a particular good idea when a single data users imposes a format for a tag that breaks pretty much everybody elses map.

3 Likes

That’s fine. If the simulated screenshot would be wrong in any of these places, then by all means the name should stay as is. For the features that are using a semicolon, however, it’s clear that the mapper’s intention was not for the user to see a semicolon. If Americana is avoiding ugly breakage, then you’ve written a better headline than I was able to come up with.

Can you elaborate on what’s broken as a result of Americana or any of these other data consumers (plural) interpreting a semicolon as a value separator? Americana still renders slashes, dashes, and spaces as slashes, dashes, and spaces. If you’re concerned about semicolons getting misinterpreted, so far, I’ve come across only one name in the whole world that properly contains a semicolon in the real world – and it’s escaped as ;;. So if anything, that feature is broken in any data consumer that does not interpret the semicolon as Americana does.

This would be grist for a separate topic, about which I’m pretty sure you and I would see eye to eye.

If separating multiple equally primary names with the standard semicolon separator is such a bad idea then it would be helpful to explain why you think that. I don’t particularly want to see further proliferation of multiple names stuffed in one tag in cases where name + alt_name + *_name + name:* would be a better representation. However, with multiple names in the name tag being a common political compromise, I’d much prefer to see the standard semicolon delimiter used in those cases.

1 Like

I’m glad to hear that you recognize the challenge that we face, and I appreciate that you have some concrete ideas to solve the problem. If this thread has shown anything, it’s that a lack of data standardization can cause real problems for real data consumers when alternate tagging schemes are in competition. If the community comes up with a better data modeling solution, I’m confident that the Americana project and the broader US mapping community would adopt it.

In the meantime, with my “maintainer of a community renderer project” hat on, I support the views of my fellow maintainers that supporting semi-colon delimiters is the least bad option available in the face of multiple conflicting methods to solve the same problem. Sitting around and waiting for the community to invent a better solution is inconsistent with the zeitgeist of the community around our renderer. Supporting innovation is an explicit goal of the project, and I expect we will continue to innovate in the future on long-standing challenges in OSM-based cartography. If that philosophy exposes areas where the OSM data model can do better, I consider that a positive outcome.

I recognize the unfortunate situation that rendered names with a semi-colon will look poor when rendered on maps that have not chosen to interpret a semi-colon as a delimiter. Rather than complain about the situation, I hope those on this thread with strong feelings will consider this a call to action to work with the community to solve it properly. I appreciated your response to a question on tagging standards during the recent OSMF election:

The evolution of tagging is a question I consider a core responsibility of the community that should not be decided top-down by the OSMF board. However, it is a topic where the board could give the necessary support to bring the topic forward by organizing a working group. As with the data model, such a working group would need to start with a study that researches the different options of standardization or consolidation of our tagging system, so that the community can have an informed discussion. Only then can we talk about how the OSMF can support a concrete evolution step.

If you were serious about this, and it wasn’t just an offhand statement to mollify the portion of the electorate that feels strongly about tagging standardization, consider this an opportunity to put your suggestion into action.

  • As has been pointed out places do have composite names (@lonvia touched on other complexities that in the end cause similar issues) while they might be built by concatenating semi-independent strings, the result is still a name in its own right.

  • Turning the previously unstructured name tag in to a structured tag is just a tremendously bad idea, it changes the semantics of one of the most used attributes (and @Minh_Nguyen was asking for all punctuation to be converted to semi-colons, not just handling the odd misused tag) and will loose information on a big scale.

PS: poster child example Biel/Bienne - Wikipedia

1 Like

The problem right now with non-standardized delimiters is that there is no way to distinguish the case of a “name in its own right” from there being two equally valid but different names used by different local linguistic groups. This is specifically the situation that normalizing on the semicolon separator for equally valid but different names can help distinguish. If the name really is hyphenated, don’t change the hyphen to a semicolon. If the name really isn’t hyphenated in practice but there are two different versions that are equally prominent and valid, then don’t use a hyphen, use a semicolon.

I believe that you are misunderstanding Mihn’s comments. If the single name is understood to include hyphens and other punctuation, then they should be kept. The cases where semicolon should be used is where local speakers of one language use one name and local speakers of another use a different name and those linguistic groups don’t have a unified understanding that the name should be compounded.

3 Likes

Seems a bit alarmist to say that semicolon-delimited multiple names is breaking anything or that it imposes a tagging scheme on others. Nobody is making anyone change existing ad-hoc delimiters that communities already use.

For those used to locally standardized delimiters in names, consider places where there may not be a standard delimiter. This road in Indiana, USA has two equally important names, both in English. These names are posted on separate signs, so there’s nowhere for a delimiter to go. The US doesn’t really have a precedent for this. Are we supposed to invent a delimiter? If so, why not a semicolon?

3 Likes

Possible alternative if one has reason to preserve the existing non-semicolon separator in the main name tag:

name=Biel/Bienne
name:separated=Biel;Bienne
name:de=Biel
name:fr=Bienne

(name:separated would override name for renderers like OSM Americana.)

So it would do the wrong thing?

1 Like

I’d love it if the Americana team could tune down the rhetoric two
notches from “hey we’ve solved this problem for everybody, now please go
ahead and change all the name tags” to “hey we’ve solved a local issue
we had, we acknowledge our renderer won’t work for everybody because of
that but that’s fine, it doesn’t have to, we’re Americana after all” :wink:

I would also appreciate if you could recognize that while there might be
renderers who have “not chosen to interpret a semicolon as a delimiter”,
most renderers will have “chosen to not interpret a semicolon as a
delimiter”.

4 Likes

Rendering Biel and Bienne on separate lines, with the user-preferred language on top, is not wrong. (Rendering the official name Biel/Bienne on one line is of course also fine, but the point is to give the renderer more options.)

2 Likes

I asked for no such thing. In this thread, I asked for the community to be aware of the longstanding usage of semicolons as an option, so that data consumers could support it without fear of a backlash. In a way, my request has been denied. :wink: (To clear up any confusion, the word “support” can mean “to know what to do with”, not necessarily “to campaign for”.) Unfortunately, this thread has become so long that folks just now coming into the discussion have probably gotten an overly simplistic view of the situation.

In the other thread I created, I thanked those who have been using a semicolon when appropriate. If expressions of gratitude to mappers are problematic, then let me replace it with a profound apology for being grateful.

You’re right that some may have made this choice. You must know much more about renderer developers’ intentions than I do; I had no idea most have considered and rejected the idea of pretty-printing semicolons.

Incidentally, there’s a new localized renderer on the scene, Tracestrack, which I just found out about from weeklyOSM. Their preferred delimiter? The empty string. It works well enough for Hong Kong, which separates Chinese and English names with a space:

https://twitter.com/tracestrack/status/1592246528152076289

Like Americana, they render the whole world. I can’t get it to show the English name of Milan, but on the bright side, Milano is an English name that happens to be my favorite snack.

The thought process that inevitably leads to this point is:

  1. Users want to see a map in a language they know. OSM has lots of name:* tags for this purpose, so we’ll show those instead of name.
  2. For things like cities, users also want to know the name in the local language, so we’ll get it from name. (And default_language would be a nonstarter, if for no other reason than the potential for massive vandalism.)
  3. Yuck, repeated names all over the place. Deduplicate the names by searching for the preferred-language name in name.

Americana took an additional step in avoiding false positives by requiring the matching duplicate name to be surrounded by semicolons (but not ;;) before removing it from the label. Unfortunately, “Americana avoids false positives” didn’t occur to me as a subject line last night.

That’s all well and good with the semicolons as separators and it may make sense in some cases.

Nevertheless, I find it somehow selfish from the renderer’s point of view to want to have more options at all costs and to offer the user a user-specific map with names from the user’s supposed preference. This ignores the efforts of genuine bilingual or multilingual regions and the local community, which consciously use multilingual names in the name-tag for very specific reasons, often especially cities and municipalities use these multilingual names highly officially.

If it’s just a matter of making the display of multilingual names in the map image “nicer”: well, my computer science studies were 30 years ago now, back then I was still programming with Turbo Pascal and C++, and today I don’t know anything about it anymore. But I assume that almost all multilingual names also have the specific name:*** tags in the name tag. This allows the name-tag to be broken down into its language-specific components and then reassembled individually, in the desired order or among themselves and with any desired separator.
I even managed to do this with the example of “Cottbus - Chóśebuz” with a small Excel table (data basis: copy of the place=city - node from the editor):
image
The whole thing works without a single semicolon! So without a semicolon in the name, for the formulas in Excel sheet you need one or the other semicolon of course :wink:

I have not yet taken into account the French name just added by a colleague. :slight_smile:

Translated with DeepL Translate: The world's most accurate translator (free version)

As you didn’t share your Excel, I would be interested how you got the delimiter used in the name-tag? How did you considered - or / could actually be part of one name part?

It was mentioned in this discussion, that of course instead of using a fixed delimiter like ; you could also add a tag defining it. Something like name:delimiter=/. So far the conclusion was that ; is already defined as a common delimiter in OSM-tags, so it might be better to use it as well for name.

What do you mean by “wrong thing”? Isn’t it up to the renderer how to display our OSM data? Is it as well a wrong thing if I decide to render the ocean red and the buildings blueish?

5 Likes