Multiple delimited names in the name tag

Matija_Nalis · January 3, 2023, 5:37pm

Yes. Most common uses I thing are either:

a local name, or
a combination on multiple names (often in multiple languages) separated usually with some (semi-random) ASCII separator.

What I would advise against is that “new and improved solution” caters the latter. It is a bad idea from several standpoints

for all that data-centric purists, it is absolutely horrible idea to intentionally break database normalization. It has been shown time and again - in OSM too - that using separate tags is superior to trying to cramp multiple values into one tag separated by ; or whatever (e.g. sidewalk:left=yes is better solution than sidewalk=left, which is highly noticeable as situation gets more complex, e.g. sidewalk:left=separate). And here, name:hr=* + name:it=* (+something like “default_language=hr - it” in polygon for the county where that is recommended) would be better idea than trying to standardize specific way in which name=hr_name / it_name would be abused. - especially as situation gets more complicated (by e.g. implementing user preferences).
trying to do it wrong way around is bound to be much more complex, with huge number of problematic cases. So - it is simple to merge name:hr with name:it, it is reliably hard to extract separate name:hr and name:it given their mix in name. (same as it is much easier to mix flour and salt, than to extract salt from flour given their mixture; that is why I use term “wrong way”)
then there is separator issue; if one still insist on forcing the multiple values into one key (which is frowned upon in most databases), using ASCII separator like “;” or “/” is probably bad idea (as it can be used already elsewhere). Although I would highly discourage trying to stuff multiple values in one key, if one were to go that way, it would probably be better to use dedicated UTF8 information separation characters for that purpose.

Maybe the renderer can pull in some external comprehensive source of what’s spoken or signposted in every locality

As noted above, it can be specified per-locality polygon.

but a very tempting simpler alternative is to just use name

it is tempting, and it is a problem. Just as it is tempting currently, when most popular renderer will just take name and go with it verbatim – because it is simpler to do. Problem is it is often not what user wants.

At this point, we don’t even need to know that it’s Italian, just that this seven-character-long name has already appeared earlier in the label.

Oh, I totally agree with that this is unwanted consequence of putting multiple things into one key. It’s just that my suggested solution is not “lets see how we can better deduplicate stuff different values put into name key” but instead “let’s NOT put duplicate information into name key in the first place”.

Is there an example of an online map (doesn’t have to be OSM-based) that implements such complex fallbacks dynamically based on user preferences alone?

On user preferences alone, no. There should be reasonable default provided by map depending on the language. So my for example online desktop map, first part (Croatian in Croatia, or Spaniard in Spain) would remain the same (name / loc_name / official_name / alt_name). The second part (user in foreign country) could be approximated my CLDR matching you mention, and it would be reasonable for desktop online map. (although exposing it to the user so they can additionally modify it to their preferences, like e.g. streetcomplete-mapstyle does for styles).

But that online desktop map behaviour might be quite different from the behaviour the same user wants when on the ground (i.e. I might not speak Chinese at all so its unwanted on my desktop, but if I’m in China, I definitely want my mobile app to display Chinese characters too, so I can compare them with traffic signs for example)

So preferences change even for same user looking at the same part of the map, depending on what that map is being used for.

But you know, OSM Americana is easy to fork – a Croatia-focused style can afford to hard-code some assumptions about its users’ language skills

Sure. But if I’m going to fork&hardcode it, I can go with anything, right? (and I’d probably prefer mobile offline solution then). The advantage of OSM Americana to me seemed exactly that it automatically gets some of the user preferences and caters its display to accommodate them, with reasonable fallbacks.

What I miss in solution like OSM Americana is more user control over preferences (e.g. in example above I might choose not to see official_name or loc_name even when in Croatia (but just name and alt_name if it exists), or in other case using on the ground I might want to see int_name as well as local Chinese name – yeah I know it is called “Americana” for a reason and it is not a feature request there, but I’m trying to show what I personally find good and what I would prefer as direction).

TL;DR: What I though was interesting in this discussion was idea how this catering to user preferences can be even further improved by standardization on some tags (and maybe one day implemented on osm.org too). It is just that I find standardizing on separator chars in name=* as a worse (because it seems more prone to misinterpretation, de-normalizes data needlessly, hugely bigger amount of work, etc.) solution then improving standardization on something like default_language=* or language_format=*.