I apologize in advance for the length, but several blue-sky ideas have been floated so far that I think could benefit from concrete counterexamples.
Any delimiter you like
This is very meta. Pretty soon there will be a need for default_name_separator=;
because no one would want to blanket a multilingual region in the same name_separator=;
tag over and over again. Nothing would know how to interpret this key today; maybe in another few years’ time?
Meanwhile, the technology already exists to understand what ;
means when it occurs in a name
tag. If some local communities prefer not to use it for now, that’s their prerogative. But for those communities that are already quietly using a semicolon, it shouldn’t be necessary for them to explicitly indicate that they want name
to work just like any other key in OSM, such as destination
(which already takes multiple names in the local language).
Any order you like
This strikes me as an oversimplification of the reality on the ground. Maybe someday someone will figure out how to use default_language
in regions that have a strict system of multilingual names, but some places are just more complex.
To show you where I’m coming from, here are some pretty typical examples from places I’ve visited in the U.S. I’m very curious how folks think default_language
would help us determine either a standard name order or a standard separator other than the semicolon that all data consumers already handle in some fashion and some already handle elegantly.
When the City of Houston turned streets such as Turtlewood Drive and Bellaire Boulevard into “Turtlewood Drive Ngụy Văn Thà” and “Đại Lộ Sàigòn Bellaire Boulevard”, respectively, they just stuck the Vietnamese-language signs wherever there was enough room for an additional sign:
Some of the English signs are so faded that an English-speaking traveler may need to rely on the Vietnamese signs in some cases. A map could show them something like “Turtlewood Dr. / Ngụy Văn Thà” or “Turtlewood Drive — Ngụy Văn Thà” or “Turtlewood Dr. (Ngụy Văn Thà)” or “Turtlewood Dr.” above the street and “Ngụy Văn Thà” below it. The specific delimiter here only matters to the map style designer. The order of the names in name
doesn’t matter much either, because the preferred-language name will come first in any savvy map style or navigation guidance instruction.
The city only dual-named the through streets in this neighborhood, but other things are named differently. Turn 90 degrees clockwise and you’ll see a restaurant whose name is signposted in interleaved English, Chinese, and Vietnamese above some shops that are in English only or Vietnamese only:
The San Francisco Bay Area, where I live, happens to be very linguistically diverse. Many places of worship around me offer services in multiple languages and make every effort to unify their congregations despite a language barrier. This Jehovah’s Witness Kingdom Hall serves both English and Spanish speakers on equal terms. The placement of English to the left of Spanish is purely coincidental. As far as I know, they don’t have a preferred delimiter either.
This supermarket has two signs visible from the street. The logo on the sign in the foreground puts Korean on top of English, while the sign on the façade puts English to the left of Korean:
This doctor’s office posts its English name above and to the left of its Vietnamese name, but I think he mostly serves Vietnamese-speaking patients:
And let’s not forget that sometimes a feature can have multiple names regardless of language. Before it moved earlier this year, this flag and costume store was either “Funhouse/Flaghouse” or “Flaghouse/Funhouse”, depending on whether you looked at the sign on the front or the rear. (Customers typically parked in front and entered around back.) If I recall correctly, the receipt had both names printed on it, separated by the delimiter “***”.
Any language you like
I agree that user preferences matter a lot for rendering, but showing only the user-preferred language isn’t a panacea. OSM Americana’s local-name gloss has a lot of precedent in the American map publishing industry. Here’s a page of a small world atlas I used in school. It’s designed for students in geography class, so it’s representative of more serious reference works like those by the National Geographic Society:
Rome is “Rome (Roma)” and Naples is “Naples (Napoli)”. Wherever an anglicized name matches the local name minus diacritical marks, it restores the diacritics, as in “València” for Valencia. The only novel aspect of Americana’s language support is that it automatically chooses the main language based on your individual preference instead of making you buy a separate copy from the bookstore. But otherwise it’s a conservative approach that doesn’t necessarily open the floodgates to the complicated fallback preferences suggested earlier.
There are a couple things this atlas does that Americana can’t currently do based on OSM data. It avoids repeating a name just because English and one of the local languages happens to agree on a name. Americana also can’t automatically transliterate local names into Latin script for readability. However, that’s more of a problem to solve outside of OSM, since for example English and German require different transliteration systems for the same source language.
As long as it’s not Japanese
Anything that concatenates two arbitrary languages’ names will run into situations like this. Perhaps the most complicated example is Japanese. A high-quality Japanese-language map, such as one powered by Mapbox GL JS, will display some text vertically to better fit the allotted space, just like on shop signs.
The punctuation characters for vertical text are very different than the ones for horizontal text:
There are some very nuanced conventions for when to rotate individual characters or keep them upright in vertical text. Acronyms tend to stay upright, and if possible they get crammed horizontally into a single character block in a practice called tate-chu-yoko. This stuff keeps graphics engineers up at night.
Japan is largely monolingual, but if a renderer wants to combine Japanese text with text in some other language, it might need to try a little harder than a slash.
Meanwhile, there are other use cases for names that require data consumers to split the name
on a delimiter. Colocated offices very often have multiple signposted names that appear in arbitrary order. Presumably you’d want to search for your doctor by name, not by all her associates’ names:
Overall, I think we should apply same principle as we do with abbreviations: avoid misspelling, causing offense, or violating trademark law, but otherwise aim for structured data that can be readily consumed.