Multi-language/default language naming conventions

Okay, but the way the dataset is parsed and rendered by many if not most users of the dataset is name=* is the tag that gets used before any other variant, including the myriad name:<lang>=* tags that may be added to a given object. In this case, in St. Boniface, using the English exclusively in the name=* part of the dataset is incorrect, because the English name is not the one that should be the default. That’s why:

I totally agree that

or at the very least it’s not ideal, but

Are there? Because you really don’t seem to like any of them! You just want to change the name=* tag from the Frenglish compromise that, yes, just unabashedly apes the signage, to the unilingual English. Your premise right from the get-go has been:

That’s not a “proven method for localization”, that’s just making it default to English, and—again—that’s not an acceptable solution because you’re operating from the false premise that that would accurately represent “official usage” and “on-the-street local custom”. I don’t know how you can make an appeal to “official usage” and then turn around, just ignore my reference to the City of Winnipeg Charter, and write:

:man_shrugging:

Your chief complaint about the “follow-the-street-signage” compromise seems to be:

I don’t use multi-lingual routing, but if the software you’ve used is “doing it wrong” that’s the fault and problem of that software not parsing the dataset properly. Don’t fudge the dataset to make it look (or sound) the way you want.

I fully agree that it’s better for a user of the dataset to pick one of name:fr=Avenue Niverville or name:en=Niverville Avenue to suit the language preferences of the user, but in the absence of that software being able to parse using one or the other, it’ll default to name=*, and using only the English or French name in name=* is wrong for the aforementioned reasons. You harp on how name=Avenue Niverville Avenue is gibberish that doesn’t really work because that’s not it’s actual ‘name’, like so…

Then you question how “name=Avenue Niverville; Niverville Avenue” would show up when rendered, or ought to show up when rendered:

I dunno man, let’s go back and look at that map from Tourism Winnipeg I linked earlier and see what acceptable visual rendering by others is. Here, I’ll take a snapshot for everyone’s benefit:

:man_shrugging: :man_shrugging: :man_shrugging:

At the end of the day you may think it’s a stupid, ugly compromise, but even these sorts of English-language maps go out of their way to present both the English and French names in the exact same fashion as the street signs.

2 Likes

So, speaking of New Brunswick, I noticed that many streets in the Moncton–Dieppe area are still missing the French name. As I went along chemin Shediac Road adding what I could from street-level imagery, I followed the format “boulevard … Boulevard”, which seemed to be the most common format in that neighborhood, though “Boulevard … / … Boulevard” was also common in other neighborhoods.

I did run into some streets that illustrate the pitfalls of literally tagging what’s on hybrid signs. Apparently the signs for Marche Street say “rue du Marché st”. Should there be no name=* just because “Rue du Marché” and “Marche Street” share no words in common? Or should both names appear in full, as in Place Champlain / Champlain Place Mall? (And better yet, with a semicolon?)

Also, should name=* and name:fr=* begin with a capital or lowercase letter? Unlike in Winnipeg, many signs here capitalize the French road type, and I believe this form is appropriate for map labels and address labels. However, lowercase would be more correct in a sentence, such as in turn-by-turn navigation instructions. In either language, one can easily capitalize a lowercase string as needed, but it’s harder to reliably go in the other direction. (This isn’t true of every language.)

From memory, most (all?) streetsigns in Brussels (Belgium) sidestep this issue by writing everything in all caps, with smaller font sizes for rue and straat. On the openstreetmap side, all the French rue, avenue, and other words in the city are written with a leading capital. This is also consistent with all regions (that I have come into contact with) where French is the only language.

Any French-language name=* and name:fr=* tag should be treated as a title, and thus begin with a majuscule. “Boulevard Babineau”, not “boulevard Babineau”. This applies not just to street names but anything, really. We would talk about the Atlantic Ocean in a sentence as “l’océan Atlantique”, but the name of the ocean on its own is Océan Atlantique.

Likewise a longer name with more than a couple words should follow French capitalization rules: the first letter of the first word is capitalized, but thereafter only the first letter of proper nouns should be capitalized. E.g. Parc national de la Mauricie.

There may be dialects or regional varieties of French that don’t follow these rules, but at the very least in Canada this should be universally accepted by any francophones. That said, it doesn’t take long to find OSM objects where someone has violated these rules, e.g. Réserve Faunique de Papineau-Labelle should be “Réserve faunique”… :confused:

1 Like

Thanks, are these norms documented on the wiki somewhere? I realize this falls into the category of “stuff every Canadian learns in grades=-5 to -1”, but there’s a lot of inconsistency in New Brunswick, where one would naïvely expect bilingualism to come second nature by now. Anecdotally, most of the French names seem to come from mappers abroad, particularly from France, who probably didn’t know the local customs. Documentation would help these mappers avoid making the situation worse.

1 Like

I mean… it shouldn’t have to be documented by OSM, in the same way you shouldn’t need to document title case in English.

That said, a cursory Google search for other sources says “Unlike most areas of French grammar, the capitalization of French titles of books, movies, etc. does not follow a clearly defined set of rules. According to ‘Le bon usage’, French title capitalization is inconsistent, with competing systems used by writers, publishers, and other authoritative sources.”

:man_shrugging:

Never mind, I guess.

2 Likes

It’s not a stupid, ugly compromise, it’s a clever way of arranging text to save space due to French using street prefixes and English using street suffixes. The resulting text string is a label mixing two names, rather than just a single name. The label, “Av Hamel Ave” on the Tourism Winnipeg map combines the two names “Av Hamel” and “Hamel Ave”.

Humans reading the map intuitively know this and use either of the two names when speaking, but a computer reading from an ostensibly structured database does not have intuition[1]. If the computer is using the name tag in a context where this clever space saving technique is appropriate, this works out ok, but in a context where a single name is needed it ends up feeling awkward and not right. For example, turn by turn directions saying “Turn right on Avenue Hamel Avenue”.

This space saving technique also works best when street prefixes and suffixes are also abbreviated to save space. Although Avenue is spelled the same in French and English, the abbreviations differ. So “Av Hamel Ave” looks less redundant than the full “Avenue Hamel Avenue” we see on the OSM Standard layer:

I definitely understand the desire for OSM based maps to be able to achieve this name merging technique, and I’d hope this could be possible by combining name:en and name:fr in postprocessing. If not, it would seem to me that a different tag than name would be more appropriate since data consumers expect name to contain one or more full names in a structured way. name:fr-en?


  1. Despite the promise of AI, computers are still pretty dumb :grinning: ↩︎

I figure that there isn’t currently very much case inconsistency in tagged English names, but there is plenty of case inconsistency in tagged French names, as well as inconsistency about whether name=* should be a hybrid name or two full names separated by some delimiter, and what to do when the names differ only by du or an accent mark. In the absence of explicit guidance in the Canadian or New Brunswick tagging guidelines, mappers are applying their own preferences or assumptions. From the standpoint of a data consumer, the only thing worse than an ambiguous format would be variations in format from neighborhood to neighborhood, or even from street to street.

Tagging hybrid names explicitly in name=* optimizes for unpersonalized map labels at the possible expense of other use cases, such as bilingual map labels, that require some indication of the local-language names. Even if the community accepts this tradeoff, it still needs a solution for names that cannot be hybridized. Adopting a machine-readable delimiter can simplify the situation by making it easier for renderers to synthesize the hybrid name as needed while catering to these other use cases.

Yes, I think it would be quite feasible for OSM Canadiana to implement something like this based on the existing code for bilingual labels. There’s already code to automatically collapse labels in which the local-language name is identical to the English name plus some diacritics. For example, an English speaker sees “Montréal” and “Québec City” instead of “Montreal (Montréal)” and “Quebec City (Québec)”, even if one holds that the proper English spellings are “Montreal” and “Quebec City”, respectively. This diacritic folding is commonplace on signs and maps alike.

If we must hard-code these typographical flourishes in the database, ideally there would be some way to distinguish them from “names in the local languages”. en-fr is an invalid language code, but name:en;fr=* would be consistent with Internet standards for lists of language codes. I would be greatly amused if the community desperately avoids semicolons in tag values by putting semicolons in keys instead. :upside_down_face: