OpenStreetMap English (en_OSM) vs localization in regional English dialects

Forking the discussion from Acknowledging provinces in website search results, where some interesting points about localization are raised:

Most of the words in OSM tags are English (en_GB specifically) and I wouldnā€™t say they just happen to be by chance. It is intentional. That said, the full set of words in our tagging schema does not match any particular regional dialect of English, and it even includes some words borrowed from other languages. Further, the point is well taken that the meaning of some OSM tags have deviated from that of the original British English words. I call the resulting dialect OpenStreetMap English (en_OSM). The portmanteau ā€œOpenstreetmaplishā€ is fun, but it would seem to suggest an entirely separate language from English. I donā€™t think we have deviated that far yet!

I have mixed feelings about this. Those of us with many years of engagement with the project know OpenStreetMap English well and it can bolster cross cultural communication as a lingua franca of sorts. However, for less engaged mappers from English speaking regions it can easily be a source of confusion and conflict because they do not realize the differences between their regional dialect and OpenStreetMap English (or indeed that en_OSM even exists). Localization into regional English dialects should help here, but it is tricky. On the other hand, such localization can hinder cross cultural communication if mappers donā€™t have to contend with the real en_OSM tag values, only localized strings in their particular en_* dialect.

When localizing into non-English languages, it is generally clear if you are looking at English or your local language. ā€œStraƟeā€ is clearly German and ā€œRoadā€ is clearly English. Of course this isnā€™t always the case. ā€œTagā€ could be English or German and means a rather different thing in each case. This same thing can happen between English dialects.

The term ā€œcamp siteā€ has a different meaning in American English than it does in British English (and in en_OSM). An American English speaker can easily make the mistake of tagging camp_site where the correct tag is camp_pitch because of this subtle difference in meaning. An en_US localization labelling camp_site as ā€œcampgroundā€ and camp_pitch as ā€œcamp siteā€ would help prevent incorrect data entry by unaware American English speakers. However, if such a localization were to completely hide the underlying en_OSM tag values then communication between those that speak American English and those who speak a different dialect would be confusing:

en_US mapper: ā€œI added a camp site to this campgroundā€
en_GB mapper: ā€œAre you sure? It looks to me like you added a camp pitchā€
en_US mapper: ā€œNo, I definitely added a camp siteā€.
ā€¦ both mappers unaware of the localization and the different meanings of the term in their respective dialects

To me it seems clear that when it comes to English dialects, neither strict localization or strict adherence to en_OSM are good options. I think an ideal English dialect localization would be something like this:

  • Stick as close as possible to en_OSM while ensuring terms will be well understood
  • Change a term if it will be misunderstood or confusing in the regional en_* dialect
  • Where a term differs between the regional dialect and en_OSM, always display the en_OSM term as well. For example an en_US UI might display:
    • Campground (en_OSM: camp site)
    • Campsite (en_OSM: camp pitch)

This obviously would require more UI design consideration than a typical localization, but itā€™s the only way I can think of to work against cross-dialect misunderstanding and incorrect data entry.

3 Likes

Does this mean Iā€™ll be able to get OSM localised into Yorkshire :grinning:?

5 Likes

Sure, and Iā€™ll take a Western New England localization :grinning:. We might be waiting a while for more detailed locale codes beyond US and GB to become standardized thoughā€¦

As I see it, openstreetmap-website (and its Nominatim integration) is essentially a special kind of data consumer whose audience, in theory, is mostly mappers rather than laypeople trying to go about their day. For most data consumers in a similar position, like Overpass turbo and taginfo, the stakes are quite low, but openstreetmap-website is the face of OSM, so naturally people take its presentation of OSM data to have more of an air of authority. Even those of us who live and breathe OSM donā€™t check our lived experiences at the door, which is why the website is available in multiple localizations in the first place.

The base localizationā€™s use of British English terms is causing a world of confusion. For example, shop=alcohol is ā€œOff Licenseā€, because establishments in the UK are licensed to sell alcohol for offsite consumption. Translators lacking this context have variously turned it into ę”æåŗœč®øåÆ酒ē±»åŗ— in Chinese (ā€œgovernment-licensed liquor storeā€), Ɓn vĆ­nveitingaleyfis in Icelandic (ā€œlacking a wine licenseā€), Kedai Arak Tanpa Lesen in Malaysian (ā€œunlicensed liquor shopā€), and Wala sa Lisensiya in Tagalog (ā€œnot found in the licenseā€).

The problem with ā€œOff Licenseā€ is not that itā€™s spelled according to British English or that itā€™s British English vocabulary. Rather, the problem is that the term doesnā€™t make sense in jurisdictions outside the UK, or it only makes sense by analogy to the UK, even though the shop=alcohol tag isnā€™t tied to UK law in any way. Switching the base localization to American English wouldnā€™t necessarily solve this problem across the board. After all, a highly colloquial Americanism like ā€œPony Kegā€ for shop=convenience could similarly cause translators to mistake it for a beer container of a certain size. So this is not an argument for favoring one dialect over another; itā€™s an argument for avoiding colloquialisms or legal jargon that donā€™t fit the tag very well.

Even so, there are unfortunate situations where no neutral, intuitive term exists. Probably no English-speaking mapper under the age of 90 knew of ā€œFilling Stationā€ as a word for shop=fuel amenity=fuel before joining OSM, but weā€™re using that term because ā€œGas Stationā€ sounds too American, ā€œPetrol Stationā€ sounds too British, and the Canadian alternative of ā€œGas Barā€ sounds to me like a pub serving ginger ale. The truly neutral option would be simply ā€œFuelā€, but it risks confusion with shop=fuel, where the fuel is only sold for offsite consumption. Would the latter be an ā€œOff Licenseā€? :drum:

Switching the website to another language allows us to think more clearly about this issue. Thereā€™s no question that a Chinese-speaking user should see shop=alcohol labeled as something that means shop=alcohol in their language. They might not even be able to read Latin text. We only seem to have this controversy when suggesting that American English speakers should be able to identify a shop=alcohol without asking what the heck an ā€œOff Licenseā€ refers to. The counterargument is that it would be unfair to give American English a localization of its own, versus the status quo that gives British English two different localizations, at least nominally. Part of me wonders how much of this pushback is coming from British English speakers, versus speakers of other languages who are simply accustomed to what the website has been calling it.

Iā€™d support the idea of annotating any localized terms with their raw tagging equivalents. It already has similar tooltips in other parts of the UI, since icons and color swatches arenā€™t always especially intuitive. On the other hand, if any label in the UI intentionally eschews the local language in favor of OSM terminology, it could be rendered as monospaced text to make that clear to less experienced users.

3 Likes

Iā€™d absolutely agree that here ā€œAlcohol shopā€ would make more sense to people speaking OSMese rather than British English (and would be perfectly understood in British English too), but controlled sale of alcohol in some way is pretty common around the world. The rules may be different around the world, but people will be familiar with the concept - a level of state control like Systembolaget in Sweden, Alko in Finland, bottle shops licenced similarly to UK ones in Australia, and I seem to remember some very odd scandi-like restrictions in Massachusetts many years ago? Surely itā€™s the actual words used rather than the concept thatā€™s the problem, and you could get that with any base translation?

Iā€™d agree - weā€™ve seen plenty of cases where a tag has ā€œgone rogueā€ because it got misconstrued in a different language, sometimes by a ā€œfalse friendā€ in that language.

3 Likes

Yes. Translators lacking the necessary context is a hard problem in general, regardless of the base language or dialect. Mappers like @maro21 have added a lot of helpful contextual hints in Translatewiki.net, but weā€™ll always have to be vigilant about issues like landuse=mine being translated as the first-person singular pronoun ā€œmineā€.

That said, enough open source projects use American English as a development language that most translators and their tools assume this dialect. For example, in Translatewiki.net and any other modern translation management system, one of the main translation aids is machine translation results from Google, DeepL, or the like.[1] These services are configured to assume American English, and Iā€™m unsure if thereā€™s any way to change that to British English, let alone some concept of OSM English that no translation service supports as a source language. Translators are supposed to double-check that the suggested translation matches the contextual hint or rely on their familiarity with the software, but unfortunately not everyone exercises that level of care.

On the bright side, Translatewiki.net also shows translators the equivalent translation in another language they speak. So ideally getting a translation right in one language will give translators in other languages a heads-up that they need to take the machine translations with a grain of salt.


  1. The absence of this functionality is one of the main complaints about JOSMā€™s translation setup. ā†©ļøŽ

2 Likes

amenity=fuel, no? And where else would one fill up with petroleum distillate and have their tires re-vulcanized? :wink:

Huh. I was today years old when I learned that ā€œgas barā€ is a Canadianism. Itā€™s a bit ye olde fashioned, used to distinguish between fuel stations that had no other amenities except the fuel pumps and fuel stations that also had a service garage (ā€œservice stationsā€). Nowadays it tends to be used to mean more specifically the part of a fuel station where the pumps reside. Some retailers still refer to their stations as ā€œgas barsā€, e.g. Co-op (Federated Co-operatives Ltd.), but this is more of an anachronistic holdover than anything.

Anyway, personally I think the ideal solution is to use the ā€œleast regionalā€ terminology that most people will understand despite their regional dialect, and I think thatā€™s already pretty much the case with most things. For example, I wouldnā€™t insist on what I call a parkade being tagged as such in OSM, and am perfectly fine deferring to ā€œparking garageā€ and amenity=parking, parking=multi-storey. Trying to plan for every eventuality in every dialect of English is a bit of a foolā€™s errand.

4 Likes

Drats, I have a tendency to typo my best arguments, but at least it gave you an opening. :wink: Fixed.

This discussion is not about tagging but rather about how the tags are presented in a UI. My point there was that sometimes there is no such option. Another example off the top of my head is shop=chemist: the Americans think ā€œchemist shopā€ is a scientific laboratory, while the British probably think of ā€œdrugstoreā€ as a place of vice.

1 Like

(Somewhat off-topic)

Iā€™ve been fascinated reading through this thread. I wonder if ā€œOpenStreetMap Englishā€ is a living, developing example of koineization.

5 Likes

Right, I shouldnā€™t have written ā€˜taggedā€™, as I did mean ā€˜how theyā€™re presented in a UIā€™.

Solution: ā€˜pharmacyā€™, like we say in Canada. :grin:

1 Like

A ā€œchemistā€ is the non-licensed part of a Canadian pharmacy, so stuff like sunscreens and shampoos. In parts of the world theyā€™re separate stores.

We say ā€˜pharmacyā€™ in the US too, but thatā€™s amenity=pharmacy, not shop=chemist. :face_with_spiral_eyes:

2 Likes

Chemist is really an old name for a pharmacy in the UK. When I was growing up you went to the chemist for your prescription medicines but at some point, I think in the 90s, they all rebranded as pharmacies. Boots the Chemist became Boots Pharmacy.

I first learned the word pharmacy as pharmacie when I was learning french shops at school.

It is still common to refer to them as chemists despite what it saying pharmacy on the sign.

1 Like


Thatā€™s, just, likeā€¦ a ā€œgeneral storeā€, to me. If they donā€™t dispense drugs, theyā€™re not a pharmacy. :man_shrugging:

Pharmacie, oh-lĆ -lĆ , pantalons fancie.

This is stirring up fun memories of old mailing list and iD GitHub discussionsā€¦ Anyways, the tag exists, is common, and needs to be distinguished from amenity=pharmacy, which for better or worse has come to refer to just the pharmacy counter in the back of a shop=chemist or shop=supermarket.

4 Likes

Is there such a thing? Candy is also something Iā€™d expect to find in a ā€œgeneral storeā€ but there are still candy shops. Nowadays you can find mobile phone accessories in nearly every shop and supermarket, but there are still shops that specialize in mobile phones (usually coupled with a repair shop).

But even the most ā€œgeneralā€ shop has to fall into some category, e.g. a grocery shop if it mostly sells food items.

1 Like

Thereā€™s all sorts of strangeness in the world. Iā€™m sure many find the idea of a store that sells only beer to be strange, but here we are.

(My understanding based on some personal experience is that at least in Germany, shop=chemist (Drogerie) is the same retail idea as Canadian ā€œpharmaciesā€ (London Drugs, Shoppers, etc) except they donā€™t have the pharmacist because by law pharmacists arenā€™t allowed to operate within store chains. It doesnā€™t seem entirely unreasonable considering the effects of centralization of Canadian retail market cough Loblaws)

1 Like

Question to British English speakers, is ā€œOff Licenceā€ even a good label for shop=alcohol in British English? Where I am, most shop=alcohol are specialist beer, whisky or wine shops that people generally wouldnā€™t call ā€œan off licenceā€ even if they technically hold an off licence (same as the supermarket). The actual ā€œoffiesā€ (that are called that in real life, by non-OSMers) tend to be tagged shop=convenience.

In the 21st century not really.

When I was growing up off-licences could open on a Sunday whereas the sale of food stuff was prohibited. Some shops had a separate door for the off-licence and the food section was blocked off on a Sunday.

The relaxation of Sunday trading laws meant this was no longer necessary.

Off-licence is really a legal term meaning they can sell alcohol for consumption off the premises whereas pubs, bars and restaurants are licensed to sell alcohol for consumption on the premises (most are also licensed as off-licences).

Since the end of Sunday trading laws most have expanded what the sell and have morphed into convenience shops (although still licensed as off-licences, as are supermarkets).

Most shop=alcohol that remain are now much more specialised and are wine merchants or specialise in things such as whisky. Personally for think shop=wine or shop=whisky would describe these better.

1 Like

(offtopic diversion from site localisation)

There are some genuine non-convenience ā€œoffiesā€ left in the UK; from memory this is one, and there were similar ones in similar areas south of Manchester, complete with late-night anti-theft hatch. Youā€™re absolutely right that there are many fewer of them than there were.

Australia has ā€œBottle Shopsā€ (colloquially ā€œbottle-osā€). A quick overpass search suggests that theyā€™re still pretty prevalent - theyā€™re similar in concept to an English offy, major on booze rather than other convenience store stuff and arenā€™t ā€œposhā€ wine or whisky shops.