Filling in the `name:en` tag in Ukraine

Hello! I’m a contributor from Ukraine. Recently, I’ve become interested in how names on maps are localized. I’d like to draft some recommendations on how to fill in the name:en field for objects in Ukraine, primarily for OSM contributors from our country.

I’m familiar with the wiki articles and the arguments presented there, but I still have a few questions about certain nuances that might be obvious to native English speakers.

Specifically, names that include a generic or nomenclature term.

Very often, when filling out the name and name:uk fields, OSM contributors in Ukraine also try to fill in the name:en. For simple names, like town names, this is usually straightforward — the official transliteration is used. For example:

Original name Transliteration
Черкаси Cherkasy
Київ Kyiv

However, some names are more complex. For example, regional names. The first word in the name of a region is usually an adjective derived from a city name, which serves as the administrative center of the region. For instance:

  • The city name is Cherkasy (a proper noun), and the name of the administrative unit that includes Cherkasy is Cherkaska, an adjective derived from that proper noun.

The second word is a generic or nomenclature term like “oblast”, which is a noun.

Full transliteration approach

Original name Transliteration
Черкаська область Cherkaska oblast

This method involves converting Cyrillic characters to Latin characters based on a set transliteration table.

I don’t have anything against this approach, but it has its pros and cons:

  • Pros: An English speaker can try to read the name and more or less pronounce it (though it might be challenging).
  • Cons: The meaning of the name becomes unclear, turning it into just a string of Latin characters.

Mixed approach (used by many mappers in Ukraine)

This approach involves mostly transliterating the name while translating the generic part. For example:

  • Cherkaska Region

The first word is a transliterated adjective, while the second word “oblast” is translated into English as “Region” and capitalized.

Personally, I’m not a fan of this approach because it mixes two methods.

Adaptation/Translation approach

This method is the most complex since it requires a deep understanding of the name’s etymology.

For example, the name “Cherkaska oblast” comes from the toponym (the city name Cherkasy). Many regions in Ukraine are named this way.

Using this method, the name in English would be fully translated as:

  • Cherkasy Region

The problem with this approach is complexity and non-universality for all names.


I would love to hear your thoughts on these three approaches, including any critiques. I’d also appreciate any English-language sources (books, papers, etc.) on different methods of localizing foreign names into English, especially for maps. If you could recommend any, I’d be very grateful.

I’ve chosen to post in this general category because it’s read by English speakers from around the world. Additionally, similar topics I’ve created in the Ukrainian category haven’t yet received feedback from other contributors in Ukraine.

3 Likes

In OpenStreetMap the guiding principle is that we as mappers don’t translate names ourselves. How things are named is not for us to decide, we just document it. If a name is used in English (that is, it has a proper exonym, like Kyiv), we use that. If there is no name in English, name:en is simply not set.

The reason for this is simple. If someone writes about a small village in France in an English newspaper or blog, than they use name, which will be in French. There won’t be a name:en, and the French name can be used in English.

Now with endonyms (that is, the native name) in another script, things are a little different. If no English name exists (which is common for something like a small village), sometimes a romanization is used for name:en when that name is used in English, because when someone writes about Черкаська область, suddenly that name makes the English text unreadable to most readers. So name:en=Cherkaska oblast may seem fine.

On the other hand, it makes no sense to provide name:* for every known language in cases where that language has no common exonym for that place. The same problem above goes for every language in another script. Someone writing in Dutch would need a romanized name as well. What you can do instead, is provide the romanization (see my note about name:uk-Latn below).

There are downsides to providing your own translations. Sure, it may sound logical that область means region, but translations have a habit of varying depending on a lot of factors. If you do consider using something like name:en=Cherkaska region, I would recommend only doing that if there is some official use of that translation as well. Something like a Ukrainian government website listing all oblasts in English. Don’t make up your own translation, even if it looks straightforward.

Depending on who you ask, it might make perfect sense to simply use oblast in English names. This is something you would need to feel out with other Ukrainians (again, official English language Ukrainian government websites and communiques are often useful here). For ‘oblast’ specifically, I would say that this is a word which many people will already understand as meaning ‘a region in a Slavic country’. I do, and I have no background in any Slavic language or culture.

Whether this is done or not depends on the local mapper community I think. You can have a look at the wiki for how this is done in other countries. For Ukraine no decision seems to have been taken yet.

In addition to thinking about name:en, I would recommend documenting a tag for the official romanization of Ukrainian names. The wiki doesn’t mention this yet for Ukraine. Have a look at how this is done for some other languages. name:uk-Latn=Cherkaska oblast sounds good.

6 Likes

By the way, the above goes for all names, but for oblasts and major cities a case can be made that these are indeed already known and used in English:

Looking at a number of these, a common theme seems to be that something like Черкаська область is often called ‘Cherkaska’ in English. You might tag that like this:

name=Черкаська область
name:uk=Черкаська область
name:uk-Latn=Cherkaska oblast
name:en=Cherkaska
official_name:en=Cherkaska oblast

Or the other way around:

name=Черкаська область
name:uk=Черкаська область
name:uk-Latn=Cherkaska oblast
short_name:en=Cherkaska
name:en=Cherkaska oblast

In fact, for oblasts you can probably safely say that in most languages these already have a common exonym. You can see this reflected in the name tags of the occupied parts of Ukraine which have been in the news worldwide for some time now: Луганська область

2 Likes

Just keep in mind, that usually name:* supposed to be filled with the name people from the *-area are using, not what the local government want it to be.

1 Like

You obviously intended this, but for clarity: this is the convention for exonyms (names people from another country give a place in their native language).

For Ukraine a good example is Kyiv. In the Netherlands most professional media have shifted from the Russian-style ‘Kiev’ to the Ukrainian-style ‘Kyiv’. By now this shift is complete, and only a few stragglers and political extremists hang on to the old Dutch name for the Ukrainian capital.

(By the way, name:nl and name:fy for Kyiv are currently outdated (stating Kiev), but I know most mappers are hesitant to edit things in Ukraine at the moment. What is the best way to address this?)

For endonyms (names in the local language(s) of the place itself) ground truth can overrule this. E.g., when a place officially changes its own name, signs and all (thus ground-truth), OpenStreetMap should follow as soon as possible (using old_name to keep the old name).

1 Like

Exactly, since the topic was about such names and I believe in most cases it’s best to leave name:en to our English community to fill. They would the once who know whether they call that area “Cherkaska” or “Cherkasy” and with or without Suffix “Region” or rather “Oblast”

@JeroenHoek, thank you for the detailed messages, especially for the terms “endonym” and “exonym.” I googled them and found a lot of interesting information, particularly regarding geographical names.

I understand the validity of the approach that we, as mappers, should only document names. I don’t want to judge this as positive or negative—Ukrainian mappers add a lot of name:en tags, and it seems that they don’t always follow this principle.

Currently, around 300,000 objects in Ukraine have a name:en tag. Of course, each case is unique, but most of these names are either transliterations, translations, or some combination of both. It also appears that most of these names weren’t taken from any source but were translated or transliterated by the mappers themselves. I assume this is done with good intentions and a genuine desire to add a name to the object so that foreigners can read and understand it.

A large category consists of street names. Often, there are streets with the same name in different cities. For example, Shevchenko Street (vulytsia Shevchenka)—is a street named after the famous Ukrainian poet Taras Shevchenko. Suppose there is a source legally allowed for use in OpenStreetMap, from which name:en=Shevchenko Street for the city of Lviv was filled in. But Lviv is a large and well-known city. In the small village of Vaniv, also in Lviv Oblast, there is also a Shevchenko Street. Does this mean that name:en can be added for Shevchenko Street in Lviv, but not for the village of Vaniv?

You could also imagine a situation where one city uses a full transliteration approach in some English sources, while another city uses a translation with adaptation in other sources.

2 Likes

Ah, good question. That’s hairy. Generally speaking, I don’t think many streets in Ukraine (or just to name another example, my country the Netherlands) should have names in English. It’s rare.

Someone was busy adding German translations of Dutch street names a while back (so a ‘Bank Street’ (‘Bankstraat’) would be tagged with name:de=Bankstraße), but these have been reverted. It’s really only the exceptionally famous places like Amsterdam’s Dam Square which are used in English more often. (For names of cities and larger entities like regions, provinces, etc. this is different, as noted above.)

And yes, you are right. A famous street in Kyiv (something people write about in English books or sing about in songs for example) could have a name:en, but even if the Ukrainian name is the same, I wouldn’t copy that to a small village somewhere else.

But even then mistakes will be made. Take майдан Незалежності, perhaps the most famous Ukrainian street or square. If I look at name:en and name:nl I see direct translations of ‘Majdan Nezaležnosti’ (‘Indepedency square’ and ‘Onafhankelijkheidsplein’). And… both are wrong (the Dutch one is hilariously wrong). Those are the literal translations of the words, but Maidan is much, much bigger than that of course.

As you know, anyone literate worldwide now knows the word Maidan for the historical significance of that place. In Dutch, newspapers write about ‘Maidanplein’ (‘plein’ meaning ‘square’), and while in English people do call it an independence square (describing what it represents), they are often naming it ‘Maidan’ or ‘Maidan square’ (see both of these used in this representative article in a British newspaper).

Currently, OpenStreetMap does not represent those names as it should.

4 Likes

Cherkas -ka is something like Cherkas -ian. For example, Cherkasian Street, Cherkasian Dam, Cherkasian Lane, etc.

Yes, but as soon as an exonym gains common use, the local language rules are irrelevant (annoying as that is). So an adverb in Ukrainian can become a proper name in another language.

Wrong or outdated exonyms are common. Take the Japanese for the Netherlands: name:ja=オランダ. That name (Oranda) comes from the Portuguese word for ‘Holland’ and was borrowed several centuries ago. Most countries call us something derived from the Netherlands nowadays (literally ‘the low lands’), but some use a name which is either — depending on who you ask — a popular moniker or just the name for two of our provinces rather than the whole. That’s hard to change.

Kyiv (that is, the Ukrainian government) actually did really well by getting people worldwide, in a number of languages, to shift from the Russian transliteration (or romanization) of the capital to the Ukrainian one (Kyiv that is). Politics play a part in these things too of course.

Turkey is trying to do the same with their country name in English, preferring Türkiye instead. (Did you see that I used the unofficial name? It’s mostly not something you consciously think about. I suspect I might actually write Türkiye automatically in a few years time, but who knows? Language can be vague and weird.)

3 Likes

Actually, even before this discussion, I had a rough idea of the answers to my questions. The process of creating an exonym is often quite random. My goal is to create clearer guidelines for this. After all, it’s unlikely I can just tell all other mappers, “Don’t do this, don’t fill in name:en…” So, there should at least be some kind of wiki page with step-by-step instructions like:

  1. Find a source or several sources for the English name that are compatible with the OpenStreetMap license.
  2. If no such source exists, don’t try to create your own translation, adaptation, or transliteration.
  3. If there’s no source but you really, really want to fill in name:en, at least do it in this and that way…

Is this not a very good idea?

I would figure out what to do with romanized names as well for the guidelines. That way you can redirect mappers who want to put in a romanized name to the correct tag. My guess is this should be name:uk-Latn. Then if someone complains that they want to render a map of the world with English names, they can use those romanized names (name:ja-Latn, name:sr-Latn, name:ko-Latn, name:zh-Latn-pinyin etc.) as a sensible fall-back if no name:en exists.

Note however that this tag has only been used 32 times, so my guess is that the Ukrainian mapper community will have to decide if that is a desirable route.


  1. If there’s no source but you really, really want to fill in name:en, at least do it in this and that way…

Often for these kind of things no reliable sources exist beyond people just having an affinity with their own language. I think for ③ you could recommend that people use names which are at least demonstrably in use. Articles in national newspapers, for example, can be a good reference. In those cases you are not using a source to copy from, but are providing references which demonstrate that name being in use. For that, no licence is required (that is, you’re not copying data, just backing up your claim).

1 Like

I don’t think that’s a good idea. If the English speaking folks missing a English name, they will add it.

1 Like

(By the way, the Dutch ‘Maidanplein’ is weird if you look at it closely, because it literally reads ‘Square Square’. Languages are silly.)

@aighes I see your point. However, some mappers might feel that it’s up to local Ukrainian cartographers to decide how these names should be added or translated. I’m concerned that with this approach, the data in this tag might remain inconsistent.

In general there are two schools of thought for romanisations in OSM. One is name:xx_Latn, where xx is the code for the language in question (e.g. name:ja_Latn). This is used in places like Japan or Korea as mentioned before.

Then there is int_name that is used in places like Bulgaria, North Macedonia, Greece for names using standard national romanisation systems.

Some places, like Serbia, use both, since they have a local Latin script (Gaj’s Latin alphabet) and a more “simplified” romanisation without special characters.

From taginfo int_name seems to have worldwide use, but it is inconsistent. In the places I mention above, I know for a fact almost every street has an int_name, and renderers support it as such (e.g Organic Maps).

I concur with the above that name:en is subpar. Not every person who can’t read Cyrillic is English speaking, nor are transliterated names in English, but rather Ukrainian written in Latin script. name:en is already populated with English language exonyms, which are not always the same as ones in other languages (e.g English Lviv vs Croat Lavov vs Spanish Leópolis), while romanisations from Ukrainian are consistent across the board.

1 Like

So, for now, a preliminary conclusion can be made—name:en in Ukraine should be used for the English exonym if it exists. The correct way to fill in name:en is to base it on a reliable source. If the exonym does not exist, name:en should be left blank.

Romanization of names in int_name or name:uk-Latn is a better approach. This way, the mapper doesn’t have to create an exonym themselves. It’s a bit simpler, but there are still some nuances. I’m already seeing the first problem with romanization, which involves names containing numbers or ordinal numerals:

Translation (for your convenience) Numeral with digits Numeral in words Romanization with digits Romanization with numeral in words
1st Brick Lane 1-й Цегельний провулок Перший Цегельний провулок 1-y Tsehelnyi Provulok Pershyi Tsehelnyi Provulok
3rd Righteous Drive 3-й проїзд Праведників Третій проїзд Праведників 3-y Proizd Pravednykiv Tretii Proizd Pravednykiv
Cherkasy 700th Anniversary Square площа 700-річчя Черкас Площа Семисотріччя Черкас Ploshcha 700-richchia Cherkas Ploshcha Semysotrichchia Cherkas
June 28th Street вулиця 28 Червня вулиця Двадцять Восьмого Червня Vulytsia 28 Chervnia Vulytsia Dvadtsiat Vosmoho Chervnia

To me, the goal of romanization is to allow users to read and pronounce the name. Therefore, the numeral should be written in words and this written form should be romanized. As before, I’m not sure if this is the right approach. I’d be interested to know how suitable this idea is.

1 Like

Agree on that, though wondering, why it’s name:uk-Latn and not name:en-Latn. Would there be a different romanisation for en_US?

name:uk represents the name in Ukrainian, and I think name:uk-Latn indicates the Latinized form of the Ukrainian name.

:scream: Stupid me, my bad…