Filling in the `name:en` tag in Ukraine

I also recommend that you specify as a community which romanisation system will be used (presumably the national Ukrainian standard). This way it can be independently conducted/verified even by non Ukrainian speakers, provided they read the guidelines carefully.

Personally, I meant the system you mentioned. It is officially approved by the Cabinet of Ministers of Ukraine. I believe it will be used, as our community is currently adopting it.

1 Like

Speaking as a retired U.S. diplomat who served in the USSR and later in two former Soviet countries, I faced this issue when reporting in English to Washington readers about events in locations like this. In the U.S. government, the rules for naming places are established by the Board of Geographic Names, and foreign names are determined by the National Geospatial Intelligence Agency. In the case of Kyiv, you will find Kyivska Oblast and Kyivskyi Raion (the latter in Donetska Oblast). I hope this info is useful as you ponder how to tag Ukrainian places with name:en.

1 Like

Thank you for these examples! I did some further research and found that The Permanent Committee on Geographical Names for British Official Use and The United States Board on Geographic Names, the authorities responsible for the standardization of geographical names, adopted the Ukrainian national Romanization system from 2010 in 2019.

ROMANIZATION_OF_UKRAINIAN.pdf
ROMANIZATION_UKRAINIAN_Feb22_75_.pdf

On a personal note, I would like to add that in live Ukrainian Cyrillic texts, various types of apostrophes may appear, not just the “’” specified in the table. Therefore, when writing code for automatic Romanization, it’s important to account for different types of apostrophes that should be omitted.

Here is the text of the agreement, should anyone wish to use it independently:

Romanization of Ukrainian

BGN/PCGN 2019 Agreement

The BGN/PCGN system for Ukrainian was designed for use in romanizing names written in the Ukrainian alphabet. It is an adoption of the new Ukrainian national system in use since 2010, and supersedes the BGN/PCGN 1965 System for Ukrainian.[1]

Ukrainian UnicodeCyrillic[2] Romanization
1. А, а 1040, 1072 a
2. Б, б 1041, 1073 b
3. В, в 1042, 1074 v
4. Г, г 1043, 1075 h[3]
5. Ґ, ґ 1027, 1107 g
6. Д, д 1044, 1076 d
7. Е, е 1045, 1077 e
8. Є, є 1028, 1108 ye initially, ie elsewhere [4]
9. Ж, ж 1046, 1078 zh
10. З, з 1047, 1079 z[3:1]
11. И, и 1048, 1080 y
12. І, і 1030, 1110 i
13. Ї, ї 1031, 1111 yi initially, i elsewhere[4:1]
14. Й, й 1049, 1081 y initially, i elsewhere[4:2]
15. К, к 1050, 1082 k
16. Л, л 1051, 1083 l
17. М, м 1052, 1084 m
18. Н, н 1053, 1085 n
19. О, о 1054, 1086 o
20. П, п 1055, 1087 p
21. Р, р 1056, 1088 r
22. С, с 1057, 1089 s
23. Т, т 1058, 1090 t
24. У, у 1059, 1091 u
25. Ф, ф 1060, 1092 f
26. Х, х 1061, 1093 kh
27. Ц, ц 1062, 1094 ts
28. Ч, ч 1063, 1095 ch
29. Ш, ш 1064, 1096 sh
30. Щ, щ 1065, 1097 shch
31. Ю, ю 1070, 1102 yu initially, iu elsewhere [4:3]
32. Я, я 1071, 1103 ya initially, ia elsewhere [4:4]
33. Ь, ь 1068, 1100 not romanized [4:5]
34. 0146 not romanized [4:6]

  1. The 2019 system was adopted by BGN and PCGN after monitoring a good level of implementation of the national system within Ukraine. Note, however, that this system is not recommended for reverse transliteration; take caution when attempting to convert a romanized name back into Ukrainian. This system also lacks the methodology outlined in the 1965 System to provide additional differentiation between digraphs and individual character sequences. For example, unlike the 1965 System, the 2019 System doesn’t differentiate the special character sequences зг, кг, сг, тс, and тсг (previously romanized as z∙h , k∙h , s∙h , t∙s , and ts∙ h ) from the digraphs zh, kh, sh, ts, and the letter sequence tsh, which are used to render the characters ж, х, ш, ц and the character sequence тш. ↩︎

  2. To use the keyboard Unicode function, hold ALT and enter in sequence listed in the table. ↩︎

  3. The character sequence З Г, previously romanized as zh, is romanized zgh under the 2019 system. ↩︎ ↩︎

  4. These characters differ significantly in romanization from the BGN/PCGN 1965 system. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎

1 Like

Your considerations:

and the English Wikipedia uses:

and this article describes “oblast” not as a region but as a province (in some othet articles also as region)

My opinion: Ukrainian mappers should not translate Ukrainian geographical names into another language, not even name:en. Please leave this to native speakers. They know best how Ukrainian geographical names are used in their own native language.

1 Like

Isn’t romanisation (or transliteration, as it is often referred to) a solved problem already? I believe some apps have built-in transliteration to show any name in Latin script, rendering int_name and name:xx-Latn redundant.
This is also part of the reason I don’t see much value in romanisation on its own - eg int_name was added across the whole of Belarus using a script, and has not really been maintained since.

1 Like

yes, its possible

In practice, this is difficult to do. Just imagine that I want to make some mapping service. So, in addition to the main tasks, I still need to implement about 100 algorithms that will romanize local names. And if there are exceptions, then it is necessary to keep a whole database for this. Why do we need another cartographic database in addition to OSM?

1 Like

The mechanical approach has significant limitations. Romanisation is truly useful primarily for purely Ukrainian names, but not all objects in Ukraine have names of Ukrainian origin. There are many international brands whose names are derived from other languages, not to mention the mixed use of Cyrillic and Latin characters, as well as numerals in some cases. All of this makes the situation far more complex than a simple algorithmic approach can handle effectively.

4 Likes

I think that the correct approach by data consumers is to display (or search for) int_name / name:en / name:uk-Latn where specified, and otherwise fall back to the default transliteration rule for the given language. With the additional catch that “given language” sometimes has to be determined heuristically – not every name within Ukraine (or Serbia or Greece or…) is necessarily in Ukrainian language. However, in those cases mappers ought to specify int_name.

3 Likes

If anyone’s still interested — I wrote down my further thoughts on this topic in my diary and started a discussion in the Ukrainian forum thread about a tag for romanization.

3 Likes