Choosing the right English Name for a POI

After encountering a name change of a well-known public park in downtown Chiang Mai, I became curious.

The English and Thai names on OpenStreetMap are listed as “Buak Had Park” and “Nong Buak Haad Park” respectively, with the Thai name shown as “สวนบวกหาด”.

However, upon further investigation using Mapillary, the main park entrance displays the name as “สวนสาธารณะหนองบวกหาด” and “Buak Hard Public Park” in English.

Considering that “หาด” is often translated as “Haad” in Thailand, and other map providers show the park as “Nong Buak Haad Public Park”, there are several options for the English name to be used in the “name:en” field:

  • A) The original main entrance sign English name "Buak Hard Public Park"
  • B) The above option with the “Haad” variation: "Buak Haad Public Park"
  • C) The translation of the main entrance Thai name "Nong Buak Hard Public Park" (google translate)
  • D) The above option with the “Haad” variation: "Nong Buak Haad Public Park"

As it is not uncommon to have multiple signs with mismatching English names, option C) would be a reasonable choice. What do you think?

Since there’s no official or well known consistent transliteration, an RTGS system should be used. So, in this case, it should be "Nong Buak Hat Public Park"

@nitinatsangsit Can you please suggest an online service that utilizes the RTGS system?

I found that someone put this on the Thailand wiki.

http://pioneer.chula.ac.th/~awirote/resources/thai-romanization.html

By the way, it’s not always perfect because the RTGS system transliterates by sound rather than character, so it’s up to the program’s vocabulary database, but it’s the best we have.

Thanks. There is also an online version for this RTGS software:

Now, this service provides full romanization of Thai words (lowercase).

However, our wiki indicates that certain English terms may need to be retained, such as ‘Phuket Province’ instead of ‘Changwat Phuket’.

But we seem to use ‘Soi’ instead of ‘Street’, ‘Wat’ instead of ‘Temple’, and ‘Ban’ instead of ‘Village’.

So, how do we decide whether it should be ‘Satharana Nong Buak Hat’ or ‘Nong Buak Hat Public Park’?

This is sometimes controversial, and there may be no obvious rules. I think it should be decided by what is more commonly used.

For a public park, I think we should use English words.

2 Likes

Another highly recommended online resource is the robust ChatGPT, available at https://chat.openai.com/. It has demonstrated impressive accuracy thus far:

When I inquired with ChatGPT about the system it employs and the rationale behind its choice of “Hat” rather than “Haad” or “Hard,” here is the response I received:

1 Like

We can establish rules for commonly used terms and based on that, I can suggest a wiki revisions for the Multilingual Names section. e.g.

Generic and common terms for describing places or objects should be translated using the English term, with notable exceptions for "Wat" (Temple), "Soi" (Street), and "Ban" (Village).

Are there any other exceptions that come to mind?

1 Like

For me, what I often found are Khlong (canal) and Ko (island).

1 Like

note that while ChatGPT and similar tools may provide consistent answers and be useful in some contexts they often behave erratically - and claimed sources and rationale is generated, and often does not match actual one (it will happily hallucinate nonexisting sources or claim to employ method mismatching outcome)

1 Like

I also strongly suggest to NOT use ChatGPT for answers requiring factual correctness.
A GPT is supposed to provide plausible answers similar to the input provided in training. It frequently fails on facts, but keeps providing you the answer in a way looking right. It even provides you plausible sounding answers, which can be completely wrong.

For naming, we agreed since long that we’re using RTGS transliteration in name:en tags. And we’re applying a translation where applicable. Road/School are translated, Wat is kept as transliteration.

A good guideline is: how would a Thai native speaker name the POI in English when you ask about it?They would send you to Sukhumvit Road and recommend to visit Wat Pho.

For the software aspect we had various threads in the last years.
I prefer tltk · PyPI

Khun Wirote also replied very quick when we reported problems in the past. I think Sven Geggus used it for the transliteration feature of the German map style. I don’t know whether it is still in use there. The change was done at least 4 years ago on a hack weekend in Karlsruhe.

Regarding the naming: A sort of tiny “dictionary” might be useful. I have some basics in my scripts where I process my photomapping with my “OSM keyboard”. I think I blogged about it in the past.
If you are interested, we could open a dedicated thread for it. While at it I would then also like to review/standardize/update the JOSM presets.

For the naming are also words interesting which got transliterated the other way. You will frequently find these at shops. For example: บุฟเฟ่ต์, คาร์เซ็นเตอร์, แมนชั่น and similar. You want to use the original English wording here and not RTGS.

To come back to the initial topic about city park:
RTGS is the standard to follow. Frequently signs are different. Either they pre-date the RTGS standard or someone who was in charge of the signs did not follow it correctly. And as mentioned, there could be slight variations in transliteration. There is no strict 1:1 mapping.
If the wording on a sign is different, store the bad spelling of the sign in a tag alt_name:en. That way search still works. Nominatim (or other software) will pick them up.

2 Likes

One of the things I found at early map starting was that for streets when having a name tagging conflict with others: The physical sign at the road rules supreme. The exception is when there are abbreviation, then write them full out like here S. Salvatore is San Salvatore. The renderer can decide to abbreviate when there’s space cram like the classic of Via Cavour here. The full name is like 50 characters. I apply the same rule for everything else so with this park I have no doubt but to put the name:en as what’s signed over the main entrance.

@SekeRob this might work in other parts. In Thailand, the English name is simply spelled wrong in many cases. Not using RTGS.

That’s why the “correct” spelling of the romanized name (translated/transliterated) is then in “name:en”. To not loose the “ground” detail, the “alt_name:en” tag is used.

This prevents from having badly spelled names in a prominent location and still allow users of data to search for the bad spelled name.

While Thailand has a quite high literacy rate (93.8%, compare to eg United State 86%), that is about reading/writing Thai script. English is at school, but relatively poor. That’s why you can find many examples of “funny” English signs. And similarly the transliteration quality is often not that great. Depends also who is the authority responsible for the sign.

2 Likes

This is another good example of how RTGS would be used. “Sukhumvit” is commonly used and very consistent in the sign and English document, so we use it instead of RTGS’ “Sukhumwit

very good catch.
I wanted to point to the translation/transliteration aspect and we got a quite good example of inconsistent RTGS usage.

Wikipedia in this case lists both the non-RTGS transliteration and the RTGS one (along with the original Thai script): Sukhumvit Road - Wikipedia

So yes: We have many examples of non-official transliterations on signs, which are sometimes so common, that they are widely used.

Someone with more history/linguistics background might explain how Sukhumwit got to Sukhumvit.
In my native language the pronounciation of “w” is much closer to the intended tone than “v”.

Still: All sources state that RTGS is official transliteation system.

And I still recommend to put both spellings into the OSM database. But which one should be in name:en and which in alt_name:en gets tricky.

The “Sukhumvit (Road)” spelling is now internationally so established, that it either counts as translation or maybe as int_name, because internationally that road is known by this name and not the RTGS one.
I am not a big fan of that int_name, because it was abused instead of name:en so frequently in the past. The city name “Bangkok” instead of “Khrung Thep” sounds like a valid use-case.

Are there any good examples for a tagging guideline? Common sense approach would be: If a sign is completely wrong or some out of many signs differ, go with RTGS. If the wrong spelling is widely and consistently used, go for that in name:en.

1 Like

Maybe to add: I though a bit about maybe referring to Wikipedia as a source to decide if something is international known by a different spelling, as this would then be the key to the English wiki page. But that might lead to the “Scots Wikipedia” problem where bad and made up words were circularly copied around and used as reference.

Aside from the RTGS system, which is transliterated by sound, another commonly used system is transliteration by character, which can be used for a proper noun derived from Pali or Sanskrit words. It is widely used for royal family names as well as some noble titles. This system, for example, is used by Suvarnabhumi Airport rather than RTGS’ Suwannaphum.

Sukhumvit road is named after พระพิศาลสุขุมวิท (RTGS: Phra Phisan Sukhumwit; by character: (Phra) Bisal Sukhumvid). The problem is that v is transliterated by character, but t is the RTGS system. The system mix is strange, and we have no idea why. However, it is now widely used, so just use it. :smile:

From this thread, I’ve learned that:

  1. English names should be based on how they are commonly used by local and English-speaking communities, and may not always align with the RTGS system.
  2. There are many variations of names, making it challenging to determine the most common one.

Two questions arise:

A. Why are there so many variations?

Could it be due to limited accessibility of RTGS systems?

  • RTGS software is hard to find and requires specific programming skills.
  • TLKT web version does not retain generic terms and word cases.
  • Everyday web services like Google/Bing Translate do not rely on RTGS system.

B. How to choose the most common translation?

A quick Google search for the mentioned park leads to various results.

Which ones generate the most hits?

  • The original park sign “Buak Hard Public Park” indicating the physical sign is widely used.
  • The second variation is based on Wikipedia entry.
  • Note that the correct RTGS transliteration “Nong Buak Hat Public Park” is almost not used at all.
"Suan Buak Haad Park": 1380 results
"Buak Haat Public Park": 2970 results
"Buak Haad Park": 9010 results
"Nong Buak Haad Public Park": 1740 results (Google translation)
"Buak Hard Public Park": 41600 results (TripAdvisor, Facebook)
"Buak Hat Park": 21400 results (Wikipedia)
"Nong Buak Hard Public Park": 8740 results
"Nong Buak Hat Public Park": 260 results (RTGS)
1 Like

The number of Google searches is interesting. By the way, in Thai, สวนบวกหาด returns 757,000 results, while สวนสาธารณะหนองบวกหาด returns 124,000 results. This showed that the number of Google searches should not be used to determine correct spelling, because casual spoken name may be more popular than formal ones.

In general, according to the OSM “on ground” principle, the sign should be the best evidence of the name spelling, but in Thailand, I want to add one point of worry, which is the “reliability” of the sign.

For example, the shopping mall’s name, which is consistently spelled across their signs, advertisements on Facebook, websites, and so on, is far more reliable than the government’s sign, which may rely solely on the opinion of the people who write it, and sometimes even not the authority but the construction contractor. Imagine that 10 years later, the Chiang Mai City Municipality is renovating that sign, and the spelling has changed slightly.

This leads to a simple answer: 99% of places, especially public ones, should be transliterated using the RTGS system, as it is the Thai government’s official transliteration system. If the RTGS system is not to be used, we should have a strong and convincing evidence of the correct spelling.

In my experience, the simple answer is that most Thai people (and certainly, some public servants) doesn’t even know that this system exists on earth. As you can see, 99% of Thai people’s names do not follow the RTGS standard. That’s why หาด can be transliterated to haad, hard, haat, etc. as they like.

1 Like

Blog post from… “Buak Hard public park” in Chiang Mai

Just wondering if such a big attraction, maybe kiosks in the neighbourhood sell, yes I’m from the previous century, postcards and what would be printed on front and or back.

PS, Duck Duck Go gives the same hit distributions as Google, without advertisements.