RFC: Proposal for bulk name changing in Azerbaijan prepared with AI assistance

There is a proposal for bulk name changing in Azerbaijan prepared with AI assistance. The spreadsheet report is based on the ready-to-upload changeset file.

@menkaura

I’m not hurrying with uploading these changes at all. Let’s review it attentively, discuss more if needed.

Also, I’ve written report to the DWG regarding such a naming.

2 Likes

Your AI turned out to be not so intelligent after all. İmperator Domisian is not the name of a settlement, but the name of a Roman emperor, in whose name legionnaires left an inscription on a stone in the Gobustan historical reserve in Azerbaijan…
And I’m sure the AI ​​will make a lot of mistakes, and no one can guarantee against such AI errors. Your AI will simply cripple the map of my country, which I’ve been working on for 13 years…

5 Likes

I am a Turkish person from Turkey, not an Azerbaijani Turk. However, since we speak almost the same language, I can also notice some other tagging errors in this list. From what I can see in the first 1,000 lines, there are many descriptive name=* tags in the list.

The following items on the list can also be corrected:

For example, there are places tagged with name=Avtoservis. “Avtoservis” is a generic name. “Avtoservis” means “car repair shop.” It seems that tags with name=Avtoservis can be removed entirely.
The same applies to:
-shop=car_parts entries with name=Avto Ehtiyyat hissələri and name=Ehtiyyat hissələri
-amenity=car_wash name=Avtoyuma and name=Автомойка
-shop=car name=Maşın Bazar
-amenity=police name=Polis and name=Polis Şöbəsi and name=Polis Məntəqəsi
-leisure=sports_centre name=İdman Məktəbi
-shop=convenience name=Market
-amenity=place_of_worship name=Məscid
-shop=hardware name=Tikinti Materialları
-shop=hairdresser name=Bərbərxana and name=Bərbər
-shop=pet name=Zoo Mağaza
-amenity=pharmacy name=Aptek
-shop=butcher name=Ət Dükanı

also applies.

For amenity=bank elements in the name=* tag that include the phrase “+ ATM” the atm=yes tag can be added, and the +ATM phrases in the name=* tag can also be removed. For example, Node: ‪Expressbank + ATM \ Sumqayıt‬ (‪1809771351‬) | OpenStreetMap The phrase name=Expressbank + ATM likely indicates that the bank has an ATM. In this case, adding the atm=yes tag to the object may be more appropriate.

The abbreviation “FHN” appears in multiple items. (FHN = Fövqəladə Hallar Nazirliyi = Ministry of Emergency Situations). This is the name of one of the ministries in Azerbaijan. I’m not sure if it’s correct for this to appear in the name tag. In some elements, operator=Fövqəladə Hallar Nazirliyi while in others, short_name=FHN or operator:short=FHN might be the correct usage.
A similar situation applies to Dövlət Yol Polisi=DYP If the list is examined in more detail, it seems likely that more similar errors could be found. I haven’t gone through the entire list yet. For now, these are the ones I can see.

3 Likes

Good catch!

I’ve added a column that shows the number of occurrences of each value that is supposed to be deleted. We can leave those values which has less than, let’s say, 20 occurrences. Also, it can be sorted by the number of occurrences for easier validation.

Selam @menkaura. Öncelikle yıllardır sürdürdüğünüz katkılarınız için teşekkürler.

Ben de bu tip toplu değişikliklere genelde sıcak bakmam. Ama maalesef verdiğiniz örnek, zayıf bir örnek. Ben de yapay zeka araçlarına tamamıyla güvenelim gibi bir şey demiyorum. Fakat sizin hatalı etiket kullanım oranınız, yapay zekanın hatalarından fazla bile olabilir. Listenin şimdilik sadece ilk 1000 satırına biraz baktım, verdiğiniz örnekteki gibi başka bir hata göremedim.

Örnek verdiğiniz yerin özel adı gerçekten “Roma Legionerlərin yazısı \ İmperator Domisian XII” mı? Yazdığınızdan anlaşıldığı kadarıyla bu taşın böyle bir adı yok. Siz name=* etiketini, “taşın niteliklerini açıklamak için” kullanmışsınız. Doğru mu anlıyorum?

1 Like

Arkadaşım, kendin bak:

@menkaura
Ben, name=* yerine description=* etiketinin daha uygun olacağını düşünmüştüm. description=Roma Legionerlərin yazısı. İmperator Domisian - XII şeklinde.

Yolladığınız yazıda taşın “adı” yazmıyor, taşın nitelikleri yazıyor. Fakat siz taşın adı “Roma Legionerlərin yazısı \ İmperator Domisian XII”dır diyorsanız ben ısrar etmiyorum. Karar sizin.

Azerbaycan’a katkılarınızdan ötürü elinize sağlık. Selamlar.

1 Like

Teşekkürler Arkadaşım.. Sadece ben burada kendimi teklenmiş kibi hiss ediyorum.. Burada, bölgesel özellikleri dikkate almayarak, yaratıcı çalışmaların gerekli olduğu yerlerde Yapay Zekayı kullanmak istiyorlar, ancak bu tamamen haksız bir yaklaşım..
Birdaha teşekkür, Sizin destek bana çok vacib..

Procedural note: this import will need a separate forum thread dedicated to it.

Because of the data integrity risk of AI, you should be slicing up these suggestions into groups with similar confidence levels so that common cause issues can be reversed together if an issue is discovered late.

8 Likes

That sounds like a spectacularly bad idea.

If there are consistent “category errors” in the data (and it sounds like there might be, because of lack of community review so far, as flagged at the top of this thread), then please let’s not make things work by using subhuman guesswork to “correct”.

22 Likes

I could do the same work without AI but script. Because the task was just to remove part of name after first backslash. I didn’t ask AI to evaluate what this part is, settlement or not.

I don’t suggest to upload my changeset as is. It’s just the first iteration to review, to understand what actually placed to the names.

Hello,

This is a thread to discuss the proposed bulk POI name change in Azerbaijan.

The discussion has started in the thread Concerns about Azerbaijan

To the name= tags of POIs, MenKauRa has been adding, after a backslash, also the name of the town. In the same changeset discussion, he says that he feels it necessary to do this since some people’s apps might not find the POI otherwise. It seems to me that MenKauRa has simply reinvented, in a poorer form, the long-deprecated is_in:city tag. The result is that names of POIs in Azerbaijan look like nowhere else; they look like the result of a bad import. As an example, see the cafe at 40.1731, 49.4699 and the other POIs to the northwest of it: there is absolutely no reason to specify the town in the name= tag (moreover, many of these are generic names, not the POI’s actual name).

First of all, I’d like to express gratitude and respect to @menkaura, who has been drawing the map of Azerbaijan for 13 years, and who has drawn almost all the thousands of the POIs I’m talking about now. I don’t agree with the naming approach that he used, but without his contribution, there wouldn’t be anything to discuss at all.

My proposal is to change such names in bulk by just removing the part after a backslash.

With AI assistance, I prepared a draft changeset to evaluate existing data, possible impact, and so on. Here is an overview spreadsheet: https://docs.google.com/spreadsheets/d/14-kqFjgPMc7wvABBiaqSpgs0j3mwOnWjDbBcqP5843k/edit?usp=sharing

There are three sheets in the spreadsheet:

  1. Changes. List of all tag changes sorted by the number of occurrences of each value of the name’s part after a backslash. It’s supposed to be the name of a settlement or another place where the POI is located.

    How did I get it? Via Overpass, I got all nodes in Azerbaijan that contain a backslash in the name tag. Then I used Claude to generate a changeset and a report, providing the .osm file with raw data from Overpass, providing an example of a changeset generated by JOSM (I manually changed the name of one POI), and asking to remove everything after backslashes in name tags.

    To evaluate the changeset, I additionally asked to generate a spreadsheet that you can see above (I used the same context, so theoretically Claude could generate the report not based on the actual changeset but on its own previous response).

    The number of occurrences is calculated in the spreadsheet using a formula.

  2. Suspicious values. Via Overpass, I got all place=* *in Azerbaijan. Then I provided this raw data and previously generated report to Claude and asked it to find substrings supposed to be removed from the name tag, which don’t match with any place in OSM.

    The majority of the found values are names of places anyway, like streets, residential complexes, and so on, which just aren’t tagged with the place=** tag.

    Previously, two cases that obviously shouldn’t be changed the same way have been found manually. It was discussed here Concerns about Azerbaijan - #13 by evgenykatyshev

  3. Summary. It has been added by Claude on its own. Contains total numbers and the most popular values.

To avoid possible AI-related mistakes, the same changeset and report might be generated with a plain script.

I’d like to discuss with the community the best way to perform this bulk update, if it’s considered to be done at all.

I’ve created the separate topic Proposal: bulk POI name change in Azerbaijan

First of all, I want to thank you for your kind words and acknowledge your persistence in seeking a creative approach to editing OSM maps.
But I don’t understand why it was necessary to open another thread when an identical one already exists, with specific opinions from OSM users:

https://community.openstreetmap.org/t/concerns-about-azerbaijan/113039/12

I remain of the opinion that using AI in creative, individuall(!) mapping work will not lead to anything positive, but will only create additional problems and questions. I already gave an example in the previous thread. And given the regional specifics of my country, which I mentioned above, I consider the use of AI unacceptable..

5 Likes

I’ve redone the changeset with Python script (see on Github azerbaijan-names/fix_backslash_names.py at main · ekatyshev/azerbaijan-names · GitHub ), so we can be certain that there isn’t AI hallucination. You can see and download the resulting changeset file here azerbaijan-names/changeset.osm at main · ekatyshev/azerbaijan-names · GitHub

Additionally, I’ve added to the report sheet a column showing the distance in kilometers from a particular object to the settlement was found by matching the names. It doesn’t handle situations when there are several co-named settlements, but having the number less than 5 km, you can be sue there is the settlemen with this name.

2 Likes

I did the test changeset after manual reviewing: Changeset: 183213088 | OpenStreetMap

I used the changeset generated by my script, but uploaded only 130 POIs located within a single settlement of Şəki.

I have done the other 5 changesets for 5 places with manual reviewing:

@menkaura please have a look.

Also, I accidentally made two bigger changesets (Changeset: 183297383 | OpenStreetMap and Changeset: 183298920 | OpenStreetMap), but the changes have been reverted within the same changesets. It happened because I mistakenly pressed Upload instead of Upload selected. I apologize for that. I’ve removed the Upload button from my JOSM toolbar, so it shouldn’t happen again.

I found another pattern of using backslashes in the name. It’s putting the name in another language or an alternative name after a backslash:

Just to let you know that I’m aware of such cases. They require the moving part of the name into another tag, like name:en or alt_name

Thank you, Evgeny, for taking action about the low quality of OSM data in Azerbaijan that so shocked me two years ago. With regard to the example I am quoting here, some caution is necessary, because the last element “Binə” means ‘building’ in Azeri. So it should not be moved to a name:xx= tag, but simply deleted. Who knows how many other such cases user Menkaura created in Azerbaijan.

1 Like

“binƏ” is the name of a village in Azerbaijan…

the building in Azerbaijani is “binA”…

That’s why I’m against the experiments they’ve started here… and this desire for confessions is already developing into outright bias towards me and my country… Inadequate advice like yours will do a disservice to the map of Azerbaijan… Dear sir, you simply shouldn’t talk about things you don’t fully understand… I repeat once again for those who still don’t understand - total editing without knowledge of regional and linguistic specifics will lead to undesirable(!) consequencеs..

And here’s one of many examples where the name of a town is written on an object, in this case a store… “Qax” is the name of the city… I’m standing in front of this store right now… Friends, please don’t do something you’ll regret later…

2 Likes