Stage 1 complete.
On Tuesday 26 May 2026, I’ll confirm stage 2 and provide a sample of the changes, as with stage 1, and if there are no concerns by 24 hours later (Wednesday 27 May 2026), I’ll implement the changes.
Stage 1 complete.
On Tuesday 26 May 2026, I’ll confirm stage 2 and provide a sample of the changes, as with stage 1, and if there are no concerns by 24 hours later (Wednesday 27 May 2026), I’ll implement the changes.
For the record, I have observed that many of the changes follow the exact same history pattern and it involves the bus import scripts. Here’s a representative example.
n5210704770 is a bus stop which had name:en=HaGay/Sderot Yitzhak Rabin. This had probably become a stale name sometime after 2020 because I stopped operating the old GTFS script and the upstream source updated the name.
en1, setting name:en1=HaGay/Yits'hak Rabin Blvd in c135784701.name:en in c156526115. Now en and en1 are identical.name:en1 as expected from stage 1 in c182941303.This all seems fine and I don’t know if there’s any action to take here, but thought it might interest @NeatNit .
We should make sure that automated workflows do not undo this work.
enX/heX/arX?imageX?enX/heX/arX?Nope. Only en, he and ar.
ImageX and websiteX.
Hi Harel,
Unrelated to this proposal, but I had a look through the imageX links and they seem to mostly be Wikipedia images. Have you considered using the Key:wikimedia_commons - OpenStreetMap Wiki key?
I did, but then it requires an extra call to wikimedia to get the actual image link in order to present it, which I found awkward, and when there’s more than one image it will require wikimedia_commons2 etc, which is even weirder. There’s a possibility to create a category for each osm element that has more than one image, but that seems like a complete overkill. So I decided to simply use imageX.
This is the Stage 2 follow-up to the numbered name tag cleanup.
I plan to begin Stage 2 no sooner than 24 hours after this post, unless there are objections or specific cases that should be excluded.
The working bbox is:
(29.3,34.2,33.5,35.9)
The source query was refreshed on 2026-05-26 and matches:
^.+:[A-Za-z]{2,3}[0-9]+$
^(name|alt_name|old_name|official_name|short_name|loc_name|nat_name|reg_name|int_name|sorting_name)[0-9]+$
This includes numbered language-suffix keys such as name:en1, name:he1, name:ar1, old_name:en1, and alt_name:en1, plus numbered standard name-family keys such as name1, alt_name1, and old_name1.
Stage 2 preserves the numbered values in standard target keys, then removes the numbered keys.
| Existing numbered key | Target key |
|---|---|
name:<lang>N |
alt_name:<lang> |
nameN |
alt_name |
alt_name:<lang>N |
alt_name:<lang> |
alt_nameN |
alt_name |
old_name:<lang>N |
old_name:<lang> |
old_nameN |
old_name |
Multiple values are joined with semicolons, preserving existing target-key values first and then adding numbered values in numeric order.
Example:
name:en=Main name
name:en1=First variant
name:en2=Second variant
becomes:
name:en=Main name
alt_name:en=First variant;Second variant
| Object | Current numbered key/value | Target result |
|---|---|---|
| node/93920263 | name:en1=Horashim |
add alt_name:en=Horashim, remove name:en1 |
| node/278471492 | name:en1=Qasra plus another English numbered value |
add alt_name:en=Qasra;Qasr Umm Quseir, remove numbered English keys |
| node/26608859 | name:he1=שדה התעופה ירושלים |
add alt_name:he=שדה התעופה ירושלים, remove name:he1 |
| node/392637143 | name1=Isrotel: Laguna Eilat |
add alt_name=Isrotel: Laguna Eilat, remove name1 |
| node/1252332570 | alt_name1=עין אורה with existing alt_name=עין אל-בלד |
update alt_name=עין אל-בלד;עין אורה, remove alt_name1 |
| way/34071590 | old_name1=אפריד"ר with existing old_name=אפריקה דרומית |
update old_name=אפריקה דרומית;אפריד"ר, remove old_name1 |
| way/34693108 | old_name:en1=Ha'Shikma with existing old_name:en=HaShikma |
update old_name:en=HaShikma;Ha'Shikma, remove old_name:en1 |
| Metric | Count |
|---|---|
| Objects checked in current bbox snapshot | 45,821 |
| Objects changed automatically by Stage 2 | 44,110 |
| Objects held for manual review | 1,711 |
| Numbered tags removed automatically | 82,652 |
| Numbered tags held for manual review | 3,238 |
| Distinct values added to target keys | 81,648 |
| Target tags created | 48,980 |
| Target tags updated | 301 |
| Proposed target values over 255 characters | 0 |
| Longest proposed target value | 127 characters |
Top automatic target keys:
| Target key | Numbered tags |
|---|---|
alt_name:en |
74,922 |
alt_name:he |
7,098 |
alt_name:ar |
236 |
alt_name |
148 |
alt_name:ru |
110 |
old_name:en |
46 |
mtb:name:en |
41 |
old_name |
23 |
old_name:he |
18 |
alt_name:cs |
7 |
old_name:ar |
2 |
mtb:name:he |
1 |
Most common original keys in the automatic cleanup:
| Original key | Numbered tags |
|---|---|
name:en1 |
41,216 |
name:en2 |
18,593 |
name:en3 |
8,752 |
name:he1 |
5,712 |
name:en4 |
4,320 |
name:en5 |
1,484 |
name:he2 |
844 |
name:en6 |
542 |
name:he3 |
522 |
name:ar1 |
186 |
name1 |
121 |
name:ru1 |
108 |
The automatic upload excludes objects with:
name:<lang> for name:<lang>N keys;old_name:<lang>, official_name:<lang>, or short_name:<lang>;Manual review categories in the current snapshot:
| Review reason | Objects |
|---|---|
gns_source |
1,086 |
public_transport_or_gtfs |
306 |
missing_name:en |
210 |
semantic_en_name_tag_present |
60 |
missing_name:he |
30 |
semantic_ar_name_tag_present |
13 |
missing_name:ar |
11 |
semantic_he_name_tag_present |
8 |
semantic_ru_name_tag_present |
5 |
non_latin_english_numbered_value |
4 |
missing_name |
4 |
missing_name:ru |
2 |
The same fresh snapshot was checked for Stage 1-style duplicate numbered keys that may have appeared since Stage 1.
name:en2=Tel Qidda duplicates name:en=Tel Qidda.I refreshed that carry-forward object from the OSM API and it is still current. This can be handled as a tiny conservative Stage 1.1 duplicate removal before Stage 2, or included as a separate non-overlapping chunk in the same cleanup sequence.
I have generated:
.osc for only the automatic cases;Before upload I will refresh the exact changed objects from the OSM API, regenerate the .osc files using current object versions, and split the upload into bbox-safe chunks. The oversized large-bbox objects will stay out of the automatic upload unless handled separately.
Proposed changeset comment:
Migrate numbered name tags to standard semicolon-separated target keys
Please reply within 24 hours if you see a problem with the Stage 2 rules, the scope, or any of the sampled examples.
The automatic Stage 2 cleanup has now been uploaded and verified. It moved the clear automatic cases from numbered keys such as name:en1, name:en2, name:he1, etc. into standard target keys such as alt_name:en / alt_name:he, then removed the numbered keys.
I re-ran the query after upload. There are now 0 remaining automatic Stage 2 cases and 0 remaining Stage 1 duplicate-removal cases.
The remaining numbered name tags are the cases that were intentionally held back for manual review:
| Item | Count |
|---|---|
| Remaining numbered-name tags | 3,237 |
| Objects with remaining numbered-name tags | 1,711 |
| CSV review groups | 1,823 |
I am attaching a CSV for review. It is grouped by object and proposed target key. Each row includes:
This is not an automatic edit proposal yet. I am asking for review of the remaining cases and the proposed handling categories.
gns_sourceThe object appears to have names from GNS / GEOnet.
These were held back because GNS variants may be genuine alternate names, historical names, exonyms, old import artifacts, or names that are not locally used. I do not think these should be moved automatically without local/source review.
Possible outcomes may include:
alt_name:*;old_name:* if that is known to be correct;Feedback requested: how should GNS/GEOnet variants be treated here?
public_transport_or_gtfsThe object has public-transport or GTFS-related context.
These were held back because names on transit objects can represent stop names, route names, feed/import values, platform names, or other transit-specific concepts. A generic move to alt_name:* may be wrong.
Feedback requested: should these be reviewed by transit mappers, preserved as alternate names, handled with transit-specific tagging, or left alone?
missing_name:* / missing_nameThe corresponding base name was missing. For example, name:en1 exists but name:en is missing, or name1 exists but name is missing.
These were held back because without the base name it is unclear whether the numbered value is an alternate name, the primary name that should become name:*, or stale source data.
Feedback requested: should the base name be established first, and then only true alternates moved to alt_name:*?
semantic_*_name_tag_presentThe object already has another semantic name tag in the same language, such as old_name:*, official_name:*, short_name:*, or a similar name key.
These were held back because the numbered value might belong in a more specific key instead of generic alt_name:*.
Feedback requested: for these rows, should the value go to alt_name:*, an existing semantic name key such as old_name:*, or something else?
non_latin_english_numbered_valueAn English-numbered key contains a non-Latin-script value.
These were held back because moving that value to alt_name:en would probably be wrong. It may belong in another language-specific name key, or it may be a source/import artifact.
Feedback requested: please identify the intended language/script and whether these values should be preserved in another standard key.
| Proposed action | Groups | Meaning |
|---|---|---|
needs_source_review |
1,145 | GNS/GEOnet-sourced values need local/source review before retagging. |
needs_transit_review |
319 | Public transport / GTFS context makes generic retagging unsafe. |
needs_base_name_review |
266 | The base name is missing, so the primary name should be checked first. |
choose_semantic_target |
89 | A solution may exist, but the correct target depends on the meaning of the value. |
no_automatic_solution |
4 | No safe automatic solution was found; community feedback is needed. |
Please review the attached CSV and comment on:
stage2-review-candidates-with-proposed-solutions.csv (1.1 MB)
I did some scan on missing_name:en.
Indeed there are many simple mistakes, mostly removing main tag when fixing street name with unknown English name or removing wrong street name altogether and not removing secondary tags.
All this requires manual intervention, I suppose.
Fixed things that I could, mostly removing the tags, but also there was places there the right tag can be added.
Can you regenerate the file to remove cleaned lines?
Here you go.
stage2-review.csv (1.0 MB)