Proposal: review nonstandard name:en1/name:en2 keys related to this community

Stage 1 complete.

On Tuesday 26 May 2026, I’ll confirm stage 2 and provide a sample of the changes, as with stage 1, and if there are no concerns by 24 hours later (Wednesday 27 May 2026), I’ll implement the changes.

2 Likes

For the record, I have observed that many of the changes follow the exact same history pattern and it involves the bus import scripts. Here’s a representative example.

n5210704770 is a bus stop which had name:en=HaGay/Sderot Yitzhak Rabin. This had probably become a stale name sometime after 2020 because I stopped operating the old GTFS script and the upstream source updated the name.

  1. @davidmguest adds the more up-to-date name to en1, setting name:en1=HaGay/Yits'hak Rabin Blvd in c135784701.
  2. The newer gtfs2osm bot by @NeatNit goes online in 2024. It updates name:en in c156526115. Now en and en1 are identical.
  3. @TaggingReviewer removes name:en1 as expected from stage 1 in c182941303.

This all seems fine and I don’t know if there’s any action to take here, but thought it might interest @NeatNit .

We should make sure that automated workflows do not undo this work.

  • @NeatNit does Gtfs2osm ever touch enX/heX/arX?
  • @Harel_M does IsraelHikingMap and/or Mapeak ever touch any numbered suffixes other than imageX?
  • @davidmguest do any of your automated/semi-automated/bulk edits touch enX/heX/arX?

Nope. Only en, he and ar.

ImageX and websiteX.

Hi Harel,

Unrelated to this proposal, but I had a look through the imageX links and they seem to mostly be Wikipedia images. Have you considered using the Key:wikimedia_commons - OpenStreetMap Wiki key?

I did, but then it requires an extra call to wikimedia to get the actual image link in order to present it, which I found awkward, and when there’s more than one image it will require wikimedia_commons2 etc, which is even weirder. There’s a possibility to create a category for each osm element that has more than one image, but that seems like a complete overkill. So I decided to simply use imageX.

Stage 2 notice: numbered name tag cleanup will begin in 24 hours

This is the Stage 2 follow-up to the numbered name tag cleanup.

I plan to begin Stage 2 no sooner than 24 hours after this post, unless there are objections or specific cases that should be excluded.

Scope

The working bbox is:

(29.3,34.2,33.5,35.9)

The source query was refreshed on 2026-05-26 and matches:

^.+:[A-Za-z]{2,3}[0-9]+$
^(name|alt_name|old_name|official_name|short_name|loc_name|nat_name|reg_name|int_name|sorting_name)[0-9]+$

This includes numbered language-suffix keys such as name:en1, name:he1, name:ar1, old_name:en1, and alt_name:en1, plus numbered standard name-family keys such as name1, alt_name1, and old_name1.

What will change

Stage 2 preserves the numbered values in standard target keys, then removes the numbered keys.

Existing numbered key Target key
name:<lang>N alt_name:<lang>
nameN alt_name
alt_name:<lang>N alt_name:<lang>
alt_nameN alt_name
old_name:<lang>N old_name:<lang>
old_nameN old_name

Multiple values are joined with semicolons, preserving existing target-key values first and then adding numbered values in numeric order.

Example:

name:en=Main name
name:en1=First variant
name:en2=Second variant

becomes:

name:en=Main name
alt_name:en=First variant;Second variant

Clean-up target examples

Object Current numbered key/value Target result
node/93920263 name:en1=Horashim add alt_name:en=Horashim, remove name:en1
node/278471492 name:en1=Qasra plus another English numbered value add alt_name:en=Qasra;Qasr Umm Quseir, remove numbered English keys
node/26608859 name:he1=שדה התעופה ירושלים add alt_name:he=שדה התעופה ירושלים, remove name:he1
node/392637143 name1=Isrotel: Laguna Eilat add alt_name=Isrotel: Laguna Eilat, remove name1
node/1252332570 alt_name1=עין אורה with existing alt_name=עין אל-בלד update alt_name=עין אל-בלד;עין אורה, remove alt_name1
way/34071590 old_name1=אפריד"ר with existing old_name=אפריקה דרומית update old_name=אפריקה דרומית;אפריד"ר, remove old_name1
way/34693108 old_name:en1=Ha'Shikma with existing old_name:en=HaShikma update old_name:en=HaShikma;Ha'Shikma, remove old_name:en1

Current numbers

Metric Count
Objects checked in current bbox snapshot 45,821
Objects changed automatically by Stage 2 44,110
Objects held for manual review 1,711
Numbered tags removed automatically 82,652
Numbered tags held for manual review 3,238
Distinct values added to target keys 81,648
Target tags created 48,980
Target tags updated 301
Proposed target values over 255 characters 0
Longest proposed target value 127 characters

Top automatic target keys:

Target key Numbered tags
alt_name:en 74,922
alt_name:he 7,098
alt_name:ar 236
alt_name 148
alt_name:ru 110
old_name:en 46
mtb:name:en 41
old_name 23
old_name:he 18
alt_name:cs 7
old_name:ar 2
mtb:name:he 1

Most common original keys in the automatic cleanup:

Original key Numbered tags
name:en1 41,216
name:en2 18,593
name:en3 8,752
name:he1 5,712
name:en4 4,320
name:en5 1,484
name:he2 844
name:en6 542
name:he3 522
name:ar1 186
name1 121
name:ru1 108

What will be held for manual review

The automatic upload excludes objects with:

  • missing name:<lang> for name:<lang>N keys;
  • existing semantic name tags in that language, such as old_name:<lang>, official_name:<lang>, or short_name:<lang>;
  • GNS / GEOnet source tags;
  • public transport or GTFS-related tags;
  • non-Latin values in English numbered keys;
  • any proposed target value over 255 characters.

Manual review categories in the current snapshot:

Review reason Objects
gns_source 1,086
public_transport_or_gtfs 306
missing_name:en 210
semantic_en_name_tag_present 60
missing_name:he 30
semantic_ar_name_tag_present 13
missing_name:ar 11
semantic_he_name_tag_present 8
semantic_ru_name_tag_present 5
non_latin_english_numbered_value 4
missing_name 4
missing_name:ru 2

Stage 1 duplicate check

The same fresh snapshot was checked for Stage 1-style duplicate numbered keys that may have appeared since Stage 1.

  • 1 duplicate numbered tag is on a Stage 2 automatic object and will be removed by Stage 2.
  • 1 duplicate numbered tag is outside the Stage 2 automatic set: node/278477740, where name:en2=Tel Qidda duplicates name:en=Tel Qidda.

I refreshed that carry-forward object from the OSM API and it is still current. This can be handled as a tiny conservative Stage 1.1 duplicate removal before Stage 2, or included as a separate non-overlapping chunk in the same cleanup sequence.

Upload and rollback preparation

I have generated:

  • a Stage 2 .osc for only the automatic cases;
  • a CSV listing every planned automatic change;
  • a CSV listing every manual-review case;
  • Stage 1 carry-forward artifacts for duplicate removals outside the Stage 2 automatic set;
  • rollback reference files.

Before upload I will refresh the exact changed objects from the OSM API, regenerate the .osc files using current object versions, and split the upload into bbox-safe chunks. The oversized large-bbox objects will stay out of the automatic upload unless handled separately.

Proposed changeset comment:

Migrate numbered name tags to standard semicolon-separated target keys

Please reply within 24 hours if you see a problem with the Stage 2 rules, the scope, or any of the sampled examples.

1 Like

The automatic Stage 2 cleanup has now been uploaded and verified. It moved the clear automatic cases from numbered keys such as name:en1, name:en2, name:he1, etc. into standard target keys such as alt_name:en / alt_name:he, then removed the numbered keys.

I re-ran the query after upload. There are now 0 remaining automatic Stage 2 cases and 0 remaining Stage 1 duplicate-removal cases.

The remaining numbered name tags are the cases that were intentionally held back for manual review:

Item Count
Remaining numbered-name tags 3,237
Objects with remaining numbered-name tags 1,711
CSV review groups 1,823

I am attaching a CSV for review. It is grouped by object and proposed target key. Each row includes:

  • the OSM object and link;
  • the numbered key(s) and value(s);
  • the review reason(s);
  • the possible target key if the value should be preserved;
  • a proposed action;
  • a proposed solution or a note that no safe automatic solution was found;
  • a feedback request.

This is not an automatic edit proposal yet. I am asking for review of the remaining cases and the proposed handling categories.

What the review reasons mean

gns_source

The object appears to have names from GNS / GEOnet.

These were held back because GNS variants may be genuine alternate names, historical names, exonyms, old import artifacts, or names that are not locally used. I do not think these should be moved automatically without local/source review.

Possible outcomes may include:

  • preserve genuine alternate names in alt_name:*;
  • move historical names to old_name:* if that is known to be correct;
  • leave or remove source artifacts if they are not useful OSM names;
  • handle specific subsets differently.

Feedback requested: how should GNS/GEOnet variants be treated here?

public_transport_or_gtfs

The object has public-transport or GTFS-related context.

These were held back because names on transit objects can represent stop names, route names, feed/import values, platform names, or other transit-specific concepts. A generic move to alt_name:* may be wrong.

Feedback requested: should these be reviewed by transit mappers, preserved as alternate names, handled with transit-specific tagging, or left alone?

missing_name:* / missing_name

The corresponding base name was missing. For example, name:en1 exists but name:en is missing, or name1 exists but name is missing.

These were held back because without the base name it is unclear whether the numbered value is an alternate name, the primary name that should become name:*, or stale source data.

Feedback requested: should the base name be established first, and then only true alternates moved to alt_name:*?

semantic_*_name_tag_present

The object already has another semantic name tag in the same language, such as old_name:*, official_name:*, short_name:*, or a similar name key.

These were held back because the numbered value might belong in a more specific key instead of generic alt_name:*.

Feedback requested: for these rows, should the value go to alt_name:*, an existing semantic name key such as old_name:*, or something else?

non_latin_english_numbered_value

An English-numbered key contains a non-Latin-script value.

These were held back because moving that value to alt_name:en would probably be wrong. It may belong in another language-specific name key, or it may be a source/import artifact.

Feedback requested: please identify the intended language/script and whether these values should be preserved in another standard key.

Proposed action categories in the CSV

Proposed action Groups Meaning
needs_source_review 1,145 GNS/GEOnet-sourced values need local/source review before retagging.
needs_transit_review 319 Public transport / GTFS context makes generic retagging unsafe.
needs_base_name_review 266 The base name is missing, so the primary name should be checked first.
choose_semantic_target 89 A solution may exist, but the correct target depends on the meaning of the value.
no_automatic_solution 4 No safe automatic solution was found; community feedback is needed.

What I am asking for

Please review the attached CSV and comment on:

  1. Whether the proposed action categories make sense.
  2. Whether any category can safely be handled in a later structured cleanup.
  3. Whether any category should be excluded entirely.
  4. Whether specific rows have an obvious correct standard tag.
  5. Whether GNS/GEOnet-sourced variants should generally be preserved, retagged more specifically, or left for case-by-case review.

stage2-review-candidates-with-proposed-solutions.csv (1.1 MB)

I did some scan on missing_name:en.

Indeed there are many simple mistakes, mostly removing main tag when fixing street name with unknown English name or removing wrong street name altogether and not removing secondary tags.

All this requires manual intervention, I suppose.

Fixed things that I could, mostly removing the tags, but also there was places there the right tag can be added.

Can you regenerate the file to remove cleaned lines?

Here you go.

stage2-review.csv (1.0 MB)