Automated edits for name tags

I halted my changes for now, but is IHM down? And is it related?

https://www.facebook.com/groups/994960670559126/permalink/1364330870288769/

Would this algorithm addition be acceptable?

  • If name and name:he mismatch (and they’re both Hebrew), see which one was the most recently updated, and update the other one accordingly.

Strongly against it - I found many streets where last edit was problematic. When you do automatic edit over problematic one it make very hard to spot an error and make harder to find the source of error.

I accidentally pushed an experiment. Will revert in a second.

Reverted. Further explanation: The autofix code has been ready for a while, but didn’t work due to a bug in the scripting plugin. Today, that bug was fixed, and I went ahead and tested the code, but I also accidentally uploaded.

Now that it works, we need to decide if it’s desired.

I believe the benefits are greater than the drawbacks, but I am willing to stand corrected. Here is why:

  • Most people make good edits (I hope!), if so, the the last edit should usually be correct. We need to check out the diff to confirm this: #50233725 (which is now reverted)

  • Name/name:lang mismatch are very hard to manually detect because sometimes it’s the good name that renders or shows up in editor descriptions. Because of that, sometimes these mistakes survive for months or even years. Example here. I argue it’s often easier to see the bad edit when it appears in both tags. Quite often, a name mismatch can only be detected by someone actively looking for mismatches.

  • When I find a mismatch, I often trace the history to figure out which name is newer. That’s mechanical and boring, and the bot can do it for me, faster.

  • I want the bot to “simulate” a single name tag. When you edit one tag, you are forced to edit the other whether it’s a good or a bad edit. I believe this makes life easier.

We could have a middle-ground, where we manually review the autofix. (I can post them here whenever they’re made), It’s still much faster than manual fixes, and will also catch the bad edits.

how about a: fixme=“Name fixed by bot. Please review.”?

I don’t think the “fixme” tagging is working. There are more than 30,000 nodes and 1700 ways with a “fixme” tag in Israel, not no mention “fixme” in “note” tags…

Having an table with columns for “name”, “name:he”, “name1”, and “name:he1” would be more efficient.

See this post for how to update OSM tags using a CSV file.

I’d say fixmes pending review indefinitely are better than name mismatches pending review indefinitely.

By the way, almost all node fixmes are from the GTFS bus stop import. It may be wise to bulk-remove those, to shine the light on the more important fixmes.

Edit: Sorry, I was confusing this with something else and this comment is wrong regarding overpass. Please ignore the previous version of this comment.

I’ll not apply auto-fixes for now. I’ll look into manually updating them (via the csv method or some other way).

Nope, I still don’t like this. It’s wasted manpower; Even if I manually fix them all, the mismatches will accumulate over time and the manual fix must be done periodically, and most mismatch “errors” will be in fact legitimate edits where someone changes a name tag and forgets the other. It’s mechanical drudgery.

Wouldn’t it be easier to just treat the two tags as a single tag and have the bot auto-synchronize them? We also don’t even need a fixme for this; If a user mis-edits a tag, it’s not the synchronization’s fault. Mis-edits are normal in OSM, and name tag mis-edits should be handled like any other mis-edit, through monitoring tools and such.

I can understand the need for a fixme tag for the first edit only (because some autofixes will be grabbed from deeper history), but no need for a fixme when this is synchronized periodically. By the way, it’ll only add an additional 372 fixmes to the 30k already present.

For the record, http://overpass-turbo.eu/s/qmy is a query that finds elements in Israel that have a Hebrew “name” tag and a “name:he” tag that are different.

It has an option to output a CSV file by un-commenting the CSV output definition.

The current element count is: nodes: 62, ways: 430, relations: 11

There are also 586 in total, if we count Arabic, Hebrew, English. 372 of which are auto-fixable by swiftfast_bot.

Does everyone agree that we should always have Hebrew “name” tags duplicated at “name:he”? I asked prior to running the scripts and I think everyone agreed, but I cannot find the post anymore. (Rationale: The language of the “name” tag varies in Israel. But name:he guarantees Hebrew).

I agree that

The following cases should not be handles automatically:

  • name tags with foreign language characters
  • name tags that are different than the name:he tags

The current rules are similar, except for line 2. I think they are as follows.

This allows copying things like “KSP מחשבים”. Do you think that’s a bad idea?

I should publish the source code soon. (I wanted to fully automate it first, but that’s not going to happen soon).

It’s not good enough because it has no notion of the English contents and would also copy “Herzl הרצל” - using a naming scheme we cleaned-up in Jerusalem a while ago.

Why is that a bad thing? It wouldn’t introduce a new error, it would just keep an already existing error unfixed. This is similar to Sanniu’s opposition to the autofixes.

I think everything would be much simpler if we treat the bot as a convenience copy-machine. It “binds” name and name:he. If the bot copies a faulty tag, it’s not the bot’s fault, and it doesn’t really make things worse. A human needed to fix that anyways.

In that scenario, there was no error in the name:he tag, and the bot created one.
On the other hand, the human work needed for a fix is doubled by a bot.
As a person who spend a significant time in manually fixing errors, I would like the bots to “do no evil”, rather than spreading it.

The bot does not introduce additional fixing effort. In the case of “Herzl הרצל”, you had to fix both “name” and “name:he” regardless of the bot’s work. There were two errors (missing name:he, wrong name), and there remained two errors.

(In fact, I argue the bot makes this slightly easier, because you don’t have to type in “name:he” in JOSM and you just edit the value)

On the other hand the bot alleviates effort by fixing many cases (like “ksp מחשבים”).