Bot proposal: automatically fix several tags

there are some values that are clear typos/duplicates/synonyms of other ones

I want to propose extension of already running bot edit, to perform cleaning of some obvious cases.

see Mechanical Edits/Mateusz Konieczny - bot account/fix many obvious typos - OpenStreetMap Wiki for list of proposed changes

Cleaning them up, for clear cases, makes map data more usable, less confusing for newbies and software alike.

At quite low cost of editing objects.

Yes, bot edit WILL cause objects to be edited. Nevertheless, as result map data quality will improve.

Please comment if any of proposed replacements are dubious and should not be replaced with an automated edit.

In such case please, let me know which values are problematic and why.

If someone wants to review but needs more than 2 weeks - please write and I can wait for longer.

Please also comment (or +1) if you checked values proposed to be edited and you agree with the edit!

This bot edit would be rerun from time to time, from Changesets by Mateusz Konieczny - bot account | OpenStreetMap bot account .

I have quite decent experience with bot edits, see Mechanical Edits/Mateusz Konieczny - bot account - OpenStreetMap Wiki

If anyone wants I can help them to find affected objects or present listing of edits which added this tags or list people who added this values onto currently tagged osm objects.

Tried to use them as detectors of bogus data, neither were really useful for this purpose.

We have many better ways to find OSM data requiring human review.

If anyone is looking for more cases where human review is needed - I would be glad to list them

(let me know if you are interested in specific area or specific type of issues - maybe only shop-related? maybe only ones that require survey? maybe only ones fixable remotely?)

But there is no point in manual drudgery here, with values clearly replaceable by better matches.

Also, I have a massive queue (in thousands and tens of thousands) of automatically detectable issues which are not reported by mainstream validators, require fixes and fix requires review or complete manual cleanup. And where such manual cleanup adds value, unlike entries from tags listed above.

This edit is documented at Mechanical Edits/Mateusz Konieczny - bot account/fix many obvious typos - OpenStreetMap Wiki

there are also many low-use values with one or two or three extra bogus characters, for example

tomb = tomb22 → tomb = tomb

would be also OK to migrate them without listing them

for review here and just add them to replace list later?

And other similar obvious typos appearing or found in future?

Only low use obvious mistakes would be changed without submitting them for review.

If anyone at all will protest and I will not do this and post for review, like list here, once sufficiently many values are found.

10 Likes

I do not understand why you plan to replace “direction=East” with “direction=E”.

In
https://wiki.openstreetmap.org/wiki/Key:directioin
I found “The values north/south/east/west are used as synonyms for N/S/E/W and should be treated the same way.”

Why not replacing “direction=East” with “direction=east”?

(The 3 other directions might be treated the same way.)

direction=East is wrong and should be corrected, see Key:direction - OpenStreetMap Wiki, highlighting of “lowercase” by me.

Whether it’s “east” or “E” should make no difference …

Edit: @Mateusz_Konieczny , I approve these bot edits, especially the trail_visibility :smiley:

agree

not agree - i like the tags als human-readable as possible.
but this ist not a very important topic to me

good hint. me too (exept this minor topic)

There’s way too much to discuss in there. Please concentrate on one thing at a time. Some of your previous changes have been genuine typos, some have not. Please don’t ask everyone here to go through a ridiculously long list to work out which is which.

This is exactly the problem. You are outsourcing your drudgery to everyone else.

I notice that your page has this in it:

Opt-out
Please write at https://community.openstreetmap.org/ in thread where discussion has taken place.
See #Discussion

Can I therefore request this:

Before changing any tag that you check which projects are listed in taginfo as using that tag to make sure that no projects will be adversely affected? As an example, the list for trail_visibility=none lists three projects but says “Processed as synonym for ‘no’” so in that case it would be OK to “just change it” (unless another mapper can think of a compelling reason not to).

That way, we don’t all have to read through pages of guff to see the potential issues among obvious changes like =Bad to=bad.

Years ago (15 or so!) I used to regularly use “E” etc. in changeset comments and I repeatedly got asked what it meant. At the time I thought that it was obvious, but I was wrong. Also note the various values here - in most cases OSM uses the full word. The direction wiki page has more on this.

I do so already. I am using OpenStreetMap_cleanup_scripts/recurrent_bot_edits/shared_detect_suspicious_deprecations.py at master - matkoniecz/OpenStreetMap_cleanup_scripts - Codeberg.org to do it automatically.

This - among other things - handles “Processed as synonym for” (see OpenStreetMap_cleanup_scripts/recurrent_bot_edits/shared_detect_suspicious_deprecations.py at master - matkoniecz/OpenStreetMap_cleanup_scripts - Codeberg.org ) specially and does not yell about such entries.

Values being changed either are not at taginfo, or are explicitly listed as deprecated or have the same descriptions as new values.

Also, while maybe some project tagging was missed I looked at it with enough detail to find some problems with taginfo listings themselves (see say OsmAnd taginfo is incorrect · Issue #19427 · osmandapp/OsmAnd · GitHub - this one got fixed)

(code there is not the most beautiful - if someone is interested in reuse or correctness checking, feel free to let me know and I can spend some time on making it more reusable)

good news, this part is done already

How you propose to split it? Separate thread by transformation type? (mixed case → lowercase, turning " " into “_” etc)
Separate thread for each key?
Separate thread for mechanical transformation (lowercasing, adding _, removing spaces/_, cutting endings etc) and separate for translations and separate for claimed synonyms which were not found automatically?
Separate thread for each changed value? (that would be absurd in my opinion, especially for low use ones but doable with some automation)

Separate thread for obvious changes and dubious changes is not doable, as what I think is a dubious change was dropped before making this proposal, this is supposed to contain only obvious ones.

this one does not seem really long - but how long it should be?
I actually merged it because I felt that opening separate thread about 4 values each used 3 times is quite silly, but I can do this

Obviously I would not open all thread at once in either case.
Though I think that splitting is worse than form used here, I would go with it if people request splitting. Just let me know what kind of splitting would make people happy.

wiki page itself seems confused, one version in values listed, one version in examples

I will probably drop it from bot edit and change it manually (not sure yet which target I will choose)

the same for West, South, EAST etc.

2 Likes

trail_visibility=none → no seems questionable to me; I’d leave that out. It is also the one with by far the most instances. The rest look fine.

I approve of all cases changing key=Value to key=value. That seems a very straightforward fix that shouldn’t step into other issues. And for any case where key=value isn’t the right tag, well, neither was key=Value so nothing was lost.

It would be better to propose bot edits categorically like that, rather than having us trudge through all the values.

3 Likes

I will definitely do this if I will have such group ever again.

I can also split this list in parts if people would want it

How meaning may differ between this two values?

I prefer the lists with exact values. (I’m a batch processor by nature.)

“none” is ambiguous, and could have been intended by the person who added to mean anything from a lack of trail blazes (without discussing if the trail is visible otherwise), or as a way of saying that they didn’t know how the trail was marked, or a way to say: “there really isn’t a trail here”, or something else.

Wouldn’t that be the same for “no”?

3 Likes

I would have been glad, if that change didn’t seem questionable to anyone.

So I fixed my systematic error by hand → only 178 instances now.
I must have overlooked “no” the first time I used it …

1 Like

The difference is “no” is documented on the wiki as having a specific meaning (so, in theory at least, people who used it could have done so based on that explanation), but “none” isn’t.

In at least one case, that’s a wiki edit that was made without any discussion. From a rendering perspective I personally handle both, (search here for mkgmap), so don’t care which is used. I can see the logic behind choosing one negative value (despite it being less grammatical English), but “someone updated the wiki” does not imply that other mappers, editor authors or data consumers even know that that happened.

1 Like

but can you invent any other, even entirely theoretical, different meaning for none that would differ from no? And would be actually plausible?

I can ask people whether they meant something different, but I feel a bit silly doing it for say horse=no2 For =none vs =no I cannot even imagine potential divergent meaning that anyone would actually use.

this does not seem plausible at all to me but I will drop this change then and ask people using it

Started Talk:Key:direction - OpenStreetMap Wiki at OSM Wiki - please comment there if you have opinion about direction=E vs direction=east (or start discussion elsewhere)

Removed

                'EAst': 'E',
                'WEST': 'W',
                'EAST': 'E',
                'East': 'E',
                'North': 'N',
                'West': 'W',
                'South': 'S',

transforms from this bot edit (code updated, wiki page updated)

1 Like

trail_visibility=none → no dropped from bot edit (code updated, wiki page updated)

1 Like