Implementation of new tagging scheme of archaeological_site=

Ah, thanks - the scope is worldwide. I thought this was clear because there was no limitation, but I see it is not :slight_smile:
I will add this to the documentation.

It probably needs to be said explicitly: no, no data consumers and renderers have been informed except for those that just happen to have followed the discussion and no, there is no way that this can be automated.

If you feel that changing existing objects is worth the trouble, then YOU need to go and chase down consumers of the data, work out a campaign to inform ones that you can’t contact directly and so on.

3 Likes

To be clear, there is a way that data consumers can “register an interest” - they can update “taginfo” as a project, like @SimonPoole did for Vespucci here. If a project hasn’t done this then Simon’s ansolutely right - there isn’t an easy way to do it other than to chase down each project manually.

3 Likes

If you feel that changing existing objects is worth the trouble, then YOU need to go and chase down consumers of the data, work out a campaign to inform ones that you can’t contact directly and so on.

The thing is that existing objects have already been changed on a global scale, a month ago, and no fix so far. Are you suggesting these automated edits should be reverted in the meantime or what is your proposal?
https://taginfo.openstreetmap.org/keys/site_type#chronology
example https://www.openstreetmap.org/changeset/129838666

Sure that is a starting point, but I don’t think there is a “staring you in the face” place in which you could find about this, so it is a very insider kind of thing.

1 Like

You should fix your quoting.

Isn’t that something that the creators of the mess should sit down and think hard about? They were supposedly making the world a better place (not).

1 Like

I replied via email, should likely be fixed in discourse.

Probably worth just making the distinction that (I think) b-unicylcing wasn’t using mechanical/automated re-tagging. They were systematically but, as far as I am aware, manually re-tagging items.

This is, essentially, a terminology change. It doesn’t break the OSM database.

I do agree, however, that this should have been discussed before being undertaken (as is suggested in the Automated Edits code of conduct) as it can (and did) result in broken projects/maps.

I also agree that @ChillyDL 's proposed solution is sensible.

| Casey_boy
January 17 |

  • | - |

ChillyDL:

As of today, 75,000 of the occurrences have already been mechanically re-tagged

Probably worth just making the distinction that (I think) b-unicylcing wasn’t using mechanical/automated re-tagging. They were systematically but, as far as I am aware, manually re-tagging items.

It was mechanical because they were searching for objects with the site_type tag and then removed this tag and added a archaeological_site tag. On every item until they got too much flak and stopped.

How is this not “mechanical”?

This is, essentially, a terminology change. It doesn’t break the OSM database.

it breaks every single data consumer who expects “site_type” tags

I think @ChillyDL’s plan is a good idea.

Out of curiosity, could you share the script with us here?

The automated edits code of practice implies automated/mechincal edits are:

[changes] made to objects in the database without review individually by the person controlling the edits

I don’t think this is the case here. b-unicylcing was manually (individually) editing objects and, one could argue, therefore reviewing them as they did so. Although, I accept, it will depend on one’s definition of what counts as a review…

To be absolutely clear: I still think they should have discussed this change and come up with a transition plan. An accepted proposal isn’t enough. Their edits were systematic but I don’t think they were “mechanical”. Just a semantic distinction.

Again, not disagreeing, see the other part of my post…

But I mentioned “not breaking OSM” because the code of conduct (as written) is about the database not 3rd parties

The purpose of this policy is to avoid the database being damaged.

Again, to be clear. I don’t think this was the right way to go about things but I think it was probably a misunderstanding that an accepted proposal was enough to make widespread changes.

2 Likes

yes, I am aware of this wording, I agree the guidelines are probably not really helpful in the case of “retagging” from tag A to B with the intention to not change anything, e.g. from phone to contact:phone.

If you take the guideline literally, doing so one by one (implying “individual review”) would not be forbidden by the guideline, but if the spirit is to not have few individuals manipulate what thousands of mappers have „voted with their feet“, then it doesn’t matter if you do it one by one or in a bigger batch, and you must get approval for such mass editing

Regarding the individual review, it is very likely that b-unicycling didn’t know all those thousands of objects, so if they were wrongfully tagged as a site_type they will be wrong as an archaeological_site=* now (i.e. maybe no actual review has taken place)

In the end, I decided not use the script since Word and Excel would freeze because of the sheer amount of 650,000 lines to process.

I use regular expressions in Visual Studio Code instead, to which I copied the output from level0 for processing:

  1. run overpass query for POIs with tag archaeological_site NOT with tag site_type, load into level0, copy plain text from there.

  2. duplicate lines containing archaeological_site and replace archaeological_site with site_type in the original lines, using these regex in Visual Studio Code:
    Find: *(.*)archaeological_site =(.*)*
    Replace with: *$1site_type =$2\n$1archaeological_site =$2*

  3. run overpass query for POIs with tag site_type NOT with tag archaeological_site, load into level0, copy plain text from there.

  4. duplicate lines containing site_type and replace site_type with archaeological_site in the original lines, using these regex in Visual Studio Code:
    Find: *(.*)site_type =(.*)*
    Replace with: *$1archaeological_site =$2\n$1site_type =$2*

  5. combine both texts in level0 and upload to OSM.

Further details in the documentation.

1 Like

Just for completeness, you need to search for historic=archaeological_site + site_type=* as some usage of site_type=* is not for historical sites (and so we wouldn’t want to add archaeological_site to them).

2 Likes

And, of course, commercial projects may not want to document their tag-parsing “secret sauce” on taginfo.

2 Likes

(unrelated to archaelogical sites, as was the complaint, but)

Experience suggests that that is unlikely to occur. In the meantime it probably makes sense to manually review anything that you have contributed by email (and maybe wait until you can contribute not by email).

(back on topic)

Yes, and that actually applies to pretty much any proposal that suggests “deprecating” existing usage of a particular tag without giving any thought to how the proposal can be implemented. Another recent example was here - good idea (in this case to harmonise tags for “diplomatic” entities), poorly documented proposal and even poorer implementation that was only spotted due to “objects disappearing”.

1 Like

I am a bit stuck - I am about to upload the POIs with duplicate tags now, but level0 refuses to even load the data (too many?), while in JOSM I keep running into conflicts while uploading so I have to restart every time.
Is there a way I can just upload the non-conflicting POIs in JOSM?

Or would you have any other idea?

I absolutely would not use a “shoot yourself in the foot” tool like level0 for something like this. I’d use JOSM, and the workflow I’d use would be - query data according to a published overpass query that people have agreed to, change the data that people have agreed to, upload the data.

However, before doing anything else, I’d make sure that the discussion was complete, and I’m not sure it is, yet (I don’t think you’ve addressed this comment, and the fact that you’re even thinking about using level0 for this suggests a misunderstanding of the best way to work with OSM data).

Thanks for the point. I take care of this in two ways:

  • there are no POIs tagged site_type= that are not part of the historic= group
  • the regex in the overpass query makes sure only POIs in the historic= group are selected:
nwr[~":*historic$"~".*"]["site_type"]

That’s what I am doing.
I am at “upload the data” at the moment, and I wonder if there is a way to deal with the conflicts later and get the rest uploaded already. That is my question.