Mass remove `gnis:created` and similar tags? [final version presented]

Sure, I’d appreciate it.

Ideally, Wikidata would also have two separate items, one for the human settlement and the other for the township. But since that’s a lot of work for what most other sources conflate, this error is commonly resolved by adding a subject has role (P2868) qualifier to both statements, as seen in the item for the neighboring township of West Orange (Q932601).

1 Like

I did notice that in the Wikidata entry for Dedham. But then it still shows up in the constraint violation report.

The qualifiers are well-established as an exception to the constraint, as documented on the property’s talk page. I’m unsure why the report ignores the constraint’s separator (P4155).

This QLever query finds 557 violations of this constraint, taking the separator into account. The set includes a high concentration of towns in Maine and New Hampshire.

Maybe if there are multiple GNIS Feature ID (P590) values then all the values have to have the subject has role (P2868) qualifier?

I noticed that one of the Feature IDs for Dedham did not have the qualifier.

1 Like

+1, makes sense (though as it is not removal it will be no handled as part of this bot edit, if this tagging irritates anyone I would encourage them to fix it)

oh definitely!

I thought I might just fix the tags, but based on the first element I looked at, there may be some more complex issues.

Relation: ‪Glade Creek Reservoir‬ (‪3881986‬) | OpenStreetMap was tagged with alt_name=Beckley Water Supply Number One Lake and alt_name:gnis:feature_id=1559275 but that feature is actually 2.5 miles to the west (and apparently no longer present). The correct ID for Glade Creek Reservoir is 1539425.

I’ll take a look at the other features with the alt_name:gnis:feature_id tag later.

The discussion here prompted me to send another set of updates to USGS. I’ve included Middle Fork Salt Creek and Middle Branch Shade River in that list. It usually takes a month or so for USGS to reply, but when they do I’ll go back and update the features with the correct information.

3 Likes

I fixed all the features with alt_name:gnis:feature_id. I think a lot of what happened was that old imported reservoir and dam nodes were merged with the wrong features.

3 Likes

I guess that means that

        "alt_gnis:feature_id", # https://www.openstreetmap.org/relation/7132203 https://www.openstreetmap.org/relation/274921
        "gnis:id_2", "gnis:id_1", # why has https://www.openstreetmap.org/node/150952282 both? I will create note if noone will investigate this
        "gnis:feature_id_1",
        "gnis:feature_id_2",
        "gnis:feature_id_alt",
        "gnis:feature_id2",

are still available for cleanup? (unless some of this keys actually make sense?

No, none of those keys make sense.

If you were to just append the values to the gnis:feature_id key separated with semicolons, that would at least normalize the keys. Then someone could go and clean up the values later.

Honestly, it’s not a high priority. These things are insignificant compared to the vast numbers of GNIS features that are either missing from OSM or out of date in OSM. And if we can get to a point where those problems can be solved, these minor inconsistencies will get cleaned up in the process.

[Edit]

However, I have noted some queries for USGS:

Relation: ‪Loretta Lake‬ (‪7132203‬) | OpenStreetMap - Possible duplicate GNIS records that could be merged.

Relation: ‪Great Swamp‬ (‪274921‬) | OpenStreetMap - GNIS, NHD, and USGS topo all disagree about the extent and naming of “Black Swamp” or “Great Swamp.” This is a good one for BGN to resolve.

Node: ‪Riverview‬ (‪150952282‬) | OpenStreetMap - Likely duplicate GNIS records.

2 Likes

oh definitely not a high priority, though cleaning them would make other GNIS synchronization a tiny bit easier

2 Likes

Mechanical Edits/Mateusz Konieczny - bot account/remove not needed GNIS tags - OpenStreetMap Wiki - has now a finished proposal for bot edit.

Though obviously feedback and changes are welcome, proposal will remain open waiting for them,

Right now I plan to wait week (most of bot edit was presented already), but I can also wait two weeks as usual if anyone will request it.

1 Like

Looks good, but maybe you don’t want to limit the editing to objects in Poland?

2 Likes

Ops. Changed.

1 Like

I support removing most of the GNIS cruft, so I think this mass edit is a good idea. I will say that I do actually use gnis:reviewed=no when editing: it’s nice to see at a glance whether, say, a place of worship or post office is likely to ever have been touched, since their locations tend to be a bit off. But obviously it’s not a very reliable indicator, since people can forget to remove it, and I suppose you could get the same information from a node’s history. But it was spared in the previous cleanup attempts up-thread, so I would favor leaving it in the database after this one as well. But I wouldn’t object very strongly if others wanted to get rid of it.

Also, I think a lot of these points have ele imported from GNIS too. I tend to find them to be not very accurate (and leads to the funny problem that when a GNIS node is merged with a building in Los Angeles, where the buildings were also imported with elevations, the building/POI node gets two semi-colon delimited ele values). Did you consider removing imported ele values as well? Maybe only from some classes where they aren’t particularly relevant? I guess it’s hard to tell if it’s been verified.

Like many of the building datasets that have been imported, the GNIS import set the ele value to whatever value an elevation model has at that coordinate. If the coordinate is off, so is the elevation. As far as I know, GNIS contains no surveyed elevations, but ele could contain something surveyed; you’d have to check the history to rule out that possibility.

In the past, whenever I merged GNIS-imported features, I’d pick one of them instead of keeping them both around. Usually it’s a slight difference, again due to the different coordinates associated with each feature. On the other hand, I’ve rarely remembered to remove the ele tag when moving a GNIS node a significant distance away. Such is the challenge of maintaining a variety of secondary attributes on features.

The ele values imported from GNIS are, to use a technical term, crap. In the old data set, the values in GNIS were interpolated using an elevation model that was not particularly accurate. The old GNIS data had no surveyed elevation data.

The new GNIS data set doesn’t even include elevation data. (Thankfully.)

If you happen to find a GNIS feature that has an elevation value and should have an elevation value, it would be good to validate or correct it using either a coincident vertical benchmark in the USGS topo layer or the current 3DEP data. 3DEP is usually better because USGS benchmarks aren’t always where you’d expect them to be. I use a JOSM script that pulls elevation data from 3DEP to update these values and I’ll post that script when I get around to it.

Of course, there are plenty of large areas (such as civil boundaries) that were assigned elevation values in the old GNIS data set and these values were imported into OSM. These ele values are complete nonsense and should be deleted.

2 Likes

Actually, in older version I proposed keeping it and was asked to also remove it.

I have no very strong preference either way.

For now per “But I wouldn’t object very strongly if others wanted to get rid of it.” I will keep it on “to remove” list, but if you repeat objection or anyone else will object I can drop it or start a poll on whether it should be kept or removed (with at least 85% support for removal to remove it).

Not really. It seemed to be weird to put it on say school POI but it is less confusing than other GNIS tags and would require more tricky analysis to check its source.

And it seemed to be potentially controversial.

I guess that means they should be removed? But I would say that it would require separate discussion, more advanced analysis of objects (including history - I actually have tool for that but not sure whether it would work for so many objects, maybe different approach may be better of identifying GNIS import edits and remove any ele tags added in them)

1 Like

Well, not necessarily. At least not for all types of features.

For things like natural=peak the GNIS ele value is probably within a few meters of being correct. That’s better than not having the value at all. And the ele values for mountains in OSM have often been manually corrected using better sources.

For things like waterway=stream or natural=valley, the ele value from GNIS really isn’t useful within OSM and could be removed.

1 Like

oh, that would be avoided by checking object history and removing only ones not modified (though that would remove ones verified to be correct so may be a bad idea anyway)

A better approach might be to remove the ele tags from things that don’t need to have ele tags based on the primary feature tags (or GNIS Feature Class if that’s available).