Mechanical edit Proposal to clean up street=*-tag

The only documentation I was able to find about the street=*-key is that it is a possible tagging error according to addr:street=*. So as far as I can tell it is basically undocumented.

I am not aware of any but I also do not know how to search for this. Is there a list where I can see which tags are in some way recommanded buy iD or JOSM?

There is this refugee-camp I already meantioned. The data import was in 2014 and is documented here. None of these objects are part of my mechanical edit proposal. I will keep trying to contact the people responsible for this. If that does not wok out I would include it in the MapRoulette-Challenge for others to have a look.

There is this area in London. About 90% of the objects where street=* is equal to addr:street=* seem to be in this area, added by one user. I just reached out to this user in a changeset discussion and am now waiting for feedback.

As long as street=* is undocumented, an editor could display a warning like “This is an unusual tag. Consider using addr:street=* or highway=* instead.”. After the MapRoulette-Challenge is finished or close to being finished, the people who participated could probably come up with some ideas for such warnings.

Maybe tags such as housenumber=*, postcode=* or city=* are also worth a look. There are in total 7495 objects using at least one of these tags. But on first glance I can’t see much of a concept. Sometimes it is city=name of city, so city=* should be name=*. Sometimes it is city=yes, probably to tell that this object is a city. A lot of housenumber=* seem to have a numeric value how you would expect it from a addrs:housenumber=*. I suggest to include some more of the addr:*-prefixes into the MapRoulette-Challenge.

2 Likes

I just reached out to them via changeset discussion.

I understand that. That is why I limited my initial proposal to cases where addr:street=* carry the same information like street=*.

If it takes you 30 seconds for every correction via MapRoulette, ca. 600 objects still add up to ca. 5 hours of work. And if there is a safe way to automate this without the risk of damaging data, I am sure that this is something that is worth doing.

There are exactly 8 street=* on relations, so including/excluding them will probably not change much. But I will consider this.

3 Likes

I think the purpose of the MR challenge is gain insight from random mappers. It might make sense to group suspected tag duplications by thier location. So that the tag is probably being used in the same way.

The same tag may be used in different ways as there is no documentation to form a baseline reference.

The instructions should be asking for simple reason why a change was made. The responses should be enough to start forming picture of how the tag is being understood in different places.

1 Like

street seems to be safe to remove automatically here

5 Likes

I would like to do the possible mechanical cleanup before creating a MapRoulette-Challenge. The other way around would lead to a MapRoulette-Challenge with lots of already fixed problems.

Instead of a wide spread discussion about how to design a MapRoulette-Challenge for further cleanup I would like to focus the discussion for now to one question: Should street=* be removed from objects where addr:street=* stores the same value with an mechanical edit? Even though this is not a democracy rather than consensus based discussion, a poll is a good orientation for that:

  • I endorse a mechanical edit where street=* is equal to addr:street=*
  • I oppose a mechanical edit where street=* is equal to addr:street=*
0 voters

I documented the proposed mechanical edit in the wiki and added an example.

3 Likes

As the poll was open over more than 24 hours and got 9 votes all supporting the mechanical edit I executed the edit with changeset 145806155.

grafik

9 Likes

@os-emmer Just out of curiosity, could you link the MR challenges once you have created them?

That was my plan :stuck_out_tongue:


I had a look into the usage of housenumber=*. It is used 306x at the moment. About half of these objects seem to be added by a single user. These look a lot like typos for addr:housenumber=*. I contacted the user in the changeset discussion. There is no object, where housenumber=* and addr:housenumber=* store the same value.

Another 80x housenumber=* is used in combination with tags of emergency=* for some kind of fire hydrant or water tank. As this combination is used by several users in different places (all in Russia) I wonder if theses where intended uses. Does anyone know anything about this? The examples I checked where creaed by accounts that have only a few edits and that are inactive for a long time now.

Most of the uses of housenumber=* are probably typos for addr:housenumber=*. Some of them may also be addr:housename=* or ref. I think that this can be included into one MapRoulette-Challenge with the remaining street=*.


postcode=* is undocumented and used exactly 7x spread around the globe. As the number is so low that can be included into the MapRoulette-Challenge.

2 Likes

One such edit in Greece was done by a Kaart member. It was properly tagged addr:housenumber initially, but they changed it to housenumber. (specifically that Kaart member done quite several mistakes, so it’s not fully surprising)

Thank you anyway for this search you are conducting.

Far more interesting: Is there any instance where there are both keys present but they store different values? From the examples I’ve seen, there simply is no addr:housenumber tag present. If not, I would support an automated edit for all objects with building=*.

Maybe others have more intel on that, but I would leave that as is / ask in the changeset comments. If there is a meaning to it, a wiki page would be nice.

+1

Interesting. Did you find multiple of these or just this one?

Yes, exactly 3:

I agree. As there is some kind of system visible, this could be excluded from a MapRoulette-Challenge for now.

1 Like

In all 3 cases in Greece, that change was done by two Kaart members. (the other two cases 1,2).

3 cases is not much. As you already changed these 3 occurances, do you know why it was mapped like this?

3 cases is indeed not much, it was just weird that all 3 of them were done by Kaart. Due to a case I had with them involving DWG in 2020, I understood that Kaart doesn’t properly train all of their mappers and they don’t even quality check their edits, which is why I’m cautious when I see an edit from them, especially when it’s a modification and not creation.

Those 3 specific cases though, had mistagged the postal code and place in addr:housenumber anyway, which is why I moved them to the appropriate addr:postcode and addr:city. Although, I think there was a recent discussion about using addr:street for cases where there are no named roads and POI’s actual address is usually just the postal code and place, usually in villages. I’m not entirely sure about that, I have to search for it.

Then leave them from the automated edit, but I think the rest (of the buildings) are fine to edit automatically.

That is something that would be interesting to know during MapRoulette-Challenge. Please let us know if you find it.


I think to do a clean up on housenumber=* we need more filters. There are for example:
housenumber=B1-32, housenumber=1/10 or housenumber=7E1. These could be housenumbers but that should be checked manually. Than there are 4 digit long housenumber=*. That could be addr:housenumber=* but also addr:postcode=*. That should be checked manually.

A mechanical edit like this could work:

  1. Search for everything with housenumber=* and building=*.
  2. Filter out everything that has addr:housenumber=*
  3. Filter out every object where housenumber=* is not numerical.
  4. Filter out every object where housenumber=* has more than 3 digits.
  5. Move housenumber=* to addr:housenumber=*

I can’t tell exactly right now but I think that would edit 150-200 objects.

What do you think about such a mechanical edit?

1 Like

I think MR is a safer choice since most of problems are mostly transposition errors. If it were just a matter of consistent misuse of prefixes, near duplicates or other strange patterns I would agree with a manual edit. In this case, MR would let those with local knowledge recognize what information was out of place. They should also have the able to correct most incorrect values. This is perfect for zip code tagged as a house number example.

1 Like

How do others see this? Clean housenumber=* up as proposed by me or use MapRoulette?

Here are some statistics:

Search Filter number
object has housenumber=* 303
object does additionally not have addr:housenumber=* 300
object also has building=* 207
housenumber=* is also numeric 175
housenumber=* additionaly has not more than 3 digits 173

All of them happen to be ways. Here is an overview:

Independently from my question about the proposed mechanical clean up of housenumber=*, I have another question. For the MapRoulette Challenge, do you wish seperate small challenges or one big one?

  1. I could divide them by tags (3 challenges, 1 for street=*, housenumber=* and postcode=* each)
    Pro: It is easyer to concentrate on one tag instead of checking 3
    Con: An object that has multiple of these tags may get completly corrected in one edit and in the other challenges it would still pop up and would be needed to be marked as “already resolved” manually.
  2. I could divide them by Region (1 challenge for every continent, contry or group of countrys each)
    Pro: This was already suggested here, it is easyer to focus on an area you feel confident with.
    Con: As you can choose your area on the map anyways this is not realy needed.

Right now I prefer one big challenge but since I have seen country specific challenges in the past I wonder if there is a good readson for that.

Here is a map with every object that has

  1. street=*
  2. housenumber=* but not emergency=* (see here for explanation)
  3. postcode=*

As of now there are 3241x matches.

How should the MapRoulette Challenge be organised?
  • One big Challenge
  • Divide by tag
  • Divide by continent/country/area
0 voters

There seems to be not much support for a second mass edit for housenumber=* so I skip this.


My question for the design of the MapRoulette Challenge(s) has only 4 voters with no real consensus. I now created one challenge for all 3 tags excluding Jordan because of the refugee camp already meantined and excluding housenumber=* where emergency=* is set because there maybe seems to be some kind of local concept for this in some places in Russia.

grafik