Cleanup of name= fields describing buildings

Thanks for this effort. I sent them a message both in the changeset discussion and DM

2 Likes

Iā€™ll be heading to State of the Map US so probably wonā€™t get much more editing in until next week. Just for fun, hereā€™s an overpass query that shows the extent of things I think are more or less easily reviewable. Itā€™s about 3000 things. Somewhat of a challenge but definitely something achievable.

Thanks for talking me through things so far. Iā€™ve really been enjoying the change of pace from my usual mapping and getting to see a bit of different scenery!

[out:json][timeout:1500];
{{geocodeArea:Indonesia}}->.searchArea;
(  
  nwr["name"~"^rumah$",i](area.searchArea);
  nwr["name"~"^rumah warga$",i](area.searchArea);
  nwr["name"~"^gudang$",i](area.searchArea);
  nwr["name"~"^bangunan$",i](area.searchArea);
  nwr["name"~"^bandunan$",i](area.searchArea);
  nwr["name"~"^pabrik$",i](area.searchArea);
  nwr["name"~"^sekolah$",i](area.searchArea);
  nwr["name"~"^pos ronda$",i](area.searchArea);
  nwr["name"~"^kios$",i](area.searchArea);
  nwr["name"~"^ruko$",i](area.searchArea);
  nwr["name"~"^bangunan masjid$",i](area.searchArea);
  nwr["name"~"^kebun$",i](area.searchArea);
  nwr["name"~"^sawah$",i](area.searchArea);
  nwr["name"~"^irigasi$",i](area.searchArea);
  nwr["name"~"^irigasi mati$",i](area.searchArea);
  nwr["name"~"^sawah$",i](area.searchArea);
  nwr["name"~"^perkebunan$",i](area.searchArea);
  nwr["name"~"^padang rumput$",i](area.searchArea);
  nwr["name"~"^pepohonan$",i](area.searchArea);
  nwr["name"~"^jalan$",i](area.searchArea);
);
// print results
out body;
>;
out skel qt;

Comment from adiatmad about 1 hour ago
Halo, hasil editmu di OSM sangat bagus tapi tidak perlu menambahkan nama deskriptif (Names - OpenStreetMap Wiki) seperti ā€œname=RUMAHā€ karena tag building=yes sudah mencukupi.

Saya juga melihat ada beberapa orang yang mengirimkan pesan ke Anda (OpenStreetMap (OSM) Changeset Discussions). Boleh banget diskusi di sini kalau ada yang mau ditanyakan =)

Strictly speaking, ā€œrumahā€ https://en.m.wiktionary.org/wiki/rumah in a few mutually intelligible languages seems to have the literal meaning of ā€œbuilding=houseā€ on OpenStreetMap, which is more specific than ā€œbuilding=yesā€. Yes, this distinction is important, and if local mappers can be more specific, it is always better. So

building=house
name=RUMAH

Is something you might consider even suggesting id-tagging-schema as a generic word, because literally repeating. One example of how iD would warn the user is this changeset which used English https://www.openstreetmap.org/changeset/135645198 , the warnings:suspicious_name:generic_name=1 works for English but not Indonesian (and by extension, this would also apply to languages which use the same words; in some rare situations near but not equal languages such as Portuguese and Spanish thereā€™s some confusion on QA tools, so consider reduce false positives, because suggestions on the id-tagging-schema could also start to cause lot of issues)

However,

building=yes
name=RUMAH

Is different: building=yes is more generic than building=house, and if the name is descriptive, it is not repeating the tag.

PS.: just to make clear on the changeset message I quoted, I might be not considering for example the context, such as when doing the feedback to the mapper, assume the mapper actually be using ā€œrumahā€ for what if someone else do survey, could see that is is English meaning is not ā€œhouseā€ https://en.wiktionary.org/wiki/house#English, but ā€œbuildingā€ https://en.wiktionary.org/wiki/building#English (and is statistically significant than pure guesswork any armchair mapper could do without local knowledge; e.g. local mapper using imagery labeling everything as ā€œrumahā€). My argument is that it is possible to reasonably affirm building=house, better use building=house, not building=yes. A data customer likely will assume building=house is more meaningful.

and

Please stop doing that. In particular, explicitly tell everyone on ā€œOSM US Slackā€ doing QA to stop doing it. Let me explain.

I already complained recently on Telegram (around this message https://t.me/OpenStreetMapOrg/103550) about some kind of foreguein wiki trolling plus calling DWG to try to enforce QA.

Itā€™s not my point here if DWG decide to engage or not, because turns out that DWG might be called to try enforce one worldview when another mapper decide to ignore what makes no sense (and this become very obvious when even is a foreign language itself, not merely arguments; but sometimes can be both language and poor arguments). But this trolling instigates locals to get so angry to the point of being pissed off and simply use offensive language in English because the information already is contradictory and even those who understand English could simply decide to ignore nonsense. I would say seems also trolling the more common the ā€œ0-day block trickā€ against lack of changeset response from mappers when different personā€™s using English complaining against a local mapper like here

then it worked

about tagging usage, but note that not just thereā€™s no documentation on their language to explain alternatives, but even the entire Wiki actually allows different use than whatā€™s attempted to be imposed justified on the English wiki version itself without wait for local mappers feedback.

So, I could agree with ā€œTheir edits are generally fine otherwise so I presume itā€™s a language barrier issue.ā€, but is unclear to me if the problem is less about the local mappers (and lack of localized wiki documentation) and not poor understand of English itself and arbitrary literal mass edit based on Wiki as long as the non local mappers do the bare minimum to not receive complaints on DWG and, if any, blame the local mappers.

With all this said

If ā€œthe planā€ is not merely discuss volunteer adoption, but implicitly enforce, then the same principle (know mass editing, rudimentary language knowledge need, likely not even this) could them be used by foreigners doing edits on USA: focusing on changeset comments of not recently active editors on your rural areas (this would resemble likelihood of fast response). Thatā€™s why I think it is a good idea to talk with everyone on OSM US Slack about how your QA-fixers ā€œfixā€/ā€œcleaningā€ overseas.

Either I donā€™t understand that, or I donā€™t agree with it. Whilst there can be issues associated with mapping of places in country A by people in country B by people not entirely familiar with country A, I donā€™t think that this is one of those.

I definitely do agree that it makes sense for messages (changeset discussion comments, and also ā€œplease read your changeset discussion commentsā€ messages) to be in a language that the recipient is likely to understand, and itā€™s not safe to assume that all 8 billion people have a working knowledge of English or a means to translate it. Google Translate et al arenā€™t perfect, but at least there will be some words in there that the recipient will recognise.

Perhaps you could rephrase exactly what you mean here because otherwise I suspect it might be misintepreted.

(and yes, if someone is persistently adding ā€œname=houseā€ in whatever language to every house they map, then a little friendly help explaining how OSM tagging works is needed)

I do agree with all this.

Yes, weā€™re not talking about the same problem. And, if oversimplifying, would be name=rumah, not name=house.

But to comment on the linguistics part, terms can and often will, have very different uses. Even the same languages like Portuguese, European Portuguese and Brazilian Portuguese without use of slang or regionalisms can have contradictory definitions, which from time to time lead to mistagging on suggestions from iD, which disproportionately affect outside Brazil. Also, at the country border, both Spanish and Portuguese can lead to warnings on validators. And on this thread, even machine translation is being used, which can have additional errors.

But again, Iā€™m talking about a different issue, and I would invite others from ā€œOSM US Slackā€ to pay attention to how such QA-fixing is being done.

Sorry, but you even is aware that this could be a addressing scheme, yet still okay to simply remove without asking first?

While I cannot discuss this case in particular, some kinds of patterns used on OpenStreetMap are used for addressing.Here in Brazil this even include emergency response which use OSM dada on offline maps when central government does not have official adressing. So, if this kind of undiscussed editing would happens in one of these spots here, thereā€™s a chance the actually complain of find the undiscussed revert would be like police not finding place in rural area in worst cade situation possible. Also, typically the offline apps used for OpenStreetMap can take up to a month with whatever was stored, so even discovering the problem could make almost a month without get the data working again.

Please do not ask the locals to ā€œrevertā€ in case of error, because id you are sure and even donā€™t take time to check, then you shouldnā€™t do it.

Iā€™ll try to cover the various questions/comments above. Let me know if Iā€™ve missed anything. Iā€™ll start with specific things and get more general.

Potential addreses in name fields
To start, I have reverted 135992512. Happy to do so. Presuming they are addresses, I would hope they eventually get moved to a tag that has semantic meaning.

Helping mappers with general mapping practice
I take your point about interactions in a non-native language. I worry about that a lot. That said, I have had many great interactions with both parties using mechanical translations.

My goal is to help another mapper avoid two of the most common issues I see when review edits (unsquared buildings, and improper/unnecessary tagging). Itā€™s my understanding that asking for a 0-day block is not uncommon, my intention is not to troll or harass. As soon as I realized this wasnā€™t the path to success, I asked here for someone with native language experience. I explicitly didnā€™t escalate back to DWG because thatā€™s way way way way way overkill.

I think this a useful conversation but would prefer to have it in a different thread so it can get the room it deserves without being tangled up in this specific set of edits.

Mapping all over the globe
I disagree that editing should contained within boundaries. They should definitely be done cooperatively though! I absolutely recognize that I cannot be a global expert. Itā€™s also why Iā€™m happy to revert things whenever asked. I think this thread is actually a good example of cooperating across boundaries.

General comments
The goal is to improve the map. Taking descriptive name= tags and ensuring they are reasonably converted into structured tags makes the map data better and more useful. The exercise can also show where the common tagging patterns are insufficient/confusing/etc. This can help reduce the issue into the future. Itā€™s a virtuous cycle.

I should be more specific about my work pattern as there does seem to be assumptions about how the work is happening. Using the guidance provided by @rtnf, I am reviewing each object to make sure any retags make sense. Any time Itā€™s not clear, I leave it alone. I try very hard not to damage the map.

4 Likes

Hello! I got back from my trip and did a bit of editing this week. Mostly followup cleanup of ā€œhouseā€ and ā€œbuildingā€.

I started to evaluate ā€œschoolā€ but havenā€™t really done much as it seems like the actually name of the building could be School. Itā€™s much harder to tell if the tagging is intentional or haphazard when thereā€™s only 1 building in the are with a name=* tag. My current thinking is to leave all these for now, perhaps as a marker for more local mapper to come and do some cleanup. I donā€™t really feel comfortable doing this one from afar.

Iā€™ll try to do some evaluation to see which of the other items have the most elements that need reviewing later this week.

@watmildon Hello, welcome back! Do you mind sharing your findings related to the ā€œschoolā€ that you are in doubt about? As local mappers, some of us might analyze if it is intentional or haphazard as you mentioned. Thanks!

1 Like

It would be great to get some local perspective. Hereā€™s an Overpass query I used to get things in to JOSM to look at a bit:

[out:json][timeout:120];
{{geocodeArea:Indonesia}}->.searchArea;
(  
  nwr["name"="sekolah"](area.searchArea);
  nwr["name"="SEKOLAH"](area.searchArea);
  nwr["name"="Sekolah"](area.searchArea);
);
// print results
out body;
>;
out skel qt;

Thereā€™s mostly 2 classes of things here:

  1. lone buildings with amenity=school and name~ā€œSekolahā€
  2. a small cluster of buildings all with name~ā€œSekolahā€

The first set (and majority) is the one giving me pause. The second set could definitely use some form of cleanup, in the US Iā€™d try to approximate the containing area for the school with an amenity=school and then remove the name tags from the buildings. Does that sound reasonable? Would love to know what you think of the first class of stuff.

Happy to keep working on things, just being somewhat cautious.

I have modified the tag based on your query. In general, we can delete the descriptive tag name=sekolah and use building=school or amenity=school instead to preserve the information. But need further local knowledge, I mean from the ground, to add more detail i.e. school name, operator name, etc. Thanks @watmildon for sharing this information!

1 Like