Do Cerentino, Comunanza Cadenazzo/Monteceneri, Lavizzara, Personico, Onsernone and Campo still lack valid addresses?
Comunanza Cadenazzo/Monteceneri is completely uninhabited. Personico would seem to have legit addresses, but just 0 in OSM.
Cerentino, Lavizzara and Onsernone have addresses in the building records, but don’t have corresponding entrance records which leads to the workaround that worked for the rest of Ticino (using the building centroid as coordinates for the entrance records) not working for these municipalities. Obviously I could hack together yet another workaround, but I would prefer to wait a month or two to see if it gets fixed given that it affects just a couple of 100 of addresses (I’ll give the canton corresponding feedback).
Some municipalities display abbreviated abbreviated street names, not only in the Kurztext Strasse IT field.
I don’t think this is solvable without cooperation from the respective municipalities (if they even still have the relevant records), for example in Biasca some of the street names in question were surveyed by @lonvia so we can safely assume that that is what was on the street sign, and not just the GWR, the swisstopo street name list too, only contains the abbreviated name (and the common G. in Italian doesn’t even help with an educated guess).
I gave the data a quick look and couldn’t see a directly obvious issue. The only thing that I can think of right now that might be an issue, is that the municipality covers 8 (!) PLZ-6 areas and because the addresses in OSM are missing at least the PLZ-4 and there seem to be a number of streets with the same name in the PLZ-6 areas, that that causes havoc with the matching.
Sorry if I wasn’t clear, Simon. Most of the addresses are missing but the table shows only a few missing. Also, even addresses conflated directly from the GWR appear with the error “not official”.
Anyway, I also seize this occasion to thank you again for this very useful tool.
Famous last words or so, yes this is actually the issue.
As I wrote in Address and Street Data Updates - #31 by SimonPoole I use a simple heuristic to check if a municipality has validated its addresses or not, or put differently if the official flag can be used or not. If the flag can be used, addresses that don’t have the flag set are ignored. Evolène seems to have partially validated its addresses which causes the heuristic to assume that the flag is valid. This should be fixable with some fine tuning.
I’ve changed the heuristic so that at least 9080% of the addresses in a municipality need to have the official flag set before it is considered an actual indication of the address being valid.
This can naturally still go wrong, that’s the nature of the beast, but it should be a lot better now.
The problem is that if you look at municipalities for which the official flag actually means something you can see that up to 30% of the addresses don’t have that flag set and it would be nice to be able exclude such addresses over the whole country.
This is an issue, because for example in Ticino in some municipalities the addresses are still simply located on the building centroid and as a result you can have multiple GeoJOSN points at the same location.
Just another observation about the GWR data: there are in total ~170’000 buildings in the data that have a null gkat field, that is the municipality hasn’t classified the building. After all other filtering there are ~90’000 left that are uncategorized, some of these are legit addresses, some are not and there doesn’t seem to be a way to further reduce the number of false positives outside of turning ones brain on.
A good example of this is Würenlingen where ~30 buildings that are clearly just random bits of the cement production facility can’t be filtered out.
You may have noticed some small shifts in the numbers after the weekend. These are mainly due to two changes:
we now handle the same house number and street/place combination in the same municipality (but with different postal code) on the GWR side and the same with OSM data.
as a consequence we now not only match the addresses correctly in these cases, we can produce warnings for all elements too (for example when you have multiple shops that are tagged with the same address).
A further change is that the warning files only contain the actual issues now and no entries for aspects of the address that are actually OK, making them far easier to understand.
And yes there is a new per canton summary right at the bottom, not the most beautiful thing but functional.
The latest change has uncovered an issue in the GWR data that previously went unnoticed: there are situations in which the same address is assigned to multiple entrances and has multiple EGAIDs.
This is naturally slightly bizarre from a modelling pov and something that I can’t cater for right now, so if an address is present in OSM but it is still turning up as missing in the statistics, you might want to check that this isn’t one of these cases.
PS: according to the GWR documentation it isn’t a allowed situation, but that doesn’t really help
Eingänge von bestehenden Gebäuden (GSTAT 1004), deren Strassenbezeichnung und Eingangsnummer, auch wenn diese leer ist, mehrmals innerhalb der gleichen 6-stelligen Postleitzahl vorkommen, sind nicht zulässig. EQ1968
Ich hab’s nicht überprüft, aber es könnte gut sein, dass die Tessiner Gemeinden die ihre Daten noch nicht richtig erfasst hatten jetzt aufgeholt haben, siehe Address and Street Data Updates - #41 by SimonPoole
I’ve done a bit of work on this now, and count the duplicate GWR addresses and remove them from further consideration (it is essentially random which ones of the duplicates are ignored, so don’t be surprised if the coordinates change from statistics run to statistics run for such an address).
The reason I added the count is so that we can see how big the issue is, and with 2’915 for the whole country I would suggest simply ignoring it for now. It is interesting that the issue is concentrated in specific municipalities though.