lonvia
(Lonvia)
1
I wondered how the Canadian community feels about the address interpolation lines that come with the CanVac imports. They have become a noticable issue in the Nominatim search engine.
The address interpolation QA report gives you an overview of interpolation lines Nominatim takes issue with. At the time of this writing more than 70.000 OSM ways are flagged as problematic in Canada. The two main errors are:
On top of that, there are interpolations that donât follow any street or path that is visible in OSM or aerial photos (Way: 143606764 | OpenStreetMap). These are not even flagged in the QA report.
Finally, Iâve noticed that the interpolation lines are not deleted when the actually existing house numbers are added. Example: OpenStreetMap
What would be the best way to go about improving the quality of this data? Would it be possible to add some preprocessing to the import that avoid importing the worst offenders?
Edit: Way: 143579633 | OpenStreetMap removed as bad example because interpolations in the middle of nowhere do have addr:street, so there is some information about the address.
3 Likes
Well, it certainly is a bit of a mess. A lot of the issues Iâve seen in my neck of the woods seem to have come from old imports that were cut off in odd placesâcausing interpolation points to âclampâ to the edge ofâŠtiles I assume? Before the import from earlier this year, you could see the cutoff lines in some parts of Longueuil.
As far as how I feel about it though, I think this data is indispensable and Iâd rather have the current slightly-broken version than nothing.
Iâve only really been active for something like a year, maybe a year and a half, but the vast majority of the work Iâve done in that period is surveying addresses. This is because prior to the most recent import, huge portions of Longueuil had no address data -at all-. And sadly, since most of the regional mapping attention is limited to Montreal fully surveying and keeping these addresses up to date is likely impossible without imported data. With the current data, itâs at least possible to look up the rough location of addresses.
Iâve not been involved in any data imports, but I would welcome any work to improve its accuracy. From my experience, the recent import fixed a lot of weird & broken interpolations in my area, but seems to still be stretching single addresses to cover the full area around the building as you say. (Hereâs a recent example for folks, I could locate a bunch of these if needed)
As an asideâIâm glad this came up, because I havenât been doing that very consistently after address surveys. I wasnât entirely sure if should or shouldnât be done, so I mostly stopped deleting the interpolations along streets Iâd surveyed out of an abundance of caution (that example wasnât one of mine though!)
This is probably just a case of newbie-itis though!
1 Like
jfd553
(Jfd553)
4
Youâre right. In order to respect OSM API capabilities at the time the Canvec product was created (2010-2012), the map sheets were divided into square tiles varying in size based on the number of features in each tile.
This is something that came up last year on the mailing list as well: [Talk-ca] Which is preferred? Addr:interpolation with a source or adding in actual addresses?
In addition to the sorts of errors or âquirksâ of the interpolation ways youâve already identified, I found that the ones I deleted last year in Bragg Creek, Alberta were often placed on the wrong sides of roads (i.e. the âoddâ numbers and âevenâ numbers were erroneously swapped around), or went the wrong way (i.e. the lower number at the start and higher number at the end of the interpolation way were erroneously swapped around).
As I said in my reply to the list: I generally replace the interpolations with actual building-by-building or POI-by-POI addresses, and delete the interpolations when Iâm done. Their accuracy is often dubious at best. In the absence of more precise addresses I think theyâre better than nothing, but with the addition of more precise addresses in my opinion they become entirely redundant and otherwise not worth keeping.
You ask,
What would be the best way to go about improving the quality of this data? Would it be possible to add some preprocessing to the import that avoid importing the worst offenders?
and Iâm a little concerned: are you thinking of âre-importingâ the CanVec data? I wouldnât; I think itâs a pretty common sentiment that CanVec was better than nothing, but going forward isnât accurate enough to satisfy most users, and thereâs no point going back to the same âwellâ to fix the original errors.
Frankly, as time-consuming as it would be, I think the only way to do it properly is slow, methodically, one-by-one double-checking and fixing/replacing.
Also, as an aside:
FYI those addresses are simply out of alignment with the underlying data; the associated roads are on the map, but located about 600 m south of the address interpolation ways, and tagged (without names) as highway=path.
lonvia
(Lonvia)
6
Thanks for the responses. It is great to see that people are busy replacing the interpolation slowly with building house numbers sourced from surveys. And, yes, please do delete the interpolation lines when a street is completely surveyed. It will help reduce confusion for the data users (and also tell the next mapper that the street is done).
Oh no, I wasnât thinking that. Some of the broken interpolations have been created only yesterday (Way: 1282312246 | OpenStreetMap). This is also what prompted me to write in the forum here. I was wondering if it isnât better to stop these imports for a bit and improve the process before continuing.
Thatâs more along the lines I was thinking. In particular, Iâm wondering if there is something remote armchair mappers could help with.
One thing that comes to mind are the interpolation lines that are cut on CanVec tile boundaries. It should be fairly easy to create a list of problematic ways and to fix them remotely by joining the ways. Could be a MapRoulette challenge.
Other improvements are more problematic without local knowledge and I wouldnât want to start larger edits without the consent of the local community: Condensing one-address interpolation lines to points might work with armchair mapping. At least short ones like Way: 357011203 | OpenStreetMap should be fairly save to do (except for the fact that it sometimes not clear if the number exists at all), the longer ones like Way: 1281057082 | OpenStreetMap are a bit trickier but it might be possible from aerials to identify the one house along the road the address likely belongs to.
Maybe you already had other ideas for armchair fixing.
Thatâs a case that is clearly not fixable without going out an surveying. But that begs the question, isnât it better to delete the data completely in that case? Right now it does quite a bit of damage because it leads people to the wrong place. Better to have no data. And when the error is fixed by surveying, it can be done with on-the-ground data. The CanVec data will not help.
Phew! 
Yeah thatâs tricky⊠I donât really have a better suggestion than leaving it to local mappers, if we have 'em.
Your example of the âshort, single-address waysâ is emblematic of the sorts of problems weâll face without boots-on-the-ground knowledge: thereâs a building at the corner of Earnscliffe Avenue and Little Road with potentially three different addresses. Is it 2 Little Road (following the interpolation way on the west side of the building)? Is it 4 Little Road (following the interpolation way on the south side of the building)? Or is it 24 Earnscliffe Avenue, following the interpolation way on the east side of the building? Itâs perfectly conceivable they may all be correct, in some way: maybe there are multiple entrances to the building that each have different addresses. I donât know how to figure this out without the aid of someone with a lot of local knowledge.
Well, we could just move the interpolation ways 600ish metres south, and that would certainly improve the accuracy quite a bit.
Whether the interpolations themselves are correct in the first place is another, underlying issue.
1 Like
What I do is I move the interpolated lines outside of buildings in JOSM and move the interpolation in front of the building- because if they are inside the building, itâs usually mistaken as the address of the building, and never gets surveyed (ex: StreetComplete will not ask for address if the interpolation is inside of it). A lot of places where i am (gta area) have this, and itâs annoying because i want to survey the addresses, but SC doesnât give me the option to.
Once it gets manually surveyed, I delete the interpolations via JOSM, but that is a manual process.
lonvia
(Lonvia)
9
I really donât see how this is realistically achievable, even if the Canadian community suddenly grows by an order of magnitude. There are now 1.4 million interpolation lines in Canada. From what I understand from the responses in this thread, they are all potentially of questionable quality and therefore all need to be resurveyed.
That makes me wonder if this data belongs imported into OSM in the first place. It would be much better to prepare the data as an external dataset in a way that it can be easily used with OSM as fallback data. Weâve been doing this for many years now in the US with the house number interpolation data from TIGER. Nominatim (the search engine) can import it on the side and use it as a fallback in the US but it will always prefer house numbers from OSM if they are available. The added bonus of an external dataset would be that it can be easily updated with each new version of CanVec.
There is a quirk with the current tagging schema of interpolations. Each interpolation creates at least two address nodes that are on first sight indistinguishable from exactly mapped house numbers. Youâd have to look if the address node is part of an interpolation way to understand that it is not an exact number but an estimate. Very few data users do that. And thatâs where the low-quality interpolations do a lot more harm then it seems on first sight.
1 Like
PierZen
(Pier Zen)
10
Les interpolations jouent bien leur rĂŽle et il est possible dâamĂ©liorer la qualitĂ© avant de penser Ă de nouveaux imports / nouveaux problĂšmes.
PlutĂŽt que critiquer ainsi et les donnĂ©es et la communautĂ© OSM, je penses quâil est plus mobilisateur pour les communautĂ©s locales si nous identifions les problĂšmes Ă rĂ©soudre et de leur proposont des solutions simples Ă rĂ©aliser.
Peux-t-on penser Ă des requĂȘtes Overpass et parfois en deuxiĂšme Ă©tape utiliser des filtres JOSM pour rĂ©soudre. Y-a-t-il un contributeur assez familier avec Overpass pour complĂ©ter la solution ci-dessous ? Jâai il y a plusieurs annĂ©es utilisĂ© JOSM pour identifier/corriger plusieurs de ces problĂšmes, notamment lorsque node dĂ©but/fin ne contient pas lâattribut adresse.
-
chemin pour une seule adresse
way[addr:interpolation] Node(1)[addr:housenumber]=Node(-1)[addr:housenumber]
-
chemin avec adresse début-fin manquante
(
way[addr:interpolation] Node(1)![addr:housenumber]
way[addr:interpolation] Node(-1)![addr:housenumber]
)
-
chemin dont les adresses non cohérentes avec interpolation
(
way[addr:interpolation=even] Node(1)![addr:housenumber] non cohérente
way[addr:interpolation=even] Node(-1)![addr:housenumber] non cohérente
way[addr:interpolation=odd] Node(1)![addr:housenumber] non cohérente
way[addr:interpolation=odd] Node(-1)![addr:housenumber] non cohérente
)
1 Like
jfd553
(Jfd553)
11
Canvecâs address interpolations most often come from municipalities, which provided them to their provincial governments, which made them available to the federal government (i.e., Canvec), without a priori validation. We cannot therefore speak of wall-to-wall data quality (a mari usque ad mare qualitas ;-)). Rather, data quality should be assessed on a municipal basis.
I agree with Pierre on the fact that we must identify the problems to be solved and propose simple solutions.
I think an import of address interpolations from Canvec could only be relevant where nothing else exists.
PierZen
(Pier Zen)
12
Petit mĂ©nage du printemps chemins interpolation dâadresses OSM
Identifier et joindre lorsque deux segments
Les adresses dans OSM sont importantes pour que les divers outils de navigation routiĂšre fonctionnent optimalement et les livraisons, services dâurgence etc. puissent trouver leur chemin rapidement. Au Canada, on retrouve souvent des chemins avec lâattribut addr:interpolation pour reprĂ©senter une sĂ©rie dâadresses le long dâune rue. Sur la premiĂšre et la derniĂšre node on retrouve normalement lâattribut attr:housenumber avec au dĂ©but la plus petite adresse et Ă la fin la plus grande.
Lors de lâimport Canvec, ces chemins ont souvent Ă©tĂ©t coupĂ©s en deux ce qui rend non fonctionnel cette sĂ©rie dâadresse. Voici ci-dessous une recette pour corriger ce problĂšme.
Je vous proposes la recette suivante Ă lâaide de JOSM + tĂ©lĂ©charger via lâAPI Overpass + coloriage Admin Boundaries
-
F12 / Coloriage / Sélectionner Admin Boundaries / Cliquer sur > pour ajouter dans votre liste personnelle.
-
a. Zoomer sur la zone oĂč vous voulez tĂ©lĂ©charger les adresses Ă valider
b. TĂ©lĂ©charger / TĂ©lĂ©charger via lâAPI Overpass
c. Instructions : Cliquer pour télécharger
way ["addr:interpolation"]({{bbox}});
node(w:1,-1)[!"addr:housenumber"]; out meta;
node(around:0); out meta;
way(bn)["addr:interpolation"];
out meta; >; out meta;
Observez ensuite que les cercles rouges montrent les nodes intersections entre deux chemins. Il sâagira ici de simplement cliquer sur les chemins adjacents et de cliquer sur la touche de raccourci C pour joindre les deux chemins,
- Utilisez la fonction Validation dans le panneau de droite. Une liste de correction apparaitra, ce qui permet ensuite de corriger les différents segments.
VoilĂ !
1 Like
I can go through the interpolation lines while importing them and merge the adjacent ones at the tile borders. Would that help?
1 Like
PierZen
(Pier Zen)
14
Le problĂšme que nous avons est que trop de contributeurs sans connaissance suffisante des donnĂ©es dâadresse ont fait des imports cela sans valider les donnĂ©es. Je suis Ă rĂ©viser Ă Chateauguay, au sud de MontrĂ©al, oĂč un contributeur a fait un deuxiĂšme import quelques annĂ©es plus tard par dessus les donnĂ©es existantes crĂ©ant systĂ©matiquement des doublons et des chemins et des nodes. Les Ă©ditions multiples par la suite ont ajoutĂ© au problĂšme les contributeurs ne rĂ©alisant pas que les chemins et nodes Ă©taient en doublon.
Il serait prĂ©fĂ©rable oui de contrĂŽler lâaccĂšs actuel Ă ces donnĂ©es dâimport par des contributeurs non expĂ©rimentĂ©s.
Comme je lâai dĂ©ja dĂ©montrĂ©, il est facile dâidentifier les donnĂ©es Ă rĂ©viser. Mais des corrections partielles via Maproulette, cela est aussi dommagable que les imports. Les contributeurs Maproulette ne vont voir que deux segments Ă raccorder sans voir les doublons cachĂ©s derriĂšre et corriger partiellement ajoutant encore dâautres problĂšmes. Lorsque deux segments + doublons (4xchemins + 8xnodes), je vois des situations oĂč une seule portion en doublon a Ă©tĂ© effacĂ©e.
interpolation 1: n(housenumber) w(interpolation) n() - n() w(interpolation) n(housenumber)
interpolation 2: n(housenumber) w(interpolation) n()
Jâaimerais que Lonvia documente son interprĂ©tation que les contributeurs du Canada trouvent les donnĂ©es de qualitĂ© douteuse ce qui justifierait de les rĂ©-importer. Affirmation par ce dĂ©veloppeur pour le moins dĂ©motivante pour la communautĂ© 
Je constate bien sûr des améliorations possibles dans la forme des données. Mais
contrairement Ă lâĂ©noncĂ© de Lonvia, je constate que les donnĂ©es dâinterpolation sont en gĂ©nĂ©ral correctes au QuĂ©bec Ă tout le moins. Je lâinvite Ă regarder les donnĂ©es de plus prĂšs
. Pour le Québec, voir le fichier WMS du gouvernement du Québec (source des données utilisées par Canvec pour le Québec) wms:https://servicescarto.mern.gouv.qc.ca:443/pes/services/Territoire/AQ_ADRESSES_WMS/MapServer/WmsServer?FORMAT=image/png&TRANSPARENT=TRUE&VERSION=1.3.0&SERVICE=WMS&REQUEST=GetMap&LAYERS=Adresses&STYLES=&CRS={proj}&WIDTH={width}&HEIGHT={height}&BBOX={bbox}
Câest moi le contributeur canadien qui trouve que les donnĂ©s sont de qualitĂ© douteuse
:
Je devrais expliquer que je suis seulement un voix parmis tout le monde. Ăâest mon opinion seulement Ă cause de les donnĂ©es en lâAlberta; surtout le sud de lâAlberta.
mtmail
16
@lonvia In the Nominatim database 334 CA interpolations didnât have a parent place (road) assigned. Thatâs usually because Nominatim cannot find a road with the same name nearby. I believe I fixed 80% of those now. Itâs not easy to check because Nominatim doesnât update the interpolation in the database after a nearby OSM change.
Examples:
- Interpolations with road name âunknown road 59â. Removed. Changeset: 151679814 | OpenStreetMap
- Interpolations but no road on the map yet. I created the roads.
- Interpolation where the name was slightly different. E.g. âRyan s Laneâ vs âRyanâs Laneâ
- Interpolations from 10 years ago where today there is no road or house nearby.
- Interpolations where one node didnât have a house number. Sometimes those were next to another and could be merged together
- Interpolations on RV camping areas which represent parking spot numbers, not house numbers. I didnât delete them. Way: 696410754 | OpenStreetMap
- Interpolations where the start and end house number were the same. If there was clearly only one house I converted it to an address node
A couple are impossible to fix
jfd553
(Jfd553)
17
Câest un peu ce que jâai expliquĂ© dans mon intervention prĂ©cĂ©dente. Les interpolations dâadresses Canvec proviennent gĂ©nĂ©ralement des municipalitĂ©s, qui les ont transfĂ©rĂ©es aux provinces, qui les ont transfĂ©rĂ©es au gouvernement fĂ©dĂ©ral (Canvec), sans quâil y ait de validation, du moins au niveau fĂ©dĂ©ral.
Câest pourquoi elles peuvent ĂȘtre de mauvaise qualitĂ© dans le sud de lâAlberta, trĂšs bonnes Ă Edmonton et excellentes Ă Calgary. Les situations diffĂšrent dâune province Ă lâautre, selon le niveau de validation effectuĂ© par chaque province.
1 Like