Import Addresses from City of Guelph Data

Hello folks, I am proposing to import the Addresses dataset, sourced from the City of Guelph.

Documentation

I do not currently have a wiki page setup for this import, I was first curious to hear from the community before I went ahead with it.

This is the source dataset’s website, along with the data download:

I have a file I have prepared which shows the data after it was translated to OSM schema, if anyone knows of a place where I can host and share it with relative permanence, that would be welcome! Otherwise, I could use something like wetransfer? Thank you to @watmildon for suggesting a git repo, the file is hosted there:

License

I have checked that this data is compatible with the ODbL. This data is distributed under Open Data Licence Version 2.0, which is compatible according to the following source:

Abstract

I have no relationship to the dataset, which contains 53,579 records of addresses in Guelph. I am planning to use JOSM and the Conflation plugin in order to merge the data, and I am using the zipped address data from the source specifically. The data is imported into JOSM, and mapping of the tags are as follows:

  • The streetno tag is replaced with addr:housenumber with no change to the source values.
  • The unit_no tag is replaced with addr:unit with no change to the source values.
  • The postcode tag is replaced with addr:postcode with no change to the source values.
  • The fullname tag is replaced with addr:street with no change to the source values.
  • The place tag is replaced with addr:city with all values set to ‘Guelph’.
  • The addr:province tag shall be added with all values set to ‘Ontario’.
  • These tags will not be transferred over from the source data: x, y, label, objectid, addid, streetid, streetname, qualifier, has_unit, gpid, pin, segmentid, status, parity, name, place, addlocinfo, landmkname, addleg, utm_x, utm_y, lat, long, roll_no.

I am unsure how long this process will take. It will mostly be a lot of menial work of double checking that the data is going into the right place, and I believe it will take 4 months perhaps? I am new to this process of importing data, however, I have done a small scale test in JOSM to ensure that this would work.

This project so far only involves me, and the process I plan on engaging with is chipping away at the data on a street by street basis.

I am aware that I haven’t put a post in the main community forum, I wanted to get feedback on this first before I go ahead with the process of asking! I welcome any past experiences, advice, and the such :D.

EDIT: Something I wanted to mention: The data source is from September 2024, making it a year old. While this might be considered outdated, I believe it is a strong idea to proceed anyway as a majority of the buildings do not have a house number tag associated with them.

1 Like

It’s pretty common for the intermediary data files to live in a git repository (github, codeberg, gitlab) and I find that pretty easy to work with. It also gives you a good spot to put any process documentation you want to keep.

I have done a ton of address work in JOSM and am always happy to talk about it. Here’s a few things I’ve written you may find intersting. They’re about using the National Address Database which is a US dataset but you can easily substitute in your own.

General thoughts about how to review and merge address datasets: watmildon's Diary | Adding addresses with JOSM and MapWithAI | OpenStreetMap
Using the address data to review/confirm roadway name data: watmildon's Diary | Using the US National Address Database to assist TIGER tag cleanup | OpenStreetMap
Conflation plugin and speeding up review: watmildon's Diary | Using the JOSM Conflation plugin to add 1500 addresses in 10 minutes | OpenStreetMap

1 Like

Oh no duh, why did I not think of Github LOL. I can get that done then. Thank you for the links, I’ll comb through them sometime in the near future, but gotta say, Conflation is a great plugin, dunno what I’d do without it!

2 Likes

New users are only allowed to have 3 links in a post LOL, so another reply it is:

  • I’ve tested out a small edit using the process, and it works.
  • Also, a repo is set up over at Codeburg for future reference!
1 Like

You should set the source tag on the changeset to something along the lines of Guelph Open Data and add a link to the documentation page/forum discussion.

That way it’s easier to track the changes

1 Like

Hm, for whatever reason I don’t think I can edit my original post? Anyway, update: I have created a wiki page here, and I am also waiting till September 15th for any objections before I start importing the data. I’ll also be adding the “Guelph Open Data” value to the source tag in the changesets.

Oh hey, one of my favourite mapping topics, having applied 400k+ addresses in Canada :slight_smile: After roads and buildings, addresses are (imho) the next most important bit of data for people using maps to navigate within the built environment. So thanks for taking this project on!

As for your time estimate of ~4 months, that sounds about right? It took me about a year of plonking away in spare moments (mostly in the evening) to get ~200k points into Winnipeg using JOSM.

I have a small application I’ve written that does various “data massaging” tasks for these data sets, including fetching them from data layers in e.g. Esri feature servcies, normalizing tags (both keys and values), and generating usable geojson files from them that are broken out by neighbourhood, town or similar geographically convenient methods. Some things I’ve learned while adding address points to various parts of Canada:

  • Different datasets have the address points anchored to different locations. Some are on the centroid of the land parcel, some are on the center of the “front edge” (street frontage) of the parcel, some are in the building they relate to … depending on the dataset, this can be quite an annoyance, and it’s hard to programmatically get it right in a large number of circumstances, given that Canadian properties often have multiple outbuildings (more so in the countryside, of course, but also alley-facing garages as well as backyard sheds) that can make deciding where the address belongs a bit harder. But figuring out WHERE the address points are registered is an important first step, as that tends to be consistent throughout a single dataset.
  • Duplexes and terraces … these often require a good amount of manual effort, AND there are multiple possible approaches to this: just leave the address points as independent nodes, attach them to entrance nodes on the buildings, subdivide the buildings into terraces, … .. . .
  • Depending on the starting point … buildings may be missing, landscape features may be in less-than-great states. So sometimes getting the addresses into a state that ends up looking sensible in the dataset as a whole takes a bit more effort. Some of the towns I’ve done in e.g. Nova Scotia have had some … less than accurate sections of buildings.
  • Sometimes there are already SOME addresses laid in by hand, and those tend have a high incidence of error (e.g. more than 1% are incorrect in some fashion).
  • Multilingual road names are STILL not really (excuse the pun) addressed well.

I’m also concerned about updating over time. For instance, there are areas in Winnipeg that have not had buildings drawn in, and in some cases they aren’t even showing up in aerial imagery yet … I have manually sorted those sections out into “check later” folder on my computer. That’s not super sustainable.

BUUUT… it’s definitely all possible, and super valuable!

So, super interested in seeing your project evolve … and if you’re into discussing any of this in more detail, talking about how to get more of Canada’s settlements into a better place, working on tooling that could help us brave few, or just want someone to review changes or even do some address mapping hang outs online, hit me up. :slight_smile:

1 Like

addresses are (imho) the next most important bit of data for people using maps to navigate within the built environment.

cannot agree more! I’m using OSM as my go-to maps, however the lack of addresses has me resorting to Google Maps, so I’m excited to solve the pain point!

Duplexes and terraces

Yes, there are a number of times I’ve encountered a monolithic house that is actually a semi-detached house. I plan on checking the locations in person for information that is vague from the open data available, it’ll be a good incentive for me to get out of the house haha!

I would love to take you up on that offer of talking sometime! I’ve done some prep/test changes locally to make sure my workflow is sound, and JOSM is truly such a blessing! It’s super cool doing this work, really makes me feel like I’m making a difference.

Bumping this one last time, as the 2 weeks between posting it to the forum vs beginning the changes are almost up (expires on September 15th)!

Hi folks! Necroing this to say that the import is complete! Organic Maps has caught up to half of the data that I have imported and it’s super freaking sick to see everything in the app :grin:

3 Likes