Regional subdivisions in Geofabrik data for Canada

Hello,

I’m not sure if this is the best forum, but I couldn’t find a better one.

Several Canadian provinces and territories are rather large in the Geofabrik regional exports: Nunavut is 1.2 GB, Quebec and British Columbia are 1.1 GB each, Ontario is 863 MB. Currently there are no subdivisions defined for them, so downloading an export requires getting the full province/territory.

I know some other countries have sub-regions defined below state/province level (e.g. Regierungsbezirke for some German states) – would it be possible to define sub-regions for the bigger Canadian regions too?

They have the following subdivisions in OSM:

  • Nunavut admin_level=5 has 3 subvisions plus a fourth one also tagged with boundary=aboriginal_lands that overlaps the other ones - if possible I would suggest using boundary=administrative + admin_level=5 to subdivide
  • Quebec admin_level=5 has 16 subdivisions, that might be a bit too fine-grained, I don’t know if there are also larger subdivisions in popular use? Maybe they could be assembled from several admin_level=5 combined together, or is it easier for Geofabrik if we were to manually draw rough dividing lines?
  • British Columbia has a couple admin_level=5 + boundary=aboriginal_lands. Admin boundaries are at level 6 and there are 28 of them, which seems like too much to use for data export subvisions. I would suggest a manual subdivision something like: Vancouver Island, Metro Vancouver up to Chilliwack, then divide the rest into northern and southern?
  • Ontario admin_level=5 has 6 subdivisions (Overpass Turbo link) that make sense to me as an Ontarian

It might also be worth it to subdivide others…

  • Northwest Territories is 403 MB, it has five admin_level=5
  • Alberta is 312 MB, it has admin_level=6 for counties that seem a bit too fine grained, a rougher subdivision would probably be Calgary, Edmonton, southern Alberta, northern Alberta?
  • Newfoundland and Labrador is 245 MB, it has two admin_level=5 (for Labrador and Newfoundland) which would work

The others are under 200 MB and do not have conveniently large admin subdivisions in OSM, so it’s probably not worth subdividing them.

They could be group by natural features like Saint-Lawrence river (south/north) in Québec.

I’m not sure how useful splitting would be in terms of keeping data size down. Just eyeballing it from the map, Quebec has most of its data concentrated along the St. Lawrence River, Ontario in the area surrounded by the Great Lakes, and British Columbia is centered around Vancouver. They’re not like California, where a dividing line somewhere around Fresno will split the state into two similar-sized chunks of data.

As a rule of thumb, I have tried to introduce subdivisions when files
surpassed the 1 GB. I’m happy to look into the larger Canadian
provinces, though the 16 subdivisions of Quebec do sound a little over
the top!

The download server generally only has free shape files for the “leaf
nodes” of the region tree so splitting Nunavut in three will mean the
loss of the whole-Nunavut shape file.

1 Like

I was thinking of subdivisions like “Montréal and suburbs”, “Québec and suburbs”, maybe Eastern Townships/Estrie, Outaouais, and then two or three subdivisions for the remaining areas. I’m guessing people are more likely to want Laval and Longueuil in one data file than Laval and Val d’Or, despite the Saint Lawrence separating Longueuil. But I’ll defer to Quebecers for subdivisions that make sense to them.

I’m not sure about that. Nunavut is 1.2 GB for 2 million square kilometers, and that’s overwhelmingly natural features (mostly waterways and water polygons, per Taginfo). Quebec is 1.1 GB for 1.5 million sq km, and much of it also has many lakes, so I would guess the natural features to be roughly 700-900 MB, and that’s going to be distributed throughout the area. Hence the proposed subdivision - people who want to work with Montréal’s data won’t need to download every lake in Nord-du-Québec.

Yes, I agree 16 subdivisions is a bit much. How do you split the regions in your toolchain? Does it go off admin boundaries in the OSM data you’re processing? Or do you have the boundaries extracted into standalone polygons before the processing? If the latter, could we provide polygon boundaries for sub-regions manually?

I will admit this hasn’t occurred to me, partly because I’m most interested in the .osm.pbf files. Do you have a way of knowing if the shp files are used, and by whom?

Below are overpass queries to extract admin_level=5 relations and group in two subsets with the ref:qc variable. Adding Montreal + Laval to South Saint-Lawrence would reduce the size of the North subset.

// Nord

[out:xml][timeout:120]; 
relation[boundary=administrative][admin_level=5](area:3600061549)
  ['ref:qc'~'02|03|04|07|08|09|10|14|15'];
out meta; >; out skel qt;
// Sud-Saint-Laurent + Montreal + Laval

[out:xml][timeout:120]; 
relation[boundary=administrative][admin_level=5](area:3600061549)
  ['ref:qc'~'01|05|06|11|12|13|16|17'];
out meta; >; out skel qt;

No it doesn’t. The extract has a couple of admin_level=5 because BC borders the Northwest Territory.

Speaking as a local, there is no obvious set of divisions. admin_level=6 entities aren’t something we really consider. In common usage people talk about cities or admin_level 8 entities - probably the cities more often than the admin_level=8 unless you’re following local politics.

I’m fairly map-aware, and if you asked me when you leave Metro Vancouver when traveling up the Sea to Sky I couldn’t answer you or tell you the name of the next district.[^1]

The admin_level=6 areas outside Metro Vancouver and the Valley are also quite messy in OSM with exclaves. I think most of them are a relic from old ways of tagging reserves and could be cleaned up, but it’s still difficult to make use of the data.

All in all, I wouldn’t split the Canadian provinces and territories yet. The largest are only just over the 1 GB mark and Nunavut is the only one that has a reasonable number of splits.

Why did you trim off the " + boundary=aboriginal_lands" in my message? Relation: ‪Nlaka'pamux Nation Tribal Council‬ (‪8518419‬) | OpenStreetMap is definitely within BC and currently tagged admin_level=5. I know it’s not a administrative boundary that’s useful in this case but I wanted to mention it because I was discussing admin_level=5 in other provinces.

I think you have more storage and RAM than me, lucky!

Ontario’s 6 divisions for a million square kilometres and 15 million people seems reasonable to me, personally…

I had mis-read it as an or. Looking at I think it’s another case of old tagging for reserves.

Probably, but Frederik said 1GB was when he tried to introduce subdivisions. The three largest extracts are 1.2, 1.1, and 1.1 GB. Of those, only Nunavut divides easily.

It’s not yet at 1GB which is the point where he normally tries to introduce subdivisions.

It takes my system about 13 seconds to extract Vancouver from the BC PBF on a HDD. I do this when I’m repeatedly testing a small area and would need to do it regardless of extract size.