After thinking about it a bit more, I don’t think we should be adding any rivers or streams from the CanVec dataset. They’re not up to OSM standards. None of the rivers are mapped as centerlines through lakes and river areas, which results in watersheds being disconnected. This site demonstrates the extent of the problem. I contributed to the problem and I regret it.
Instead we should be using the 50K shapefiles provided by NRCAN here. They contain connected waterways and are available for all provinces and territories. They also don’t have the issue of being disjointed at tile boundaries.
The downsides are that you need to download the entire massive dataset, understand the basics of working with shapefiles, and process the raw data provided into tagging suitable for upload. These issues could be mitigated by either a) providing detailed instructions on the download, import, and tag processing required or b) performing these steps and providing pre-processed OSM files for use.
However, I’m not confident that this NRCAN data is any more up to date than the CanVec files. I still don’t think the age of the data matters all that much in blank-space wilderness areas though. In the many hours I spent inspecting the data in coastal BC, I saw three cases:
a) The water has not moved in the last 50 years.
b) The water size/shape/route changes seasonally and will continue to change.
c) The water has changed size/shape/route due to a one time event such as a landslide or dam construction.
Small streams (which are the vast majority of waterways) almost entirely fall into case a). Lakes and ponds fall into a) or b). Smaller rivers mostly fall into a) while larger ones fall into b).
In case a) there is no need to update and in case b) it is not particularly useful to update either because it is a moving target and will not be kept up to date no matter how accurately you map it the first time. Even the latest satellite photos are likely already out of date.
Case c) should be updated, but these only account for a very small percentage of the data.
natural=water / waterway=* - See my previous post. We should be getting this data elsewhere. We might also want to reduce the fidelity of the data a bit, but that’s a matter for another post.
natural=wetland - I think this is fine to pull from CanVec.
natural=glacier - This should also be pulled from the NRCAN hydrography shapefiles. They are sliced into many more pieces in the CanVec files and uploaders have not been taking the time to splice them back together. See the glaciers south of Bella Coola for an example of how I believe it should be done, vs north of Bella Coola where they are split into many pieces.
natural=reef - Does CanVec have this tag? I seem to remember it have water=intermittent (or is it natural=water+intermittent=yes). It seemed like NRCAN used this tag differently than we would use natural=reef so I didn’t upload. I would need to refresh my memory.
natural=peak - Use these! But first, search for any with “Range”, “Mountains”, or “Hills” in the name (In JOSM use this search: “natural=peak name:Range OR natural=peak name:Mountains OR natural=peak name:Hills”). Inspect each of the search results to confirm that they are in fact mountain ranges, and change the natural=peak tag to natural=mountain_range.
highway=unclassified - I would lean towards not using this. They’re not going to be anywhere near complete and most major roads (which are shown in CanVec) will already be mapped. Could be used on a case-by-case basis I guess.
addr: - I tried using this data in a few places where it was missing, but JOSM through so many validation errors that I quickly gave up. I don’t think it’s usable.
place=* - Should not be used by default. Occasionally you might find something useful. Most cities/towns/hamlets have already been mapped by other methods and the CanVec data had the node in the wrong spot in most cases. And they all seemed to be tagged place=locality which is wrong.
natural=land - Not useful on its own, but IIRC this tag contains many mountain passes. A search for “natural=land name:Pass OR natural=land name: Col” in JOSM will turn up most of them, and they should be tagged with natural=saddle.
I’m starting to think we could produce our own version of CanVec files that wouldn’t suck so bad. At least for the natural features. Fewer splitting issues, connected waterways, properly tagged land features and mountain_ranges, etc. Is that an option? Most of these issues are easily resolved with a bit of up filtering and selective re-tagging, but uploaders just don’t know to do it.
For discussion, I obtained the validity dates of the National Hydro Network - NHN - GeoBase Series content from NRCan. The same data is also offered in the current Canvec product (not Canvec/Osm). Here they are
Alberta: 1954-2015
British Columbia: 1954-2010
Prince Edward Island: 1975-2005
Manitoba: 1974-2017
New Brunswick: 1951-2007
Nova Scotia: 1974-2016
Nunavut: 1949-2011
Ontario: 1978-2012
Quebec: 1948-2015
Saskatchewan: 1952-2013
Newfoundland and Labrador: 1946-2016
Northwest Territories: 1953-2013
Yukon: 1966-2013
Regarding the compliance of the product on network topology and toponymy, it’s not the same everywhere as shown here.
I updated the Icefields Parkway (Hwy 93, AB) and surrounding natural features between the Athabasca Glacier and Bow Lake. Most of the current data is from Canvec/Osm and the region (AB, BC) has often been cited for the poor data quality of Canvec/Osm.
My feeling after all these edits? It all depends on the scale at which you look at the data. The map shows little difference between my edits, and Canvec/Osm data up to zoom 12. Beyond that, the Canvec vegetation is too coarse. At zoom 16, the hydrography appears inaccurate but not wrong. The main problem is often a misalignment with the image that is “easily” fixed by moving the ways.
My thought is: as a first “draft” of the map, is Canvec/OSM very different from the PGS data used for large bodies of water around the world, or from the features I mapped using low-resolution imagery available 15 years ago? I am not sure…
However, using a source as the current Canvec product (not Canvec/Osm) may have an interest since some components may be more recent, potentially more accurate in some areas, and the hydrography network is available in water bodies.
CanVec does have the natural=reef tag, however it is exceedingly rare. I’ve only encountered it a handful of times. Canada does have actual reefs just not many.
I’ll add that while I agree it would be nice for the watershed to be connected, it’s still more useful to have the areas of major rivers instead of nothing. I was surprised how many massive bodies of water and super long rivers just did not exist in the OSM database before I started to add them. I feel like having a pretty good geometry of said features is better than nothing. However I cannot comment on data quality in the south area of canada, as all of my work is in NV and NWT.
Also for what it’s worth, the PGS data i’ve encountered has been absolutely terrible. Especially coastlines. The canvec data in that case is miles ahead.
I would be interested in looking at the large data sets that Solarisphere is talking about, however I have not worked with SHP files. If it’s an easy process to convert to .osm I’d love to see those files. But if thats a lot of work then do not worry about it. Just curious.
Paradoxically, the map accuracy in northern Canada is generally higher than in the south. The south was mapped long before the north using analog methods, while the north was done more recently with the introduction of digital methods. For instance, the latest 50K scale maps were produced around 2010 on Ellesmere Island.