Proposal to connect NHD waterways in Oklahoma with a mechanical edit, requesting feedback

Thank you both!! I really appreciate everything you’re saying.

I’ve been gathering more information.

Yeah, the northwest part, those lakes don’t have AritificalPaths running through them so there wasn’t anything to connect, just looking at waterways. I don’t know if WaterwayMap considers small bodies of water when connecting streams and rivers? Even if it doesn’t I’m looking at connecting ways in Oklahoma like this: w61823627,w61945487,w61858734. The nodes are there, I’d just have to merge them.

One thing I’m not sure of though, is cases like way/138483691. It’s inside a small pond by itself, it is already connected to the way in the direction it’s flowing out of the pond (north in this case), but then it’s first node is at the back of the pond not connected to anything. Should I connect that, since the nodes are there to do so? Or leave it?

I found 4,125 instances of that, waterway starting at the back of a pond but not connected to it there, no other waterways nearby to connect to. And then 12,905 instances of every other case (waterway flowing out of or into pond while positioned outside the pond; OR waterway inside the pond and flowing up to the edge and stopping with no other waterway to connect to). Then, for the nodes I’d merge this time, 0.5% are at identical coordinates. 40% are closer than 3cm, and 60% are between 3 and 5 cm.

Thank you for your time!

Again, you are “connecting waterways” and so, please connect them. If your mind is not imagining the “knit together” waterway system in this area, it should be. There are ebbs and flows. There are ponds for standing water (often with a stream right through it). The USA and OSM have these, they are “understood to exist in certain ways, with certain methods of flow and connectivity” (like “ways point downstream”). Two inches or less? Yes, join / connect them. That appears to be in the neighborhood of “rounding error,” so you likely are merging things together which should be, but you should “check” (maybe the Validator is doing part of that, I’m not perfectly clear on your workflow, though your results do appear to be “an inch or two off between nodes” in thousands of instances).

How much of your review is manual, how much is automated? I don’t know, but be careful not to automate too much, especially if you are not sure of or how strongly you can trust the logic used to make a determination. Most of the time, actually nearly all of the time, what Validator reports is quite accurate and carefully presented to you. An error should always be fixed, a warning can be ignored, but should be looked at if you know what you’re doing, and you are wearing a hat that says “I know what I’m doing” as you import.

To your best extent possible, please integrate waterways into a “whole fabric” that includes the sort of connectivity that very adjacent nodes (inches, “mere centimeters”) enjoy with actual node-connectivity.

So far, you are showing us that you can map by your so-far-so-good example, though there appears to be more polish and shine ahead. Yup, especially when things are working well and as they should, that’s how we do things around here. Keep it up! (Your interactivity with us here helps).

1 Like

Thank you. When I commited the changes over the weekend, I went through all of the validator messages. I fixed all the errors, and almost all of the warnings. I looked at every warning one at a time, for sure, leaving the ones where I didn’t think there should be a change. I don’t remember what the case was where that happened.

I’m automatically downloading and parsing the information from Overpass, and then dumping what my program finds to a file. I’m looking at what the waterways would become connected to if its end node(s) got merged with the nodes within 5cm. I’ve now narrowed it down to:

  • 16.8k instances of the waterway way connecting to a single closed way tagged natural=water
  • 145 instances of connecting to a way belonging to a relation tagged natural=water
  • 5 cases of connecting to a way just tagged landuse=reservoir, which I would change to natural=water manually
  • Less than 5 instances each of a waterway connecting to a way with natural=water and also one of: a natural=wood, a natural=wetland, or a second natural=water at the same position
  • Finally, about a dozen instances of waterways with ends touching that I didn’t find before because I only looked for AritificalPaths and connectors.

There had been some cases of administrative boundaries, with a node at the same coordinates, but I don’t touch anything like that automatically.

Once I was comfortable with what I was going to do, I loaded everything in JOSM and ran a script that goes to each group of nodes to merge and runs the functionality behind the “Merge Nodes” function on them.


Excellent. I’m doing a “handoff” here to @watmildon who is also watching closely and others here who are more quiet. Buff it to a polished shine, folks. Peace, out.

1 Like

waterwaymap only looks at things tagged as waterway= so all of the natural=water features are completely invisible to it. I’m sure there’s some extra preprocessing that could, for example, collapse all water= features to points so all of the in and outflows are “connected” but that requires features like w61823627 to get patched up as you suggest.

Setting aside that this area needs a redraw, I think it’s unusual to have waterways begin at the back of a water= feature. I wouldn’t say it is wrong just unusual for how folks map them. I don’t think connecting them up is wrong but it’s also not meaningfully more correct than leaving it disconnected. Honestly, I’d be tempted to delete those “originator” ways so the import seemed a bit more consistent with OSM “style”. But again, I don’t feel strongly about them.

1 Like

Oh wow, I hadn’t looked at the satellite. I’ve fixed those manually now. I deleted the ArtificalPath too.

I had read the “lakes are basically water-roundabouts” idea before but I never though of collapsing them to one node, that’s really interesting.

Thank you for continuing to bear with me. I’ve been trying to compare OSM to the NHD to confirm these ways inside ponds are really the beginning of a stream, that there isn’t a segment missing in OSM, outside the pond and flowing into it that should be connected to the ArtificialPath. I did find and fix a few instances manually today, where a small connection between two ponds or under a roadway probably got deleted at some point.

But then I found way/139502473, it doesn’t currently connect to anything with its first node, but I see a way dated 2019 in the NHD that leads right up to it:

            "type": "Feature",
            "id": 8456072,
            "geometry": {
                "type": "LineString",
                "coordinates": [
            "properties": {
                "OBJECTID": 8456072,
                "permanent_identifier": "{46333995-D382-42E4-8188-7BA840803DAE}",
                "fdate": 1550620800000,
                "resolution": 2,
                "gnis_id": null,
                "gnis_name": null,
                "lengthkm": 0.06329655,
                "reachcode": "11090201009372",
                "flowdir": 1,
                "wbarea_permanent_identifier": null,
                "ftype": 460,
                "fcode": 46007,
                "innetwork": 1,
                "mainpath": 0,
                "visibilityfilter": 24000,
                "Shape_Length": 77.7378083086172,
                "globalid": "{F4B011AB-2297-4D56-871E-9A4C4E3E7FF8}"

I understand getting rid of useless ArtificialPaths, but I kinda want to leave 'em for now… or at least I don’t know that I want to be responsible for deleting them, there’s about 4000 of these cases. Another thing I came across, is if I were to delete them, and their last node connects to a waterway heading out of the lake, that waterway could be left connected to nothing, since many of the imported waterways had already been connected without including the edge of the surrounding water.

So in short, I tried to make it easier for me to make a decision, and ended up just making it harder. :slight_smile:

I can work on figuring out how to delete these ArtificialPaths inside ponds and making sure anything connected to their last node gets connected back to the water’s edge. And if anybody does an import of newer waterways in the future, they already have to worry about connecting things and can figure it out then.

Or, I can still leave them and just connect everything else outside of the ponds as planned. I am definitely going to sleep on it.

Thank you!

1 Like

One thing that is useful to think about it how easy would it be to “undo” this if we discovered 6 months from now we have decide we want to backtrack. Given that this is all sourced from NHD, it’s pretty hard to do enough damage that a dedicated person couldn’t (through reverts or redo of import) patch things to how they were. It’s even easier to backtrack for solely tagging changes.

In contrast, updating all the geometry for a ton of varying and connected feature types over many weeks that would need to be hand entered? Much more to worry about.

They’ve been there a while, having them in for a while longer doesn’t hurt anything. As you’ve noted, there are some that may be useful to keep around for if the import is redo/extended etc. I’m glad you’re thinking about what happens for the next folks to come along. I don’t think the map is damaged hugely either way.

1 Like

(just to comment on the original question)

To be honest, that looks like a bug in the initial import? Now that you’ve fixed it, this stream and this lake look OK to me.

1 Like

Yeah, this is a super common issue with older NHD imports.

1 Like

Ok, I think I’ve figured out what I’d like to do. Here are before and after zipped *.osm files, none of these changes have been uploaded. The files contain the objects to be modified and everything sharing nodes with them.

Edit: I made a mistake before, there are a lot of connections to be made where the nodes are exactly 7.2cm apart for some darn reason. The below is updated to reflect upping the range limit from 5cm to 8cm.

  • - 18.3MB compressed, unzips to a 124MB osm file
  • - 19.1MB compressed, unzips to a 141.0MB osm file

5,226 ways to delete (all ArtificialPaths inside of bodies of water, with nothing flowing into the body of water to connect to it, and nothing in the NHD to connect to it either.)

About 60% of them, the last node in the way is connected to the beginning of another waterway that would be left dangling, so that node is then merged with one belonging to the body of water (the nearest node within 8 cm).

Then the other 40%, the waterway flowing out of the body of water that our deleted ArtificalPath connects to is already connected to the lake edge, OR there’s just nothing the deleted way was connected to. So simply deleting the way is enough.

There are about 150 ArtificalPaths (already tagged as waterway=stream) inside of lakes I want to leave alone because of one of these cases:

  • The NHD shows something that could connect to it that doesn’t exist in OSM
  • The way inside the waterbody has a name tag
  • There’s a way outside the body of water flowing away from it that would be left connected to nothing because there isn’t a node belonging to the instance of natural=water within about 8cm to merge with
  • There are multiple ArtificialPaths connected to each other inside the same large lake
  • Any of the nodes in the way besides the very last one also belong to another way

20,159 node merges, all but 10 of which involve only two nodes, connecting the end of a waterway to the edge of a body of water where that end had connected to nothing previously. Then 10 instances of 3 nodes being merged because I found some canals I didn’t pick up before whose ends are touching.

And then there are JOSM validator warnings I’ll work through manually before committing changes to an area. I didn’t address them all yet in the files linked above.

I’m really grateful for all of this feedback, and for the ability to pursue this. It’s really satisfying doing this work.

Is it okay for me to create a wiki page, and begin making these changes, do you think? Or should I slow down, or work more on part of it?

Thank you thank you thank you.

1 Like

This looks sensible to me. I would love to know what the other 200 are and if we can work through them manually. I haven’t had a ton of time the last couple weeks but may be able to work through some (and the dams! so many dams!) in coming weeks.

1 Like

Thank you! I updated my previous post, there were a large number of instances of nodes being exactly 7.2 centimeters apart. By looking within 5cm, and then only if nothing is found upping it to 8cm there are a few thousand more connections to make.

I will put that list together of about 150 lone ways being left behind and post it here, later today probably.

Thank you for all of your time, I love the workflow of using MapRoulette for the dams. Any you’ve done or you do in the future (with or without MR, all is good) is wonderful, thank you. But don’t feel pressure! I’ll do them all eventually if nothing else. Your guidance has given me the freedom to work through them in a way that’s comfortable and efficient for me. So I appreciate it!

1 Like

[ok_ways_left · GitHub]

Here are the 147 ways in Oklahoma with NHD tags, inside of bodies of water, that don’t have anything connecting to their first node, that I didn’t want to delete automatically. Most of them, they have names and that’s why I left them. The rest should all be: the last node of the way was more than 8cm from the nearest node belonging to the body of water. I think. I haven’t looked at them all.

The columns mentioning old and new: old is “ways and relations the last node of the way is already connected to” and new is “ways and relations this way would become connected to if all nodes within 8cm of its last node were merged.”

Edit: Here’s a bonus, the 18 ways where the NHD has another way that doesn’t exist in OSM that might connect to it. Their ends are all about 1 meter apart though. And in one case I think the way in the NHD is the same way in OSM but its ids are different. I haven’t looked at all of these.

[ok_ways_left_nhd_conn · GitHub]

Query to view a way by object id for convenience (?) -*&returnGeometry=true&geometryPrecision=&outSR=4326&f=geojson

1 Like

All done :slight_smile:

[Mechanical Edits/AutoMatt/Connecting NHD Waterways in Oklahoma 2 - OpenStreetMap Wiki]

[Changesets by AutoMatt_ | OpenStreetMap]


This is a huge amount of cleanup and greatly appreciated!

1 Like

Thank you for letting me! I want to provide an update, and ask about one last mass mechanical change please (setting intermittent=yes on streams, more info at the end). First, I’ve been manually connecting things, there’s less confetti in the north now.

It’s not perfect but it’s a lot better. What I found was, there were many streams with big chunks missing. Here’s an example from Overpass. In green are the segments that I created manually.

Here’s another example. Green were missing, now created by hand.

Lastly, here are the waterway relations in Oklahoma, with thew new ones I created in green. After verifying a named waterway end to end, it made sense to me to throw a relation on it. I didn’t do it for all of them, I did about 30, each long enough to have 20+ segments.

Question about a mass edit please:
There are 59,443 waterways in Oklahoma with NHD:FCode=46003 that don’t have an intermittent tag. Can, or should, I set intermittent=yes on them?

Thank you for your time!

1 Like

Really curious how there’s that many sections missing. Probably lost to time why…

I think adding relations for longer waterways is totally fine. It’s very common and makes a nice single place to grab the whole thing.

Features with that FCode should be marked as intermittent and, given the grography of OK, I’m not surprised at the large number of ways. In case you haven’t seen this, the wiki has a page talking generally about NHD imports that mentions this. There may be other FCodes that could use some scrutiny while you’re there.

For some of the missing waterways I created manually, it looked like the NHD had the stretch as a polygon that either didn’t get imported, or did but was tagged as some kind of river bank and got deleted after the fact. And the ArtificialPath down the middle never existed. But then, yeah, a lot of the missing segments, there’s nothing in the way…

Looking at lakes and streams in Oklahoma, there are:

NHD:FCode Count Count currently having intermittent Wiki
46003 60,224 885 STREAM/RIVER waterway=stream intermittent=yes Hydrographic Category intermittent
46006 26,457 335 STREAM/RIVER waterway=stream Hydrographic Category perennial
39004 109,721 29 LAKE/POND natural=water Hydrographic Category perennial
55800 16,814 13 ARTIFICIAL PATH
39001 6,756 15 LAKE/POND natural=water intermittent=yes Hydrographic Category intermittent
39009 530 2 LAKE/POND natural=water Hydrographic Category perennial; stage average water elevation
33400 250 1 CONNECTOR

So, I think that would be (?)

  • 59,399 waterways with NHD:FCode=46003 that need intermittent=yes
  • 6,741 water bodies with NHD:FCode=39001that need intermittent=yes

What do you think?


That seems sensible to me.

1 Like

Thank you! Done!