Simplifying lake with 200,000 nodes

Graptemys · March 23, 2023, 5:11pm

There is a lake in northern Quebec, Canada that is comprised of 735 ways and 207982 nodes. This is causing pretty severe performance issues when the lake is loaded in JOSM, and since the lake is part of several boundaries and I’ve been doing boundary work in Quebec I’ve been seeing it a lot. I also imagine it causes extra work for anyone making tiles or doing other calculations. I’d really like to simplify these ways.

Actually doing the work in JOSM is super easy, just a few steps: Download the lake, select the ways, then Tools>Simplify Way. I’ve actually already done it and it give the following result, when simplifying to the default 3m accuracy:
735 ways
1528.2 miles
207982 → 64576 nodes
143,406 nodes removed or 69% (nice)

Screenshot from 2023-03-23 13-01-26

There are other lakes in the area drawn similarly. I estimate that I’d be deleting at least 200,000 nodes between them.

My only concern is that the change is very large. JOSM offers/requires breaking it into chunks of 1,000 or 10,000 but that’s still going to be 20 or 200 changes submitted. Is there some better way of submitting a change like this? Or anything else I should be aware of before submitting?

aTarom · March 23, 2023, 6:26pm

I have the feeling that it is oversimplified?
look more than 3 metres from node to node?

Mateusz_Konieczny · March 23, 2023, 6:32pm

It also looks to me like it got too far and geometry is now of worse quality.

What is on top seems fine to me.

Mateusz_Konieczny · March 23, 2023, 6:36pm

have you tried reporting it via Help → Report bug in JOSM (if not reported already)

I have run into severe performance issues few times and JOSM developers looked at relevant code and improved it so performance issue went away (obviously sometimes things will have no low-hanging fruit and you may just need more RAM)

Graptemys · March 23, 2023, 7:20pm

I believe 3m means that no point on the way will move by more than 3m. It is described as “maximum error”.

You’re right, it is reducing detail. But the new level of detail is totally fine for any lake, especially for this one which is extremely remote. And also very large.

If someone mapped 200,000 individual trees up here that would be considered a waste and they would probably get deleted. I see these nodes on the lake like that. These are 200,000 extra nodes that are technically adding detail but in reality provide no extra value and come with a huge cost.

Mateusz_Konieczny · March 23, 2023, 7:43pm

current one also is, and if I would encounter image below I would manually make it more detailed

and while mapping something less detailed is fine, but removing reasonable detail is not a good idea

on image you have chosen there is a noticeable reduction of detail - and according to aerial old version matched better, it was not fake/outdated detail

That is not a valid reason to reduce data quality. Either new version has no real data loss then it is fine, or actual data is lost and it is not a good idea.

aTarom · March 23, 2023, 7:57pm

Maybe it would be good to try to contact the previous user who drew it and try to agree on optimising it?

Graptemys · March 23, 2023, 8:21pm

I can easily adjust the maximum error used by the simplify ways tool. I couldn’t find the same spot as before but here’s a similar one, this time at 1.5m:

At this detail, I would still be deleting 53% of nodes on these ways. That’s 110,000 nodes on this lake, probably up to 150,000 including the others.

Back to my original question, is there anything I need to know or do before deleting this many nodes?

Kogacarlo · March 23, 2023, 9:32pm

I did the math: 1530 / 207982 = .007 mile/node
.007 × 1,6 × 1000 = 12 meter/node

That’s not so bad for a lake with curved banks.
I do agree that the lake is huge and it is a very large number of nodes.

Réservoir Manicouagan 250 km to the east has 1719 km bank and 9654 nodes.
That’s 178 meter/node

Réservoir Manicouagan

ezekielf · March 23, 2023, 9:57pm

When I use the JOSM simplify way feature on over-noded objects I typically set the maximum error to 0.5 meters or lower. This can remove any excess nodes without noticeably changing the shapes of the object at typical maximum map zoom levels. If that threshold doesn’t remove any nodes, I leave the object alone. Large complex objects are a problem in OSM. For example, Tongass National Forest won’t even load in the web interface. However, I don’t think the solution to this problem should be to overly simplify the geometry.

SekeRob · March 23, 2023, 9:59pm

Assuming the max of 2000 nodes, 1899 nodes or less in a member not flagged by OSM Inspector, it seems 735 for the lake outline is very high when it can be done in say 110-120. It may help performance during retrieval of the full MP outline. I recently did a fix on a reserve sea area of 2000 plus members on the french/italian coast and it took forever with JOSM. Saved it as a data set would i ever have to revisit and not a week later it paid off as a little islet was added.

Mateusz_Konieczny · March 23, 2023, 10:16pm

that seems a good threshold

can you try this one?

I would ask also local community in case of edits on this scale

Graptemys · March 23, 2023, 11:12pm

Sure, here is the resulting size for increments of .5m Maximum Error

Baseline 208,000 nodes
3.0m: 31.0% (65,000)
2.5m: 34.6% (72,000)
2.0m: 39.5% (82,000)
1.5m: 46.7% (97,000)
1.0m: 58.9% (122,000)
0.5m: 81.5% (170,000)

And here’s one more set of test images, this time zoomed a little farther out.
Baseline:
Mistassini Simplify Baseline

0.5m:
Mistassini Simplify 0-5m

1.0m:
Mistassini Simplify 1-0m

1.5m:
Mistassini Simplify 1-5m

2.0m:
Mistassini Simplify 2-0m

2.5m:
Mistassini Simplify 2-5m

3.0m:
Mistassini Simplify 3-0m

Graptemys · March 23, 2023, 11:32pm

After flipping through these images some more and looking at other places on the lake I think I will probably go with 1.5m or 2m. Any less aggressive than that and it starts to leave behind clusters that really should be removed.

I don’t believe that removing these nodes even at a “maximum error” value of 3m is going to have any meaningful impact on the quality or functionality of the data. Anything you could do with the original data you could do after it’s simplified, only with less processing power. The only difference is that some curves will be a little less smooth when you’re zoomed in all the way.

Though I do agree that on other objects like roads or boundaries 0.5m should probably be used.

Mateusz_Konieczny · March 23, 2023, 11:50pm

That is also valid data use.

dieterdreist · March 24, 2023, 12:27am

Yes, judging from the provided pictures I would not reduce the resolution, 0.5 m could also still be acceptable, but 3m is definitely too much and would bring this lake from nice and detailed to ugly edgy

amapanda_ᚐᚋᚐᚅᚇᚐ · March 24, 2023, 9:11am

Sometimes I find JOSM slow when working with large objects. Do you know about JOSM’s Wireframe View, toggle it with Ctrl-W. IME that speeds up JOSM’s rendering performance.

stevea · March 24, 2023, 9:55am

Apologies in advance for the length. TL;DR: skip to the end to find a command-line with a particular parameter (in bold text) to increase your Java heap space allocated to JOSM upon its launch.

I thank @Graptemys for editing in OSM, and with JOSM, which those who use it know has a definite learning curve. While I’m at it, I thank everybody who edits in OSM, with whatever editor whatsoever!

That said, I am also reluctant to see higher-resolution data become lower-resolution data, regardless of how “remote” it might be. (Earth is Earth is Earth, you are welcome quote me on that). There’s been a lot of good, quite technical advice offered here, including @amapanda_ᚐᚋᚐᚅᚇᚐ 's recent suggestion to use Wireframe View.

I’ve been editing massive (>100,000 node) objects in OSM for quite a while, including ginormous national forests about ten years ago, but I simply had to stop as it was crippling my machine. (A decent Mac at the time, I wouldn’t call it “big iron,” it was big enough for most things, but those forests were just too much). Of course, I’ve since upgraded, though for this scale of editing, 64 GB of RAM isn’t too much.

I did learn a few tricks with JOSM (it being a Java app), maybe they’ll work for you. Please try to increase the heap space you offer to the program at launch, and you might not be asking about “simplifying” OSM data (let’s call this what this is: dumbing down our data). I don’t want to dumb down for the convenience of editing, so get down to the command-line (a shell in Unix/Linux, a Terminal window in macOS, a CLI in Windows…) and see if you can launch JOSM like this (I’m on a Mac, hence the path/location you see here):

java -jar /Applications/josm-tested.jar

You really should be able to do this. Actually, I also add -Dsun.java2d.opengl=true and you might, too.

I’m talking about radically increasing JOSM’s heap space. If you have 32 GB (minimum for what I’m talking about) or 64 GB (or more) of RAM, this can seriously help. My usual JOSM startup command “beefs up” to 2 gigabytes of RAM (up from .5 GB in the old days, to 1 GB, now 2 GB):

java -Dsun.java2d.opengl=true -Xmx2048m -jar /Applications/josm-tested.jar

took a bit under 2 minutes (108 seconds) to open relation/6563858. But when I changed the -Xmx2048m (2 gigabytes of heap) command-line parameter to -Xmx8192 (8 gigabytes of heap), so, to be clear, the command-line I use to throw a pretty-large 8 gigabyte heap at JOSM is:

java -Dsun.java2d.opengl=true -Xmx8192m -jar /Applications/josm-tested.jar

JOSM opened this gigantic lake in 44 seconds, about a 60% savings (in time of loading and displaying).

While I’m all for “sane simplification” when warranted, let’s not (needlessly) simplify our data, let’s provide our tools the resources they need to do their job properly.

I hope this helps.

dieterdreist · March 24, 2023, 10:13am

improving editor settings aside, I think it would be acceptable to modify OSM data in a way that it is easier (and less resource hungry) to edit, but it should not involve downgrading, rather one could split huge polygons by converting into a multipolygon and splitting the outline into several ways. For local modifications you would not have to download all the lake but just the local parts.

stevea · March 24, 2023, 10:14am

An excellent suggestion!