Rainy Lake super relation

mikeocool · April 2, 2023, 5:40pm

Hello!

There has been some discussion a changeset and OSM US Slack about how Rainy Lake on the border of the US and Canada is currently mapped. It currently has 16 multipolygons that are members of a ‘site’ super relation: Relation: ‪Rainy Lake‬ (‪12311888‬) | OpenStreetMap

Existing discussions:
Changeset: 132263642 | OpenStreetMap (note that the super relation actually pre-dates this changeset)
Slack
(there is also one more short thread in OSM US slack #tagging, but my new user 3-link limit is preventing me from posting it)

From most of the folks that have participated in those discussions, it appears that the general consensus if that the lake should be mapped as a single multipolygon and not as a super relation of multipolygons – which is consistent with other large/complex lakes in the area like Lake of the Woods and Lake Superior.

I was planning to make that change, however wanted to post here to give a wider audience a chance to weigh in before moving forward.

Thanks!
Mike

jumbanho · April 2, 2023, 11:48pm

I think this is a good idea!

Richard · April 3, 2023, 9:13am

Without wanting to be too much of a contrarian, I think the general principle of several manageable multipolygons within one grouping, as opposed to one monster multipolygon, is a good one. It makes the data easier to edit and easier for consumers to handle.[1] There are some fairly egregious examples elsewhere, such as Lake Saimaa and our old friends the US National Forests.

I’m not necessarily convinced that site relations are the ideal vehicle for this, nor that we have anything better right now.

[1] I’m kind of worried that current mapping practice is encouraging OSM having a formal dependency on JTS/GEOS and that it’s increasingly impossible to process it with anything else…

jumbanho · April 3, 2023, 12:48pm

Can you tl;dr the issue with monster multipolygons? Is it the geographic size, number of member ways and/or constituent nodes or just the complexity of these objects?

Richard · April 3, 2023, 2:13pm

“Complexity” is probably the best call but yes, number of elements plays a part in that.

A lot of processing steps and algorithms that people might run over OSM data (e.g. validity-preserving simplification) get significantly slower or less reliable with large geometries or collections. Those algorithms need not be in what we traditionally think of as downstream applications, but also OSM editing software, parts of the core stack (like osm2pgsql), and even the openstreetmap.org website itself. osm.org timing out when trying to display large relations is not a rare occurrence.

Big-assed relations have historically caused several problems over the years, occasionally leading to the rendering stack catching fire (not literally). [OSM-talk] Relation #2632934 is "killing" differential update, pgsql-mid fails on relations with more than 32767 members · Issue #713 · openstreetmap/osm2pgsql · GitHub, osm2pgsql fails with large relations · Issue #1607 · openstreetmap/osm2pgsql · GitHub etc. etc.

jumbanho · April 3, 2023, 2:18pm

Thanks for your response. This will only have ~900 members, so orders of magnitude smaller than the ones you cited.

Richard · April 3, 2023, 2:31pm

Yes, but as I said, number of elements plays a part but is not the only signifier. Complexity of geometries can cause problems in processing, particularly with software that is not ultimately using JTS/GEOS.

I don’t have a particular opinion on Rainy Lake (I’d never heard of it until this morning ) but am making the point that “large/complex lakes” are not necessarily best mapped as “a single multipolygon”. Lake Saimaa, for example, really should be a collection of its constituent lake multipolygons rather than one enormous multipolygon: https://www.openstreetmap.org/relation/7379046