I know that this topic has existed ever since the introduction of boundary relations but surprisingly I haven’t been able to find any threads on it on the general tagging forum.
So, let’s recap: what is actually the difference between type=boundary and type=mutipolygon? Ever since the abolishment of exclave and enclave roles, there’s been virtually no functional difference between the two.
I’ve heard arguments that e.g. a city boundary isn’t actually representing an area but a literal boundary, so something closer to a multilinestring but come on, really? Nobody thinks of it that way and furthermore, there exist inner and outer roles which wouldn’t be necessary if the object was just a boundary. Also you’d say that (administratively) you are inside that place so the object does represent an area.
Adding extra roles like admin_centre or subarea aren’t ever an issue as they can be added to any relations and not break anything.
The existance of this duplicative tagging has caused many to unnecessarily use type=boundary relations for stuff that can be represented with a regular area (the only member of the relation is this area).
boundary=* can always exist as a top-level tag and it doesn’t need to be added onto a type=boundary. Other tags like admin_level=* work just fine too.
I’m curious to see what the community’s stance on this matter is now. Are we just too far gone and replacing boundaries with multipoligons will just break data consumers for no reason?
I’d see type=boundary as a special case of type=multipolygon. It’s particularly useful to better describe what kind of area is represented using the Key:boundary - OpenStreetMap Wiki. Technically, we could also choose to tag type=multipolygon with boundary=*, but that doesn’t look right.
I think if we want to remove type=boundary on the grounds that the relation represents an area, then we’ll need some other key name instead of boundary=*. But that’s just a name change and I don’t have a better one.
That’s just so unnecessary. We have subtags for that job and I see it that there should be as little types of relations (also relation objects), especially with the issues this creates as pointed out above.
We don’t use type=road and type=bus but type=route + route=*.
information=* doesn’t look right at first glance either. But both are valid and all it takes is getting used to it and really thinking about the benefits.
Why? I think it’s a great tag concept which is similar to place=* but for objects with defined borders. It’s what I’d consider “location” by the Wikipedia article but with the emphasis on the defined boundaries.
There’s nothing wrong with using it on areas too, which is already prevalent among some tagging schemes. It’s also what I pointed out—people aren’t used to tagging it on areas even though it’s perfectly fine and less confusing and messy.
What I’d change is border_type=* which has inconsistent naming (border vs boundary) and the obsolete _type. I’d instead use something like admin_unit=* probably with country prefixes values also.
Thanks for asking these questions, because the answer below is not super obvious and will probably surprise you.
These are absolutely necessary. Please see this thread where I discuss why. In short, boundary relations are necessary even in these cases in order to distinguish them from closed member ways of boundary relations, which are proliferated in the database.
I don’t get it. Areas that aren’t multipolygons shouldn’t require a relation and aren’t viewed any differently than software. That’s even more apparent with the example I provided which represents only one thing that’s not related to anything else.
Before the community coalesced around type=boundary relations, boundaries were quite routinely tagged as multipolygons. But they were true multipolygons without any extra metadata members. This was also before the great migration from old-style multipolygons (all tags on the outer ways) to new-style multipolygons (all tags on the relation).
subarea breaks stuff routinely by breaking reasonable assumptions, most recently:
Multipolygons are formally defined as having only inner and outer members that are ways. This is one of the more stable, reliable tagging structures in OSM, so data consumers are likely making hard assumptions. I’d anticipate plenty of surprises if we were to redefine this type to have other miscellaneous members like label and admin_centre.
Most type=* values indicate feature types, primarily communicating semantics that might entail certain geometry. However, a few type=* values solely communicate geometry because the core data model lacks that geometry as a primitive. multipolygon is one of these geometric relation types.
In other words, you’re proposing to eliminate the type=boundary feature type or elevate boundary=* to a top-level feature tag. (It is already a top-level feature tag on ways, a vestige from before we routinely mapped boundaries as relations.) This is such a drastic change to such a common, high-profile tag that it really needs a rationale beyond tidiness or minimalism.
You would not propose adding more roles to =multipolygon , as it’s for making up areas only, and have a single meaning. This keeps topological and semantic purity. It’s a substituent for an area object type. =low_emission_zone hasn’t been developed much. Proposing more roles is possible, eg relating the =enforcement and =monitoring_station devices.
Boundaries that could be reasonably modeled as simple polygons, that is, don’t have a common boundary (segment) with another boundary are the absolute exception. Using boundary relations makes them actually one of the few things we tag and model reasonably consistent overall.
The issue with additional elements, admin_centre and label have already been pointed out.
I forgot to mention in the original post that if you download let’s say an administrative boundary, that object will be a multipolygon data type so it makes no sense not to consider these a multipolygon.
That sounds like an issue on the renderer’s end but either way, it works the same on boundaries as on multipolygons and I’m not saying we have to map subareas.
Those would just be personalized roles for different objects but what data consumers need to take away is inner and outer roles for the boundary itself and other roles are for something that’s defined by other tags on the relation object.
Yep, so it should be a top-level tag on everything. Maybe similarly type=site relations would be better than type=public_transport for public_transport=stop_area.
Anyways, it’s also used on multipolygons when somebody double-tags a forest for example.
And people add stuff like POI role to route relations which does not break the relation. There’s nothing wrong with using more roles than inner and outer and that won’t twist the meaning of inner and outer.
type=public_transport has public_transport=stop_area_group
That’s not 1 =site , but multiple =site
Having roles only solves how to distinguish. It doesn’t answer whether a =multipolygon should have them. In GIS terms, a =multipolygon is what it is, and guaranteed to be that. Some applications or processing tools would try to guess what empty or unrecognized roles are, in an attempt to generate valid geometry. A =boundary or =site is eg a GeometryCollection or something depending on how it’s converted from OSM to other formats, which can be anything.
I disagree that this is an exception, because the database is full of simple, 1-way closed polygons that have boundary tagging but are actually an inner or outer member of a boundary relation. In such cases, it is ambiguous as to whether that polygon represents its own unique boundary, or is mis-tagged as duplicating the boundary tagging of its parent relation. This is especially problematic in cases where the simple closed way contains a name that is unique from the parent boundary relation, e.g. “<name of city> Boundary”.
That is why it’s necessary for boundaries to be a relation in all cases for them to be well-formed and understood unambiguously by data consumers, and why validators (including the one I developed for US boundaries) report this as a problem.
So your issue is that two similar things have the same tagging scheme? I’m not seeing the problem here. Unless you’re also suggesting that type=destination_sign and type=restriction are redundant because they both use the from-via-to scheme. But that still comes back around to “how is that anything other than convenient?”
That’s the data consumer’s fault for not processing only outer and inner elements. Why are we introducing unnecessary complications just to please less advanced software?
I’d even call it an exact same tagging scheme. It’s literally just a duplicate just to accomodate a tag that’s already often a top-level tag anyway.
They represent very different things while multipolygons and boundaries represent the same thing.
That’s because it’s a route and that way has its own tags, purpose and represents something and the relation has its own tags, purpose and represents something else. The way determining the boundary of a low emission zone has no tags and represents nothing and everything could just be expressed with a way instead of relations, as it’s sometimes done and in those cases a top-level boundary=* tag works fine.
I’d also like to point out that boundary=forest doesn’t need admin_centre, label and subarea so this one doesn’t differ a bit from a multipolygon.
In case you were wondering, it happened because some submitted a seemingly benign change to Planetiler that copied a superrelation’s tags onto each of its member relations. This was intended for superroute relations (which you might start another thread about eliminating), but it turns out that subarea on boundaries turns boundaries into a different kind of superrelation whose attributes have nothing to do with the member.
Sure, it was a bug, but it was caused by a trap we laid. Allowing multipolygons to contain relations or nodes could cause more severe bugs in software you’ve never heard of. We usually try to be more backwards compatible than that.
None of the boundary relations need these if there is no admin center or reason to specify a label location.
This comes up explicitly a lot in suburban America, where new all-residential districts are established with an HOA required to financially support their own security, streets, water lines, parks, sewer lines, and other civic infrastructure, as low-density single-family developments always cost cities around an order of magnitude more money to provide services to than they bring in through property taxes (and after the bankruptcy of Stockton, CA, almost everywhere categorically denies permits for these developments unless they have a HOA requirement). These districts rarely, if ever, have an administrative center and are frequently only a few blocks across, making the need for a label node also exceptional.
Yeah, I think it’s one of those things that should be left as they are instead of succumbing to some “OMG I need to tidy up this tagging scheme unless someone can demonstrate it has to be that way” reflex.
On the Geofabrik download server, when we split the planet into country-size bites, we complete cross-border multipolygons (so e.g. a natural=water multipolygon lake that lies partly in Germany and partly in Switzerland will be present, with all of its ways, in both country extracts), whereas we don’t do the same for boundary relations. Otherwise, the Germany country extract would contain the complete admin_level=2 boundary of all its neighbouring countries which would be unexpected for users. But obviously if boundaries were changed to become multipolygons then this rule could be modified to “complete cross-border multipolygons but not if they are administrative boundaries” or something like that.
So, to conclude this thread: the only benefit of a type=boundary is so that data consumers have to adjust for potential other roles while type=multipolygon relations should be made of outer and inner roles only.
This benefit isn’t even a dealbreaker and if we were deciding whether to have type=boundary relations in OSM, we would’ve probably decided no, but since we’re so far gone, it’s fine to keep them so that they signalise something to data consumers.
The thing I’m asking for though is to promote boundary=* on areas to a top-level tag so it’s not against the rules to have a paid parking zone as an area instead of an unnecessary relation or even a multipolygon like a protected area and other values of boundary=* which don’t have any specific roles tied to it like boundary=administrative.