[RFC] Feature Proposal - landcover proposal V2

“Undesirable” doesn’t mean “impossible” or “never happens” or even “written in stone.” It often means hassles for many, even as it might provide benefits for some.

I do agree that “breaking old and unmaintained software is not necessarily and always a bad thing.” I think most of us might agree to that with some chin-stroking / serious consideration, as the “real world” does this (over time). My point is that as OSM recognizes that as we continue to want to do this, difficult as it is, we might “fit doing so” into a more structured approach to do so, which I loosely referred to as “versioning,” but which would be actually more complex than simply assigning version numbers to phases of deprecations.

I appreciate the real patience, politeness and open-hearted attitudes I have been seeing here; thanks.

Thank you for this reminder of what OSM might consider a “partially or mostly-successful large-scale (global) essentially-but-not-full-scale deprecation” (landuse=farm “becomes” landuse=farmland). I remember in 2015-2017 when this began to occur, I do not recall the specific reasons (something about rampant mis-tagging due to misunderstandings largely in Germany) and it did take me a year or so to get used to using the new tag. Yet, we still do have thousands of the old tag, though it was distinctly chosen by our Standard renderer to deprecate its display.

A useful example, let’s continue to discover what we might further learn from it in the instant case!

So would you look with good intentions and support a proposal for a versioning system for OSM?

Of course. Though I am largely befuddled at where and how to begin. And again, “versioning” is a loose and maybe not effective way to say it. What we’re talking about is “how OSM might deprecate existing tagging to support newer, Approved tagging in a way that OSM’s mapping / producer community and downstream consumers find acceptable.” In a structured way that can work for additional deprecations, should there be more. (We seem to find ourselves realizing, “there will be more”).

This is a big, big task, everybody, let’s not kid ourselves. With the excellent example of farmfarmland, we do have evidence this can be partially / somewhat / even mostly successful. In one case, for particular reasons, and seems like it has something(s) to teach us.

I suddenly feel like someone at the top of a mountain who has kicked over a snowball, and it is now rolling downhill, gathering both mass and speed. I say it a lot (and don’t often follow through), I really want to step aside. This is hard work and while I don’t shy away from it, success here will only come from OSM’s usual voyage of many more of us contributing to these efforts together.

2 Likes

I am consistently confused by the notion, that landuse=forest and natural=wood are exclusive of each other. In my area, large swaths of wooded terrain are not economically interesting. They are there to protect infrastructure from natural disasters, avalanches e.g. They are groomed just as well and owners receive subsidiaries to to that, as otherwise they’d be financially ruined. There are also areas with no trees at all, but governed by forestry law, i.e. landuse=forestry.

I think, there is a lot or romance on subject matter. I fear, this is not helpful in getting to a concise, reproducible recipe for mappers.

I remember a proposal that set out to deprecate landuse=forestry. I quite sympathised with it. It only got approved after dropping the deprecation, see https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dforest – Unfortunately it is of no use at all in my area, conditions are not met.

So, if anything, natural=wood is the most close to landcover=trees, the only difference, landcover=trees is not affected by size of the area, perhaps it could even be used to map a single tree as an area?

Yep.

I think the key challenge here is defining exactly what it is that you’re trying to solve. This is where the vast majority of tagging proposals fail. In this case, we’ve got at least three tags that are interpreted by data consumers to mean “a place where there are trees”.

So what are some possible problems?

  • Mappers don’t have a way to indicate that there are trees here. This isn’t a problem, because mappers in fact have three ways to tag trees.
  • Data consumers don’t have a way to determine whether there are trees here. This is also not a problem because data consumers can simply consume all three of the tags that mean “there are trees here” and they’ll accomplish their goal.
  • Mappers will fight over which tag to use to indicate that there are trees here. This is a possible problem, but in practice it doesn’t seem to happen. It turns out that mappers don’t really care that much about which tag is used.

…and so on. I can’t come up with an exhaustive list of possible problems, but it’s really really important that you have a coherent problem that you can articulate, has real impact, and everyone can understand, before proposing a solution. Here’s a recent example from the United States of a proposed tagging change where I describe what I think the problem is and why the tagging change should be implemented.

The same analysis can be applied to the problem of how do we create backwards-compatibility and versioning for data consumers. Do you understand the problem? Can you define it clearly for everyone? That is step 1. Only once you truly understand the problem can you start thinking about solutions.

4 Likes

The latter quite possibly has a great deal to do with the former, but I could be wrong.

OSM has known for a long time that we have more than one way to either say (tag) something IN our data or do (render, display, route, geocode, overlay with specific tags…) something WITH our data.

Proposals that want to deprecate tags? Again, a very uphill climb. And if we’re going to fix this, let’s not continue to re-invent the wheel (of deprecation), but at least begin to talk about some smarter methods with which we might ease that transition — what’s being called “versioning,” which must be much more than suffixing .1 on the end of something or bumping v1 to v2, though it does include sane time, manner, place and circumstances where it is wise and correct to do that.

Honestly, I’m exhausted.

Would you say that this approach of suffixes or even prefixes in order to distinguish “versions” is completely unreasonable or could something be planned and built with this method?

Are you talking about version tagging on individual objects, to be maintained by individual mappers, something like what was done with public transport v2? Could you give an example? I’m finding it hard to visualize the abstract concept.

I’m also trying to understand what scenario we’re talking about.

It could be:

  • key.v1=value
  • key=value.v1
  • v1.key=value
  • key=v1.value
  • key:v1=value
  • key=value:v1
  • v1:key=value
  • key=v1:value

And if so, as for one of these, it is inconceivable, impractical in the long term and would only bring more work, is that it? Or could something be worked on with this?

“Versioning” would be a comprehensive system of integrating into OSM newer tags that have been Approved, using a scheme which deprecates old tags to be replaced. It would involve the identification of downstream use cases, and a phased approach to the introduction (of new) and deprecation (of old) in a way that is well-understood (and anticipated) by the community so as to minimize the disruption of “losing features” when a tag deprecates. It is a great deal of specification, social harmony, technical execution and agreement. The specific “numbering of versions” would be a tiny aspect of all this in comparison to to what I outline above.

OSM already does something (crudely) like this, when tags are newly introduced which duplicate or somewhat overlap existing tagging. But now, it is a “free for all,” rather chaotic anarchy, and we have three (or more?) ways to denote “a wooded area.” As in the example of farmfarmland OSM has some (modest, imperfect) success at doing this, but seldom if ever with a project-wide emphasis on minimizing impact to those who use the older tag.

“Zooming out” even further, (from being abstract about Versioning), Brian’s comments about “Proposals must define a specific solution” should guide new Proposals that introduce new tags which replace old tags, causing them to be deprecated. With Proposals that do this, identifying that there will be common ground in the process of deprecation of tags in general can lead to there being a boilerplate approach to such deprecation such that a “phased approach” (versioning) can and should emerge.

Public_transport v1 becoming v2 is another example, but it is seen that these exist simultaneously rather than v2 completely replacing v1; v1 wasn’t really deprecated, as it is so well-supported and widespread.

I hope that helps. I’m being very high-level here, deliberately avoiding the specifics.

I don’t think embedding versioning within the tagging is a viable solution because it’s just too weird and different from how we’ve been tagging. Plus it puts extra data in the database that doesn’t add information.

What is needed is a way to access the data in a version-independent way. In other words, if I need “version 1” OSM data, I get it in that format, and if I need “version 2” OSM data, I request it from the same source, but it comes out in a different format. Then mappers can muck about in the tagging and as long as the translation software keeps up to date, then the data consumer doesn’t have to care about the version differences and we can put known service lifetimes on how long we’ll maintain “version 1” before a data consumer must upgrade.

Bottom line, I think this is a software problem, not a schema one.

Furtermore, this problem is effectively solved with standardized schemas. So if you’re using OpenMapTiles, Shortbread, Overture, Daylight, etc., those schemas effectively handle the versioning for you. So there’s a great argument that this is already solved upstream.

5 Likes

If i understood correctly, what you mean by “versioning” is the method of transition from one data scheme to another, that involves replaced actual data.
If i asked what would be the diference between this method called “versioning” and what this proposal tries, would you say that is a “anarchy method”? Meaning, is the will of the person that make the proposal that chooses is own method, making difficult to consistent in this process of transitioning?

If farm -> farmland was an example of a transition of a well used tag, how it was made? How the mappers and data consumers made the transition?

Yes

This proposal proposes that we will accept that some number of data consumers will break and/or have bad data after some period of time, and that we should be okay with that because the data model will be better organized for future data consumers.

I agree that this is a software problem, it would be the one choosing and managing the version to operate on.

Then i ask you or somenone if knows how those software manage the versioning, what exactly they version, because in the moment we dont have a “versioning system” otherwise we wouldnt be debating on this, so what would be their “versioning system”?
Could we somehow help them with the data itself?

And the suffixes or prefixes approach, we agree that it would somehow duplicates information but wouldnt it be also helping these softwares to choose the data version more easily?

We dont need more than 2 versions, 1 in use and 1 for upgrade.
Of course we cant make the 1 for upgrade instantly the one for use, and delete the one currently used. We would need to incentivate the upgrade.

As someone that develops several pieces of software that consume OSM data, the answer is very simple:

When data changes in OSM in a way that I didn’t expect, my software stops working the way I expect it to.

I then have to go and find out why it broken. Often it is simply a mapper making a mistake or mapping in a weird way. But it could also be that there were multiple tagging schemes for something and I was only aware of one of them. So I then need to either modify my software or address the tagging dispute with the community or allow my software to remain broken.

As someone who is “plugged in” to the community, this is easier for me to do. For someone who is just developing software that uses OSM data and isn’t involved in the community aspects or the people or personalities, dealing with these diffences can be a real headache.

If it sounds like this is a really hard problem to solve, it is, and that’s why it hasn’t been solved yet.

6 Likes

I agree with what you said. So lets only consider the following hypothetical scenario.

We have the current data mapped as key=value, and now lets say we add the way of mapping version:key=value.
You could say with reason, that this duplicates information, right; by duplicating we get 2 types of data, the one without version property and the one with, right.
Couldnt you as a developer for the softwares that retrieve this data, filter the data by with property version or without?

Example

natural=sand in the normal way the map

and

v1:landcover=sand with version prefix.

Do we agree that if filter by the normal way only natural=sand would work and if filter by the version, only the landcover=sand would work?

If not, discard all the following text.

If so great, but there would be a problem, anyone could then make v2,v3 v… infinitetly right? What would be the criteria for new versions?

Solution that i would say to be the more reasonable, simply add the proposal with intent of adding information, not replacing or deprecating, and make a compilation of the other type of proposals.
This process of new version we would say to be develop in by year cicle. It that year, anyone could propose a change, deprecation (and addition) and make a consense with all the authors of other proposals so we dont have conflitcs between them. After a year all the proposals that would have conflict with the remaining in the new version, would be discarded for future versions and the rest would be approved to integrate the new version.

With this system, developers like you could work on render the new version to be supported. The previous would remain operating and rendering.

What if we add new values or keys to the “default version”, meaning the one with simply key=value (althrough i would advise to after a while deprecate this one and remain only with prefix versions)?
If the new values are only additions and never replacing or deprecating values, the process would stay the same as today, nothing will changed.
But what about the new replacing and deprecating?
It would be the new process, the version prefix system, it would be propose for the new version and integrate only if we wouldnt have conflicts.

its all this pure fiction and impossible with the reality, or could we do it?

These schemas don’t really solve the problem. They break just the same as any other data consumer when a tagging style they recognize is replaced with one they do not in the source OSM database. An actual versioning and backwards compatibility solution for OSM would allow a period of time where mappers could submit either v1 or v2 tags to the API and consumers could get data out with either v1 or v2 tags. A translation layer on both ends would be required. With the tag agnostic philosophy of the project this seems unlikely to happen.

2 Likes

Being realistic here, can we more-widely acknowledge that after nearly a year and almost 200 posts in this topic that the OP proposal for landcover is “not viable” any longer? It certainly has stalled (evident in its wiki’s Talk page, too) and we have seriously skidded away from the original topic into a meta-discussion about “versioning” and greater-encompassing topics of tag deprecation. This is a huge, deep, very difficult topic, and deserves its own (new) topic here, if not an entire Working Group within OSM to address for the longer term. Thanks for your consideration.

4 Likes

A proper solution could be to store references to tags instead of tags themselves.

That way when a tag change is proposed all older tags will be updated in the new schema but still work in the old one, and all new tags will be backwards compatible with the old schema.

So, proposal authors could specify with tag a new tag will be in the old schema. For example, you could specify that highway=busway would look like highway=service for data users on the old schema.

2 Likes