[RFC] Feature Proposal - landcover proposal V2

Not a native speaker, learned English mostly through listening to pop music - Seems “lost in a forest” @4:04 and “lost in the wood” @1:10 both possible.

Possible, and you would be understood, but it’s awkward phrasing in ordinary speech. Native speakers in conversational speech would say “lost in the woods” or “lost in the forest”.


| ZeLonewolf Brian M Sperlongano
January 10 |

  • | - |


Try to find a forest in the data. You cannot automatically tell whether there is a forest or just a few trees in a city.

From my purely American perspective, I do not know what “a forest” is. Would one tree be a forest? ten trees? a hectare of trees? A square kilometer?

In American terminology, a forest is just any large area of trees. It would also be correct, but very archaic, to say “a wood” to describe the exact same concept.

Generally, a forest has to be of a minimum size, otherwise it would not work, but having huge areas with trees growing is not necessarily a forest, it could be a tree farm or an orchard, or another kind of tree plantation as well.

The way my question was framed, it was intended to see the “forest” as an ecosystem. I agree there are many different definitions of forests, and the term is not well defined.

Language note, natural=wood and landuse=forest are probably used interchangeably here because the plain (American) English terminology is also used interchangeably. Whereas, in German-speaking areas you have a “wald” and “forst” which are both tree-related places but are actually distinct concepts. Thus German speakers no doubt expect wood and forest to correspond to wald and forst thanks to the false cognates.

in German actually not, the term “Forst” is not or rarely used in daily language when speaking about such areas, it is a formal term used for administration (e.g. “Forstverwaltung” = forest administration, “Förster” (forester)) and while I agree it is related to use of the forest, it doesn’t mean if it is a “Forst” it won’t be a “Wald”, contrarily I would say any “Forst” can also be referred to as “Wald”. If you want to make a point that it is a virgin forest / primeval forest, you would use the word “Urwald” or Primärwald.

Now, because I’m familiar with past discussions, I think what you might be describing when you say forest is something like "land use for forestry

no, I was not aiming at use, and I would e.g. count clearcut areas as “land used for forestry”, but not as “forest” (no trees no forest)

think about the “rainforest”…

While I can see plainly that “rainforest” contains the word “forest,” given the (rare, virtually never…) opportunity for me to tag a “rainforest,” my first choice would be natural=wood. Indeed, in my area of California, there actually is primeval redwood forest — I have tagged in OSM a particular tree which is revered (the “Big Ben Tree”) that is between 1100 and 1200 years old, and the state park in which it is located has these sorts of trees and hiking trails you are asking us to think about. I know (and love, and have owned property with many upon them…) lovely old, 40-meter tall redwood trees, they are important members of the living fabric we share this Big Blue Marble with and literally “the lungs of the planet.”

In my opinion, a “rainforest” (or what I call a “primeval” forest, sometimes called “old growth” or “first growth” trees) is about as natural a wood as we’re going to find on this good old Earth!

I can’t help but wonder, these two expressions of the same concept?

PS: My thinking explained: We have this archaic term, now we can use it to describe these archaic conditions.

This gets nuanced and difficult quickly, as while Brian and I both share what might be called the dialect of “American English,” (or US English) there are vast differences in the sub-dialects and even idiolects by its (approximately) 300 million speakers. While we are 100% mutually intelligible, I live on the West Coast, he lives on the East Coast and I am certain each of us can hear quite noticeable vowel differences, choices of vocabulary, even minor syntactic differences (e.g. some pronoun shifting has been and is underway in many sub-dialects of US English). This includes concepts like the tangle of what he means by “a wood,” I mean by “primeval (or old growth) forest,” you mean by “rainforest,” and OSM means by natural=wood.

All that said, we are using crude things called “words” (or “tags”) to denote only a few at a time of what are truly a vast number (relatively speaking) of subtle and complex semantics, for example, distinctions between trees which are ancient, harvested, referred to in a generic way, in a certain global ecological sense…there are dozens or hundreds more. That’s what makes this so difficult.

So when you ask “are these two syntactic (whether spoken, written or digital) denotations the same semantic concept?” the chances are pretty likely “no, they are not,” but it is quite likely they are “highly related.” This sort of denotational semantics (as it is known) is fairly straightforward for artificial languages like (C, Java or SQL), but for natural languages (like English or thousands more) it is very difficult to capture and express the myriad nuances that are extant. In programming, denotational semantics is somewhere around an upper-division undergraduate computer science course (or two). With natural languages, it is more like this exact discussion, the style of which has been going on for as long as humans have had language, and even after humans become extinct, shall continue to do so.

Just for curiousity, I can translate Brians quote directly into German:

In German terminology, a Wald is just any large area of trees. It would also be correct, but very officialese, to say “a Forst” to describe the exact same concept.

In German language, the use of natural=wood for rainforests is in no way special, it is all Wald/wood, where there are trees in large amounts :slight_smile:

And while it might feel somewhat self-congratulatory to say “in a language I know and am comfortable with, I find some solid alignment…” and that is GOOD, especially in a topic like this that drones on and on where finding some “common ground that can build consensus” is to be valued, (here comes the part where I will be called “the fly in the ointment” meaning “that which spoils what might be good”) there will always be somebody who not only disagrees, but is able to give a valid argument for why they believe what they believe.

Sure, we have dictionaries for words, and they really are useful tools in any given language. We even have what are somewhat less-exact (or even more precise, in some circumstances) wiki pages for our tagging which act like “dictionaries for tags.” But there are so many billions of people on Earth and so many languages, land uses, history of interactions with forests and trees (going back tens of thousands of years in some cultures, especially forest-rich cultures) that nailing down “exact aspects” of language to the extent OSM can tag well is still (still!) quite difficult.

Yet, I’d like us to continue. As I don’t want to grandstand or use up all the oxygen here, I step aside.

I believe that a point that was not clear enough and that is serving as an obstacle to the advancement of this proposal and similar ones, is the answer to a question.

What is the objective of OSM?

There are two possible answers, and both are valid, but the community needs to decide which one is the most important and should be followed.

Answer 1

If it’s just for routing and points of interest, where most of the community and customer interest is at the moment, be customers as renderers or data consumers. So this proposal, the one I recently tried to make regarding landuse and the “conflict” that also exists with naturaland surface, becomes irrelevant.
And in fact, the changes they bring to OSM, in this context, are not significant improvements so it’s a waste of time. In this case, I agree and will abandon any proposal that involves this type of context.

Answer 2

Now, on the other hand, if OSM can be expanded to multiple different uses, then this and similar proposals that pursue this goal are indeed important and should be considered.
According to the text in this link, the foundation itself promotes the growth, development and distribution of free geographic data for anyone and for any purpose.

In my view, OSM has enormous potential to be used in the educational and scientific context, which is where it currently fails. To achieve this, if we say that this project is a geographic information system, it must be able to represent reality as faithfully and accurately as possible.
There will then be a need to create, change and remove keys so that they best represent reality, and not those that best adapt to a routing system and points of interest exclusively.

If the community agrees with this objective, then arguments that address issues such as massively used tag transition or backward compatibility as obstacles no longer make sense, as the objective is to improve what is already a defect in OSM.

No one proposes to change all at once, that is, that the proposal is accepted and the next day indications begin to appear that x is deprecated and replaced by y.
First, let’s think about and obtain a system that organizes the keys correctly with these contexts, and then, create a transition plan that changes the data gradually.


Hm, just as I was headed towards the exit here…

This is (respectively and respectfully) a gross oversimplification and no, we do not.

What this sounds like is an agenda looking for a reason to exist.

OSM performs amazing tricks every day in the educational and scientific communities; I have used it to co-teach two classes (one in computer science, one in environmental studies) at the University of California.

A versioning system that can state precise boundaries (between proposals, development phases, alpha and beta declarations…) is a sophisticated process and toolchains we are pretty messy with today. It would help. What we roughly call “landcover v2” is but one cog in how this might evolve. There is a bigger machine into which this (and other blurry syntax) flows so that it can continue to improve. Yes, it is important to state clear objectives about what OSM’s “objectives” are; we do on our wiki home page

in creative, productive, or unexpected ways

This absolutely demands that OSM remain OPEN, our first name. Open to being flexible and (at the same time) open to being “exact” or “precise” or “high precision” or “accurately describing the world around us” and now we open up some very big cans of complexity.

Our project can grow with multiple tags providing multiple “services” to downstream consumers because, we already do that. Tags DO co-exist. It makes for confusion until one realizes this. It seems our job is to express our ontology more clearly. That isn’t easy, especially as it seems one goal (this proposal) requires fitting into a larger goal (one which propels the project forward with a way for us to more clearly describe how we do things). This could be a rough sketch of categories to start with, our wiki and tools like taginfo and OT queries when the data are visual and geographical shoulder some of this burden now. I might imagine fancy clickable tree-diagrams of our syntax but that’s a little dreamy and futuristic. But we need to “better show ourselves” how we have “multiple ways to say things.”

Clamping my hand over my mouth already.

1 Like

Speaking as someone from BC, Canada, where we have a lot of areas covered with trees and as someone who’s worked in the industry…

  • forest and wood are virtually interchangeable, but wood is seldom used
  • wood is often used to evoke a sense of a natural untouched place that is scenic. It’s not that way on the ground, but that’s what people who never go to these remote places think
  • a forest can be as equally ancient and scenic as a wood
  • The language is interesting around pluras and such. You’d go for a walk in in the woods, but not the wood, or in a forest, but not the forests.
  • a city park could have a forested area, but would generally be described as trees. e.g. the city says parks have an area with mature trees and bbeautiful flower beds, not a forest or woods.

From all the conversation above, I believe it is indisputable that people have different definitions for forest and wood, and they overlap. This means it will in practice be impossible to distinguish between natural=woods and landuse=forest.

Even in my search of what the area with trees was called in a local park, I found different usage in documents from the same city department. The idea that we will be able to come up with definitions that people around the globe will agree upon and consistently use is absurd. As a data consumer, I never distinguish between natural=wood and landuse=forest. In fact, they get transformed to the same thing as early as possible in the processing toolchain.


If I understand correctly, you state that OSM is in fact a project that aims for multiple uses. And you consider that there is an obstacle when it involves changing keys that are already heavily used.

Therefore, you would be thinking first about how to apply such changes and not exactly planning a system to solve a problem and then, with the solution, planning how to implement it?

I and OSM do not, here and now, aspire to “do all that” as it seems you put words into my mouth. There are logical false choices inherent in your understanding of what I say here, which does not seem correct or as I intend. Such clarity of understanding can be difficult, I mean no disrespect. Simplistic choices lead to squabbling which we must avoid.

I’ll stick with my hand clamped over my mouth (so nobody will be tempted to put words in it) and repeat: it seems we need to better express amongst ourselves that we have multiple ways of saying things (to wit and especially) “landcover” where distinctions between wood and forest are virtually hopelessly blurry, and yet multiple tags does not seem to hinder our present or future. I have said (and continue to believe) that if we examine this (at simple and perhaps increasingly sophisticated levels of detail with perhaps tools we develop and extend) we will see this better amongst a wider audience of ourselves and this will lead to both better tagging, better tag growth and development and better renderers / overlay layers / end-use cases we have yet to imagine.

The fancy click-on-the-root-or-branch-or-leaf of at first a nugget (like landcover, but that’s a good-sized iceberg tip) and increasingly our entire ontology, well, that’s a dream, but it rough-sketch blue-sky imagines a direction that can enlighten us along these lines. It seems it will naturally continue in good organic directions after people (at a simple level) realize “hm, we could ‘simplify’ tagging in this sub-realm, what about a stepwise approach, realizing that multiple versions “run together” right now…?”

I’m both dancing as fast as I can and trying to let others continue this conversation. Thanks, Paul.

Please step up with constructive criticism or real contributions, it does seem this nudges forward (after almost a year).

It wasn’t my goal to put words in your mouth, as you said I did.

What I want to understand is the real impasse for the community regarding this type of proposal?

Is it because of the way OSM currently works, that you think such changes will do more harm than good?
Although it is said several times that the process will have to be gradual.

I know it wasn’t your goal, I said “it seems” (or feels) that way to me. Again, I mean no disrespect, we are having a misunderstanding.

“Such changes” have been stated here by many august contributors that “they” (this proposal we assume) are a solution in search of a problem. What I see this as a symptom of is that OSM seems to be doing a poor job presently of “explaining” (describing, denoting, showing in wiki pages, OT query maps if appropriate, taginfo quantity heft, fancy futuristic syntax diagrams with pop-out clickable branches and leaves…) these complexities to ourselves.

If we start out simple, explaining “we have multiple methods of tagging landcover now…” in a way where people can see how, and choose what is best for them (as we already have multiple methods), this explanation, visualization, denotation… can only help better tagging to evolve. By better tagging I mean perhaps something is seen in an “a-ha” moment, the lightbulbs go off in many minds, OSM heads begin to nod…this is the organic method we do a lot of things already. But for tag improvement, development, let’s start with simple stripped-down diagrams (we did this for train_station recently, and it was fruitful, with things like good use of color in a simple block diagram really turning out to be helpful and visually communicative).

I do not want this to get too specific (by me or anybody), it must “rise up from the grass roots.” What we seem to have here is some astroturf (it seems like it is grass roots but is it?) that not an enthusiastic crowd around here wants on our front lawn. I’m being kind and offering a way to “make a silk purse out of a sow’s ear,” here [1]. We need to do this work sooner or later, really, and we are mature enough (as a project) to get the ball rolling.

[1] Making a silk purse from a sow’s ear

Edit: Added footnote

Well, and I agree with you, the approach has to be a smooth transition, showing that having multiple ways of doing it, the most recent proposal, which would be this, will benefit in terms of simplicity, logic and intuition. I believe that the authors of this proposal have precisely this objective.

As you will certainly have seen over the years, the proposal always had the objective of changing a lot of tags, no one denies that.

But in order to achieve a smooth transition, I ask for the creation of a system that classifies all types of cases, based on systems created by respective entities. At this stage it doesn’t matter if you just want to create, replace values that are in other keys, delete values…

Having a system that, if put into practice, would work, we would think of a plan for a smooth transition. Certainly starting with values that do not generate conflicts only add information, then we would end up in a scenario with duplicated information, with the current OSM system there is no way to escape this. And gradually both the community and data consumers would see its usefulness.

This of course, also in a perfect scenario where everything goes perfectly obviously won’t be like that.

Even “after almost a year,” this feels to me in its most earliest of stages. I’m talking about sketching some new ways for us to “better reveal our own data to ourselves more smartly” and you are way ahead of that (nothing wrong with that, but it is much further along that what I’m saying are smart ways to build some highway towards this being much more clear and better-working in OSM’s future).

It is 0136 here and I need sleep. Thanks for dialog and please, I encourage all others reading to contribute to this, even if it is to simply mull it over as a series of steps-along-a-number-of-ways. There seem to be a lot of moving parts, many have sketched in some chalk and watercolor of good ideas.

1 Like

In German language, the use of natural=wood for rainforests is in no way special, it is all Wald/wood, where there are trees in large amounts

not really, it depends on context and species, e.g. there is the word “Hain” like in Olivenhain, or there is Plantage as in Kokosplantage (although palm trees aren’t considered trees I believe, so this could be questioned)

Yes, it does more harm than good.

Data consumers today have a way to process “there are trees here”.

If you change this, it will cause problems with data consumers of all stripes. We don’t have a mechanism for change management and we don’t even have any way to contact data consumers and inform them of the change. We make changes, we wait for software to break, and expect data consumers to “deal with it”. Therefore, the path that makes OSM work the best for data consumers is not to change tagging systems that have millions of objects tagged.

There are also no real benefits to changing things. “Better tag organization” is not a real benefit, it’s just something that makes mappers that like things better organized happier. It makes data consumers sad.


1 Like