For example, at this location (shown above), I want to add a bridge (the section shown in red). In JOSM, if I split the stream way at that location to create a section for the stream to go under the bridge (ie add a layer=-1
tag), I’m offered the choice of one of the 3 ways, green, red, or pink, to place the original history. But, it seems silly to delete the entire history of green or pink just to put a bridge at red. Is it possible to preserve the edit history on green and pink, or maybe even all 3? Is this maybe an issue that I shouldn’t really be worrying about? It feels like bad practice to overwrite the edit history in this way for such a minor edit.
Probably easier in iD!
There, when you click on the track at that location, it’ll tell you that the track & stream cross, so do you want a bridge? “Yes” & it’ll insert one for you, which you can then adjust for size.
Same thing if you click on the stream, as it’ll then ask if you want a tunnel?
Only do one or the other though, not both!
Thank you for your response, but I don’t think that answers my question. I’m specifically only talking about preserving the existing changesets of a way that is being split — the bridge in my post was just an example — unless you mean that iD actually preserves the changeset history?
The element’s ID can only end up on one of the three resulting elements. So no, you cannot preserve the history in all three: two new ways will be created. I try to keep the element ID on the most significant of the new elements.
Ah, that’s quite unfortunate. Splitting a very long way right in the middle can cause quite a substantial loss of history.
I see your point, yes, the new object starts with version 1.
The good thing in Josm is that you can chose which way is the new object, iD will hide that from the user I think for good reasons.
I do not see a simple way to detect the “full history”, I have been wanting do that make a script that gathers statistics on who added maxspeed.
In a changeset you can find the new way and the remainder. What you further know:
- The new way has version 1
- The old way was modified, version = version + 1
- The new way has nodes from the old way
- The new and old way share 1 node and that is the begin or end node
So it is possible to find the old way with good reliability but not that easy.
Short answer, this is not possible to achieve with the current version of the OSM API.
When you split a Way into two or more parts, from the API perspective it looks like you remove some sets of nodes from the existing way and create new way reusing these (extracted from the origin way) nodes. The new way doesn’t have any history of edits because there’s no refs to ancestors features.
PS One of the possible way for the API improvements
Is there any plan to change that in the future?
No, if anything, if we do the transition to linear elements with their own geometry, history will become even more difficult to reconstruct. Currently you can reconstruct the history via the nodes that were part of the original way but with no way nodes that goes away.
By chance, do you have a source that I could read for more information on these “linear elements” that you’re referring to? This is the first that I’m hearing of this change.
It would be possible though for editors to use the changeset tags to document which, if any, ways have been split in a changeset and which the resulting new object IDs were. Then you could look at v1 of an object and check the relevant changeset tags to trace the history. It would certainly be a novel use of changeset tags but hey, “creative, productive, or unexpected ways”, right?
There are lots of things we could do in editors to hack the OSM data model (in this case as a tendency I would simply add a tag to the new way, which naturally has the problem that it will be inherited by future versions).
My favorite and very very very simple one would be to bump the version number of ways on all geometry changes.
#potlatch1wasright
Again, #potlatch1wasright (admittedly using way tags rather than changeset tags because we didn’t have changesets back then)
On the original topic, object history cleanliness is one of those things that people fuss over but really doesn’t make much difference at all. If someone really wants to research “who changed what” it’s pretty much always possible to do it by looking at the changesets and, if necessary, the member nodes. It doesn’t merit the disruption of changing the data model.
Here I want to share part of private discussion on API evolution between me and @Minh_Nguyen where I explained my vision of improving tracking history of the features in OSM.
Disclaimer: For my part, I would like to emphasise that this is just an idea that popped into my head. I have no expectations, obligations, or demands on anyone regarding its implementation. However, anyone is free to take it up. If that happens, I would be delighted if I were mentioned.
Ideas for tracking the history of objects
We now have a change history for each element of the data model, and that’s good. Now I’m concerned about the lack of interconnection, when objects are transformed from one type to another or recreated, they lose their previous history. So when we merge two ways into one, we leave the history of only one of them while marking the other as “invisible”. And when we look at the history of the “new” object, we will not find any traces of its other ancestors, except for the one for which we left the history. I propose to add an ancestor-descendant references to track the history of change.
Of course, an object created from scratch will not have any ancestors. In version 1, this parameter should be left either empty/null or have some value that clearly indicates that there are no ancestors. In all subsequent versions, such an object will refer to itself, to its previous version, when it comes to ancestors. When an object is deleted and a new one is created instead, in order to show this connection between the past and the new object, a reference to the newly created object, its first version, is added to the parameters of the deleted object. The newly created object gets a reference to the ancestor object in its latest version.
When merging two or more objects into one (for example, when we merge two or more ways into one), all ancestors end their existence at that moment, their components become elements of the new object, and references to descendants and ancestors should be added accordingly.
When we divide a single element (way/polygon) into several parts, we also consider that the ancestor has ended its existence and two or more descendants have been created instead, and the ancestor and descendants properties are updated accordingly.
This means that one ancestor can have none, one or more descendants, and one descendant can have none, one or more ancestors.
We also need to keep in mind the interconnections between the features used to create other features.
A relationship between elements, where some elements are members of others, and tracking their history
Currently, Way consist of a set of Nodes, and Relation consist of Nodes, Ways, and other Relations. Ways and Relations contain refs to the IDs of their members. We implicitly assume that these are references to the latest/current versions of the members. However, this does not take into account the fact that members themselves can be changed. For example, moving a Node in a Way does not increment the version of the Way because of this, even though the geometry of the Way has actually changed — it is a differently shaped line. There is nothing in the current data model to indicate such a change in the geometry of objects, a change in their position in space. So in addition to the members IDs, you need information about the member version. Changing the version of a member should trigger a change in the version of the element dependent on the member. Thus, changing the position of a Node (changing the shape or position of a Way) will generate a new version of the Way, which is not the case now.
Another issue. Relations (we assume that Ways are also Relations) know the IDs of their members. However, the members of a Relation have no idea which objects they belong to.
In my opinion, it would be nice to have information about the affiliation of the members of the Relations in their information. Something like an array of 'part_of'
values. For example, the intersection point of two ways will have information about this if the IDs of these ways are specified in the 'part_of'
array in their shared node. Is it necessary to have information about the specific version of the element to which the member belongs in the member properties in some way besides the IDs?
If we separate information about geometry (spatial information) from meta-information, we will have two separate types of changes: one in which we change the geometry, and the other in which we change the properties (meta-information) of the objects of interest. Thus, changing the meta-information of an object of interest will not trigger changes to the versions of the geometry it relies on. So, in general, we can have information about the version of a Relation in its members’ information along with its ID, provided that changing the version of the Relation does not change the versions of its members.
Something like this. One day, I hope this will be brought to life.
PS. Under influence of The twelve-factor app methodology
Exactly that - and of course it’s possible to do overpass queries at a data in the past to see exactly what was there then.
As a complete aside from the discussion here, I think it’s not necessary to set layer=-1
for features under bridges, and often not really desirable. It is sufficient for the bridge feature to have layer=1
. layer=0
is the default and means ground level, and does not need to be tagged except where really needed for disambiguation. layer=-1
should be used for features lower than default, so lower than ground level, so things like culverts.
JOSM asks if you have expert mode enabled, otherwise it will just pick one for you like iD.
That already exists
https://wiki.openstreetmap.org/wiki/API_v0.6#Ways_for_node:_GET_/api/0.6/node/#id/ways
I have briefly thought about implementing descendant references like this in JOSM in the past, by (ab)using changeset tags like woodpeck brought up to circumvent requiring API changes. (Initially for testing, could potentially be added to the API if it gains traction.) It unfortunately didn’t get past the thought experiment due to lack of time.
Good ol’ history=*
enters the chat.
Ideally software could do this analysis automatically so we wouldn’t need to manually. Unfortunately, changesets aren’t quite the right level of abstraction, since a changeset can consist of multiple uploads (common with JOSM). Beyond that, there’s always the possibility of splitting, then doing all sorts of stuff to the affected elements, and only then uploading. None of those intermediate actions will be saved in the changeset, unless you’re looking at an ancient changeset that was saved multiple times using Potlatch 1’s live mode (and subsequently coalesced into a single changeset anyways).
This situation is reminiscent of Git: it doesn’t explicitly track file moves or renames, so that the “Renamed from” indicator in GitHub is based solely on diffing two files that were added and removed. If you move the file, change it enough, and only then commit your changes, Git will record it as a deletion of the old file and addition of an unrelated new file. In other words, it doesn’t track the action of moving, only the aftermath, which can be watered down.
When it comes to tracking changes across renames, Git is less reliable than Subversion and Perforce but still more helpful than CVS. Still the software development world largely gets by with it.
In OSMUS Slack, @Mateusz_Konieczny shared a rough Python script for detecting way splits. Despite the caveats, this kind of detection can still be useful to OSM archaeologists. It would be cool to see this functionality work its way into something like augmented diffs, and maybe even the main site’s element history view if we could set appropriate expectations.
This occasionally comes up as a limitation even outside the context of change tracking. Selfishly as a mapper, I would love for the data model to support inverse relations for situations where the child elements are very weakly related to each other, ignoring the disruption that a new element type could cause throughout the software ecosystem. But if we also solve the problem of transcluding unversioned elements by bumping versions across the board, wouldn’t we potentially end up with massive version inflation in some cases?