Here I want to share part of private discussion on API evolution between me and @Minh_Nguyen where I explained my vision of improving tracking history of the features in OSM.
Disclaimer: For my part, I would like to emphasise that this is just an idea that popped into my head. I have no expectations, obligations, or demands on anyone regarding its implementation. However, anyone is free to take it up. If that happens, I would be delighted if I were mentioned.
Ideas for tracking the history of objects
We now have a change history for each element of the data model, and that’s good. Now I’m concerned about the lack of interconnection, when objects are transformed from one type to another or recreated, they lose their previous history. So when we merge two ways into one, we leave the history of only one of them while marking the other as “invisible”. And when we look at the history of the “new” object, we will not find any traces of its other ancestors, except for the one for which we left the history. I propose to add an ancestor-descendant references to track the history of change.
Of course, an object created from scratch will not have any ancestors. In version 1, this parameter should be left either empty/null or have some value that clearly indicates that there are no ancestors. In all subsequent versions, such an object will refer to itself, to its previous version, when it comes to ancestors. When an object is deleted and a new one is created instead, in order to show this connection between the past and the new object, a reference to the newly created object, its first version, is added to the parameters of the deleted object. The newly created object gets a reference to the ancestor object in its latest version.
When merging two or more objects into one (for example, when we merge two or more ways into one), all ancestors end their existence at that moment, their components become elements of the new object, and references to descendants and ancestors should be added accordingly.
When we divide a single element (way/polygon) into several parts, we also consider that the ancestor has ended its existence and two or more descendants have been created instead, and the ancestor and descendants properties are updated accordingly.
This means that one ancestor can have none, one or more descendants, and one descendant can have none, one or more ancestors.
We also need to keep in mind the interconnections between the features used to create other features.
A relationship between elements, where some elements are members of others, and tracking their history
Currently, Way consist of a set of Nodes, and Relation consist of Nodes, Ways, and other Relations. Ways and Relations contain refs to the IDs of their members. We implicitly assume that these are references to the latest/current versions of the members. However, this does not take into account the fact that members themselves can be changed. For example, moving a Node in a Way does not increment the version of the Way because of this, even though the geometry of the Way has actually changed — it is a differently shaped line. There is nothing in the current data model to indicate such a change in the geometry of objects, a change in their position in space. So in addition to the members IDs, you need information about the member version. Changing the version of a member should trigger a change in the version of the element dependent on the member. Thus, changing the position of a Node (changing the shape or position of a Way) will generate a new version of the Way, which is not the case now.
Another issue. Relations (we assume that Ways are also Relations) know the IDs of their members. However, the members of a Relation have no idea which objects they belong to.
In my opinion, it would be nice to have information about the affiliation of the members of the Relations in their information. Something like an array of 'part_of'
values. For example, the intersection point of two ways will have information about this if the IDs of these ways are specified in the 'part_of'
array in their shared node. Is it necessary to have information about the specific version of the element to which the member belongs in the member properties in some way besides the IDs?
If we separate information about geometry (spatial information) from meta-information, we will have two separate types of changes: one in which we change the geometry, and the other in which we change the properties (meta-information) of the objects of interest. Thus, changing the meta-information of an object of interest will not trigger changes to the versions of the geometry it relies on. So, in general, we can have information about the version of a Relation in its members’ information along with its ID, provided that changing the version of the Relation does not change the versions of its members.
Something like this. One day, I hope this will be brought to life.
PS. Under influence of The twelve-factor app methodology