Proposal to include changeset tags in the .osc file format

user10 · January 30, 2024, 12:38pm

I have created a proposal that affects the osmChange file format. If you are a software developer, or if you use this file format, then you may be interested in this technical proposal. Everyone is invited to comment on the wiki page.

Please, cross post this announcement on the tagging mailing list on my behalf by sending an email to: tagging@openstreetmap.org

Woazboat · January 30, 2024, 1:19pm

This isn’t really something that can be changed via a proposal on the wiki. You need to actively get into contact with the software developers and data users who use the file format and work with them to answer some questions.

Is there demand for a change like this in the first place? What is the actual intended outcome? Will the proposed change actually achieve that goal or are there other better options?
What are the ramifications of changing the format? Which data users are affected? Will this break existing implementations? Is the change worth the risk?
How should software make use of this new feature? Are changes in the user interface required? Is this compatible with the existing workflow?
What should the new format look like?
What is the development roadmap for implementing the feature in affected software?
Who will actually do the development? (for each affected software)

I can think of a few issues here. For example: editors usually only promt for changeset information when actually uploading data to the server, so where should the data for the changeset info in the osmchange file come from?

user10 · January 30, 2024, 2:06pm

I realise that, but it seems smarter to have a discussion first, rather than every editor implementing things differently. We already have some software that put metadata into the osc file via various different XML attributes.

I briefly mentioned this in the wiki page, if you’re exporting your edits from one editor into another, it would save time if the changeset tags were included in the exported file.

I’m not aware of any software that breaks if you have extra xml tags in the .osc file. So this is not a breaking change. Editors that allow importing .osc files can decide what they do with the changeset tags, if anything. For example, some editors won’t let you override tags like created_by for obvious reasons.

I’ve included a proposed example on the wiki page

It’s completely optional, no one is forced to include this extra metadata. I’m happy to implement this for iD, RapiD, and other web-based editors that I use. I’m not asking anyone to do extra work here, this proposal is just to define a standardised format.

In some editors, changeset tags are collected while the user edits. For example, imagery_used or source are automatically set by iD. These are the tags that would be included in the osc file. Depending on the editor’s UI, other tags like comment might not exist yet, so those tags can’t be included in the export.

SomeoneElse · January 30, 2024, 4:08pm

Just out of interest, what did you test?

SimonPoole · January 30, 2024, 4:58pm

Hmmm just realized that a number of the examples in the wiki are broken. The changeset attributes are mostly bogus and would not be present in editor output and particularly not if the element has template ids.

SimonPoole · January 30, 2024, 7:59pm

I suspect, even though it isn’t explicitly said, that the background of the proposal is that this a workaround around conflict resolution issues in iD. As a first response I would suggest that those issues get addressed as uploading with one of the other editors is just a workaround and not suitable for iDs nominal audience.

The other problem is that the OSC format has a lot of caveats wrt its use in practical terms and I’m not totally convinced that we should promote it outside of emergency use.

Jochen_Topf · January 31, 2024, 8:44am

We have to be extremely careful if we change a file format that has been around for decades and that has no formal definition, because we just don’t know what software is out there that happens to rely on some interpretation of that format. Some software might just ignore the added section, others might break. And might even break in non-obvious way. Experience shows that programmers sometimes take shortcuts and don’t make their software robust against changes like this. That being said, we have to be able to change things. But somebody has to be willing to do the due diligence and check popular software and warn developers in time and all that.

My greatest concern is that the .osc file format’s main purpose is to be used for API access and for the change files you download from planet.osm.org. The use in editors is certainly a niche use. Even if you specifically decide that API/download should not use the changeset section, it becomes a source for more confusion and potential problems. So we have to tread very carefully there.

But my main question at this point is: Why do you want this change? I personally have never felt the need to store a change in one editor and load it into another. Can you explain what the use cases for this are and why it is important ?

user10 · January 31, 2024, 9:43am

Tools that already support the <changeset> XML tag when importing osc files:

Level0

Tools that ignore the <changeset> XML tag when importing osc files:

JOSM
Vespucci
Osmconvert
osmdiff
pyosm
(various tools I’ve created that aren’t important enough to list here)

Tools that break if they encounter a <changeset> XML tag when importing osc files:

(none)

Obviously there are more tools that I’m not aware of, but so far it seems like <changeset> is not a breaking change (unlike other extensions to the osc format such as what OsmAnd does)

If there’s any other tools I should test please let me know

Actually no, where I edit there are very few mappers, so conflicts are super rare (once or twice a year for me)

My typical use case is saving my work when the wifi drops out (very common) so I can keep editing later.
Second example is continuing my edits in a more powerful editor. For example, switching from iD to JOSM or switching from OsmAnd to Level0.
Another use-case is when iD breaks PTv2 relations, this frustrates other mappers in my area, so I try to fix relations in another editor before uploading.

SimonPoole · January 31, 2024, 10:12am

Isn’t the argument really the other way around? Because we are in general a lazy bunch we don’t do input validation against a schema and check that we are really getting what we expect, aka failing in some form would be the robust option.

Woazboat · February 1, 2024, 10:20pm

Being lenient with what you receive and strict with what you send out can have merits and ensure interoperability with a wider range of software without requiring changes. What it can unfortunately also lead to is growth of mutually incompatible ad-hoc extensions by individual implementations.

Minh_Nguyen · February 2, 2024, 12:03am

For these use cases, it sounds like one user workaround would be to copy and paste the changeset comment and changeset tags into a separate sidecar file. However, I agree that there’s an advantage to being able to store a changeset that’s more self-contained.

Your proposal seems sensible, but I wonder if it’s a sign that the format should formally allow for some extensibility. Since the osmChange format is based on XML, implementers like OsmAnd and Level0 should probably define and consume a separate XML namespace for any extensions they introduce, as is commonly done in other XML-based formats like SVG.