Editing OSM Tags/POI data in a spreadsheet

Has anyone in here ever worked with OSM data in a spreadsheet?

There are several use cases when loading tag data for specific datasets into a spreadsheet can be very useful. Mainly for POIs and for aligning and fixing tags. Once the tag data is updated in spreadsheet there should be an easy way to load it to OSM. (Assuming you actually know what you are doing!)

After some research, I didn’t find any good solutions for this kind of problem so I decided to write a simple script myself. This script/tool can transform CSV tag data into xml that can be loaded into OSM editor:

Code for this tool is in jsbin and with a browser you can test it here.

Key considerations for designing this:

  • CSV file should only include element id and tags. Everything else must be excluded!
  • CSV file can contain only limited subset of tags that you want to work with.
  • New elements (nodes/ways/relations) cannot be created or deleted - this is only for updating tags for existing elements.
  • Empty values for tags are not deleted. If you want to delete a tag, it must be explicitly done by “!delete” to indicate deletion.

Using the tool should be straightforward:

  1. Check the structure of example CSV data.
  2. Copy tag data directly from a spreadsheet or export it as CSV and then copy the content of this CSV file.
  3. OSM xml can be fetched from overpass api or it can be copied manually into the tool.
  4. Tool compares OSM xml and tag data from csv and generates a new xml based on the changes.
  5. Save the output to a file.
  6. Output xml is in JOSM xml flavor - this can be opened with JOSM or Level0 editor.

Example CSV data does the following:

  • opening_hours tags are added to 3 nodes
  • fountain tag is added to 1 node
  • created_by tag is deleted from 1 node

Please treat this tool as untested script!

Any feedback on this tool is welcomed, and discussions on how spreadsheets can be used with OSM data are also encouraged.

WARNING! Using a spreadsheet like Excel, Google sheets or LibreOffice Calc for editing OSM data is generally a bad idea! I can write a long list of things that can can go wrong… Still, experienced spreadsheet users who understand the risks, should be able to choose the tools they what to use.

Mass edits/imports - guidelines and rules for such edits should always be followed, regardless of the tools used.

1 Like

To be honest I don’t have a clue what you are talking about. CSV?? XML???

But the word spreadsheet cought my attention for sure because of the mess we are making with the tagging of old bunkers and even older forts and the like.

I’m dreaming of a nice spreadsheet in which the mess is spread out so I can see what it looks like and maybe unmess it a littlebit.

Keep up the good work.

If you just want to open OSM data in a spreadsheet then there are several ways to do this. You can analyze tags and fix mistakes manually in iD/JOSM.

Doing edits in a spreadsheet and then moving changed data over to JOSM is actually easy with this tool but understanding what is inside these text based formats is really important to avoid mistakes.

CSV is a very simple text based file format that is supported by every spreadsheet program.

XML is another text based format and this is used for OSM data. In this instance it’s the same as JSOM .osm file.

If you want to understand these concepts you can play around with this tool and see if you understand what happens when you change example input. Just don’t submit any changes with JOSM to OSM. (Or use the OSM dev environment for submitting such changesets)

1 Like

It’s also possible to edit spreadsheet using OpenData plugin in JOSM. See LearnOSM

I looked into the OpenData plugin and found that importing new nodes to OMS is very easy. However, editing tags for existing nodes can be messy and it looks there is no clear distinction between importing new data and modifying existing OSM data. Perhaps I missed something, but is there a way to specify a node id in there? In a CSV file I want to define that node 2351682289 should be modified and tag fountain=nasone added/changed:

element;id;fountain
node;2351682289;nasone

Ideally I don’t want to specify anything else and all the other data/tags for this element should be retained.

What about geolocation? We often deal with sources which do not know the OSM object id. They often have an approximated geolocation.

If you don’t know the OSM id but you know the location then you could use coordinates to find matching OSM element. But this process needs manual review - there is no way to do it automatically in a way that is 100% reliable.

Currently this tool is designed in a way that no additional data processing steps are needed and the output xml can be safely loaded to OSM. This means all the data preparations steps like matching id’s should be done beforehand.

1 Like

I see. So in effect it is: output data; edit object details in spreadsheet; import data to the same objects.
I imagine this could help in some cases. In most cases, wouldn’t the group editing functions in JOSM suffice?

1 Like

Yes, this kind of workflow would work:

  1. use overpass the get csv data
  2. copy data to spreadsheet
  3. edit tags in spreadsheet
  4. copy data to this conversion tool
  5. save xml to file
  6. upload xml to OSM with JOSM/Level0

I mostly use iD editor and have no experience with group editing in JOSM so I can’t comment on that.

Your conversion tool uses JOSM, overpass and spreadsheet software, and multiple format transformations. Wouldn’t it be easier to use only JOSM?
What kind of edits would you use the spreadsheet program for?

1 Like

Just to be clear - this can’t be a simple “upload” because you’ll need to check that the objects in OSM haven’t changed while you edited them offline.

2 Likes

Well, I tried to keep the UI as minimal as possible but in the bottom right corner there are simple stats showing latest change in OSM data:
stats

This means you can compare date differences and do additional checks if data in your csv is older.

Spreadsheet software can be extremely powerful tool when used correctly. With this power also comes responsibility, since it is very easy to mess up your data if you don’t know what you are doing. Other advantages include working with external data* and larger dataset if needed. My personal recommendation is to only work small enough datasets where you are able to manually review the data.

The main reason I started working with this tool, was to be able to perform a small scale import. But I realized there are additional use cases when you might want to use spreadsheet over traditional editor. So once main the functionality of this tool was complete I also included csv validations checks and getting fresh OSM data from overpass.

External data - assuming there are no licensing issues and proper guidelines are followed.

I did exactly that, with a list of trailhead nodes. Had I known what JOSM can do, I would have done almost all editing in JOSM. It can also get data from OSM, search and filter within the downloaded dataset, change all values of a given key at once, move all values of a key to a different key, add keys and values, perform validations.

I would like to know specific examples of operations you would want to perform on the data, which can’t be done in JOSM but can be done in a spreadsheet.

1 Like

Good to know that JOSM has such powerful editing features. In the end it comes down to personal preference - if JOSM has a steep learing curve and the UX feels outdated then I prefer to use different tools. For normal edits I’m happy with iD editor and if I want to dig deep into a specific dataset I can use overpass and Excel.

Currently I’ve only tested this tool by uploading to dev environment. In the near future I’m planning to update tags for ~300 Estonian pharmacies based on public government data. Once that is complete I do see myself using this tool for some much smaller edits like aligning tags and maybe updating 5-40 objects at the time.

Well, if you combine the learning curves of the tools you are combining, starting with overpass, it’s pretty steep. Once JOSM has been installed and setup (and you have it installed, because ít’s in your workflow), and you know the basic workflow (download/edit/upload), which you know, because it’s in your workflow), you can start basic editing pretty much comparable to what Id lets you do.

For the rest, ask the community “How do I… in JOSM” and people will tell you.

update tags for ~300 Estonian pharmacies based on public government data

Are some or all of these pharmacies already in OSM? If so, what is the match key?
In general with imports, comparing the import with existing OSM-data, and deciding what to do with matches and non-matches is the bigger problem, and none of your tools solves that.

I’m afraid your personal preferences are getting in the way of doing it the easy way.

1 Like

Overpass is a strange beast. There are some instances where it is trivial to extract data but there also instances where I have spent a lot of time and still haven’t been able to get the desired results. But learning to use it is vital if you want to extract data from OSM.

Other tools - if I have a lot of experience with Excel and I know exactly what I need to do to get desired output in a reliable way, then I prefer not to experiment with unknown workflows that may or may not work for me.

Can I use JOSM with OpenData to load a csv file that has a node id and update its tag value? :slightly_smiling_face: eg:

element;id;fountain
node;2351682289;nasone

There are roughly 500 pharmacies in Estonia. Most of them are in OSM. Unfortunately these can’t be automatically matched in a reliable way. I already did the matching part, mostly manually in a spreadsheet. Now I have 3 sets of data: Pharmacies missing from OSM; Existing OSM pharmacies; OSM pharmacies that are no longer operational. (In this rare example I know that the government data is 100% complete. Also, pharmacies cannot operate without a license.)

Yes, this is something I fully agree with. This is the main reason I ended up using external tools with familiar workflows. Now I have distinct datasets where I know exactly what operations should be done for each one. This tool tries to solve only a single thing - how to change tags for specific elemets by id, without changing anything else.

… Good luck with your mass edit / import!

Just a quick update. So far I’ve used this tool several times without any issues. Tested it with both JOSM and Level0. Editing pharmacy data went smoothly.

One practical use case for this was to remove opening_hours:covid19 tag in Estonia. I one go I removed values same, open and also manually review two instances of closed that were not correct.

If there is any interest on this tool I should also release it properly. But firstly I need to come up with a good name.