Has anyone in here ever worked with OSM data in a spreadsheet?
There are several use cases when loading tag data for specific datasets into a spreadsheet can be very useful. Mainly for POIs and for aligning and fixing tags. Once the tag data is updated in spreadsheet there should be an easy way to load it to OSM. (Assuming you actually know what you are doing!)
After some research, I didn’t find any good solutions for this kind of problem so I decided to write a simple script myself. This script/tool can transform CSV tag data into xml that can be loaded into OSM editor:
CSV file should only include element id and tags. Everything else must be excluded!
CSV file can contain only limited subset of tags that you want to work with.
New elements (nodes/ways/relations) cannot be created or deleted - this is only for updating tags for existing elements.
Empty values for tags are not deleted. If you want to delete a tag, it must be explicitly done by “!delete” to indicate deletion.
Using the tool should be straightforward:
Check the structure of example CSV data.
Copy tag data directly from a spreadsheet or export it as CSV and then copy the content of this CSV file.
OSM xml can be fetched from overpass api or it can be copied manually into the tool.
Tool compares OSM xml and tag data from csv and generates a new xml based on the changes.
Save the output to a file.
Output xml is in JOSM xml flavor - this can be opened with JOSM or Level0 editor.
Example CSV data does the following:
opening_hours tags are added to 3 nodes
fountain tag is added to 1 node
created_by tag is deleted from 1 node
Please treat this tool as untested script!
Any feedback on this tool is welcomed, and discussions on how spreadsheets can be used with OSM data are also encouraged.
WARNING! Using a spreadsheet like Excel, Google sheets or LibreOffice Calc for editing OSM data is generally a bad idea! I can write a long list of things that can can go wrong… Still, experienced spreadsheet users who understand the risks, should be able to choose the tools they what to use.
Mass edits/imports - guidelines and rules for such edits should always be followed, regardless of the tools used.
If you just want to open OSM data in a spreadsheet then there are several ways to do this. You can analyze tags and fix mistakes manually in iD/JOSM.
Doing edits in a spreadsheet and then moving changed data over to JOSM is actually easy with this tool but understanding what is inside these text based formats is really important to avoid mistakes.
CSV is a very simple text based file format that is supported by every spreadsheet program.
XML is another text based format and this is used for OSM data. In this instance it’s the same as JSOM .osm file.
If you want to understand these concepts you can play around with this tool and see if you understand what happens when you change example input. Just don’t submit any changes with JOSM to OSM. (Or use the OSM dev environment for submitting such changesets)
I looked into the OpenData plugin and found that importing new nodes to OMS is very easy. However, editing tags for existing nodes can be messy and it looks there is no clear distinction between importing new data and modifying existing OSM data. Perhaps I missed something, but is there a way to specify a node id in there? In a CSV file I want to define that node 2351682289 should be modified and tag fountain=nasone added/changed:
Ideally I don’t want to specify anything else and all the other data/tags for this element should be retained.
If you don’t know the OSM id but you know the location then you could use coordinates to find matching OSM element. But this process needs manual review - there is no way to do it automatically in a way that is 100% reliable.
Currently this tool is designed in a way that no additional data processing steps are needed and the output xml can be safely loaded to OSM. This means all the data preparations steps like matching id’s should be done beforehand.
I see. So in effect it is: output data; edit object details in spreadsheet; import data to the same objects.
I imagine this could help in some cases. In most cases, wouldn’t the group editing functions in JOSM suffice?
Your conversion tool uses JOSM, overpass and spreadsheet software, and multiple format transformations. Wouldn’t it be easier to use only JOSM?
What kind of edits would you use the spreadsheet program for?
Well, I tried to keep the UI as minimal as possible but in the bottom right corner there are simple stats showing latest change in OSM data:
This means you can compare date differences and do additional checks if data in your csv is older.
Spreadsheet software can be extremely powerful tool when used correctly. With this power also comes responsibility, since it is very easy to mess up your data if you don’t know what you are doing. Other advantages include working with external data* and larger dataset if needed. My personal recommendation is to only work small enough datasets where you are able to manually review the data.
The main reason I started working with this tool, was to be able to perform a small scale import. But I realized there are additional use cases when you might want to use spreadsheet over traditional editor. So once main the functionality of this tool was complete I also included csv validations checks and getting fresh OSM data from overpass.
External data - assuming there are no licensing issues and proper guidelines are followed.
I did exactly that, with a list of trailhead nodes. Had I known what JOSM can do, I would have done almost all editing in JOSM. It can also get data from OSM, search and filter within the downloaded dataset, change all values of a given key at once, move all values of a key to a different key, add keys and values, perform validations.
I would like to know specific examples of operations you would want to perform on the data, which can’t be done in JOSM but can be done in a spreadsheet.
Good to know that JOSM has such powerful editing features. In the end it comes down to personal preference - if JOSM has a steep learing curve and the UX feels outdated then I prefer to use different tools. For normal edits I’m happy with iD editor and if I want to dig deep into a specific dataset I can use overpass and Excel.
Currently I’ve only tested this tool by uploading to dev environment. In the near future I’m planning to update tags for ~300 Estonian pharmacies based on public government data. Once that is complete I do see myself using this tool for some much smaller edits like aligning tags and maybe updating 5-40 objects at the time.
Well, if you combine the learning curves of the tools you are combining, starting with overpass, it’s pretty steep. Once JOSM has been installed and setup (and you have it installed, because ít’s in your workflow), and you know the basic workflow (download/edit/upload), which you know, because it’s in your workflow), you can start basic editing pretty much comparable to what Id lets you do.
For the rest, ask the community “How do I… in JOSM” and people will tell you.
update tags for ~300 Estonian pharmacies based on public government data
Are some or all of these pharmacies already in OSM? If so, what is the match key?
In general with imports, comparing the import with existing OSM-data, and deciding what to do with matches and non-matches is the bigger problem, and none of your tools solves that.
I’m afraid your personal preferences are getting in the way of doing it the easy way.
Overpass is a strange beast. There are some instances where it is trivial to extract data but there also instances where I have spent a lot of time and still haven’t been able to get the desired results. But learning to use it is vital if you want to extract data from OSM.
Other tools - if I have a lot of experience with Excel and I know exactly what I need to do to get desired output in a reliable way, then I prefer not to experiment with unknown workflows that may or may not work for me.
Can I use JOSM with OpenData to load a csv file that has a node id and update its tag value? eg:
There are roughly 500 pharmacies in Estonia. Most of them are in OSM. Unfortunately these can’t be automatically matched in a reliable way. I already did the matching part, mostly manually in a spreadsheet. Now I have 3 sets of data: Pharmacies missing from OSM; Existing OSM pharmacies; OSM pharmacies that are no longer operational. (In this rare example I know that the government data is 100% complete. Also, pharmacies cannot operate without a license.)
Yes, this is something I fully agree with. This is the main reason I ended up using external tools with familiar workflows. Now I have distinct datasets where I know exactly what operations should be done for each one. This tool tries to solve only a single thing - how to change tags for specific elemets by id, without changing anything else.