I did a quick comparison in QGIS and already deleted 6 pharmacies from OSM that where no longer operational. I also moved 2 pharmacies to new location.
Next step would be to add missing pharmacies to OSM. I personally don’t want to do this myself but if there is interest I can quite easily map OSM data with the data from Ravimiamet and find the missing pharmacies. Then a maproulette challenge can be used to finish this task.
Unfortunately there are no useful IDs for Estonian pharmacies. Both the licence number and branch reference number (Tegevusloa number/Tegutsemiskoha kood) can change over time and these numbers can be reused on different pharmacies.
Finally it’s important to point out that while pharmacies have a official registered name, there can be additional branches (haruapteegid) where offical name refers to main site and in such cases it might be better to use a naming schema that is more relevant to the actual location of a pharmacy.
So I guess one-time import is also a possibility. But after having a second look at OSM data I noticed that existing data quality is quite inconsistent. Cleaning up duplicates entries is still something that needs to be done manually.
I already regret that I have mentioned this
Mapping existing OSM data with pharmacy data from Ravimiamet is a slow and mostly manual process. But since I’ve started with it I might as well try to do my best to finish it.
Data import - if the scope of the change is really tiny and localized, like adding 10 new pharmacy nodes to specific region, then I assume planning/approval/documentation process is different from a larger import? Keep in mind that this data is manually prepared in a spreadsheet and then transformed to OSM xml.
License - hopefully somebody else can verify this but as I understand a legal regulation document called Ravimiameti tegevuslubade registri põhimäärus specifies what parts of the dataset belong to public domain (avalikud andmed).
Pharmacies imported/updated by Rocket Data - there are already duplicate pharmacies created by this process. During the clean-up process I plan to remove duplicates. I don’t see any potential issues with overwrites though. My only concern is that future rocketdata.io imports should be monitored to avoid duplicate entries.
Yesterday I finished validating Estonian OSM pharmacies based on license data form Ravimiamet. As a result I identified 163 Estonian retail pharmacies that are not in OSM. Using the public data from Ravimiamet I prepared a csv file where these pharmacies can be viewed:
Importing this csv file to OSM is a trivial task, but before proceeding, what should be the next steps? Should this simple import follow exactly the same procedure as a any other import? My proposal would be to document/review this tiny import in this topic.
Technical details on how the import could be performed:
Create a new OSM account for modifying any Estonian pharmacy data
Split the import into smaller regional chunks so that number of new nodes is less than 50 per changeset - this makes manual review is easier
Use JOSM with OpenData to load the csv file and perform the import
Potential issues and basic risk assessment:
Something goes terribly wrong (tags and data misaligned, text encoding issues etc)
In this unlikely scenario changeset(s) can be easily reverted
Creating duplicate entries
I have already spend many hours on manually reviewing every single pharmacy in OSM. The likelihood of this is close to zero.
Bad data quality from source
I have previous experience working with this dataset from Ravimiamet - the data quality has always been excellent.
Breaking existing OSM data
Not applicable - I don’t see how this is even possible with this particular import
If there are any suggestions on how to improve tagging/selecting what fields to import, I am happy to receive feedback. CSV file can be downloaded from the airtable link. Most fields in csv file should be self-explanatory. Some additional details to point out:
ref tag is mapped to “Tegutsemiskoha kood”/“Site reference number”
brand tag is generated from the email domain name
addr:street and addr:housenumber data was extracted from a single address line field
Import of around 170 nodes falls somewhat between large formal imports and small weekend project. Few years ago i was told on talk-ee, that import of 120-130 nodes is not worth formal import procedure, and yet it’s kind of volume, that becomes annoying to handle manually.
If you are familiar with Josm, you could use that, but you may want to consider Level0 which is less annoying and easier to use in scenarios where you have text-based source dataset. You could write basic script (ca 20-30 lines of python) that converts csv to L0-compatible format, which can then be pasted into editor and uploaded. Since L0 is also capable of running overpass queries, it may be helpful in future if you want to update existing pharmacies.
As for tagging shown in CSV, main thing to change is converting operator names to lowercase with capital letters (aka Title case?). You may however want to add additional tags, such as brand wikidata. Use Name Suggestion Index and editor presets as an inspiration.
Regarding licensing, best course of action is to contact Ravimiamet and ask them for explicit permission. Põhimäärus you linked did not specify the license they are using, it just classifies data as class 1/non-confidential (§5p3) and states there are no access restrictions on some parts of data (§12p3). Usually when estonian government agencies publish their opendata, they either forget to add license, publish it under standard CC-BY or use custom license with BY-clause. Often it’s the latter. For excample Maa-amet’s data’s, which has been imported for over a decade now, is also licensed under condition to refer to them as source, but they count mentioning them on changeset’s source-tag to satisfy BY-clause.
I did find some other Ravimiamet’s opendata and that data is licensed under CC-BY3.0, but looks like it’s different dataset.
I will contact Ravimiamet and find out under which license that data is published. The opendata link you provided refers to the same source, but the dataset link in there is outdated.
operator tag, in this instance I wouldn’t touch it. There are examples like “MTPApteegid OÜ” where changing the case to any other form is just wrong. But is this operator tag actually useful? For most OSM end-users it is actually irrelevant. It might be useful for future maintenance and since this field is also used by rocketdata.io pharmacy imports I decided to include it.
brand:wikidata tag, sure, I can include this for Benu pharmcies.
Level0, this might be useful tool indeed for future maintenance of pharmacy data.
Finally got the response from Ravimiamet (State Agency of Medicines) were they clearly state that copyright regulations do not apply and this information belongs to public domain. They also confirmed there are no limitations on how this data can be reused (ie no attribution requirements).
Full response in Estonian:
Täname küsimuse eest ja vabandame vastuse viibimise eest.
Ravimiameti tegevuslubade registris olevad andmed on avaandmed ehk kõigile vabalt ja avalikult kasutamiseks antud andmed, millel puuduvad kasutamist takistavad piirangud. Avaandmed ei ole kaitstud autoriõiguse, patendi, kaubamärgi ega ärisaladuse regulatsiooniga.
Avaandmeid on võimalik ainult teatud põhjendatud tingimustel piirata. Kui tingimuste seadmine taaskasutamisse andmisel on vajalik avalikes huvides, peavad vastavad tingimused olema objektiivsed, proportsionaalsed ja mittediskrimineerivad. Ravimiameti tegevuslubade registri andmete taaskasutamise piiramine ei ole põhjendatud ja seega võib registris olevaid andmeid kasutada piiranguteta.