I have a question about understanding the date format in Geofabrik’s OpenStreetMap data extracts.
When I see a file named like denmark-170101.osm.pbf, I want to confirm what the date represents:
Does the date (170101 = 2017-01-01) mean:
A) This file contains all OSM data for the year 2017, or
B) This file was extracted on January 1st, 2017, and contains all cumulative OSM data from previous years up to that date?
In simple terms: If I want to download OSM data representing the year 2017, should I choose:
denmark-170101.osm.pbf (extracted January 1st, 2017), or
denmark-180101.osm.pbf (extracted January 1st, 2018)?
I assume the date represents the extraction date, meaning denmark-170101.osm.pbf would contain data up to the beginning of 2017 (essentially 2016 and earlier), while denmark-180101.osm.pbf would contain data through the end of 2017. But I want to make sure I understand this correctly before using the data.
Thank you for clarifying this!
Additional context: I’m looking at the Geofabrik download page and see files following the pattern region-YYMMDD.osm.pbf and want to understand what timeframe of data each file represents.
Yes, I see there’s a file for 2025-01-01.
However, I need to understand this naming convention because there’s a possibility that new objects were established in real-life between 2018-2025, and I need to minimize the risk of using the updated label for old satellite imagery.
I understand that the data quantity was relatively low in the early years (2014, 2015, and surrounding years), but I still need clarification on what the date in the filename actually represents to ensure I’m selecting the correct files for each time period. Thank you!
Things added after a date will depend on things added earlier, so if you want to do QA on things changed after a date you might want to look at changesets after that date - but that will be just to identify “objects you are interested in” - you’ll then need to look in a cumulative set of OSM data to view the object details.
Thanks for the answer
One more thing, is there any wiki or documentation about the naming convention about it? I’ve been looking for a day and still couldn’t find it
This YYMMDD naming convention is used in other OSM spaces, e.g. the main planet download service.
a file -YYMMDD will be OSM data, at that date. It is “cumulative”, but it will not include deleted objects, and will not include old versions of each object, merely the latest version (at that date) of that object.
Osmium is a great command line tool for working with OSM data files (pbf & historical files), you can use osmium time-filter on a OSM history file (.osh.pbf) to extract how OSM looked at that date & time.