Often I find myself in a situation when I need OSM data focusing on a specific topic. I either use overpass turbo to get such a dataset, or download an extract from Geofabrik and filter that one using ogr2ogr
or osmium
. Both methods involve some manual repetitive work until I get to a GPKG (my current favourite file format).
So, I wanted to build a website which offers thematic extracts. This turns out to be more complicated then I initially thought. For context, I don’t own hardware or have hardware expertise, so I am reliant on cloud providers.
I use these 3 services:
- GitHub pages to serve frontend code
- GitHub scheduled workflows to run extraction jobs daily
- Supabase object storage to store results of extraction jobs
GitHub scheduled workflows are limited by size, so forget processing a planetfile or similar (unless one is happy to pay GitHub for the workflows). The Supabase service I can access without providing a bank card is limited to 1 GB total storage, max file size 50 MB. I tried other providers, but I found there is always a catch:
- Backblaze: you can have a free account, you can’t create a public bucket unless you give payment info (then you have to worry about egress charges)
- Hetzner: cheap storage, but again, if I give payment info, then I can theoretically face runaway egress charges
- AWS, Azure, GCP: I use these enough at work, I wanted to try something new in my freetime (+ theoretical threat of runaway egress charges.)
The end result is this pretty miniscule website: OSM extracts (source code).
I could probably include some more regions, albeit not large ones (due to the 50MB file size limit). My main takeaway is that it is not an accident that a truly robust website offering GPKG (or similar) extracts does not exist, creating one takes more effort than a Sunday afternoon. I hope next time I have a similar idea, I’ll just go outside instead.