GeoDesk goes 1.0 (and what's coming in 2.0)

Hi everyone,

Version 1.0 is out! What’s new? Not much, really. It’s a boring release, in a good way. Our Java & Python toolkits have been stable for six months now, so we’re confident to recommend them for production use (Of course, Murphy’s Law dictates that some epic bug will emerge the second I click “submit”).

Quick recap: GeoDesk is an open-source database toolkit specifically designed for OpenStreetMap data. It turns OSM PBF files into a single-file GOL (a Geographic Object Library), which is only 30% to 50% larger and supports a wide range of fast queries. Java, Python and C++ on all major platforms, with modest hardware requirements.

We’re already hard at work on Version 2.0, which is now far enough along for me to share some details about our planned rollout. The centerpiece will be a brand-new GOL Tool. We’ve redesigned the GOL file format and its toolchains to enable fast incremental updates. This has been the most requested feature. We keep hearing “Would love to adopt this, but the lack of update capability is a deal breaker” – so this is our #1 priority.

Currently, you have to rebuild a GOL from a fresh PBF file. This workflow is fast enough for analytics that run daily, but problematic for QA tools that need to analyze OSM data more frequently. With the new GOL Tool, changes (in .osc format) are downloaded from a replication server and applied directly, requiring a fraction of the time. The basics are in place, now begins the hard slog of ensuring that all conceivable edge cases work properly.

Our short-term goal for the new GOL Tool is to reach feature parity with 1.0 – even without the update capability, it will already be a substantial improvement, as it is easier and more intuitive to use, and even in its unoptimized state is already significantly faster (typical build time drops from 45 to 20 minutes, and a large-scale operation such as a GeoJSON extract of the world’s rivers now takes 5 seconds rather than a full minute). We’re looking to ship early-access builds in May.

We’re also working on an ultra-compressed file format (as a companion to GOL), which will simplify archival and distribution of OSM data and cut storage and bandwidth costs. This should be available by June, but we’ll likely publish details before to get feedback.

We’re expecting 2.0 to be feature-complete by the end of the summer, with general availability by late fall. The Python and C++ toolkits will gain the new capabilities first. Once these are stable, we will port them to the Java toolkit. Despite the underlying format changes, the APIs will remain largely unchanged, making migration to 2.0 seamless for your applications.

This schedule isn’t carved in stone, of course. We’re a tiny, volunteer-only team without corporate backing. This keeps us independent, but we do have greater scheduling volatility compared to a commercial project. I will keep you all updated as things progress. As always, we’d love to hear your input! Follow @GeoDeskTeam (or better yet: @geodesk@en.osm.town), and keep an eye on our GitHub repos, where you can submit bug reports and vote to prioritize features (look for the green roadmap tags).

Big thanks again to everyone who helped us get here!

7 Likes