GeoDesk for Python: Analyze & Visualize OpenStreetMap Data

Hi all,

We’re pleased to announce that GeoDesk is now available for Python.
GeoDesk is an open-source toolkit for geographic object libraries, which store OpenStreetMap data in a compact format and allow fast queries.

You may already know about our Java-based toolkit and the GOL tool that we’ve shipped last year. Feedback has been very positive overall, but there has been one recurring theme: What if you needed more complex queries than the GOL tool provides, but weren’t ready (or willing) to build a full-blown Java application?

Our latest release bridges this gap and puts the ability to work with OSM data into the hands of a much larger audience. Python scripts are easy to write (even a complete beginner can learn the basics in an afternoon) and open the door to a vast ecosystem of tools, from mapping to machine learning.

GeoDesk is 100% FOSS, and has minimal hardware requirements. It currently supports Python 3.7 or above, on Windows and Linux (MacOS support is experimental and limited to Catalina 10.15 or later).

To get started, create a GOL from any .osm.pbf file using the GOL tool (download / tutorial). On any reasonably modern machine, this takes less than an hour for the planet, or a few minutes for a country-size extract. (While a GOL is only about 40% larger than the .osm.pbf, you will need available storage for temporary files – typically 3x the source-file size, or about 200 GB for a complete planet)

After you pip install geodesk, fire up the Python shell and open your GOL (e.g. france.gol):

>>> from geodesk import *
>>> france = Features("france")

Select the features you want:

>>> museums = france("na[tourism=museum]")

(The query language is similar to Overpass: na here means nodes or areas; the desired tags are placed in square brackets)

Visualize these features on a Leaflet-style map:

>>> museums.map.show()

(This creates an HTML temp file and opens it in a browser; on some flavors of Linux, you will need to explicitly name the file: museums.map("paris-museums").show())

Restrict queries to an area:

>>> paris = france("a[boundary=administrative][admin_level=8][name=Paris]").one
>>> paris_museums = museums(paris)

Work directly with individual features:

>>> for museum in museums:
>>>     print(museum.name)

Find nearby features:

>>> nearby_subways = subway_stops.around(meters=500, museum)

Once you’re comfortable with Python, you can do more interesting things. Here’s how you obtain an alphabetical list of the unique street names in a city:

>>> sorted({s.name for s in streets(berlin)})

Of course, you’re not limited to pithy one-liners – you can write proper scripts (and entire applications). There are some examples on the Wiki.

What else can you do?

  • Query features based on type, tags and spatial relationships (e.g. intersects, within, contains, connects_to)
  • Measure features: length, area, distance
  • Explore object graphs (nodes of ways, members of relations)
  • Obtain geometric shapes and process them using Shapely (Python wrapper for GEOS)
  • Convert features to various formats (just GeoJSON and WKT for now, but more are coming)

For details, check out the full documentation and the GitHub repository.

Please keep in mind that this is an Early Access release, which means you’ll likely encounter bugs and missing functionality. As always, we appreciate bug reports and other feedback.

You can post general usage questions in Help & Support of this forum (tag your post geodesk for faster responses).

What’s next

These are our plans for the coming months:

  • Ship at least two more releases (to implement all documented capabilities and address any bugs)
  • Port the enhancements in the Python version back into GeoDesk for Java
  • Enable incremental updating for GOLs (#2 most-requested)

We’ll keep you posted on our progress. Thanks for your interest & support!

16 Likes

Hi all,

We just released two new versions:

GeoDesk for Java 0.1.9

Bug Fixes

  • OsmPbfReader now shuts down cleanly when reading a corrupt source file (gol-tool#104) – this bug caused gol build to hang on truncated .osm.pbf files.

  • gol build: Fixed encoding bug that caused access to certain way-nodes to fail (gol-tool#105)
    Note: We recommend re-building any GOL with more than 16K tiles

Other Changes

  • MapMaker now uses the standard OSM Carto style by default (geodesk-py#17)

GeoDesk for Python 0.1.1

Install / upgrade: pip install geodesk -U

Enhancements

  • Creating Coordinate objects is now easier using lonlat() and latlon() (geodesk-py#10)

  • Feature sets now support __contains__ for Python’s in operator (geodesk-py#23)

  • Number of tags in a Tags object can now be obtained using len()

  • Partial support for Features.tiles

Deprecations

  • Feature.is_placeholder has been deprecated and will be removed in the next major release. Missing relation members will no longer be represented by a “placeholder” feature, but will be omitted from the relation, and the relation will be tagged geodesk:missing_members=yes.

Bug fixes

  • Features.members_of() now properly returns an empty set if called on features that are not relations (geodesk-py#18)

  • Tags: Fixed bug that caused certain tags to be skipped while iterating

Other changes

  • Maps now use the standard OSM Carto style by default (geodesk-py#17)

Related issues & workarounds

  • geodesk-py#19: Way-node retrieval may fail when querying large GOLs (16K+ tiles) that were built with GOL Tool version 0.1.8 or below, due to an encoding bug in gol build (gol-tool#105, see above).

    We recommend upgrading your GOL Tool and re-building any affected GOLs.

2 Likes

Happy New Year!

GeoDesk for Python Version 0.1.3 is now available (release notes).

Install / upgrade: pip install geodesk -U

2 Likes

GeoDesk for Python Version 0.1.4 is now available.

  • Large-area spatial queries are 4x to 30x faster. GeoDesk now finds the 60+ million mapped buildings in the U.S. in less than one second (10x 2.3GHz Xeon, 32 GB RAM, NVMe SSD)

  • Area calculations for features near the polar regions are significantly more accurate (A scaling bug previously caused distortions of areas above 65 degrees latitude)

  • Other fixes (release notes)

Install / upgrade: pip install geodesk -U

1 Like

GeoDesk for Python Version 0.1.5 is now available (release notes):

  • A series of small but important fixes/enhancements for GeoJSON and WKT output.

Install / upgrade: pip install geodesk -U

GeoDesk for Python Version 0.1.6 is now available (release notes).

  • from_mercator(), nodes.parents, intersects() for nodes

Install / upgrade: pip install geodesk -U

GeoDesk for Python Version 0.1.7 is now available (release notes).

  • Enhancements & fixes related to queries

Install / upgrade: pip install geodesk -U

It’s fantastic how quickly you respond to bug reports. I reported one of the issues that is fixed in this release less than three hours ago :smiley:

1 Like

Thanks! 0.1.7 was due for release, so I included what turned out to be an easy fix. Thanks so much for submitting all these issues, this has helped us greatly with our progress.

1 Like

GeoDesk for Python Version 0.1.8 is now available (release notes).

  • Enabled autocomplete in IDEs

  • Fixed regex matching in tag queries

  • Improved coordinate handling

Install / upgrade: pip install geodesk -U

GeoDesk for Python Version 0.1.9 is now available (release notes).

  • Fixed negative tag queries

  • Improved parents_of() queries

Install / upgrade: pip install geodesk -U

GeoDesk for Python Version 0.1.10 is now available (release notes).

  • New geometric filters: max_area(), min_area(), max_length(), min_length()

Install / upgrade: pip install geodesk -U

1 Like