The best places to bite the elephant: Using osm-lump-ways to break large tasks into smaller ones

We all know how to eat an elephant (one bite at a time![1]). Mapping in OSM can be like that. Some tasks are so large, it seems impossible to tackle them. Where do we start? Where do we start biting the elephant?[2]

You all know osm-lump-ways as the software powering WaterwayMap.org[https://waterwaymap.org]. It’s useful to find missing data, such as roads without certain tags. This post is about the new Betweenness Centrality feature, which shows how to break those large groups into smaller ones.

Look at pedestrian routing. OSM now has good road network coverage. You can figure out the road connection from A to B[3]. But on foot? Only 1.5% of highways have a sidewalk tag, showing whether this road has a footpath.

osm-lump-ways shows there is a massive single connected blob of roads in Ireland lacking any sidewalk data.

tags used For the footpath discussion the following tag filtering is used. It's the [osm-lump-ways tag filter language](https://github.com/amandasaurus/osm-lump-ways#filtering-osm-data). It excludes roads that are correctly mapped (`→F`), and things that aren't roads.

sidewalk→F; sidewalk:both→F; sidewalk:left∧sidewalk:right→F; foot=permissive→F; service=driveway→F; highway∈path,motorway,motorway_link,footway,cycleway,track,proposed→F; highway→T; F

Imagine we take 2000 nodes in this network, and calculate the shortest path to each of those other nodes. and for each segment added up how many of the shortest paths go over it. This is the betweenness centrality value of each segment. This shows where the “choke points” are. If we want to split that single blob into 2 blobs, we should focus our sidewalk mapping on those “high” value segments. Doing that for Ireland we get this map. blue = not used a lot, red = used a lot.

Those maps are from the start of the year. I have been using Mapillary, and have managed to split off the northern chunk[4], most of Dublin city (mid-east coast). We now have this betweenness centrality map. I’ve made 3 choke points, and I’ve asked people nearby to take street level imagery.

Personally, I find it quite satisfying to try to track down these choke points, and I get a sense of achievement with splitting the big blobs into smaller blobs. The task of “adding sidewalk tags to everything in Ireland” seems less impossible. I hope this motivates other OSMers to map more. I hope this helps us get to next levels of data completeness.

This isn example with “roads lacking footpath data”, but “roads without surface tags” or ”… without lanes tag” are also possible.

Use osm-lump-ways & —betweenness-output FILENAME.geojons and you can make maps like this.


  1. It means to break a large seemingly impossible task into many smaller, manageable tasks ↩︎

  2. an acceptable number of elephants were harmed in the creation of this post ↩︎

  3. I’m simplifying! What’s the best road, the fastest, what happens if it’s closed, yes we don’t have traffic data, I’m simplifying ↩︎

  4. ulster said no in the end… ↩︎

9 Likes

I may be swearing in a church, but as a hiker I care about whether or not I can use a highway for walking, sidewalk or not.

I’m also a hiker![1]. While mapping in Ireland, I was surprised by how rubbish the roads there are for walking. Maybe I’ve been away too long, and I’m too germanified.

Since then I’ve learned of shoulder=* tag, and maybe I’ll map some of that.

But with this tool, you can choose the “high data quality” tagging combo that you want, and figure out how to add it


  1. In my user profile I’m wearing a merino tshirt, while on the train to hike up german’s highest mountain ↩︎