World Bank Electrical Grid Data Integration

Hey OSM Community :wave:,

I have just started this week my new position at Open Energy Transition to support the development of PyPSA-Earth by increasing the global coverage and data quality of OSM’s electrical grid data. After some extensive online research, I was able to find several up-to-date WorldBank datasets that have not yet been integrated into OSM, especially from Africa and Asia. Here is a list of datasets, after a quick comparison with OpenInfraMap, where I could see major grid data that is missing in OSM so far.

From the import/catalogue I could see that some other World Bank datasets were planned to be integrated by the NEOers user, but seem to have been abandoned. User:NEOers - OpenStreetMap Wiki

Liberia Power dataset import Liberia transmission grid power dataset from the World Bank Group Liberia CC1 None None Planned 2023-05 N/A NEOers The dataset is ready and being reformatted for importing. Unknown
Dominican Republic Power dataset import Dominican Republic Power dataset import Dominican Republic CC1 None None Planned 2023-05 N/A NEOers The dataset is ready and being reformatted for importing. Unknown
Bangladesh power dataset import Bangladesh transmission grid power dataset from the World Bank Group Bangladesh CC1 None None Planned 2023-05 N/A NEOers The dataset is ready and being reformatted for importing Unknown

Do you have any idea why this has not been done? Is it a matter of human resources to validate and integrate this dataset into OSM, or are there other reasons you know of why it has not been integrated?

Tobias, I don’t know about the status of these specific data imports. A data import in OSM is a complicated matter; you don’t just take some data set, convert it into a different format, and hit upload. First of all, you need to make sure that the license is compatible, and then you need to check that your data doesn’t duplicate stuff already in OSM, or implausibly intersect with other data (like power lines going straight trough houses, power plants on lakes and so on).

We prefer this work to be done by local community members who know the area (instead of someone on the other side of the planet) - this greatly increases quality but also limits the available workforce.

I haven’t looked at all of the data sets that you have mentioned, but one of them says that it already contains OSM data so some de-duplication would certainly be required, and crucially it also says “this data is partially based on a digitized PDF map, and so is intended as a schematic of rough locations of the power network” - that would certainly be a no-go for a direct import, and every feature in that data set would have to be manually aligned with available aerial imagery.

For more on data imports, see Import - OpenStreetMap Wiki

8 Likes

Just for the record (as in this will have to be dealt with one way or the other): seems as if those datasets are licensed on CC BY 4.0 terms which requires a waiver for compatibility with OSM. See Use of CC BY 4.0 licensed data in OpenStreetMap | OpenStreetMap Blog

3 Likes

I hadn’t seen this data before, it is quite interesting. I think it would be best used as a source for manually improving the existing state of the data, or manually-assisted importing in some cases.

Doing a straight import would require conflating the data with what exists in OSM and in most cases that’s likely harder than just doing it manually.

The main issue here is that these datasets are licensed as CC-BY, which is not directly compatible with OSM. So in any case we’ll first need to get permission to use this data - it appears that NEOers didn’t get that far.

1 Like

Thanks for the very constructive and quick answers!

you don’t just take some data set, convert it into a different format, and hit upload.

I’m aware of this, but I’m not quite sure how much work is involved.

We prefer this work to be done by local community members who know the area (instead of someone on the other side of the planet) - this greatly increases quality but also limits the available workforce.

We also want to motivate the local community to support this endeavor, if possible.

CC BY 4.0 So in any case we’ll first need to get permission to use this data

I’m planning to get directly in touch with the world bank directly not just because of the license, but also how to streamline the integration of the data they are creating into OSM for updates and future grids they are mapping.

Doing a straight import would require conflating the data with what exists in OSM and in most cases that’s likely harder than just doing it manually.

As we are mainly talking about high-voltage grids, I think the integration of the data is still manageable manually.

We are building a whole group of people at Open Energy Transition to focus solely on this challenge, so I think it is practically feasible.

With rare exceptions - very large amount of work. Especially if import would be done properly.

I would encourage to start with small and simple dataset, preferably in area familiar to you. Or part of larger dataset if all are large and complex.

Imports can be very helpful and powerful tools, but it is also easy to cause problems.

2 Likes

For the record, I am working with Open Energy Transition on improving power network data in OSM (and hopefully some other good things).

I’m (naturally) very familiar with how this infrastructure is mapped in OSM, and I’m also quite experienced with imports. So I’ll keep an eye on them :wink:

I think it’s good to have everything documented in this forum though.

2 Likes

Thank you again for the very helpful feedback. As the maintainer of the OpenSustain.tech dataset, I’m well aware of the importance of human review and validation, and I would never allow a mass upload of data here without reviewing each and every project.

I just came across all these datasets and wanted to document them somewhere, so doing it here seemed like the right choice. At least for a first good guess they should be very helpful.

energydata.info also releases a CC0 dataset for Zambia. An overlay with this data was merged to editor-layer-index, meaning the next iD release will have it. I find overlays to be helpful for, as Russ said it, “manually improving the existing state of the data”.

There is a CC BY 4.0 dataset for Namibia. I contacted energydata@worldbankgroup.org to get it under CC0 or provide a CC BY 4.0 waiver, they replied initially but then I didn’t get a waiver (nor CC0).

I don’t find the exact place now, but I think I have seen an example where energydata.info relicenses ODbL data under some CC licence. I don’t think that’s legal. They don’t seem to fully know what they are doing regarding licencing and they respond very slowly, so expect some difficulties when trying to get waivers.

If overlays help, I’m happy to make overlays for you.

1 Like

I can also imagine that they will first have to go through a few internal loops to clarify this issue. So it might take some time to get an answer. I’m also going to contact World Bank next year regarding the License problem. Maybe if multiple people from different organizations are asking this might help.

I also found out why these new records went down and were not seen before. It looks as if the large overview map and the associated dataset are maintained very irregularly: https://africagrid.energydata.info/

Even in the new overview map some new maps are missing.

1 Like

Note that some of the datasets on energydata.info aren’t actually compiled by the World Bank – the West African data comes from ECREE. It would be good to approach them and see if they could benefit from OSM.

1 Like

Hello all

One additional question that should be asked is about the relevancy of import: what goals will this help to achieve or what problems will this help to solve to import world bank data to OSM?
It could be about the power network topology, power network physical capabilities, knowledge of power network particular components (transformers, compensators) or all of them in the same time.

PyPSA had been mentioned on this forum when Bobby Xiong’s papper came out recently. I think this discussion is an opportunity to define what PyPSA requires/needs/whishes from OSM and what benefits it could bring back.
Answers to that will help to focus on what matters the most.

2 Likes

Like OSM, the PyPSA community and user base is extremely heterogeneous, with very different interests of different organisations and individuals. PyPSA can be used with very different focus and scale as it is extremely modular and is the most widely used open source framework in this area with over 80000 downloads per month.

Our (Open Energy Transition’s) main focus is to improve grid data for the wider energy modelling community and other scientists, but also for grid operators, planners and policy makers, in order to have the greatest impact on the transformation towards a sustainable energy transition. The PyPSA-Earth project is perhaps the best example of how immediately and on what scale we, as Open Energy Transition, plan to use the dataset:

Fortunately, we’re still very agile at the moment in terms of how best to use the resources of the project I’m responsible for. This raises the question of where to find the low-hanging fruit that will add value for the most people. Unfortunately, I have not been able to find any scientific publications on which investments have the greatest impact. Therefore, I see one of the main tasks to start with is to ask the different communities about their experiences with the lack of data, both in terms of quantity and quality. I definitely see the need for several workshops, preferably face-to-face, to organise and address these challenges. But please, let’s not discuss it all here in this post.

Here is a quick map comparing the nightlight based Gridfinder dataset with the existing/combined African Grid data published by the World Bank compared to the current OSM Grid data. I hope this is helpful to get a rough idea of the coverage of OSM data in Africa and where integration with the support of local mappers could be very useful.

A small web application that shows all the data together might be helpful for the community, I’m thinking about making one. Before that, try to get the other datasets that are not downloadable from energydata.info and find/create a more recent update of the gridfinder data.

2 Likes