Is GlobalBuildingAtlas properly licensed and crediting OpenStreetMap as a source?

Dear all,

Last OSMWeekly issue mentions GlobalBuildingData dataset and state it involves OpenStreetMap data among other sources. (Confirmed by §3.2 p.6 here https://arxiv.org/pdf/2506.04106).

French community noticed its current license is CC BY-NC 4.0 according to its Terms of use. The article credits OpenStreetMap source and license as bibliography reference but it is not extended to the resulting dataset.

Isn’t this surprising that the resulting dataset (or at least the OSM part of this dataset) isn’t published under ODbL and made suitable for OpenStreetMap contribution as well, as a Collective database?
Will reusers of GlobalBuildingData be aware of this without reading the corresponding publication?

Anyway, that’s an interesting case to look at to get a better understanding about the actual rights and duties of ODbL data reuse, particularly in scientific work.

Have a nice weekend

2 Likes

The resulting dataset contains ODbL licensed data coming from OSM, but also from Microsoft.

My understanding of this publication is that OSM building footprints have been used to train a computer-vision model, but also used as a source of “polygons” and 0.49 billlions OSM polygons are present in the final dataset (and 0.43 billions from Microsoft) after the fusion between sources. (see fig 11 in Appendix B). That’s 33% of the number of buildings in the dataset, and in Europe it is almost 100% (with more than 50% from OSM).

That why I’m surprised that the dataset is under CC-BY-NC instead of ODbL.

1 Like

Why would you be surprised? It is unlikely that anybody considered that the licensing of the input databases could have a bearing on their work and I would note that the NC bit might be driven by the planet labs licensing which they probably did have to take in to account.

PS: I couldn’t be bothered to point the issue out a week ago as it is likely not solvable without unpublishing the data, with other words: lots of drama, except if they have maintained some kind of source indication in the final data set (then they could remove the OSM data)..

They could also grant us to reuse the result to contribute back, which is also relevant, even not completely compliant.

One of the problems is that they deduplicated the polygons and that is a process that we have never allowed as a legit way to create a Collective Database. So even if they have maintained some indication of the source of the specific polygon we would already have to close both eyes.

1 Like

The source of the building geometry is in the geojson files published as well as its id.

{
"type": "FeatureCollection",
"name": "GUF04_DLR_v02_e010_n50_e015_n45_OGR04_lod1",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::3857" } },
"features": [
{ "type": "Feature", "properties": { "source": "osm", "id": "1210083016", "height": 6.7298097610473633, "var": 6.2188262939453125, "region": "DEU" }, "geometry": { "type": "Polygon", "coordinates": [ [ [ 1358133.554630329599604, 6412073.704819994978607 ], [ 1358109.76565514691174, 6412075.188091930933297 ], [ 1358110.667343022301793, 6412089.503404381684959 ], [ 1358134.445186255732551, 6412088.020129905082285 ], [ 1358133.554630329599604, 6412073.704819994978607 ] ] ] } },
{ "type": "Feature", "properties": { "source": "ms", "id": "Germany_120212221_13087", "height": 2.6195902824401855, "var": 1.8346726894378662, "region": "DEU" }, "geometry": { "type": "Polygon", "coordinates": [ [ [ 1359001.339176582638174, 6412954.902559894137084 ], [ 1358993.882185156689957, 6412954.665092710405588 ], [ 1358994.081657590111718, 6412948.401219916529953 ], [ 1359001.538649015594274, 6412948.63868709653616 ], [ 1359001.339176582638174, 6412954.902559894137084 ] ] ] } },

I won’t comment geometries in EPSG:3857 with 15 decimals.

What’s the next step ? Send an email to the authors ? Something to let the LWG deal with ?

1 Like

I’ve sent the following email to the paper author:

Hello,

I’m writing to you because I’m surprised by the choice of data license you’ve set on the GlobalBuildingAtlas dataset.

As mentionned and explained in your paper, at least two data sources you’ve been using to create this dataset are under the Open Database License (ODbL): OpenStreetMap and Microsoft building datasets.

I’ve downloaded the extract of data you’re proposing to have a look at the final dataset, and it confirms that building polygons from OSM (and Microsoft) are present in the resulting dataset in a substantial portion.

In such case, your dataset must be published under the ODbL licence (see 4.2), because it is a derivative database (see 1.0 of ODbL license for definition).

A copy of this message has also been sent to the Legal Working Group of the OSM Foundation.

Thanks in advance to fix quickly the license of the dataset you published. This will also allow OpenStreetMap contributors to use it to improve OpenStreetMap, which is not possible with the CC-BY-NC you choose.

ODbL license is published at: Open Data Commons Open Database License (ODbL) v1.0 — Open Data Commons: legal tools for open data

13 Likes

Did you heard anything back? Any steps LWG took?

This dataset was mentioned on a German GIS-News blog. I pointed out the possible license problem there and added a link to this discussion: Gigantisch: Alle Gebäude der Erde erfasst?! | #geoObserver

3 Likes

#geoObserver: Thanks for the tip, the post has been updated with a corresponding note:
“Achtung Ergänzung (09.12.2025, 13:25 Uhr):
Bitte beachtet den Kommentar 1 in diesem Beitrag, möglicherweise gibt es bei den Daten ein Lizenzproblem.”

1 Like

Thread at Hacker News