Issues with Calculating Building Characteristics, Footprint, Roof, and Facade Areas using Overpass API and OSM Data

Hi everyone,

I’m working on a project using Overpass API and OpenStreetMap (OSM) data to extract building characteristics such as footprint area, roof area, and facade areas in each direction. I’m also trying to determine if buildings are adjacent or standalone based on proximity.

Problems I’m Facing:

  1. Overestimated Footprint and Roof Areas:
  • The calculated footprint and roof areas are consistently higher than expected. I’m projecting coordinates from WGS84 to UTM using pyproj, but the results seem inflated.
  1. Facade Area Calculation Issues:
  • I’m calculating facade areas based on side lengths and building heights, accounting for adjacent buildings. However, the facade areas (especially East and West) appear to be overestimated, even when adjacent buildings are close.
  1. Height Estimation:
  • I estimate heights using OSM height data (if available), building levels, or the gross floor area and footprint. Despite this hierarchy, height estimates sometimes seem inaccurate, leading to further issues with facade areas.
  1. Adjacent vs Standalone Buildings:
  • I’m struggling to determine whether buildings in a given area are adjacent or standalone. I’m using a proximity threshold to identify adjacent buildings, but it doesn’t seem to work as expected.

Questions:

  • Is there a better way to handle coordinate projection or calculate building footprint/roof areas to avoid inflated values?
  • How can I improve the calculation of facade areas, especially for adjacent buildings?
  • Any suggestions on how to reliably determine if buildings are adjacent or standalone based on proximity?

Any advice or insights would be much appreciated!

1,2,3. Can you tell us what is your reference dataset, validation method, or basis of comparison and judgement for the expected result? Are you following any prior research methodology, or learning materials?
In general, these data won’t be precise worldwide. Only the building:levels= is ever needed for general map navigation purposes, and that’s still very incomplete. I doubt whether buildling:part= must be considered at this scale.
1,2. For completeness, and in case of any latitude issues, what country?
3. How are you estimating heights from floor area and footprint? Have you adjusted for different building= types (eg a smaller =apartments could actually be a taller tower than a larger block) , and use different correlations? Some building= have a range of possible floors, eg building= =house , =detached etc may be 1~2 floors.
4. You can’t use around: for checking whether they are attached. You need to use the data topology.

[out:json][timeout:115];
wr({{bbox}})[building]->.all;
(.all; .all >;) -> .alldata; 
node.alldata(way_link.alldata:3-)->.sharedpts;
way.alldata(bn.sharedpts);
(way._; rel.all(bw._););
out geom;

(this ignores unconnected overlapping areas, which is a possible mistake in the first place)

1,2,3. I am using OSM dataset. I use formulas in excel to calculate the facade, I take total gross floor area, building levels to calculate the height and then calculate the facade.
1,2. I am using dataset for USA and UK. especially i am checking in New York, Denver and London.
3. I am estimating height from building levels and floor to floor height.

The floor levels and height are similar in my methodology and OSM dataset. I am using python script to fetch the data from OSM Overpass API and to calculate floor area and facade areas in 8 directions. The script calculates footprint and roof areas are consistently higher than expected.

Please specify the other reference dataset you are using as the authoritative source to validate and benchmark the OSM data. Without knowing their specs, it’s impossible to explain any differences. Are they only “extruded” with a footprint and a single height, or what LoD they are.
Don’t know about Denver. For NYC and London, maybe the pattern of architecture means roof:shape= and roof:levels= won’t have that much effect. Again, this depends on whether you are comparing the datasets with the same criteria.
You still haven’t explained how large is the deviation, and what’s your tolerance. There’s a recent question about routing, which turns out to be only 10% overestimation, within an acceptable bound.
Area size as viewed from 8 directions? That’s would be further affected by the error in orientation, on top of each facade face area.

I am only using the OSM dataset. I am validating output with Nextspace tool. This Nextspace tool is also using the OSM dataset to calculate the roof area and facade areas in 8 directions.
Yes they are extruded with footprint and a single height. I am comparing with same dataset.
The deviation of roof area is about 40% large. and for Facade it is around 35%.
The roof area size viewed from one direction only, but the facade areas considered in 8 directions.

Thank you. This is a suspect for your problem. Have you checked what coordinate system Nextspace is using from the settings? I can’t seem to find any details. Is the manual or documentation private? You can link it for me to read if you want.
They are using Cesium 3D OSM Buildings. But it is possible for Cesium 3D Tiles use CRS other than some WGS84, and they aren’t technically limited to only using Web Mercator for the whole world (I hope they aren’t using it). To be sure, you can rerun your scripts using EPSG:3857 instead of projecting to each UTM zone to verify if this is the culprit.

I will try to use EPSG:3857 instead of UTM. But i think it will not be going to provide the results with high precision.

What I mean is if the EPSG:3857 result is equal to Nextspace, they are the one that’s wrong, and you were correct in using UTM. Then you should find another reference software to calculate the info from OSM data, to validate your own program.
Unless you provide more details, share your own code snippet, source code, etc, we can only play a guessing game on what’s wrong. Or do you already have a public Git repo that we can look at?
Do you have a technical support plan by subscribing to Nextspace? Have you asked them for help?