I’m building a project, a research-grade GPX audit and correction pipeline focused on mountain terrain. The core thesis is that GPX data quality in mountain environments is significantly worse than most tools acknowledge — silent timestamp anomalies, elevation channel unreliability, mixed barometric/GPS sensor types — and that honest, documented methodology matters more than aggressive correction.
I ran an initial case study on ~9,500 tracks old tracks from a kaggle dataset and have documented the audit and correction methodology publicly:
I want to expand the research with a focused Himalayan trace corpus — specifically to study elevation data quality at altitude and validate a sensor type inference approach (barometric vs GPS) using DEM comparison and terrain ruggedness index.
I know the bbox API exists and I’m happy to use it. Before I go down that path — has anyone done systematic bulk collection of OSM traces ? Any prior art, existing datasets, or a more efficient access path for research purposes would be genuinely useful.
Data use is strictly research and methodology validation, not redistribution.
Whatever you do please do not try and scrape the API for bulk data - that is prohibited by the usage policy, is likely to disrupt access to the site, and will get you blocked.
I would be rather surprised if the source of the largest errors overall is not simply people mistakenly using the WGS84 ellipsoid (for example the default on many phones).
that contains no information about the device used to generate elevations - whether from GPS or other GNSS or from a device barometer, and if the latter whether it had been calibrated recently. Without that, it probably won’t be useful to you.
I don’t think that is necessarily the dominant source of error in practice, mainly because ellipsoid-vs-MSL differences tend to behave like a relatively stable offset over the spatial scale of an individual GPX trace.
Even for traces spanning hundreds of kilometers, the geoid separation itself changes fairly smoothly compared to the short-scale instability typically observed in recorded elevation channels. So in many cases the WGS84 to MSL conversion effectively behaves more like a deterministic offset correction rather than the kind of non-stationary error seen in real traces.
What I’m observing much more often is inconsistency in the recording itself like local spikes, drift, discontinuities, sensor fusion artifacts, timestamp anomalies, etc. The error relative to terrain or DEM does not remain constant over the trace, which is why my impression so far is that the dominant issue is usually measurement instability rather than purely vertical datum choice.
Also, from what I’ve read, many modern devices and applications already convert GNSS ellipsoidal height to an MSL reference internally (often using something like EGM96 or similar geoid models under the hood), although implementation details are obviously device-dependent.
So I definitely think datum confusion can contribute to absolute offsets, but my current impression is that it is a comparatively smaller contributor than sensor behavior and recording inconsistency, especially in mountain terrain.
I probably went a little crazy with the feature ideas there , but even raw trace data is still very useful to me for testing objective correctness of a GPX file and detecting kinematic anomalies like outlier jumps, jitter, discontinuities, etc.
A lot of those issues can still be studied directly from the geometry and temporal structure of the trace itself like implied speeds, local continuity, motion consistency for specific activities, and similar pattern-based signals, even when exact sensor provenance is unknown.
absolutely, I’m trying to be careful about that side of it.
My intention is not to aggressively scrape the service or mirror data, just to build a focused research corpus for methodology validation. I’ve been looking into the bbox-based access patterns and keeping requests conservative/rate-limited specifically because I do not want to disrupt infrastructure or violate community norms.
Part of the reason I made the post was actually to ask whether there are already accepted datasets / prior collection efforts / better access paths for this kind of research before I go further down that route.
While I’m still sceptical about how useful it might be, there is old GPS planet data from 2013 or so - although obviously that will tell you nothing about devices released after 2013 .
thanks for giving it a thought, I know about it already but that dataset is like ancient, and not too relevant and its gigs of data in which no1 knows how much will be remotely useful in comparison to modern devices. I might be undervaluing it though, I dont know what type of devices we used then.