Incompatible PBF format osmupdate/osmconvert versa osmosis

Hi,

there have been some posts around the topic before but none of it contains a solution for me.
I have tried to update a planet (date 13.06.2012) file with osmupdate to 26.06.2012. For this I use the command line:

osmupdate -v --daily --keep-tempfiles --emulate-osmosis --planet-url=http://planet.openstreetmap.org/redaction-period/ 20120613\planet.osm.pbf 2012-06-13T00:00:00Z  20120626\planet.osm.pbf

The updated planetfile is now cut into the continents using 5 polygon files.
For this cutting I tried osmosis using the command


osmosis.bat ^
--read-pbf file=20120626\planet.osm.pbf outPipe.0=1 ^
--tee 5 inPipe.0=1 outPipe.0=301 outPipe.1=302 outPipe.2=303 outPipe.3=304 outPipe.4=305 ^
--bounding-polygon file=polys\africa.poly inPipe.0=301 outPipe.0=401 ^
--bounding-polygon file=polys\america.poly inPipe.0=302 outPipe.0=402 ^
--bounding-polygon file=polys\europe.poly inPipe.0=303 outPipe.0=403 ^
--bounding-polygon file=polys\asia.poly inPipe.0=304 outPipe.0=404 ^
--bounding-polygon file=polys\australia-oceania.poly inPipe.0=305 outPipe.0=405 ^
--write-pbf file=africa\20120626\africa.osm.pbf omitmetadata=true inPipe.0=401 ^
--write-pbf file=america\20120626\america.osm.pbf omitmetadata=true inPipe.0=402 ^
--write-pbf file=europe\20120626\europe.osm.pbf omitmetadata=true inPipe.0=403 ^
--write-pbf file=asia\20120626\asia.osm.pbf omitmetadata=true inPipe.0=404 ^
--write-pbf file=australia-oceania\20120626\australia-oceania.osm.pbf omitmetadata=true inPipe.0=405

Running this osmosis call it works fine for the original planet file from 13.06.2012.
But using osmosis with the updated planet file the continent cuts are much too small.

As an example the statistics of africa obtained with osmconvert:
13.06. (original planet), size 183MB:
lon min: -29.1055035
lon max: 58.2506989
lat min: -35.9772629
lat max: 37.6017624
nodes: 24725481
ways: 2045490
relations: 12237
node id min: 197798
node id max: 1785762075
way id min: 3715718
way id max: 167173227
relation id min: 892
relation id max: 2228181
keyval pairs max: 205
noderefs max: 2000
relrefs max: 2506

26.06 (updated planet), size 11MB:
lon min: -25.3633000
lon max: 57.6910605
lat min: -34.7715300
lat max: 37.3481136
nodes: 1423315
ways: 122033
relations: 2435
node id min: 197798
node id max: 1802086439
way id min: 3715888
way id max: 169037258
relation id min: 3367
relation id max: 2248461
keyval pairs max: 73
noderefs max: 1992
relrefs max: 2494

So there seem to be some incompatibility in writing and reading the PBF format between osmosis and osmconvert. Similar things have been reported with mkgmaps splitter.

I would like to start an investigation about the problem but need some help.
Is there any reference reader for the PBF format?
@osmosis guys: Do you have any idea where the problem is located?
@Marqs (osmconvert developer): Do you have any idea where the problem is located?

Please don’t answer with “I guess it’s the other program” because that doesn’t really help. I don’t mind which of the programs have a problem. Probably it’s a different understanding of the PBF format so no ones fault.
But it is important to find and fix these problems to get a compatible tool chain!

Thanks for all hints to find a solution!

WanMil

I’ve such problems too. It would be great if this error could be fixed. If I can help, let me know.

Hello!

I really would like to help, but at the moment I have no idea. :frowning:

Did I get it right? There is an osmconvert output file (.pbf) which might have been written incorrectly.
What if you took this file and ran it through Osmosis, converting it from .pbf to .pbf?

Does the contents of the resulting file differ from Osmosis’ input file?

Have the same problem too. Because OSMUpdate and Splitter are incompatible I went back to Osmosis. See this topic.

At the moment - but I am sure we will find a solution :slight_smile:

Yes, the file updated by osmupdate which uses osmconvert cannot be handled correctly by osmosis (and it is not clear if that’s a problem of osmosis or osmconvert or whatever…). Interestingly osmosis does not throw an exception but it reads only a very very small subset of the elements that are expected to be in the file.

That’s a good idea! I will try that tomorrow.

It would also help if a debug version of osmconvert can output the deserialization of the pbf file. So create a printout with one line for each osm element “node/way/relation osmid (lat lon if it is a node)”. I could use it to compare that with the output of osmosis or splitter by changing their java classes to printout the same deserialization. It would be a large effort for me to do this for osmconvert. Maybe that helps to get closer to the problematic areas of the pbf file.

WanMil

Thanks! This could help.

This too is a good idea.
The easiest “deserialization” is to use osmconvert to convert .pbf to .osm.

Ok, here are the results.

Planet file updated from 13.06.2012 to 26.06.012 using osmupdate. Size 17.276.231.932 bytes.
timestamp min: 2005-05-01T14:56:35Z
timestamp max: 2012-06-25T23:59:52Z
lon min: -180.0000000
lon max: 180.0000000
lat min: -90.0000000
lat max: 90.0000000
nodes: 1499200749
ways: 140615320
relations: 1460686
node id min: 1
node id max: 1802293460
way id min: 35
way id max: 169055533
relation id min: 11
relation id max: 2251643
keyval pairs max: 338
noderefs max: 2000
relrefs max: 10293

The updated planet file rewritten with osmosis (osmosis --read-pbf=planet.osm.pbf --write-pbf=osmosis_planet.osm.pbf)
Size: 13.198.594.326 bytes
timestamp min: 2005-05-01T14:56:35Z
timestamp max: 2012-06-25T23:59:52Z
lon min: -,.),(-*,(
lon max: 214.7483647
lat min: -90.0000000
lat max: 90.0000000
nodes: 1499200749
ways: 140615320
relations: 1460686
node id min: 1
node id max: 1802293460
way id min: 35
way id max: 169055533
relation id min: 11
relation id max: 2251643
keyval pairs max: 338
noderefs max: 2000
relrefs max: 10293

Yes, that’s really the output of osmconvert --statistics. So the topic seems to be the coding of the longitudes.
@Marqs: Any ideas?

I will start to compare the pbfs by converting them to osm.xml.

WanMil

I have converted the updated planet file to osm.xml with

osmconvert planet.osm.pbf --emulate-osmosis > planet_osmconvert.osm

and

osmosis --read-pbf file=planet.osm.pbf --write-pbf file=planet_osmosis.osm

I did not write the whole file and compared the first million lines.

The first relevant diff is:
osmconvert:


osmosis:

The difference of the longitude (osmosis - osmconvert) in all differences I checked is a constant number: 429.4967296

Does that help?

WanMil

I suggest you try the file with Osmium (http://wiki.openstreetmap.org/wiki/Osmium). There is an osmium_debug in the examples directory that will be helpful. As Osmium is another independent implementation of the OSM PBF file format, you might get an idea whether Osmosis or osmupdate is at fault.

And another thing: 10,000,000 is the factor used when converting lat/lon from a double to the internally used integer. 429.4967296*10000000 == 2^32.

4294967296 = 2^32

Yes, in deed! There seems to be an error in data types: one procedure treats longitude as signed integer value (correct!), another procedure “thinks” this would be an unsigned integer value (false!). In both cases the number is stored in the same way - hence it’s just a question of interpreting this data.

Well… what program is responsible and where is the error?

If I understood you right, you took an osmconvert-written PBF file. This file can be interpreted correctly by osmconvert but not by Osmosis.
I still cannot say if there is an error in osmconvert’s writing procedure or in Osmosis’ PBF reader. I rather would guess it’s an error in osmconvert, but I’m not sure at all, nor do I know where exactly this error could be. :frowning:

Will think about it and look into the code…

Hi Jochen, good idea, thanks!

WanMil: As you have demonstrated, the error is reproducible. Do you have a smaller example file? The whole planet file is somewhat unhandy. :frowning:

Right - the error must occur within one of the programs then and cannot be a PBF interpretation mistake. PBF format stores signed numbers in a different manner:
0->0
1->2
2->4
10->20
-1->1
-2->3
-10->19
etc.
Thus we would not have to deal with a 2^32 offset.

Marqqs

I’m pretty sure, it’s not “one of the programs” but this one program: osmconvert. :slight_smile:
Osmosis is much older, it has been used for many years now, hence it’s unlikely that it would fail in such basic functions.

WanMil:
Could you please change a line in the source and try it again?

Version 0.5Z, line 4697 from

  pw__objp= pw__dn_lon; pw__obj_add_sint64(lon-pw__dc_lon);

to

  pw__objp= pw__dn_lon; pw__obj_add_sint64((int64_t)lon-pw__dc_lon);

Maybe this will fix the bug…

Great!!

I would like to but I don’t have a running compile environment so it would take me a lot of time to do so. Could you please compile the patch and post a URL where I can download the windows version of patched osmconvert? I do not have time to check before Monday so take your time :slight_smile:

WanMil

No problem. :slight_smile:
m.m.i24.cc/osmconvert_new.exe
(Win32)

Using the patched osmconvert_new.exe I cannot reproduce the problems. So the fix seems to work well!

Thank you all for finding this tricky problem!!
WanMil

Great! Thanks for your patience and the test runs! Thanks also to Jochen Topf for suggestions.

I just uploaded the fixed source and the newly compiled executables.

Markus

Hi,
Link is no longer working?
Chris
Edit: erledigt.

Seems to work perfectly, thanks a lot!