Parsing .osm files

Parsing this file with Osmosis should take less than a minute. I don’t know how long inserting into your database takes, though.

If you want to try Osmosis: Download the latest version, add the jars to your classpath (you don’t need all of them, but I suggest that you first add them all, get it to work, and then remove those that are not actually required). The actual code will then look somewhat like this:


import org.openstreetmap.osmosis.core.container.v0_6.EntityContainer;
import org.openstreetmap.osmosis.core.domain.v0_6.*;
import org.openstreetmap.osmosis.core.task.v0_6.*;
import org.openstreetmap.osmosis.xml.common.CompressionMethod;
import org.openstreetmap.osmosis.xml.v0_6.XmlReader;

...

File file = ...; // the input file

Sink sinkImplementation = new Sink() {
    public void process(EntityContainer entityContainer) {
        Entity entity = entityContainer.getEntity();
        if (entity instanceof Node) {
            //do something with the node
        } else if (entity instanceof Way) {
            //do something with the way
        } else if (entity instanceof Relation) {
            //do something with the relation
        }
    }
    public void release() { }
    public void complete() { }
};

boolean pbf = false;
CompressionMethod compression = CompressionMethod.None;

if (file.getName().endsWith(".pbf")) {
    pbf = true;
} else if (file.getName().endsWith(".gz")) {
    compression = CompressionMethod.GZip;
} else if (file.getName().endsWith(".bz2")) {
    compression = CompressionMethod.BZip2;
}

RunnableSource reader;

if (pbf) {
    reader = new crosby.binary.osmosis.OsmosisReader(
            new FileInputStream(file));
} else {
    reader = new XmlReader(file, false, compression);
}

reader.setSink(sinkImplementation);

Thread readerThread = new Thread(reader);
readerThread.start();

while (readerThread.isAlive()) {
    try {
        readerThread.join();
    } catch (InterruptedException e) {
        /* do nothing */
    }
}

...

This is copied from one of my own tools that use Osmosis, and can also parse .osm.pbf, .osm.gz, and .osm.bz2 files. If you only need .osm (xml) parsing, the code gets a lot shorter.