22.09.2010 | Frederik Ramm
Geofabrik has been providing the OSM community with cut-down data extracts for various continents, countries, and smaller administrative divisions for over two years now. Our downloads at download.geofabrik.de form the basis of many community projects, and we’re happy to make working with OSM data easier for so many people.
Today we’re launching experimental downloads in a new binary format. The new “protobuf binary format” (.osm.pbf) is 30% smaller than the bzip2-compressed OSM XML, and it can be processed or extracted much faster than bzip2 files. Also, while we will continue supporting the bzip2 files for a while, we hope that we can ultimately free up some resources by dropping bz2 support, and use these resources to produce an even wider set of daily updated OSM extracts.
The protobuf binary format was developed by Scott Crosby and presented to the OSM community in April this year (wiki article with details). As the name implies, it relies on Google’s “Protocol Buffers” for its internal data representation. The format is supported by Osmosis starting with version 0.37; .osm.pbf files can be read directly by Osmosis, or converted to plain XML OSM first using a command like
osmosis --read-pbf myfile.osm.pbf --write-xml myfile.osm
The above command will run significantly faster than a bz2 decompression, and the .osm.pbf files made available by Geofabrik are 100% lossless. The format offers further compression options by stripping of metadata or minimally reducing precision, but Geofabrik extracts will remain lossless.
Not only are .osm.pbf smaller and faster to process than their bz2 counterparts; they are also likely to appear faster on the Geofabrik download site than the regular .osm.bz2 files. Everyone is encouraged to give them a try.
(Edit on 2010-11-16: Initially the command line options to use were called “read-bin” and “write-bin”, but later releases of Osmosis now use “read-pbf” and “write-pbf”.)