RSS Feed icon

Blog

Historic OpenStreetMap Data

13.11.2024 | Frederik Ramm

We frequently receive requests from researchers about “old” OpenStreetMap data. In one instance we’ve been asked whether we could provide data for a city road grid – for the year 1980! So let’s take a minute to explain what is possible and what isn’t.

In general, you can download older data files – for the 1st of January of each year – from our download server. Just click on the “raw directory index” link and you’re presented with a list of available files – either .osm.pbf or shape files. Sometimes when we’ve split up a country in smaller regions at some point, you might only find more recent files for the small regions, but older files for the whole country.

OpenStreetMap keeps a full history of everything, so it is possible to extract even more fine-grained snapshots – say for the 1st of each month since 2014! This will require the “osmium” utility (with its –time-filter option), as well as a “full history” file. Such files are available (for the whole world) from planet.openstreetmap.org, but also for individual continents or countries from our project-internal download server. (Because these files contain metadata that may be personal data, access is only allowed with an OSM login.)

There are a number of caveats that apply to working with historic OSM data though. First and foremost: An OSM data set of time X tells you what OSM knew about the world at that time – and not what the world was like at that time. If a building has been there for 100 years but was only added to OSM in 2020, then the 2018 data set will not contain that building. You can get an idea of what to expect in an old OSM data set from the “chronology” tab of the taginfo service (globally at taginfo.openstreetmap.org, or for individual regions at taginfo.geofabrik.de). Here’s a graph of the “building” tag for Spain:

As you can see, OSM currently has about 4.5 million building objects in Spain, and in 2014 we had about 450,000. Does that mean that 90% of Spain’s buildings have been erected in the last 10 years? No – it only means that OSM was much smaller in 2014 than it is today!

Because OpenStreetMap was invented in 2004, obviously any data export for any time before 2004 will be completely empty. In the early years we also had a few incompatible data model changes, and the license change in 2012 meant that some data had to be removed again. So, going back further than the license change will really give you very patchy results, and is not recommended for most use cases.

Another thing to consider is that in some areas the mapping methods have changed, and something that was mapped as X in 2014 might today be Y. Depending on what research you are doing, it might be necessary to adapt your filters to that.

If you would like to enlist our help in extracting historic information from OpenStreetMap, we’re happy to make you an offer.

Shortbread Tiles on the Download Server

21.02.2024 | Frederik Ramm

We’re now offering Shortbread vector tile packages on our download server. We have been doing this experimentally for a while, serving .tar.gz files, but have now switched to the more popular “mbtiles” format.

At present you will find download links for country-wide vector tile packages, except for large countries like the US where the packages are at state level. We’re looking into providing continent-wide and planet-wide downloads but these will likely utilize BitTorrent to keep some of the load off our servers.

Data is currently updated infrequently but we’ll sooner or later establish automatic, regular updating.

Internal Download Server now on OAuth2

15.02.2024 | Frederik Ramm

Our project-internal download server which supplies OSM data with full metadata has been upgraded to use OAuth2, in anticipation of OSMF switching off their support for OAuth1.0a.

(For those who are not aware – the public “no login required” download server doesn’t publish the author and timestamp of modifications in order to be squeaky clean when it comes to data protection; the project-internal server requires an OSM login, thereby ensuring that anyone downloading data from this server has consented to OSM’s policies which, among others, include a data protection clause.)

In fact, as some of you may have noticed, we had been using OAuth1.0, not 1.0a, until now, which was unceremoniously given the boot last weekend, leading to a service interruption of a few days on the internal download server.

It’s all fixed now, and those of you who use some sort of automated download with a cookie will have to update their oauth_cookie_client.py and obtain a new cookie as documented on the GitHub page.

Regional Taginfo Instances

2.10.2023 | Frederik Ramm

The regional taginfo server at taginfo.geofabrik.de has been updated with the newest taginfo codebase which means, among other things, that you can get a “chronology” view for many tags telling you how the use of that tag changed over time.

For those of you who have used the service in the past and who were probably used to frequent “internal server errors” forcing a reload, those toothing pains should now be a thing of the past, and we’ve added direct links to the respective regions on download.geofabrik.de – taginfo should have pages for every region supported by the download server. Feel free to report any issues you still encounter.

For quite a while now we’ve been offering a region called “Alps” on our download server and simply assigned it a rectangular bounding box. There have been some legitimate complaints about this box missing some bits in the Southwest (notably the Maritime Alps). We’re therefore changing the boundary polygon, but in order to keep the size manageable, we’re also getting rid of some of the areas that were included until now but are definitely not part of the “Alps” by any definition. Here’s the old (rectangular) and the new (potato-shaped) areas:

Notably the old file contained all of Bern, Zürich, Munich, Linz, Venezia, Zagreb, and Milano; these cities are not in the new “Alps” any more.

We’ll be making the change on September 25 which will lead to a relatively large diff, deleting all the stuff that isn’t in the file any more and adding all that is new. If you somehow rely on the old polygon, you’ll have to stop updating now.

We’ve started making vector tile packages available on our download server at download.geofabrik.de. At the moment you’ll find a .tar.gz file with vector tiles typically on the same level where you’d also find shape files — i.e. we don’t give you a package for all of Europe, but for individual European countries, and we don’t have a package for all of the US, but for individual states and so on.

excerpt of a screenshot of download.geofabrik.de, the link to the .tar.gz vectortile package is highlighted

These vector tiles use our Shortbread schema, and we create them with the excellent open source Tilemaker software. For suitable MapLibre vector styles, have a look at the VersaTiles repository.

This is supposed to be an experiment and we don’t yet make any promises about the structure and update frequency of this offer. We’re happy to hear your ideas though. At present, the .tar.gz files simply contain all vector tiles for the region as simple files (which are themselves .gz compressed so you might have to instruct your web server to add the appropriate encoding headers).

Like everything else on our download server, these tiles are made from OpenStreetMap data and come under Open Database License 1.0 with an OpenStreetMap attribution requirement.

This weekend we’ll implement a change that affects the handling of boundary-straddling multipolygons in our OSM extracts. (See this 2017 blog article for some background.)

We’ll stop completing cross-border multipolygons except landuse polygons and a hand-picked list of natural polygons (e.g. water, grassland, wetland).

This has become necessary because of the propensity of OSM mappers to create huge multipolygons like “the Iberian penisula” or “the Alps”, artifacts that not only unnecessarily increase the data volume of any given PBF but also have unexpected consequences – for example, for a while anyone processing the rivers of the Switzerland extract would find a stretch of the River Danube in Vienna, because it happened to be part of the outline of the “Alps” multipolygon.

We hope that by restricting multipolygon completion to landuse and a small list of natural polygons we’ll be able to curb these unexpected side effects of polygon completion.

As a result of this change, the .osc.gz files generated on Friday night will contain “delete” operations for ways and nodes that were heretofore contained in the extracts due to multipolygon completion, but are not any longer.

Some data extracts, notably those for small islands or archipelagos, will shrink by more than 10%, but for most extracts the size will not be affected dramatically.

A few days ago we have added a couple of new road-related layers to our OSM Inspector.

Relations with highway=*

Screenshot of OSM Inspector showing a mountainbike route with highway=cycleway

In the past, highway=* was sometimes added to route relations (type=route + route=road). This is not in line with current tagging standards any more, and can even lead to duplicated line geometries when importing OSM data into PostgreSQL with Osm2pgsql.

The layer shows these old-style relations in dark red.

Not all relations with highway=* are a problem – for example, pedestrian areas and rest areas mapped as multipolygons relations (type=multipolygon) are totally fine and hence not shown as errors. This exception applies to multipolygons with a highway=* value of pedestrian, footway, service, rest_area or services.

Multipolygon relations with other highway=* values are likely candidates for changing to area:highway=*. See the OSM Wiki page aboutarea:highway=* for details about mapping roads as areas.

Out-of-use Roads

Even though OSM records “facts on the ground” and not historic or future data, it is generally accepted to map roads and paths which are not in use any more (or are not completed yet).

In recent years, the so-called Lifecycle Tagging Schema was adopted by many mappers and is used to tag feature which are not actively used any more, are (partially) removed or expected to be created in the future. The schema works by adding the lifecycle state as a colon-separated prefix to main key of the feature. If a road tagged as highway=secondary becomes disused, the tag is changed to disused:highway=secondary. Other tags of the feature remain unchanged.

(An older, but equally valid, way of describing lifecycle states is to put highway=construction together with construction=secondary. The new OSMI layers treat both methods in the same way.)

The following lifecycle sates are in use:

  • razed or dismantled or removed (note that the use of these prefixes is often subject to discussion and may be discouraged in your region)
  • abandoned
  • disused
  • in use (normal state)
  • construction
  • proposed (note that mapping proposed features may contravene the on-the-ground mapping rule and may be discouraged in your region)

OSM Inspector now has four new layers displaying all linear highway=* features which are abandoned, disused, under construction, or proposed.

Screenshot of OSM Inspector showing abandoned and disused paths in Saxon Switzerland National Park

Two new layers render frequent mapping mistakes:

  • Red: If a way has tags of contradicting lifecycle states, it is rendered on the Multiple lifecycle states layer in red. There can be valid reasons for such tagging, for example when a minor road is reconstructed to be a larger road (or a track in the forest is not maintained any more and becomes a narrow path). But in most cases this is an error.
  • Orange: If the old method of lifecycle tagging is used in an incomplete way, e.g. highway=construction without a construction=* tag), this is considered an error. Sometimes the correct tag value can be discovered from the object history; otherwise a survey is required.

Screenshot of OSM Inspector showing a trunk road under construction with an additional proposed=* tag

Use quality assurance tools responsibly

Please use quality assurance tools responsibly. They are often wrong. And even when they point you to poor mapping, simply repairing the mistakes is not a good choice. Poor mapping is often a sign of lack of knowledge (e.g. newbies), bad intentions, or mechanical edits, organised editing or imports gone wrong. It is worth having a look at an object’s history and other edits by the mappers involved to avoid simply “cleaning” the map and therefore possibly hiding a systemic issue. When contacting others about mistakes they have made, always remember that we all make mistakes and we can only become better by supporting each other.

OSM Inspector touch-up

1.07.2022 | Frederik Ramm

We have made a few small improvements to the OSM Inspector user interface, providing better responsiveness to different display sizes, using tiled WMS access for better performance, and improving the presentation of feature detail information.

Give it a spin at https://tools.geofabrik.de/osmi and tell us what you think!

In the past night, a problem on the download server caused us to publish truncated data extracts for the following countries:

In Africa:

  • Benin
  • Cameroon
  • Chad
  • Comores
  • Kenya
  • Mali
  • Mauritius
  • Namibia
  • Niger
  • Nigeria
  • Saint Helena/Ascension/Tristan da Cunha
  • Sao Tome and Principe
  • Senegal
  • Seychelles
  • Swaziland
  • Togo
  • Zambia
  • Zimbabwe

In Asia:

  • Bangladesh
  • Cambodia
  • China
  • India
  • Indonesia
  • Iran
  • Iraq
  • Malaysia
  • Maldives
  • Nepal
  • North Korea
  • Philippines
  • South Korea
  • Sri Lanka
  • Syria
  • Taiwan
  • Vietnam

The extracts have been reset to yesterday’s version but if you have downloaded an extract between 2021-09-05 01:00 UTC and 2021-09-05 10:00 UTC then you have got a truncated file. Also, if you have an automatic process that loads daily updates, and you have loaded and processed an update for any of these countries in that timespan, you now have a truncated database and need to re-import data. (This will not automatically fix itself with the next update.)

If you have not loaded updates during this timespan, or if you have loaded updates but not for the countries mentioned, then you are not affected. (You are also not affected if you have used the full Africa or Asia files, these were correct.)