14.06.2019 | Michael Reichert
We recently rewrote the OSM Inspector Routing View. The old views were used quite a lot by mappers to find potential routing errors. Unfortunately, they made use of PostGIS to find nearby, unconnected roads which meant having to import the whole road network into a PostGIS database every day. To make matters worse, the island and duplicate segment detection was implemented using OSRM which also has relatively heavy system requirements, especially if you do not limit yourself to automobile routing. This made us have a closer look at using GraphHopper to generate the routing view.
The new backend code was implemented from scratch and does not aim to be a one-by-one reimplementation. It uses GraphHopper to read the planet, build up the graph, analyse it and find nearby edges. Our code uses a forked version of GraphHopper 0.12 with the following changes:
OSMReader
class. Our changes allow us to build additional indexes and populate maps mapping from internal edge or node IDs to OSM object IDs.PrepareRoutingSubnetworks
class which removes islands from the graph is not extensible but we need to write the edges being removed to a GeoJSON file.In addition to forking, we implemented a new routing profile (“flag encoder” in GraphHopper speak) accepting any road and ignoring access restrictions. It is used to find duplicated edges, unconnected roads, and islands.
The backend code is available on GitHub.
The new routing view has the following groups of layers:
These layers show edges which exist twice in the graph. The blue ones are almost always real errors and should be fixed by comparing the tags and history of both ways. The purple ones are locations where two edges have equal geometry and at least one of them is an area. Opinions whether areas should share nodes with neighbouring roads are diverse among the OSM community. Therefore, these cases are shown in a separate layer at high zoom levels only.
The results of the duplicated edges layer (not involving areas) are similar to those of OSRM. Minor differences are possible where OSRM excluded a road due to access restrictions while our new implementation includes it.
The use of GraphHopper allows us to identify routing islands for multiple types of vehicles in one go. The OSM Inspector now provides islands for cars and bicycles using slightly modified version of the original profiles of GraphHopper (we use fewer bits to store the speed because island detection does not take travel times or distances into account). In addition, a layer with islands inaccessible to any vehicle is provided. This layer contains fewer entries in total but those contained are more serious because they are not caused by correct or incorrect access restriction tagging in OSM.
Most development time was spent finding a set of unconnected nodes layers with a proper separation of very likely, likely, potential errors and likely false positives. We are grateful to the folks on the German OSM forum who pointed out many bugs and too high rates of false positives.
Nodes where a single edge ends but with other edges within 15 meters are assigned to one of 6 priority classes. A distance below 2 meters makes a unconnected node appear in the top layer. The further composition of the layers depends on the road class, the distance, and access restrictions. The priority of unconnected nodes involving private access roads is reduced by 1. Service roads and footpaths get low default levels. In addition, the following rules avoid too many false positives:
noexit=yes
and entrance=*
make unconnected nodes disappear.The “snap points” layer shows the snap points of open ends helping to understand the situation.
Not all entries in the unconnected nodes layers are mistakes. Often, adding a linear barrier is helpful. The more blueish a point is, the less likely it is a mistake at all. Don’t feel forced to add noexit=yes
to every point the OSM Inspector complains about. You are not mapping for the validator 😉
23.10.2018 | Christine Karch
Seit 2012 veranstalten wir zweimal bis dreimal im Jahr ein OpenStreetMap-Hackweekend in unseren Büroräumen.
Es wird am Freitag mit einem gemütlichen Abend in der Kneipe eingeleitet. Im Laufe des Abends trudeln die meisten Gäste ein. Viele haben eine weite Anreise hinter sich. Sie kommen aus allen Teilen Deutschlands, und oft ist auch der ein oder andere Gast aus dem Ausland dabei.
Von Samstag früh bis Sonntag abend geht es dann zur Sache. Es wird diskutiert, manchmal auch Flipcharts und Whiteboards vollgeschrieben. Dann wieder ist alles still, weil alle auf ihre Arbeit am Notebook konzentriert sind. Dazwischen unterbrechen wir unsere Gäste mit Pizza, Keksen, Gummibärchen oder Kaffee. Auch den Samstag Abend verbringen wir gemeinsam im Restaurant.
Aufgrund der Routine, die sich im Lauf der Jahre eingestellt hat, mussten wir dieser Veranstaltung zuletzt nur noch sehr wenig Aufmerksamkeit in der Vor- und Nachbereitung schenken. Die Wiki-Seite ist in zwei Minuten aufgesetzt (Copy und Paste vom letzten Jahr, die Termine werden angeglichen, die alten Teilnehmernamen werden quasi nur pro forma herausgelöscht). Wir recyclen sogar den alten Einkaufszettel.
Doch unser 16. Hackweekend stellte alles auf den Kopf. Ein zu Martin Raifer leicht dahingesagtes “wir können uns ja auf dem nächsten Hackweekend darüber unterhalten” verursachte ein 9-köpfiges Vorbereitungstreffen für die SotM in Heidelberg im nächsten Jahr. Und meine Bemühungen, den Austausch mit den Franzosen zu intensivieren, hatte Früchte getragen. Vincent Privat (aus Toulouse) machte Werbung auf der JOSM-dev-Liste. Sieben JOSM-Entwickler aus ganz Europa (von Österreich, über Polen und Belgien bis Frankreich) meldeten sich an.
Plötzlich war klar, dass die angemeldeten 32 Gäste überhaupt nicht in unser Büro passen würden. Glücklicherweise hat uns Prof. Dr. Günther-Dieringer ganz kurzfristig und unkompliziert einen Raum in der Hochschule Karlsruhe (University of Applied Sciences) vermittelt. Hierfür danken wir ihm alle ganz herzlich.
Am Ende war das Hackweekend wieder ein großer Erfolg. Auch diejenigen unter den Besuchern, die befürchtet hatten, dass alles nicht mehr so schön gemütlich sein würde, waren am Ende zufrieden. Der Karlsruher und Heidelberger Teil der SotM-Working-Group haben viele Fragen geklärt, den Zeitplan für das nächste Jahr aufgestellt und das Konferenzformat festgelegt. Ein Protokoll wird demnächst auf den Wiki-Seiten der OSMF zu finden sein. Auch die JOSM-Entwickler, die sich teilweise noch nie getroffen hatten, waren sichtlich produktiv und haben endlos Blätter auf dem Flipchart vollgeschrieben.
Wir freuen uns schon auf das nächste Hackweekend im Februar 2019!
4.05.2018 | Frederik Ramm
As mentioned two months ago, we have made some changes to our download server at download.geofabrik.de. The publicly available files now don’t contain user or changeset information any longer (see below for details). You can still get everything that was there before, but files that do contain user information were moved to a server that requires you to have an OSM login.
The public server continues to be download.geofabrik.de, and the URLs and paths have been unchanged. The new server that requires a login is called osm-internal.download.geofabrik.de, and while it looks largely the same as the public server, the files it offers will have a name component of -internal
in order not to confuse them with the public files.
Here’s a table that explains which files to get where:
public server | internal server | |
---|---|---|
.osm.pbf files for all regions (original OSM data) | yes, without user data | yes, with user data |
.osc.gz files (diff updates) | yes, without user data | yes, with user data |
.osh.pbf files for all regions (original OSM data with full history) | no | yes, with user data |
.osm.bz2 files (old XML OSM data format) | yes, without user data | no |
.shp.zip files for all regions (ESRI shape files) | yes | no |
The change was originally made on 03 May. Unfortunately osmosis and older versions of osmium, and all programs depending on them (e.g. osm2pgsql < 0.96.0) don’t work with PBF files lacking metadata fields. They fail to read PBF files which don’t have any username
, user
and changeset
field set. To work around that, starting 05 May, we’re now writing empty strings to all user
fields and zeros to all user
and uid
fields.
OSM extracs and diffs with full metadata are still available but you have to log in with your OSM account.
For security reasons, it is not enough to pick the cookie out of the developer console of your web browser. The cookie will become invalid after 48 hours and the server will redirect you to authorization form of openstreetmap.org if you send that cookie after it has expired. That’s why you have to retrieve a new cookie every 48 hours or any other time the server forces you to do so.
We have published a client program written in Python (read the documentation) automating the cookie retrieval. The repository containing the client also contains the server software if you are curious or want to use it on your own server providing sensitive data.
This blog post may be updated as things develop.
16.03.2018 | Frederik Ramm
On May 1st this year, we’ll make the following changes to the standard download.geofabrik.de server:
username
, uid
and changeset
fields), and might* have their timestamp
fields dilutedThe general functioning of the server and all URLs will remain unchanged. We believe that the majority of users of our download server, who use the extracts to populate rendering, routing, or geocoding engines, will not be affected by the changes.
In case you want to double-check if your software supports the files with missing fields, here are sample files for the UK county of Rutland with user data removed: osm.pbf, osm.bz2, osc.gz
If you need more test data, you can generate these files on your own using the latest (unreleased) Osmium Tool and the latest (unreleased) Libosmium library. The following Osmium Tool command removes all metadata except version
and timestamp
fields from a file:
osmium cat -o without-user-data.osm.pbf --output-format pbf,add_metadata=version+timestamp input.osm.pbf
(and similar for .osm.bz2 or .osc.gz)
At the same time, we will introduce a new service, osm-internal.download.geofabrik.de, which will be like the “old” download service, offering full history files, regional extracts, and diffs with full user information and un-diluted time stamps. This service will be 100% free of charge like the main download server, but will require an OAuth login with the openstreetmap.org web site and might* also require a click-through user agreement in order to safeguard personal data.
*) Some details are still being worked out.
24.10.2017 | Michael Reichert
We recently rewrote the Public Transport views of our OSM Inspector. The old views had been used rarely and were of limited use, so we went for a full rewrite. While the old views aimed to give an overview on both the infrastructure and the network, the new views are focussed on the routes and stops.
The public transport schema used today was proposed back in 2010/2011 but it was adopted rather slowly. Also, many mappers made lots of mistakes when trying to use the schema because the documentation on the wiki was confusing. To clearly distinguish public transport mapping that follows this schema from old-style public transport mapping, new schema is now called “public transport v2” – even though there really wasn’t ever a version 1.
The new views of OSM Inspector try to support good mapping according to the “public transport v2” scheme, and act as a validator.
Here’s how you can use the two new views to do some public transport mapping:
The Stops view shows stop positions and platforms (multipolygons are not supported yet). Go to an area you are familiar with and check if there are missing bus stops. Use this view on zoom levels 10 and 11 to spot (rural) areas where no bus stops are mapped at all. Just take your car or bicycle and visit them!
Platforms are rendered in blue, stop positions in green. Bus stops with highway=bus_stop
but without a public_transport=*
tag are rendered in orange. They are not errors if they are the only object that represent a bus stop (per direction).
The Routes view shows route relations which claim to be v2 route relations (they have the tag public_transport:version=2
). The view shows whether a route is valid (green) or not (black) and highlights objects which are either the reason of invalidity or which are located next to the invalidity (orange and purple).
The tagging schema defines minimum requirements a route must comply with:
stop
or stop_entry_only
.platform
, platform_entry_only
or platform_exit_only
.(Even though many mappers seem to think so, you do not have to map both the stop position and the platform. One is enough. Especially for vehicles where the platform and the vehicle are short, adding stop positions is overkill. But if a stop has both, both have to be added to the route relation.)
A correct route relation of a bus line would look like this (roles are given in parentheses):
- stop position of station 1 (stop)
- platform object of station 1 (platform)
- stop position of station 2 (stop)
- platform object of station 2 (platform)
- ...
- stop position of station n (stop)
- platform of station n (platform)
- first way with highway=* ()
- second way with highway=* ()
- ...
- last way with highway=* ()
That’s how it would look like in JOSM’s relation editor:
As said above, mapping both stops and platforms is not mandatory. One of the two is enough. (If you know the location of the platform and/or the bus stop sign, add it. Otherwise the stop position is a good replacement.) Therefore, the route could also look like this:
To get quickly a working validator, we decided not to implement all checks. Here is the list of implemented validation tests:
stop
, stop_entry_only
, stop_exit_only
, platform
, platform_entry_only
, platform_exit_only
.stop
, stop_entry_only
, stop_exit_only
, platform
, platform_entry_only
, platform_exit_only
or an empty role.platform
, platform_entry_only
, platform_exit_only
.trolley_wire=yes
(or trolley_wire:forward/backward=yes
). Roads/tracks under construction trigger errors.route=*
tag of a route relation must have one of the following values: train
, tram
, subway
, bus
, trolleybus
, ferry
, aerialway
.There are some ideas for more tests but they haven’t been implemented yet:
osmi_pubtrans3 is free software, and patches are welcome.
4.09.2017 | Michael Reichert
The Tagging view of our OpenStreetMap Inspector tool got two new checks (represented as layers) a few days ago. Both look for oddities in OSM data.
The freedom of OpenStreetMap to tag what you want can lead to tagging errors. One common but maybe dangerous error is the usage of tags like disused=yes
, abandoned=yes
, construction=yes
and proposed=yes
on objects.
For example, if a mapper wants to map a road which is under construction, they usually tag highway=construction
+ construction=secondary/tertiary/<whatever>
. This is a good way of tagging because data consumers who look for roads and who are not interested in roads under construction can just look at the value of the highway=*
key and check if its value is in a defined set of “good” values (primary
, secondary
, tertiary
, …).
Unfortunately, some mappers tend(ed) to use highway=secondary/tertiary/<whatever>
+ construction=yes
in this case. This tagging is misleading. Data users have to check if an object is marked as being under construction. Because a minor key [1] such as disused=*
inverts another tag, we call this tagging style “misleading”. OSM should be easy to use and that’s why the tagging of these objects should be improved.
Currently, the processing software only checks for misleading tagging if there is at least one of the following “main” keys and inverting keys:
Main keys are:
highway
railway
amenity
If the value of a main key is disused
, abandoned
, razed
, dismantled
, proposed
or construction
, it is ignored, further checks are skipped, and the object is not reported as errorenous.
Inverting keys are:
disused=*
abandoned=*
razed=*
dismantled=*
construction=*
proposed=*
Inverting keys with a value of no
are ignored because they do not invert the meaning of the main key.
As a side effect, objects with mixed tagging like highway=primary
+ construction=primary
are reported as errors, too. They occur if a mapper changed the object from highway=construction
to highway=primary
but forgot to remove construction=primary
. It is an indicator that these objects might be worth checking.
It occurs rather often that newbies add objects to OSM without proper tags. They manage to add a name, a description and/or a website but that’s it. This is caused by both a lack of experience and a suboptimal user interface of the editor being used (and in rare cases a lack of suitable tags). These objects are more or less hidden from any data consumer because nobody wants to parse the value of a name or description tag to guess what it represents.
The processing software distinguishes between non-feature keys and feature keys.
A key is considered a non-feature key if it begins with one of the following strings:
name
description
comment
website
contact:website
The following keys are considered being feature keys, i.e. they describe what the object represents:
building, landuse, highway, railway, amenity, shop, natural, waterway, power, barrier, leisure, man_made, tourism, boundary, public_transport, sport, emergency, historic, route, aeroway, place, craft, entrance, playground, aerialway, healthcare, military, building:part, training, traffic_sign, xmas:feature, seamark:type, waterway:sign, university, pipeline, club, golf, junction, historic
The office
tag is only a feature tag if it has a value other than yes
.
Keys are considered being feature keys if the begin with one of the following strings:
historic, razed, demolished, abandoned, disused, construction, proposed, temporary, TMC
All other keys are considered being neutral and do not influence the evaluation.
If an object has a non-feature key but no feature key it is flagged as an error.
If you find such an object, please check the following:
If you help fixing these errors, you will have to do a lot of research for tags. Consider this a valuable experience – you will learn lots of tags you did not know and you will learn what could be mapped or is mapped at OSM.
Due to changes in OSRM we had to set up our own process which looks for routing errors. We now use Open Source Routing Machine’s “small components” extractor but with the default car profile. Don’t be frightened. There are lots of new errors because our validation is now strict and flags everything which is unreachable using the default car profile most OSRM users probably use.
We added a new layer, called “sinks and sources”. A similar layer existed a few years ago. It will show all one-way roads (using the default OSRM car profile) which lead into nowhere or start nowhere, i.e. are not connected to the network at both ends. See the OSM wiki for
further instructions.
[1] The OSM data model itself does not differentiate between major and minor, more or less important keys. But the usage of tags which has established since 2004 does and there are tags like amenity=shelter
which are further distinguished by using shelter_type=*
.
30.06.2017 | Michael Reichert
We have rolled out some changes to the OpenStreetMap Inspector in the last days. They affect the views Geometry, Tagging, Places and Highways.
From now on the data which powers the rendering of these views is generated by a tool called osmi_simpleviews. We released its code on Github under GPLv3. It’s a C++ program which uses the libosmium library by Jochen Topf to read the planet and work with the objects in it, and GDAL to write the errors to a Spatialite database (other formats are also possible, but not properly tested). It generates one Spatialite file per view. It’s open source and you can run it on your own.
The Spatialite database is copied from a processing server at Hetzner’s data centre to the machine where all our tools available at tools.geofabrik.de are hosted on. It uses the Spatialite file as a data source for the WMS service (Mapserver).
While the main goal was to reimplement things in C++ (instead of Perl and C previously), we changed some things which we want to explain here. Have a look at the OpenStreetMap Wiki where the full documentation of the views is located.
This view (documentation) shows errors and potential errors regarding the geometry of ways.
Tagging errors and strange tags are shown by the Tagging view (documentation).
fixme=*
and todo=*
or which has any key with the value fixme
. This means that fixme=continue
, fiXme=something
, or highway=FixMe
are shown. Values which contain fixme
preceeded or followed by different characters are not shown any more to reduce the number of false positives.Please keep in mind that the Tagging view is no invitation to do mechanical edits like changing all occurences of a wrong-spelled tag using the search&replace feature of your favourite editor. Please review all the objects manually, look into their history and check why they were written wrong. Maybe you will uncover a much larger problem which should be fixed at its roots instead by just cutting of the parts above ground. Read OpenStreetMap’s guideline for mechanical edits.
Places are a core feature of many maps and good data of places in OSM is important. But not only names are important, population numbers are a rather objective method to classify places of equal category and help map renderers to prefer the larger of two neighbouring cities.
The Places view (documentation) is a special-topic map showing places and only places above or without a base map. It highlights missing names and anomalies in the data.
place=neighbourhood
and place=hamlet
. They are not flagged as “unknown value” any more. We added two new layers for them.How should you reach a place if there is no way (highway=*
)? Some checks are done by our Highways view (documentation). We did not change very much with this view:
deprecated
was removed. It used to show highway=unsurfaced
and highway=minor
, two very old and deprecated tags which completely disappeared from OSM some time ago, i.e. the layer was empty.From now the processing software of almost all views are open source. You can search for the errors on your own, e.g. if you need more frequent updates.
16.03.2017 | Frederik Ramm
If you’re running a world-wide OSM tile or Nominatim server and you are consuming updates more frequently than once per hour, you might find that your update process got stuck on around 21:00 UTC on March 12, 2017. At that time, a relation (#7066589) was uploaded to OSM that had more than 215-1 members. This is currently allowed by the API, but triggers a failure in the osm2pgsql utility used to update tile server and Nominatim databases. Depending on how your update process is constructed and when it runs, you could be lucky – the relation was deleted 55 minutes later and might therefore never have reached your osm2pgsl. But if your update process is constructed such that it downloads a diff file from OSM and then tries to apply that no matter what, then your process will be stuck because the file containing the large relation can never be successfully applied.
Your database will not be affected if you do not use the minutely, global updates from OSM. If you use them, you can either check your log files, or look at your database and check if an object created shortly after the problem polygon is there or not. If it is there, then everything is good; if not, then you need to investigate (of course it is possible that your updates are broken for a different reason and maybe for longer).
For a Nominatim database, try this:
nominatim=# select class from place where osm_type=’R’ and osm_id=7066630;
class
———-
boundary
(1 row)
For a tile server database, try this:
osm=# select boundary from planet_osm_polygon where osm_id=-7066630;
boundary
—————-
administrative
(1 row)
In each case, if you don’t get a result back then your database does not include a small boundary relation that was created shortly after the problem relation. (To those who read this long after 16 March 2017, note that the test relation we’re using here could have been changed deleted on OSM meanwhile and that would of course mean it’s normal that it wouldn’t be in your data – check http://www.openstreetmap.org/relation/7066630 before you panic).
Nominatim, when run in an update loop with “–import-osmosis-all”, will create a file called “osmosischange.osc” in its data subdirectory, and will try to apply that every time it wakes up from its sleep. Fix the file by removing everything between <relation id="7066589"...
and the matching </relation>
.
Tile servers will use different ways to update and potentially accumulate changes; one method that we frequently use when setting up servers is to collect un-applied changes in a file named /srv/osm-replicate/var/merged.osc
. If you have such a file then both the addition and the deletion of the relation are likely in that file, and you can either delete the relation manually (as described above), or run osmosis --read-xml-change merged.osc --sort-change --simc --write-xml-change sorted.osc && mv sorted.osc merged.osc
to squash the relation from the file. If you don’t have such a file, try to find out where your tile server downloads change files to, and get rid of the relation.
The issue has led to a patch in osm2pgsql discussed here on GitHub. This patch will make osm2pgsql ignore too-large relations.
23.01.2017 | Frederik Ramm
We’ve just rolled out a small enhancement to the OSM extracts available on our download server, concerning the way we deal with objects that cross an extract boundary. In the last couple of years we’ve used a program called osm-history-splitter to create the extracts. This program is based on an older version of the Osmium library and is able to ensure referential integrity on the way/node level only.
The ways shown as a and b in the sketch above would have been fully contained in the extract cut our along the dotted boundary. A polygon formed by a relation c in which at least one way lies fully outside the boundary, however, would not be constructable from the extract because those ways would be missing.
We’ve now switched to the newest version of osmium, the command line companion to the libosmium library, for producing the extracts and deriving the change files. This allows us to offer referential integrity for boundary-crossing multipolygon relations, while other (non-multipolygon) relations are still handled the same as before. This is important because otherwise a large boundary or route relation crossing a small extract would blow up the size of that extract too much.
With the new complete multipolygon relations (called the “smart” strategy in osmium), the extracts we offer have seen a size increase of just 0.5% on average. Some very small extracts with lots of border-crossing multipolygons have become much larger – the Andorra extract, for example, is now three times as big as before. But we believe it is worth it! If you process nightly updates you’ll likely see a small spike today, with today’s update bringing in all those extra ways needed to complete polygons.
Using the new software has also brought down the overall processing time from around 10 hours to under 4 hours, a 60% speedup. Kudos to Jochen Topf and Mapbox for their tireless improvements to Osmium! This is a nice proof that even in times of the ever-scalable cloud, solid engineering still has its place.
29.09.2016 | Frederik Ramm
Our little band of Geofabrik people is back from this year’s State of the Map conference. Christine and Frederik, Geofabrik directors, were there as part of the SotM working group (Christine) and speaker (Frederik); Rory, a Geofabrik employee since 2014, even did two talks, and Michael, who currently writes his master thesis at Geofabrik, was a speaker too.
Frederik, Rory, and Michael proudly wearing their #craftmapper t-shirts
SotM started out with a really nice keynote from the US ambassador to Turkmenistan, Allan Mustard, who sounded like a proper hacker when he playfully said: “And then I realized that I was the ambassador, and all these people were working for me, and I could tell them what to do!” (and sent them mapping Ashgabat). We had overheard a few people fearing the worst: “Well, a keynote by a diplomat, that sounds tiring” – they were most positively surprised.
As always, State of the Map provided amazing insights into what goes on in the OpenStreetMap universe. The project is growing steadily, and new people bring new ideas. There are new communities, new fields of endeavour, and new technologies every year – and we were lucky enough with the weather to be able to have lunch on the landuse=grass outside!
Rory presented on vector tile work he’d been doing for Geofabrik, as well as on his long-time project of Irish townland mapping. Michael was at his best explaining the nuts and bolts of railway mapping (including exactly which places in which trains were the best spots). Frederik held forth, as he likes to do, on mechanical edits in OpenStreetMap and why he didn’t encourage them.
The social event was held at the “Event Brewery” serving nice food (including Belgian waffles) in a great atmosphere.
We all had a good time and we’d like to thank the whole SotM team for running the conference. We’re looking forward to next year. We even heard the idea of an African SotM being discussed.