RSS Feed icon

Blog

Tagging and Routing View Got Additional Layers

4.09.2017 | Michael Reichert

The Tagging view of our OpenStreetMap Inspector tool got two new checks (represented as layers) a few days ago. Both look for oddities in OSM data.

Hidden Non-Operational Tagging

screen shot of OSM Inspector showing a toilet with disused=yes

The freedom of OpenStreetMap to tag what you want can lead to tagging errors. One common but maybe dangerous error is the usage of tags like disused=yes, abandoned=yes, construction=yes and proposed=yes on objects.

For example, if a mapper wants to map a road which is under construction, they usually tag highway=construction + construction=secondary/tertiary/<whatever>. This is a good way of tagging because data consumers who look for roads and who are not interested in roads under construction can just look at the value of the highway=* key and check if its value is in a defined set of “good” values (primary, secondary, tertiary, …).

Unfortunately, some mappers tend(ed) to use highway=secondary/tertiary/<whatever> + construction=yes in this case. This tagging is misleading. Data users have to check if an object is marked as being under construction. Because a minor key [1] such as disused=* inverts another tag, we call this tagging style “misleading”. OSM should be easy to use and that’s why the tagging of these objects should be improved.

Currently, the processing software only checks for misleading tagging if there is at least one of the following “main” keys and inverting keys:

Main keys are:

  • highway
  • railway
  • amenity

If the value of a main key is disused, abandoned, razed, dismantled, proposed or construction, it is ignored, further checks are skipped, and the object is not reported as errorenous.

Inverting keys are:

  • disused=*
  • abandoned=*
  • razed=*
  • dismantled=*
  • construction=*
  • proposed=*

Inverting keys with a value of no are ignored because they do not invert the meaning of the main key.

As a side effect, objects with mixed tagging like highway=primary + construction=primary are reported as errors, too. They occur if a mapper changed the object from highway=construction to highway=primary but forgot to remove construction=primary. It is an indicator that these objects might be worth checking.

Name/Description without Important Tags

screen shot of OSM Inspector showing a node which onyl has a name tag

It occurs rather often that newbies add objects to OSM without proper tags. They manage to add a name, a description and/or a website but that’s it. This is caused by both a lack of experience and a suboptimal user interface of the editor being used (and in rare cases a lack of suitable tags). These objects are more or less hidden from any data consumer because nobody wants to parse the value of a name or description tag to guess what it represents.

The processing software distinguishes between non-feature keys and feature keys.

A key is considered a non-feature key if it begins with one of the following strings:

  • name
  • description
  • comment
  • website
  • contact:website

The following keys are considered being feature keys, i.e. they describe what the object represents:

building, landuse, highway, railway, amenity, shop, natural, waterway, power, barrier, leisure, man_made, tourism, boundary, public_transport, sport, emergency, historic, route, aeroway, place, craft, entrance, playground, aerialway, healthcare, military, building:part, training, traffic_sign, xmas:feature, seamark:type, waterway:sign, university, pipeline, club, golf, junction, historic

The office tag is only a feature tag if it has a value other than yes.

Keys are considered being feature keys if the begin with one of the following strings:

historic, razed, demolished, abandoned, disused, construction, proposed, temporary, TMC

All other keys are considered being neutral and do not influence the evaluation.

If an object has a non-feature key but no feature key it is flagged as an error.

How To Repair?

If you find such an object, please check the following:

  • If the object has a description tag and the description contains either non-factual information “the best bakery in A-Town, founded by James Smith in 1786”, it might be some kind of search engine optimization. Remove all non-factual information (it might be justified to delete the description completely).
  • If the object has a website tag, check if the website still exists. We don’t need closed shops.
  • Check if the object is located properly. If it is obviously wrong (e.g. center of the road) or on top of an existing object, it might have been be uploaded automatically.
  • Look for a proper tagging of the object. Add these tags. There are various sources of information:
  • Is the user a newbie and joined OSM recently? If yes, write a friendly welcome message. If it is more than a month ago and they is not active any more, use your time for more useful tasks.

If you help fixing these errors, you will have to do a lot of research for tags. Consider this a valuable experience – you will learn lots of tags you did not know and you will learn what could be mapped or is mapped at OSM.

Changes To The Routing View

Due to changes in OSRM we had to set up our own process which looks for routing errors. We now use Open Source Routing Machine’s “small components” extractor but with the default car profile. Don’t be frightened. There are lots of new errors because our validation is now strict and flags everything which is unreachable using the default car profile most OSRM users probably use.

screen shot of OSMI showing islands and sinks & sources layer

We added a new layer, called “sinks and sources”. A similar layer existed a few years ago. It will show all one-way roads (using the default OSRM car profile) which lead into nowhere or start nowhere, i.e. are not connected to the network at both ends. See the OSM wiki for
further instructions.

[1] The OSM data model itself does not differentiate between major and minor, more or less important keys. But the usage of tags which has established since 2004 does and there are tags like amenity=shelter which are further distinguished by using shelter_type=*.

Changes To The OpenStreetMap Inspector

30.06.2017 | Michael Reichert

Places View of OpenStreetMap Inspector

We have rolled out some changes to the OpenStreetMap Inspector in the last days. They affect the views Geometry, Tagging, Places and Highways.

From now on the data which powers the rendering of these views is generated by a tool called osmi_simpleviews. We released its code on Github under GPLv3. It’s a C++ program which uses the libosmium library by Jochen Topf to read the planet and work with the objects in it, and GDAL to write the errors to a Spatialite database (other formats are also possible, but not properly tested). It generates one Spatialite file per view. It’s open source and you can run it on your own.

The Spatialite database is copied from a processing server at Hetzner’s data centre to the machine where all our tools available at tools.geofabrik.de are hosted on. It uses the Spatialite file as a data source for the WMS service (Mapserver).

While the main goal was to reimplement things in C++ (instead of Perl and C previously), we changed some things which we want to explain here. Have a look at the OpenStreetMap Wiki where the full documentation of the views is located.

Geometry

This view (documentation) shows errors and potential errors regarding the geometry of ways.

  • Ways with long segments displays ways which have very long segments between two nodes. The threshold for what counts as “very long” as been changed from 0.3 degrees (previously) to 20 km.
  • Duplicate node in way used to only flag ways that contained the same node twice in sequence, and has been extended to also flag ways that contain two different nodes which share the same location.
  • We increased the minimum zoom level of some layers to speed up rendering on low zoom levels.

Tagging

Tagging errors and strange tags are shown by the Tagging view (documentation).

  • The layer Misspelled key has been removed. Some of the functionality is provided by the “Similar” tab in Taginfo. Just search for a well known key on Taginfo and open the Similar tab of this key. It looks like this for the key building.
  • The layer Tagged with FIXME was modified. It shows every node and way which has fixme=* and todo=* or which has any key with the value fixme. This means that fixme=continue, fiXme=something, or highway=FixMe are shown. Values which contain fixme preceeded or followed by different characters are not shown any more to reduce the number of false positives.

Please keep in mind that the Tagging view is no invitation to do mechanical edits like changing all occurences of a wrong-spelled tag using the search&replace feature of your favourite editor. Please review all the objects manually, look into their history and check why they were written wrong. Maybe you will uncover a much larger problem which should be fixed at its roots instead by just cutting of the parts above ground. Read OpenStreetMap’s guideline for mechanical edits.

Places

Places are a core feature of many maps and good data of places in OSM is important. But not only names are important, population numbers are a rather objective method to classify places of equal category and help map renderers to prefer the larger of two neighbouring cities.

The Places view (documentation) is a special-topic map showing places and only places above or without a base map. It highlights missing names and anomalies in the data.

  • This view now also supports place=neighbourhood and place=hamlet. They are not flagged as “unknown value” any more. We added two new layers for them.
  • The layer population number format was merged into population not a number as part of the rewrite. Every object is flagged if the value is not a plain number. The number must be an integer equal to or larger than 0 and must not be prefixed or followed by any characters – not even spaces – to make it easier for data users to parse the number.
  • Unusual population size was extended by some more checks.
  • We increased the minimum zoom level of some layers to speed up rendering on low zoom levels.

Highways

How should you reach a place if there is no way (highway=*)? Some checks are done by our Highways view (documentation). We did not change very much with this view:

  • The layer deprecated was removed. It used to show highway=unsurfaced and highway=minor, two very old and deprecated tags which completely disappeared from OSM some time ago, i.e. the layer was empty.

Open Source OSM Inspector

From now the processing software of almost all views are open source. You can search for the errors on your own, e.g. if you need more frequent updates.

  • osmi_simple_views is the program which powers the views Geometry, Tagging, Places and Highways.
  • osmcoastline by Jochen Topf powers the Coastlines View.
  • area_stats_and_report by Jochen Topf powers the Areas Views which shows broken polygons and multipolygon relations.
  • osmi-addresses by Lukas Toggenburger powers the Addresses View.
  • osmi-water by Nathanael Lang powers the Water View.

Have your OSM updates suddenly stopped?

16.03.2017 | Frederik Ramm

bomb_iconIf you’re running a world-wide OSM tile or Nominatim server and you are consuming updates more frequently than once per hour, you might find that your update process got stuck on around 21:00 UTC on March 12, 2017. At that time, a relation (#7066589) was uploaded to OSM that had more than 215-1 members. This is currently allowed by the API, but triggers a failure in the osm2pgsql utility used to update tile server and Nominatim databases. Depending on how your update process is constructed and when it runs, you could be lucky – the relation was deleted 55 minutes later and might therefore never have reached your osm2pgsl. But if your update process is constructed such that it downloads a diff file from OSM and then tries to apply that no matter what, then your process will be stuck because the file containing the large relation can never be successfully applied.

How can I find out if my OSM database is affected?

Your database will not be affected if you do not use the minutely, global updates from OSM. If you use them, you can either check your log files, or look at your database and check if an object created shortly after the problem polygon is there or not. If it is there, then everything is good; if not, then you need to investigate (of course it is possible that your updates are broken for a different reason and maybe for longer).

For a Nominatim database, try this:

nominatim=# select class from place where osm_type=’R’ and osm_id=7066630;
class
———-
boundary
(1 row)

For a tile server database, try this:

osm=# select boundary from planet_osm_polygon where osm_id=-7066630;
boundary
—————-
administrative
(1 row)

In each case, if you don’t get a result back then your database does not include a small boundary relation that was created shortly after the problem relation. (To those who read this long after 16 March 2017, note that the test relation we’re using here could have been changed deleted on OSM meanwhile and that would of course mean it’s normal that it wouldn’t be in your data – check http://www.openstreetmap.org/relation/7066630 before you panic).

How can I repair the problem?

Nominatim, when run in an update loop with “–import-osmosis-all”, will create a file called “osmosischange.osc” in its data subdirectory, and will try to apply that every time it wakes up from its sleep. Fix the file by removing everything between <relation id="7066589"... and the matching </relation>.

Tile servers will use different ways to update and potentially accumulate changes; one method that we frequently use when setting up servers is to collect un-applied changes in a file named /srv/osm-replicate/var/merged.osc. If you have such a file then both the addition and the deletion of the relation are likely in that file, and you can either delete the relation manually (as described above), or run osmosis --read-xml-change merged.osc --sort-change --simc --write-xml-change sorted.osc && mv sorted.osc merged.osc to squash the relation from the file. If you don’t have such a file, try to find out where your tile server downloads change files to, and get rid of the relation.

Learn more

The issue has led to a patch in osm2pgsql discussed here on GitHub. This patch will make osm2pgsql ignore too-large relations.

Improving Geofabrik OSM Extracts

23.01.2017 | Frederik Ramm

We’ve just rolled out a small enhancement to the OSM extracts available on our download server, concerning the way we deal with objects that cross an extract boundary. In the last couple of years we’ve used a program called osm-history-splitter to create the extracts. This program is based on an older version of the Osmium library and is able to ensure referential integrity on the way/node level only.

Referential Integrity when making extracts

Referential Integrity when making extracts. Cases a and b covered until now, case c additionally covered from now on.

The ways shown as a and b in the sketch above would have been fully contained in the extract cut our along the dotted boundary. A polygon formed by a relation c in which at least one way lies fully outside the boundary, however, would not be constructable from the extract because those ways would be missing.

We’ve now switched to the newest version of osmium, the command line companion to the libosmium library, for producing the extracts and deriving the change files. This allows us to offer referential integrity for boundary-crossing multipolygon relations, while other (non-multipolygon) relations are still handled the same as before. This is important because otherwise a large boundary or route relation crossing a small extract would blow up the size of that extract too much.

With the new complete multipolygon relations (called the “smart” strategy in osmium), the extracts we offer have seen a size increase of just 0.5% on average. Some very small extracts with lots of border-crossing multipolygons have become much larger – the Andorra extract, for example, is now three times as big as before. But we believe it is worth it! If you process nightly updates you’ll likely see a small spike today, with today’s update bringing in all those extra ways needed to complete polygons.

Using the new software has also brought down the overall processing time from around 10 hours to under 4 hours, a 60% speedup. Kudos to Jochen Topf and Mapbox for their tireless improvements to Osmium! This is a nice proof that even in times of the ever-scalable cloud, solid engineering still has its place.

Our little band of Geofabrik people is back from this year’s State of the Map conference. Christine and Frederik, Geofabrik directors, were there as part of the SotM working group (Christine) and speaker (Frederik); Rory, a Geofabrik employee since 2014, even did two talks, and Michael, who currently writes his master thesis at Geofabrik, was a speaker too.


Frederik, Rory, and Michael proudly wearing their #craftmapper t-shirts

SotM started out with a really nice keynote from the US ambassador to Turkmenistan, Allan Mustard, who sounded like a proper hacker when he playfully said: “And then I realized that I was the ambassador, and all these people were working for me, and I could tell them what to do!” (and sent them mapping Ashgabat). We had overheard a few people fearing the worst: “Well, a keynote by a diplomat, that sounds tiring” – they were most positively surprised.

As always, State of the Map provided amazing insights into what goes on in the OpenStreetMap universe. The project is growing steadily, and new people bring new ideas. There are new communities, new fields of endeavour, and new technologies every year – and we were lucky enough with the weather to be able to have lunch on the landuse=grass outside!

Rory presented on vector tile work he’d been doing for Geofabrik, as well as on his long-time project of Irish townland mapping. Michael was at his best explaining the nuts and bolts of railway mapping (including exactly which places in which trains were the best spots). Frederik held forth, as he likes to do, on mechanical edits in OpenStreetMap and why he didn’t encourage them.

The social event was held at the “Event Brewery” serving nice food (including Belgian waffles) in a great atmosphere.

We all had a good time and we’d like to thank the whole SotM team for running the conference. We’re looking forward to next year. We even heard the idea of an African SotM being discussed.