Over the last years, research on OpenStreetMap data quality has become increasingly popular. At this year’s FOSS4G, I had the honor to present some work we did at the AIT to assess OSM quality in Vienna, Austria. In the meantime, our paper “Towards an Open Source Analysis Toolbox for Street Network Comparison” has been published for early access. Thanks to the conference organizers who made this possible! I’ve implemented comparison tools found in related OSM literature as well as new tools for oneway street and turn restriction comparison using Sextante scripts and models for QGIS 1.8. All code is available on Github to enable collaboration. If you are interested in OSM data quality research, I’d like to invite you to give the tools a try.
Since most users probably don’t have access to QGIS 1.8 anymore, I’ll be updating the tools to QGIS 2.0 Processing. I’m starting today with the positional accuracy comparison tool. It is based on a method described by Goodchild & Hunter (1997). Here’s the corresponding slide from my FOSS4G presentation:
The basic idea is to evaluate the positional accuracy of a street graph by comparing it with a reference graph. To do that, we check how much of the graph lies within a certain tolerance (buffer) of the reference graph.
The processing model uses the following input: the two street graphs which should be compared, the size of the buffer (tolerance for positional accuracy), a polygon layer with analysis regions, and the field containing the region id. This is how the model looks in Processing modeler:
First, all layers are reprojected into a common CRS. This will have to be adjusted if the tool is used in other geographic regions. Then the reference graph is buffered and – since I found that dissolving buffers directly in the buffer tool can become very slow with big datasets – the faster difference tool is used to dissolve the buffers before we calculate the graph length inside the buffer (
inbufLEN) as well as the total graph length in the analysis region (
totalLEN). Finally, the two results are joined based on the region id field and the percentage of graph length within the buffered reference graph (
inbufPERC) is calculated. A high percentage shows that both graphs agree very well geometrically.
The following image shows the tool applied to a sample of OpenStreetMap (red) and official data published by the city of Vienna (purple) at Wien Handelskai. OSM was used as a reference graph and the buffer size was set to 10 meters.
In general, both graphs agree quite well. The percentage of the official graph within 10 meters of the OSM graph is 93% in the 20th district. In the above image, we can see that links available in OSM are not contained in the official graph (mostly pedestrian/bike links) and there seem to be some connectivity issues as well in the upper right corner of the image.
In my opinion, Processing models are a great solution to document geoprocessing work flows and share them with others. If you want to collaborate on building more models for OSM-related analysis, just leave a comment bellow.