The city of Vienna provides both subdistrict geometries and population statistics. Mapping the city’s population density should be straightforward, right? Let’s see …

We should be able to join on ZBEZ and SUB_DISTRICT_CODE, check! But what about the actual population counts? Unfortunately, there is no file which simply lists population per subdistrict. The file I found contains four lines for each subdistrict: females 2011, males 2011, females 2012 and males 2012. That’s not the perfect format for mapping general population density.

A quick way to prepare our input data is applying pivot tables, eg. in Open Office: The goal is to have one row per subdistrict and columns for population in 2011 and 2012:

Export as CSV, add CSVT and load into QGIS. Finally, we can join geometries and CSV table:

A quick look at the joined data confirms that each subdistrict now has a population value. But visualizing absolute values results in misleading maps. Big subdistricts with only average density will overrule smaller but much denser subdistricts:

That’s why we need to calculate population density. This is easy to do using Field Calculator. The subdistrict file already contains area values but even if they were missing, we could calculate it using the $area operator: "pop2012" / ($area / 10000). The resulting population density in population per ha finally shows which subdistricts are the most densely populated:

One could argue that this is still no accurate representation of population density: Big parts of some subdistricts are actually covered by water – especially along the Danube – and therefore uninhabited. There are also big parks which could be excluded from the subdistrict area. But that’s going to be the topic of another post.

If you want to use my results so far, you can download the GeoJSON file from Github.

Today, I’ve finished my submission for the Hubway Data Visualization Challenge. All parts of the resulting dataviz were created using open source tools. My toolbox for this work contains: QGIS, Spatialite, Inkscape, Gimp and Open Office Calc. To see the complete submission and read more about it, check the project page.

Continuing where I left off yesterday, I went on to creating another visualization based on the assumption that people like to bike downhill but will be more reluctant to bike uphill.

As described in yesterday’s post, the Hubway network features stations with a negative bike balance (more bikes leaving than arriving) and others with a positive balance. Positively balanced stations are where people like biking to while negatively balanced stations are less attractive – like if they were on top of a hill and hard to reach.

Using this assumption and QGIS Grid Interpolation and Relief tool, I’ve created the following biking landscape:

With a negative balance of -2.76 bikes per day, Mayor Thomas M. Menino – Government Center station is the most unpopular station to bike to. Assuming the trip dataset is complete, this means that Hubway has to regularly transport bikes from positively balanced stations to this one to keep the system going.

Due to some trips containing NULL values in start/end station id, the total number of incoming and outgoing trips does not add up perfectly. But otherwise, this should represent the state of Hubway’s biking landscape pretty well.

Today, I’ve been working on some station statistics. From the trip data, I calculated incoming and outgoing trips per station as well as the station’s first day of operations. Combining this information makes it possible to calculate the average day’s “bike balance”. A balanced station has the same number of incoming and outgoing trips while an unbalanced station will either run out of bikes or empty slots for returns.

I’ve published the resulting station map on QGIS Cloud (http://qgiscloud.com/anitagraser/hubway_cloud1) where you can have a look at the bike balance values.

Additionally, I’ve created a mashup in Leaflet pulling together background tiles from Stamen and the cloud-hosted WMS for better orientation:

Today, I’ve been experimenting with a new way to visualize origin-destination pairs (ODs). The following image shows my first results:

The ideas was to add a notion of direction as well as uncertainty. The “flower petals” have a pointed origin and grow wider towards the middle. (Looking at the final result, they should probably go much narrower towards the end again.) The area covered by the petals is a simple approximation of where I’d expect the bike routes without performing any routing.

To get there, I reprojected the connection lines to EPSG:3857 and calculated connection length and line orientation using QGIS Field Calculator $length operator and the bearing formula given in QGIS Wiki:

(atan((xat(-1)-xat(0))/(yat(-1)-yat(0)))) * 180/3.14159 + (180 *(((yat(-1)-yat(0)) < 0) + (((xat(-1)-xat(0)) < 0 AND (yat(-1) - yat(0)) >0)*2)))

For the style, I created a new “flower petal” SVG symbol in Inkscape and styled it with varying transparency values: Rare connections are more transparent than popular ones. This style is applied to the connection start points. Using the advanced options “size scale” and “rotation”, it is possible to rotate the petals into the right direction as well as scale them using the previously calculated values for connection length and orientation.

Update

While the above example uses pretty wide petals this one is done with a much narrower petal. I think it’s more appropriate for the data at hand:

Most of the connections are clearly heading south east, across Charles River, except for that group of connections pointing the opposite direction, to Harvard Square.

Hubway is a bike sharing system in Boston and they are currently hosting a data visualization challenge. What a great chance to play with some real-world data!

To get started, I loaded both station Shapefile and trip CSV into a new Spatialite database. The GUI is really helpful here – everything is done in a few clicks. Afterwards, I decided to look into which station combinations are most popular. The following SQL script creates my connections table:

create table connections (
start_station_id INTEGER,
end_station_id INTEGER,
count INTEGER,
Geometry GEOMETRY);


insert into connections select 
start_station_id, 
end_station_id, 
count(*) as count, 
LineFromText('LINESTRING('||X(a.Geometry)||' '||Y(a.Geometry)||','
                          ||X(b.Geometry)||' '||Y(b.Geometry)||')') as Geometry
 from trips, stations a, stations b
where start_station_id = a.ID 
and end_station_id = b.ID
and a.ID != b.ID
and a.ID is not NULL
and b.ID is not NULL
group by start_station_id, end_station_id;

(Note: This is for Spatialite 2.4, so there is no MakeLine() method. Use MakeLine if you are using 3.0.)

For a first impression, I decided to map popular connections with more than one hundred entries. Wider lines mean more entries. The points show the station locations and they are color coded by starting letter. (I’m not yet sure if they mean anything. They seem to form groups.)

Some of the stations don’t seem to have any strong connections at all. Others are rather busy. The city center and the dark blue axis pointing west seem most popular.

I’m really looking forward to what everyone else will be finding in this dataset.

This is an update to my previous post “WFS to PostGIS in 3 Steps”. Thanks to Even Rouault’s comments and improvements to GDAL, it is now possible to load Latin1-encoded WFS (like the one by data.wien.gv.at) into PostGIS in just one simple step.

To use the following instructions, you’ll have to get the latest GDAL (release-1600-gdal-mapserver.zip)

You only need to run SDKShell.bat to set up the environment and ogr2ogr is ready for action:

C:\Users\Anita>cd C:\release-1600-gdal-mapserver
C:\release-1600-gdal-mapserver>SDKShell.bat
C:\release-1600-gdal-mapserver>ogr2ogr -overwrite -f PostgreSQL PG:"user=myuser password=mypassword dbname=wien_ogd" "WFS:http://data.wien.gv.at/daten/geoserver/ows?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:BEZIRKSGRENZEOGD&srsName=EPSG:4326"

Thanks everyone for your comments and help!

This is a quick note on how to download features from a WFS and import them into a PostGIS database. The first line downloads a zipped Shapefile from the WFS. The second one unzips it and the last one loads the data into my existing “gis_experimental” database:

wget "http://data.wien.gv.at/daten/geoserver/ows?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:BEZIRKSGRENZEOGD&srsName=EPSG:4326&outputFormat=shape-zip" -O BEZIRKSGRENZEOGD.zip
unzip -d /tmp BEZIRKSGRENZEOGD.zip
shp2pgsql -s 4326 -I -S -c -W Latin1 "/tmp/BEZIRKSGRENZEOGD.shp" | psql gis_experimental

Now, I’d just need a loop through the WFS Capabilities to automatically fetch all offered layers … Ideas anyone?

Thanks to Tim for his post “Batch importing shapefiles into PostGIS” which was very useful here.

Update: Many readers have pointed out that ogr2ogr is a great tool for this kind of use cases and can do the above in one line. That’s true – if it works. Unfortunately, it is picky about the supported encodings, e.g. doesn’t want to parse ISO-8859-15. In such cases, the three code lines above can be a good alternative.

Today’s post is a note-to-self on how to set up a really quick little Leaflet web map with Stamen’s “Toner” background and a WMS overlay served by QGIS Server.

Note the use of the map parameter when creating the QGIS Server WMS layer.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>

<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta http-equiv="content-language" content="en" />

<title>Leaflet, Stamen Toner and QGIS Server</title>

<link rel="stylesheet" href="http://leaflet.cloudmade.com/dist/leaflet.css" />
<link rel="stylesheet" href="my.css" type="text/css" />
 <!--[if lte IE 8]><link rel="stylesheet" href="http://leaflet.cloudmade.com/dist/leaflet.ie.css" /><![endif]-->

<script type="text/javascript" src="http://leaflet.cloudmade.com/dist/leaflet.js"></script>
<script type="text/javascript" src="http://maps.stamen.com/js/tile.stamen.js?v1.1.3"></script>
<script type="text/javascript">
function initialize() {
	var stamen = new L.StamenTileLayer("toner-lite");
	
	var myLayer = new L.TileLayer.WMS("http://10.101.21.28/cgi-bin/qgis_mapserv.fcgi", {
		map: "/usr/lib/cgi-bin/test/test.qgs",
		layers: 'mylayer',
		format: 'image/png',
		transparent: 'TRUE',
	});
	
	var map = new L.Map("map", {
		center: new L.LatLng(48.2,16.4),
		zoom: 13,
 		minZoom: 10,
 		maxZoom: 18,
		layers: [stamen,myLayer],
	});
}
</script>

</head>

<body onload="initialize()">
    <div id="map" class="map"></div>
</body>

</html>

With this my.css, the map will be displayed in “full screen”.

  div.map{
    display:block;
    position:absolute;
    top:0;
    left:0;
    width:100%;
    height:100%;
  }

Update

You can test an extended live example at http://anitagraser.github.com/Webmapping-Sandbox/leaflet.html

Data from various vehicles is collected for many purposes in cities worldwide. To get a feeling for just how much data is available, I created the following video using QGIS Time Manager which has been shown at the Austrian Museum of Applied Arts “MADE 4 YOU – Design for Change”. It shows one hour of taxi tracks in the city of Vienna:

If you like the video, please go to http://www.ertico.com/2012-its-video-competition-open-vote and vote for it in the category “Videos directed at the general public”.