Archive

PostGIS

Since I’ve started working, transport and movement data have been at the core of many of my projects. The spatial nature of movement data makes it interesting for GIScience but typical GIS tools are not a particularly good match.

Dealing with the temporal dynamics of geographic processes is one of the grand challenges for Geographic Information Science. Geographic Information Systems (GIS) and related spatial analysis methods are quite adept at handling spatial dimensions of patterns and processes, but the temporal and coupled space-time attributes of phenomena are difficult to represent and examine with contemporary GIS. (Dr. Paul M. Torrens, Center for Urban Science + Progress, New York University)

It’s still a hot topic right now, as the variety of related publications and events illustrates. For example, just this month, there is an Animove two-week professional training course (18–30 September 2016, Max-Planck Institute for Ornithology, Lake Konstanz) as well as the GIScience 2016 Workshop on Analysis of Movement Data (27 September 2016, Montreal, Canada).

Space-time cubes and animations are classics when it comes to visualizing movement data in GIS. They can be used for some visual analysis but have their limitations, particularly when it comes to working with and trying to understand lots of data. Visualization and analysis of spatio-temporal data in GIS is further complicated by the fact that the temporal information is not standardized in most GIS data formats. (Some notable exceptions of formats that do support time by design are GPX and NetCDF but those aren’t really first-class citizens in current desktop GIS.)

Most commonly, movement data is modeled as points (x,y, and optionally z) with a timestamp, object or tracker id, and potential additional info, such as speed, status, heading, and so on. With this data model, even simple questions like “Find all tracks that start in area A and end in area B” can become a real pain in “vanilla” desktop GIS. Even if the points come with a sequence number, which makes it easy to identify the start point, getting the end point is tricky without some custom code or queries. That’s why I have been storing the points in databases in order to at least have the powers of SQL to deal with the data. Even so, most queries were still painfully complex and performance unsatisfactory.

So I reached out to the Twitterverse asking for pointers towards moving objects database extensions for PostGIS and @bitnerd, @pwramsey, @hruske, and others replied. Amongst other useful tips, they pointed me towards the new temporal support, which ships with PostGIS 2.2. It includes the following neat functions:

  • ST_IsValidTrajectory — Returns true if the geometry is a valid trajectory.
  • ST_ClosestPointOfApproach — Returns the measure at which points interpolated along two lines are closest.
  • ST_DistanceCPA — Returns the distance between closest points of approach in two trajectories.
  • ST_CPAWithin — Returns true if the trajectories’ closest points of approach are within the specified distance.

Instead of  points, these functions expect trajectories that are stored as LinestringM (or LinestringZM) where M is the time dimension. This approach makes many analyses considerably easier to handle. For example, clustering trajectory start and end locations and identifying the most common connections:

animation_clusters

(data credits: GeoLife project)

Overall, it’s an interesting and promising approach but there are still some open questions I’ll have to look into, such as: Is there an efficient way to store additional info for each location along the trajectory (e.g. instantaneous speed or other status)? How well do desktop GIS play with LinestringM data and what’s the overhead of dealing with it?

Advertisements

This post covers a simple approach to calculating isochrones in a public transport network using pgRouting and QGIS.

For this example, I’m using the public transport network of Vienna which is loaded into a pgRouting-enable database as network.publictransport. To create the routable network run:

select pgr_createTopology('network.publictransport', 0.0005, 'geom', 'id');

Note that the tolerance parameter 0.0005 (units are degrees) controls how far link start and end points can be apart and still be considered as the same topological network node.

To create a view with the network nodes run:

create or replace view network.publictransport_nodes as
select id, st_centroid(st_collect(pt)) as geom
from (
	(select source as id, st_startpoint(geom) as pt
	from network.publictransport
	) 
union
	(select target as id, st_endpoint(geom) as pt
	from network.publictransport
	) 
) as foo
group by id;

To calculate isochrones, we need a cost attribute for our network links. To calculate travel times for each link, I used speed averages: 15 km/h for buses and trams and 32km/h for metro lines (similar to data published by the city of Vienna).

alter table network.publictransport add column length_m integer;
update network.publictransport set length_m = st_length(st_transform(geom,31287));

alter table network.publictransport add column traveltime_min double precision;
update network.publictransport set traveltime_min = length_m  / 15000.0 * 60; -- average is 15 km/h
update network.publictransport set traveltime_min = length_m  / 32000.0 * 60 where "LTYP" = '4'; -- average metro is 32 km/h

That’s all the preparations we need. Next, we can already calculate our isochrone data using pgr_drivingdistance, e.g. for network node #1:

create or replace view network.temp as
 SELECT seq, id1 AS node, id2 AS edge, cost, geom
  FROM pgr_drivingdistance(
    'SELECT id, source, target, traveltime_min as cost FROM network.publictransport',
    1, 100000, false, false
  ) as di
  JOIN network.publictransport_nodes pt
  ON di.id1 = pt.id;

The resulting view contains all network nodes which are reachable within 100,000 cost units (which are minutes in our case).

Let’s load the view into QGIS to visualize the isochrones:

isochrone_publictransport_1

The trick is to use data-defined size to calculate the different walking circles around the public transport stops. For example, we can set up 10 minute isochrones which take into account how much time was used to travel by pubic transport and show how far we can get by walking in the time that is left:

1. We want to scale the circle radius to reflect the remaining time left to walk. Therefore, enable Scale diameter in Advanced | Size scale field:

scale_diameter

2. In the Simple marker properties change size units to Map units.
3. Go to data defined properties to set up the dynamic circle size.

datadefined_size

The expression makes sure that only nodes reachable within 10 minutes are displayed. Then it calculates the remaining time (10-"cost") and assumes that we can walk 100 meters per minute which is left. It additionally multiplies by 2 since we are scaling the diameter instead of the radius.

To calculate isochrones for different start nodes, we simply update the definition of the view network.temp.

While this approach certainly has it’s limitations, it’s a good place to start learning how to create isochrones. A better solution should take into account that it takes time to change between different lines. While preparing the network, more care should to be taken to ensure that possible exchange nodes are modeled correctly. Some network links might only be usable in one direction. Not to mention that there are time tables which could be accounted for ;)

This post is a quick instruction for installing Postgres 9.2, PostGIS 2.0 and pgRouting 2.0.

  1. For Postgres, download the installer from enterprisedb.com.
  2. Run the installer. You’ll have to pick a superuser password – remember it, you’ll need it again soon.
  3. At the end of the installation process, allow the installer to start Stack Builder.
  4. In Stack Builder, select the Postgres 9.2 installation and install PostGIS from the list of available extensions.
  5. The PostGIS installation procedure will prompt you for the superuser password you picked before.
  6. I suggest letting the installer create a sample database We’ll need it later anyway.

Now for the pgRouting part:

  1. Download the pgRouting zip file for your system (32 or 64 bit) from Winnie.
  2. Unzip the file. It contains bin, lib and share folders as well as two text files.
  3. Copy these folders and files over to your Postgres installation. In my case C:\Program Files\PostgreSQL\9.2

Installation – done.

Next, fire up pgAdmin. If you allowed the PostGIS installer to create a sample database, you will find it named postgis20 or similar. Otherwise just create a new database using the PostGIS template database. You can enable pgRouting in a database using the following steps:

  1. In postgis20, execute the following query to create the pgrouting extension. This will add the pgRouting-specific functions:
    CREATE EXTENSION pgrouting;
  2. Test if everything worked fine:
    SELECT pgr_version();

    It should return "(2.0.0-dev,v2.0.0-beta,18,a3be38b,develop,1.46.1)" or similar – depending on the version you downloaded.

pgadmin_pgrouting

How about some test data? I’ll be using the public transport network of the city of Vienna provided in GeoJSON format from http://data.wien.gv.at/daten/wfs?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:OEFFLINIENOGD&srsName=EPSG:4326&outputFormat=json:

  1. Just copy paste the url in Add Vector Layer | Protocol to load the dataset.
  2. Use DB Manager to load the layer into your database. (As you can see in the screenshot, I created a schema called network but that’s optional.)
  3. import_publictransport

  4. To make the line vector table routable, we use pgr_createTopology. This function assumes the columsn called “source” and “target” exist so we have to create those as well:
    alter table network.publictransport add column source integer;
    alter table network.publictransport add column target integer;
    select pgr_createTopology('network.publictransport', 0.0001, 'geom', 'id');
    

    I’m quite generously using a tolerance of 0.0001 degrees to build the topology. Depending on your dataset, you might want to be more strict here.

  5. Let’s test it! Route from source #1 to target #3000 using pgr_dijkstra:
    SELECT seq, id1 AS node, id2 AS edge, cost, geom
      FROM pgr_dijkstra(
        'SELECT id, source, target, st_length(geom) as cost FROM network.publictransport',
        1, 3000, false, false
      ) as di
      JOIN network.publictransport pt
      ON di.id2 = pt.id ;

    Note how the query joins the routing results and the network table together. (I’m aware that using the link length as a cost attribute will not lead to realistic results in a public transport network but bear with me for this example.)

  6. We can immediately see the routing results using the Load as new layer option:

select_dijkstra
route

Nice work! pgRouting 2.0 has come a long way. In a post from April this year, Boston GIS even announced to add pgRouting into the Stack Builder. That’s going to make the installation even more smooth.

This is an update to my previous post “WFS to PostGIS in 3 Steps”. Thanks to Even Rouault’s comments and improvements to GDAL, it is now possible to load Latin1-encoded WFS (like the one by data.wien.gv.at) into PostGIS in just one simple step.

To use the following instructions, you’ll have to get the latest GDAL (release-1600-gdal-mapserver.zip)

You only need to run SDKShell.bat to set up the environment and ogr2ogr is ready for action:

C:\Users\Anita>cd C:\release-1600-gdal-mapserver
C:\release-1600-gdal-mapserver>SDKShell.bat
C:\release-1600-gdal-mapserver>ogr2ogr -overwrite -f PostgreSQL PG:"user=myuser password=mypassword dbname=wien_ogd" "WFS:http://data.wien.gv.at/daten/geoserver/ows?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:BEZIRKSGRENZEOGD&srsName=EPSG:4326"

Thanks everyone for your comments and help!

This is a quick note on how to download features from a WFS and import them into a PostGIS database. The first line downloads a zipped Shapefile from the WFS. The second one unzips it and the last one loads the data into my existing “gis_experimental” database:

wget "http://data.wien.gv.at/daten/geoserver/ows?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:BEZIRKSGRENZEOGD&srsName=EPSG:4326&outputFormat=shape-zip" -O BEZIRKSGRENZEOGD.zip
unzip -d /tmp BEZIRKSGRENZEOGD.zip
shp2pgsql -s 4326 -I -S -c -W Latin1 "/tmp/BEZIRKSGRENZEOGD.shp" | psql gis_experimental

Now, I’d just need a loop through the WFS Capabilities to automatically fetch all offered layers … Ideas anyone?

Thanks to Tim for his post “Batch importing shapefiles into PostGIS” which was very useful here.

Update: Many readers have pointed out that ogr2ogr is a great tool for this kind of use cases and can do the above in one line. That’s true – if it works. Unfortunately, it is picky about the supported encodings, e.g. doesn’t want to parse ISO-8859-15. In such cases, the three code lines above can be a good alternative.

This is the follow up post to “An osm2po Quickstart” which covers loading the OSM network into PostGIS and using the result with pgRouting. After parsing the OSM file, e.g.

C:\Users\Anita\temp\osm2po-4.2.30>java -jar osm2po-core-4.2.30-signed.jar prefix=at "C:\Users\Anita\Geodaten\OpenStreetMap Data\austria.osm.pbf"

you should find a folder with the name of the prefix you chose inside the osm2po folder. It contains a log file which in turn provides a command line template for importing the OSM network into PostGIS, e.g.

INFO commandline template: psql -U [username] -d [dbname] -q -f "C:\Users\Anita\temp\osm2po-4.2.30\at\at_2po_4pgr.sql"

Using this template, we can easily import the .sql file into an exiting database. My pgRouting-enabled database is called wien_ogd.

C:\Users\Anita\temp\osm2po-4.2.30\at>psql -U [username] -d wien_ogd -q -f C:\Users\Anita\temp\osm2po-4.2.30\at\at_2po_4pgr.sql

Now, the data is ready for usage in QGIS:

The osm2po table in QGIS

Using “pgRouting Layer” plugin, it’s now straightforward to calculate shortest paths. I had to apply some changes to the plugin code, so please get the latest version from Github.

A shortest path in osm2po network

Using osm2po turned out to be far less painful than I expected and I hope you’ll find this post useful too.

The function with the glorious name “find_node_by_nearest_link_within_distance” is part of pgRouting and can be found in matching.sql.

“This function finds nearest node as a source or target of the nearest link”
That means that we can use this function e.g. to find the best road network node for a given address.

The function returns an object of type link_point:

CREATE TYPE link_point AS (id integer, name varchar);

To access only the id value of the nearest node, you can use:

SELECT id(foo.x) 
FROM (
   SELECT find_node_by_nearest_link_within_distance(
	'POINT(14.111 47.911)',
	0.5,
	'nw_table')::link_point as x
) AS foo
%d bloggers like this: