Archive

Tag Archives: PostGIS

Alpha shapes are generalizations of the convex hull [1]. Convex hulls are well known and widely implemented in GIS systems. Alpha shapes are different in that they capture the shape of a point set. You can watch a great demo of how alpha shapes work on François Bélair’s website “Everything You Always Wanted to Know About Alpha Shapes But Were Afraid to Ask” I borrowed the following pictures from that site:

Alpha shapes for different values of alpha. The left one equals the convex hull of the point set.

Alpha shapes for different values of alpha. The left one equals the convex hull of the point set.

pgRouting comes with an implementation of alpha shapes. There is an alpha shape function: alphashape(sql text) and a convenience wrapper: points_as_polygon(query character varying). The weird thing is that you don’t get to set an alpha value. The only thing supplied to the function is a set of points. Let’s see what kind of results it produces!

Starting point for this experiment is a 10 km catchment zone around node #2699 in my osm road network. Travel costs to nodes are calculated using driving_distance() function. (You can find more information on using this function in Catchment Areas with pgRouting driving_distance().)

CREATE TABLE home_catchment10km AS
SELECT *
   FROM osm_nodes
   JOIN
   (SELECT * FROM driving_distance('
      SELECT gid AS id,
          source,
          target,
          meters AS cost
      FROM osm_roads',
      2699,
      10000,
      false,
      false)) AS route
   ON
   osm_nodes.id = route.vertex_id

After costs are calculated, we can create some alpha shapes. The following queries create the table and insert an alpha shape for all points with a cost of less than 1500:

CREATE TABLE home_isodist (id serial, max_cost double precision);
SELECT AddGeometryColumn('home_isodist','the_geom',4326,'POLYGON',2);

INSERT INTO home_isodist (max_cost, the_geom) (
SELECT 1500, ST_SetSRID(the_geom,4326)
FROM 
  points_as_polygon(
    'SELECT id, ST_X(the_geom) AS x, ST_Y(the_geom) AS y FROM home_catchment10km where cost < 1500'));

In previous posts, I’ve created catchment areas by first interpolating a cost raster and creating contours from there. Now, let’s see how the two different approaches compare!

The following picture shows resulting catchment areas for 500, 1000, 1500, and 2000 meters around a central node. Colored areas show the form of pgRouting alpha shape results. Black contours show the results of the interpolation method:

Comparison of pgRouting alpha shapes and interpolation method

At first glance, results look similar enough. Alpha shape results look like a generalized version of interpolation results. I guess that it would be possible to get even closer if the alpha value could be set to a smaller value. The function should then produce a finer, more detailed polygon.

For a general overview about which areas of a network are reachable within certain costs, pgRouting alpha shapes function seems a viable alternative to the interpolation method presented in previous posts. However, the alpha value used by pgRouting seems too big to produce detailed catchment areas.

[1] http://biogeometry.duke.edu/software/alphashapes/

Today, I’ve been experimenting with data from OpenFlights.org. They offer airport, airline and route data for download. The first idea that came to mind was to connect airports on a shared route by lines. This kind of visualization just looks much nicer if the connections are curved instead of simple straight lines.

Luckily, that’s pretty easy to do using PostGIS. After loading airport positions and route data, we can create the connection lines like this (based on [postgis-users] Great circle as a linestring):

UPDATE experimental.airroutes
SET the_geom = 
(SELECT ST_Transform(ST_Segmentize(ST_MakeLine(
       ST_Transform(a.the_geom, 953027),
       ST_Transform(b.the_geom, 953027)
     ), 100000 ), 4326 ) 
 FROM experimental.airports a, experimental.airports b
 WHERE a.id = airroutes.source_id  
   AND b.id = airroutes.dest_id
);

The CRS used in the query is not available in PostGIS by default. You can add it like this (source: spatialreference.org):

INSERT into spatial_ref_sys (srid, auth_name, auth_srid, proj4text, srtext) values ( 953027, 'esri', 53027, '+proj=eqdc +lat_0=0 +lon_0=0 +lat_1=60 +lat_2=60 +x_0=0 +y_0=0 +a=6371000 +b=6371000 +units=m +no_defs ', 'PROJCS["Sphere_Equidistant_Conic",GEOGCS["GCS_Sphere",DATUM["Not_specified_based_on_Authalic_Sphere",SPHEROID["Sphere",6371000,0]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]],PROJECTION["Equidistant_Conic"],PARAMETER["False_Easting",0],PARAMETER["False_Northing",0],PARAMETER["Central_Meridian",0],PARAMETER["Standard_Parallel_1",60],PARAMETER["Standard_Parallel_2",60],PARAMETER["Latitude_Of_Origin",0],UNIT["Meter",1],AUTHORITY["EPSG","53027"]]');

This is an example visualization (done in QGIS) showing only flight routes starting from Vienna International Airport:

Flight routes from Vienna International

Connections crossing the date line are currently more problematic. Lines would have to be split, otherwise this is what you’ll get:

Date line trouble

Do you need to create a table with a geometry column in PostGIS from scratch?
Can’t remember the syntax of AddGeometryColumn(varchar catalog_name, varchar schema_name, varchar table_name, varchar column_name, integer srid, varchar type, integer dimension)? I can’t. ;)

Let’s make our lives easier: QGIS PostGIS Manager offers a convenient GUI for creating tables with geometry columns:

PostGIS Manager Create Table dialog

The dialog works a lot like what you’re probably used to in pgAdmin – with the added nicety of supporting Geometry columns.

In my opinion, this is the fastest way so far to create a spatially enabled table. It provides a much better user experience than telling users (especially new ones) to use AddGeometryColumn(…

In a previous post, I’ve described how to create catchment areas with pgRouting shortest_path() function. The solution described there calculates costs from the starting node (aka vertex) to all other nodes in the network. Depending on the network size, this can take a long time. Especially, if you are only interested in relatively small catchment areas (e.g. 50 km around a node in a network covering 10,000 km) there is a lot of unnecessary calculation going on. This is where you might want to use driving_distance() instead.

Driving_distance() offers a parameter for maximum distance/cost and will stop calculations when the costs exceed this limit. But let’s start at the beginning: installing the necessary functions.

Installation

If you have followed my guide to installing pgRouting, you already have some routing functions installed – but not driving_distance(). Weirdly, the necessary SQL scripts are not shipped with the .zip file available on pgRouting’s download page. You need:

routing_dd.sql
routing_dd_wrappers.sql

Both are available through the project repository at Github. Get them and execute them in your pgRouting-enabled database. Now, you should be ready.

Calculating driving distances

To calculate driving distances, we need a query very similar to shortest_path():

CREATE OR REPLACE FUNCTION driving_distance(
sql text,
source_id integer,
distance float8,
directed boolean,
has_reverse_cost boolean)
RETURNS SETOF path_result

The only new value is “distance”. That’s the maximum distance/cost you want to be contained in the result set. “distance” has to be specified in the same units as the cost attribute (which is specified in the “sql” text parameter).

Note: In my opinion, the name “(driving) distance” is misleading. While you can use distance as a cost attribute, you’re not limited to distances. You can use any cost attribute you like, e.g. travel time, fuel consumption, co2 emissions, …

The actual query for a catchment area of 100 km around node # 2000 looks like this:

SELECT * FROM driving_distance('
      SELECT gid AS id,
          start_id::int4 AS source,
          end_id::int4 AS target,
          shape_leng::float8 AS cost
      FROM network',
      2000,
      100000,
      false,
      false)

Interpreting the result

These are the first lines of the result set:

vertex_id;edge_id;cost
294;7262;97400.433506144
297;7236;98012.620979231
335;1095;96849.456306244
347;7263;93617.693852324
364;7098;93573.849081386
366;2551;92702.443434779
378;7263;91994.328368081

The cost attribute contains the total cost of travel from the starting node to the vertex_id node.
We will only be using vertex_id and cost. The use of edge_id is a mystery to me.

Visualizing the result

The easiest way to visualize driving_distance() results is using RT Sql Layer plugin. We need to join the results of driving_distance() with the table containing node geometries:

SELECT *
   FROM node
   JOIN
   (SELECT * FROM driving_distance('
      SELECT gid AS id,
          start_id::int4 AS source,
          end_id::int4 AS target,
          shape_leng::float8 AS cost
      FROM network',
      2000,
      100000,
      false,
      false)) AS route
   ON
   node.id = route.vertex_id

If you color the nodes based on the cost attribute, it will look something like this:

result of pgRouting driving_distance() visualized in QGIS

This post covers the topic of creating MULTI* geometries or GEOMETRYCOLLECTIONs in PostGIS using ST_Collect or ST_Union.

Often it doesn’t matter if you use ST_Collect or ST_Union. Both will return MULTI* geometries or – if different geometry types or multi geometries are mixed – GEOMETRYCOLLECTIONS.

Why you might want to use ST_Collect instead of ST_Union

ST_Collect is much faster than ST_Union [1]. That’s already a good argument. It’s faster because it does not dissolve boundaries or check for overlapping regions. It simply combines geometries into MULTI*s or GEOMETRYCOLLECTIONs.

This leads to reason two for why you might prefer ST_Collect: Dissolving boundaries and overlapping regions – like ST_Union does – can lead to undesired effects. For example, combining three LINESTRING geometries that partially overlap, results in a MULTILINESTRING consisting of more than three lines. The input lines are split where they overlap each other or themselves.

Analysis and visualization based on the resulting geometries might lead to incorrect conclusions if the user is not aware of the processes performed by the union command.

[1] http://postgis.refractions.net/documentation/manual-svn/ST_Collect.html

PostGIS – our favorite spatial db – is nearing its 2.0 release! We are promised even better performance and stability and many new capabilities especially in the three new areas: raster data, topology, and 3D.

Raster

In 2.0, PostGIS Raster (formerly known as WKT Raster) will be fully integrated into the main application. Raster images are stored in special raster tables, which can be loaded from and exported to any GDAL-supported format. Additionally, there are functions for analyzing and operating on the pixel data. Rasters can be for example: vectorized, averaged, checked for intersections with vector geometries and edited inside the database.

For more information on what you can do with Rasters, check: PostGIS Doc, Chapter 8. Raster Reference.

Topology

PostGIS 2.0 represents the beginning of topology support in PostGIS. It will be possible to transform standard geometry into topological data, validate topology, and edit nodes and edges. Topology can be exported to GML.

Topology is covered in PostGIS Doc, Chapter 10. Topology.

3D

PostGIS 2.0 adds two 3D geometry types: polygonal surfaces and triangular irregular networks (TINs). Additional support operators for common tasks like finding areas (and volumes) of regions in 3D are included as well. Another improvement in this area is that existing spatial indices have been made 3D-aware, and a library of 3D-functions has been added. This allows calculation of distances and intersections in 3D, 3-dimensional bounding-boxes and many more things like 3D shortest-paths.

For a list of functions that support 3D, check PostGIS Doc, Chapter 12.

Read more on lwn.net.

Site analyses can benefit greatly from using “drive-time” isochrones to define the study area. Drive time isochrones are often significantly different from simple buffer areas which disregard natural barriers such as rivers or slow roads.

Of course, creating drive time isochrones requires more input data and more compute-intensive algorithms than a simple buffer analysis. It is necessary to create a routable network graph with adequate weights to be used by the routing algorithm.

One of the most popular routing  applications in the open source world is pgRouting for PostGIS enabled databases. I’ve already shown how to create drive time isochrones for one network node based on pgRouting and QGIS.  Today, I’ll show how to create drive time isochrones for a set of points – in this case all airports in Finland.

The first step is to find the closest network node to every airport:

ALTER TABLE airport
   ADD COLUMN nearest_node integer;

CREATE TABLE temp AS
   SELECT a.gid, b.id, min(a.dist)
   FROM
     (SELECT airport.gid, 
             min(distance(airport.the_geom, node.the_geom)) AS dist
      FROM airport, node
      GROUP BY airport.gid) AS a,
     (SELECT airport.gid, node.id, 
             distance(airport.the_geom, node.the_geom) AS dist
      FROM airport, node) AS b
   WHERE a.dist = b. dist
         AND a.gid = b.gid
   GROUP BY a.gid, b.id;

UPDATE airport
SET nearest_node = 
   (SELECT id 
    FROM temp
    WHERE temp.gid = airport.gid);

Then, we can calculate drive times between network nodes and “airport nodes”. I am still looking for the most efficient way to perform this calculation. The trivial solution I used for this example was to calculate all drive time values separately for each airport node (as described in “Creating Catchment Areas with pgRouting and QGIS”).
I create the table with the first node:

CREATE TABLE catchment_airport AS
SELECT 
    id,
    the_geom,
    (SELECT sum(cost) FROM (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   5657,
	   id,
	   false,
	   false)) AS foo ) AS cost
FROM node;

Every following node is done with an INSERT:

INSERT INTO catchment_airport (
SELECT 
    id,
    the_geom,
    (SELECT sum(cost) FROM (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   123, -- put a new index here!
	   id,
	   false,
	   false)) AS foo ) AS cost
FROM node);

Afterwards, I combined the values to find the minimum drive time for each network node:

CREATE table catchment_airport_final AS
SELECT id, the_geom, min (cost) AS cost
FROM catchment_airport
GROUP By id, the_geom;

The resulting point layer was imported into QGIS. Using TIN interpolation (from Interpolation plugin), you can calculate a continuous cost surface. And Contour function (from GDALTools) finally yields drive time isochrones.

Drive time isochrones around airports in northern Finland - spatial data © National Land Survey of Finland 2011

Based on this analysis, it is possible to determine how many inhabitants live within one hour driving distance from an airport or how many people have to drive longer than e.g. ninety minutes to reach any airport.

Based on the network created in my last post, I’ll now describe how to calculate the catchment area of a network node.

We need both network and node table. The cost attribute in my network table is called traveltime. (I used different speed values based on road category to calculate traveltime for road segments.) The result will be a new table containing all nodes and an additional cost attribute. And this is the query that calculates the catchment area around node #5657:

create table catchment_5657 as
select 
    id,
    the_geom,
    (select sum(cost) from (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   5657,
	   id,
	   false,
	   false)) as foo ) as cost
from node

Then, I loaded the point table into QGIS and calculated a TIN based on the cost attribute. With “Contour” from GdalTools, you can visualize equal-cost areas even better:

Catchment area around node #5657 with contour lines

Between contour lines, there is a difference of 10 minutes travel time.

If you are looking for a faster way to calculate small catchment areas (relative to the network size), check “Catchment Areas with pgRouting driving_distance().

Please read the new instructions for pgRouting 2.0.

The aim of this post is to describe the steps necessary to calculate routes with pgRouting. In the end, we’ll visualize the results in QGIS.

This guide assumes that you have the following installed and running:

  • Postgres with PostGIS and pgAdmin
  • QGIS with PostGIS Manager and RT Sql Layer plugins

Installing pgRouting

pgRouting can be downloaded from www.pgrouting.org.

Building from source is covered by pgRouting documentation. If you’re using Windows, download the binaries and copy the .dlls into PostGIS’ lib folder, e.g. C:\Program Files (x86)\PostgreSQL\8.4\lib.

Start pgAdmin and create a new database based on your PostGIS template. (I called mine ‘routing_template’.) Open a Query dialog, load and execute the three .sql files located in your pgRouting download (routing_core.sql, routing_core_wrappers.sql, routing_topology.sql). Congratulations, you now have a pgRouting-enabled database.

Creating a routable road network

The following description is based on the free road network published by National Land Survey of Finland (NLS) (Update January 2013: Sorry, this dataset has been removed). All you get is one Shapefile containing line geometries, a road type attribute and further attributes unrelated to routing.

First step is to load roads.shp into PostGIS. This is easy using PostGIS Manager – Data – Load Data from Shapefile.

pgRouting requires each road entry to have a start and an end node id. If your dataset already contains this information, you can skip this step. Otherwise we will create the node ids now. (Update: pgRouting also offers a special function called assign_vertex_id that will create start and end node ids for your network table. It will not create a node table though.)

Next, we create start and end point geometries. I used a view:

CREATE OR REPLACE VIEW road_ext AS
   SELECT *, startpoint(the_geom), endpoint(the_geom)
   FROM road;

Now, we create a table containing all the unique network nodes (start and end points) and we’ll also give them an id:

CREATE TABLE node AS 
   SELECT row_number() OVER (ORDER BY foo.p)::integer AS id, 
          foo.p AS the_geom
   FROM (         
      SELECT DISTINCT road_ext.startpoint AS p FROM road_ext
      UNION 
      SELECT DISTINCT road_ext.endpoint AS p FROM road_ext
   ) foo
   GROUP BY foo.p;

Finally, we can combine our road_ext view and node table to create the routable network table:

CREATE TABLE network AS
   SELECT a.*, b.id as start_id, c.id as end_id
   FROM road_ext AS a
      JOIN node AS b ON a.startpoint = b.the_geom
      JOIN node AS c ON a.endpoint = c.the_geom;

(This can take a while.)

I recommend adding a spatial index to the resulting table.

Calculating shortest routes

Let’s try pgRouting’s Shortest Path Dijkstra method. The following query returns the route from node #1 to node #5110:

SELECT * FROM shortest_path('
   SELECT gid AS id, 
          start_id::int4 AS source, 
          end_id::int4 AS target, 
          shape_leng::float8 AS cost
   FROM network',
1,
5110,
false,
false);

Final step: Visualization

With RT Sql Layer plugin, we can visualize the results of a query. The results will be loaded as a new layer. The query has to contain both geometry and a unique id. Therefore, we’ll join the results of the previous query with the network table containing the necessary geometries.

SELECT * 
   FROM network
   JOIN
   (SELECT * FROM shortest_path('
      SELECT gid AS id, 
          start_id::int4 AS source, 
          end_id::int4 AS target, 
          shape_leng::float8 AS cost
      FROM network',
      1,
      5110,
      false,
      false)) AS route
   ON
   network.gid = route.edge_id;

In my case, this is how the result looks like:

Route from node #1 to node #5110

For further pgRouting-related posts check my list of pgRouting posts.

pgRouting has become even more powerful: A DARP (Dial-a-Ride-Problem) solver is now available in the “darp branch” of the pgRouting repository.

The Dial-a-Ride Problem (DARP) solver tries to minimize transportation cost while satisfying customer service level constraints (time windows violation, waiting and travelling times) and fleet constraints (number of cars and capacity, as well as depot location).

Documentation can be found at www.pgrouting.org/docs.

%d bloggers like this: