This post covers the topic of creating MULTI* geometries or GEOMETRYCOLLECTIONs in PostGIS using ST_Collect or ST_Union.

Often it doesn’t matter if you use ST_Collect or ST_Union. Both will return MULTI* geometries or – if different geometry types or multi geometries are mixed – GEOMETRYCOLLECTIONS.

Why you might want to use ST_Collect instead of ST_Union

ST_Collect is much faster than ST_Union [1]. That’s already a good argument. It’s faster because it does not dissolve boundaries or check for overlapping regions. It simply combines geometries into MULTI*s or GEOMETRYCOLLECTIONs.

This leads to reason two for why you might prefer ST_Collect: Dissolving boundaries and overlapping regions – like ST_Union does – can lead to undesired effects. For example, combining three LINESTRING geometries that partially overlap, results in a MULTILINESTRING consisting of more than three lines. The input lines are split where they overlap each other or themselves.

Analysis and visualization based on the resulting geometries might lead to incorrect conclusions if the user is not aware of the processes performed by the union command.


PostGIS – our favorite spatial db – is nearing its 2.0 release! We are promised even better performance and stability and many new capabilities especially in the three new areas: raster data, topology, and 3D.


In 2.0, PostGIS Raster (formerly known as WKT Raster) will be fully integrated into the main application. Raster images are stored in special raster tables, which can be loaded from and exported to any GDAL-supported format. Additionally, there are functions for analyzing and operating on the pixel data. Rasters can be for example: vectorized, averaged, checked for intersections with vector geometries and edited inside the database.

For more information on what you can do with Rasters, check: PostGIS Doc, Chapter 8. Raster Reference.


PostGIS 2.0 represents the beginning of topology support in PostGIS. It will be possible to transform standard geometry into topological data, validate topology, and edit nodes and edges. Topology can be exported to GML.

Topology is covered in PostGIS Doc, Chapter 10. Topology.


PostGIS 2.0 adds two 3D geometry types: polygonal surfaces and triangular irregular networks (TINs). Additional support operators for common tasks like finding areas (and volumes) of regions in 3D are included as well. Another improvement in this area is that existing spatial indices have been made 3D-aware, and a library of 3D-functions has been added. This allows calculation of distances and intersections in 3D, 3-dimensional bounding-boxes and many more things like 3D shortest-paths.

For a list of functions that support 3D, check PostGIS Doc, Chapter 12.


Do you need a random sample of features in a Postgres table? Here is an example of how to select 1,000 random features from a table:

WHERE attribute = 'myValue'
ORDER BY random()
LIMIT 1000;

Site analyses can benefit greatly from using “drive-time” isochrones to define the study area. Drive time isochrones are often significantly different from simple buffer areas which disregard natural barriers such as rivers or slow roads.

Of course, creating drive time isochrones requires more input data and more compute-intensive algorithms than a simple buffer analysis. It is necessary to create a routable network graph with adequate weights to be used by the routing algorithm.

One of the most popular routing  applications in the open source world is pgRouting for PostGIS enabled databases. I’ve already shown how to create drive time isochrones for one network node based on pgRouting and QGIS.  Today, I’ll show how to create drive time isochrones for a set of points – in this case all airports in Finland.

The first step is to find the closest network node to every airport:

   ADD COLUMN nearest_node integer;

   SELECT a.gid,, min(a.dist)
     (SELECT airport.gid, 
             min(distance(airport.the_geom, node.the_geom)) AS dist
      FROM airport, node
      GROUP BY airport.gid) AS a,
     (SELECT airport.gid,, 
             distance(airport.the_geom, node.the_geom) AS dist
      FROM airport, node) AS b
   WHERE a.dist = b. dist
         AND a.gid = b.gid
   GROUP BY a.gid,;

UPDATE airport
SET nearest_node = 
   (SELECT id 
    FROM temp
    WHERE temp.gid = airport.gid);

Then, we can calculate drive times between network nodes and “airport nodes”. I am still looking for the most efficient way to perform this calculation. The trivial solution I used for this example was to calculate all drive time values separately for each airport node (as described in “Creating Catchment Areas with pgRouting and QGIS”).
I create the table with the first node:

CREATE TABLE catchment_airport AS
    (SELECT sum(cost) FROM (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   false)) AS foo ) AS cost
FROM node;

Every following node is done with an INSERT:

INSERT INTO catchment_airport (
    (SELECT sum(cost) FROM (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   123, -- put a new index here!
	   false)) AS foo ) AS cost
FROM node);

Afterwards, I combined the values to find the minimum drive time for each network node:

CREATE table catchment_airport_final AS
SELECT id, the_geom, min (cost) AS cost
FROM catchment_airport
GROUP By id, the_geom;

The resulting point layer was imported into QGIS. Using TIN interpolation (from Interpolation plugin), you can calculate a continuous cost surface. And Contour function (from GDALTools) finally yields drive time isochrones.

Drive time isochrones around airports in northern Finland - spatial data © National Land Survey of Finland 2011

Based on this analysis, it is possible to determine how many inhabitants live within one hour driving distance from an airport or how many people have to drive longer than e.g. ninety minutes to reach any airport.

Based on the network created in my last post, I’ll now describe how to calculate the catchment area of a network node.

We need both network and node table. The cost attribute in my network table is called traveltime. (I used different speed values based on road category to calculate traveltime for road segments.) The result will be a new table containing all nodes and an additional cost attribute. And this is the query that calculates the catchment area around node #5657:

create table catchment_5657 as
    (select sum(cost) from (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   false)) as foo ) as cost
from node

Then, I loaded the point table into QGIS and calculated a TIN based on the cost attribute. With “Contour” from GdalTools, you can visualize equal-cost areas even better:

Catchment area around node #5657 with contour lines

Between contour lines, there is a difference of 10 minutes travel time.

If you are looking for a faster way to calculate small catchment areas (relative to the network size), check “Catchment Areas with pgRouting driving_distance().

Please read the new instructions for pgRouting 2.0.

The aim of this post is to describe the steps necessary to calculate routes with pgRouting. In the end, we’ll visualize the results in QGIS.

This guide assumes that you have the following installed and running:

  • Postgres with PostGIS and pgAdmin
  • QGIS with PostGIS Manager and RT Sql Layer plugins

Installing pgRouting

pgRouting can be downloaded from

Building from source is covered by pgRouting documentation. If you’re using Windows, download the binaries and copy the .dlls into PostGIS’ lib folder, e.g. C:\Program Files (x86)\PostgreSQL\8.4\lib.

Start pgAdmin and create a new database based on your PostGIS template. (I called mine ‘routing_template’.) Open a Query dialog, load and execute the three .sql files located in your pgRouting download (routing_core.sql, routing_core_wrappers.sql, routing_topology.sql). Congratulations, you now have a pgRouting-enabled database.

Creating a routable road network

The following description is based on the free road network published by National Land Survey of Finland (NLS) (Update January 2013: Sorry, this dataset has been removed). All you get is one Shapefile containing line geometries, a road type attribute and further attributes unrelated to routing.

First step is to load roads.shp into PostGIS. This is easy using PostGIS Manager – Data – Load Data from Shapefile.

pgRouting requires each road entry to have a start and an end node id. If your dataset already contains this information, you can skip this step. Otherwise we will create the node ids now. (Update: pgRouting also offers a special function called assign_vertex_id that will create start and end node ids for your network table. It will not create a node table though.)

Next, we create start and end point geometries. I used a view:

   SELECT *, startpoint(the_geom), endpoint(the_geom)
   FROM road;

Now, we create a table containing all the unique network nodes (start and end points) and we’ll also give them an id:

   SELECT row_number() OVER (ORDER BY foo.p)::integer AS id, 
          foo.p AS the_geom
   FROM (         
      SELECT DISTINCT road_ext.startpoint AS p FROM road_ext
      SELECT DISTINCT road_ext.endpoint AS p FROM road_ext
   ) foo
   GROUP BY foo.p;

Finally, we can combine our road_ext view and node table to create the routable network table:

   SELECT a.*, as start_id, as end_id
   FROM road_ext AS a
      JOIN node AS b ON a.startpoint = b.the_geom
      JOIN node AS c ON a.endpoint = c.the_geom;

(This can take a while.)

I recommend adding a spatial index to the resulting table.

Calculating shortest routes

Let’s try pgRouting’s Shortest Path Dijkstra method. The following query returns the route from node #1 to node #5110:

SELECT * FROM shortest_path('
   SELECT gid AS id, 
          start_id::int4 AS source, 
          end_id::int4 AS target, 
          shape_leng::float8 AS cost
   FROM network',

Final step: Visualization

With RT Sql Layer plugin, we can visualize the results of a query. The results will be loaded as a new layer. The query has to contain both geometry and a unique id. Therefore, we’ll join the results of the previous query with the network table containing the necessary geometries.

   FROM network
   (SELECT * FROM shortest_path('
      SELECT gid AS id, 
          start_id::int4 AS source, 
          end_id::int4 AS target, 
          shape_leng::float8 AS cost
      FROM network',
      false)) AS route
   network.gid = route.edge_id;

In my case, this is how the result looks like:

Route from node #1 to node #5110

For further pgRouting-related posts check my list of pgRouting posts.

pgRouting has become even more powerful: A DARP (Dial-a-Ride-Problem) solver is now available in the “darp branch” of the pgRouting repository.

The Dial-a-Ride Problem (DARP) solver tries to minimize transportation cost while satisfying customer service level constraints (time windows violation, waiting and travelling times) and fleet constraints (number of cars and capacity, as well as depot location).

Documentation can be found at

Besides many other interesting topics, Opengeo’s PostGIS tutorial discusses “Tuning PostgreSQL for Spatial”.

The following values are recommended for production environments:

  • shared_buffers: 75 % of database memory (500 MB)
  • work_mem: 16 MB
  • maintenance_work_mem: 128 MB
  • wal_buffers: 1 MB
  • checkpoint_segments: 6
  • random_page_cost: 2.0
  • seq_page_cost: 1.0

All of these configuration parameters can edited in the database configuration file, C:\Documents and Settings\%USER\.opengeo\pgdata\%USER. This is a regular text file and can be edited using Notepad or any other text editor. An easier way of editing this configuration is by using the built-in “Backend Configuration Editor”. In pgAdmin, go to File > Open postgresql.conf…. It will ask for the location of the file, and navigate to C:\Documents and Settings\%USER\.opengeo\pgdata\%USER.

The changes will not take effect until the server is restarted.

Sometimes, we just want to visualize the contents of a PostGIS table containing some x/y data but no actual geometries in QGIS. But there the problems arise: We don’t have the right to add a geometry column, the table doesn’t have a suitable ID or OIDs (QGIS demands a unique integer ID) and we can’t or don’t want to mess with the database anyway. Loading the table with “Add PostGIS Layer” will result in a non-spatial layer (or fail if you use an older QGIS versions).

RT Sql Layer Plugin to the rescue!

I presented this plugin in a previous post. It allows you to execute any SQL SELECT statement, even really complex ones. Luckily, this time we don’t need anything fancy, only the two functions row_number() and makepoint():

  row_number() over (order by col1)::int AS my_id,
  x, y, 
  makepoint(x,y) as the_geom
from my_table

Have you ever wondered how to comfortable visualize PostGIS queries? Meet “RT Sql Layer” a powerful and comfortable QGIS plugin that allows building and visualizing queries on your PostGIS data.

RT Sql Layer comes with a graphic query builder:

RT Sql Layer Query Builder dialog

It allows saving/loading of queries to speed up your work flow.

The query results will be loaded as a new layer:

Loaded query layer

RT Sql Layer is available through Faunalia Plugin Repository.

For another great example on what can be achieved with this plugin, read Carson Farmer’s post on “pgRouting, OpenStreetMap, and QGIS” where he describes how to build your own routing database and visualize routing results in QGIS with RT Sql Layer.

More on RT Sql Layer: How to create Point Layers from x/y Data on the fly with PostGIS and QGIS

%d bloggers like this: