Archive

Tag Archives: PostGIS

This is an update to my previous post “WFS to PostGIS in 3 Steps”. Thanks to Even Rouault’s comments and improvements to GDAL, it is now possible to load Latin1-encoded WFS (like the one by data.wien.gv.at) into PostGIS in just one simple step.

To use the following instructions, you’ll have to get the latest GDAL (release-1600-gdal-mapserver.zip)

You only need to run SDKShell.bat to set up the environment and ogr2ogr is ready for action:

C:\Users\Anita>cd C:\release-1600-gdal-mapserver
C:\release-1600-gdal-mapserver>SDKShell.bat
C:\release-1600-gdal-mapserver>ogr2ogr -overwrite -f PostgreSQL PG:"user=myuser password=mypassword dbname=wien_ogd" "WFS:http://data.wien.gv.at/daten/geoserver/ows?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:BEZIRKSGRENZEOGD&srsName=EPSG:4326"

Thanks everyone for your comments and help!

About these ads

This is a quick note on how to download features from a WFS and import them into a PostGIS database. The first line downloads a zipped Shapefile from the WFS. The second one unzips it and the last one loads the data into my existing “gis_experimental” database:

wget "http://data.wien.gv.at/daten/geoserver/ows?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:BEZIRKSGRENZEOGD&srsName=EPSG:4326&outputFormat=shape-zip" -O BEZIRKSGRENZEOGD.zip
unzip -d /tmp BEZIRKSGRENZEOGD.zip
shp2pgsql -s 4326 -I -S -c -W Latin1 "/tmp/BEZIRKSGRENZEOGD.shp" | psql gis_experimental

Now, I’d just need a loop through the WFS Capabilities to automatically fetch all offered layers … Ideas anyone?

Thanks to Tim for his post “Batch importing shapefiles into PostGIS” which was very useful here.

Update: Many readers have pointed out that ogr2ogr is a great tool for this kind of use cases and can do the above in one line. That’s true – if it works. Unfortunately, it is picky about the supported encodings, e.g. doesn’t want to parse ISO-8859-15. In such cases, the three code lines above can be a good alternative.

This is the follow up post to “An osm2po Quickstart” which covers loading the OSM network into PostGIS and using the result with pgRouting. After parsing the OSM file, e.g.

C:\Users\Anita\temp\osm2po-4.2.30>java -jar osm2po-core-4.2.30-signed.jar prefix=at "C:\Users\Anita\Geodaten\OpenStreetMap Data\austria.osm.pbf"

you should find a folder with the name of the prefix you chose inside the osm2po folder. It contains a log file which in turn provides a command line template for importing the OSM network into PostGIS, e.g.

INFO commandline template: psql -U [username] -d [dbname] -q -f "C:\Users\Anita\temp\osm2po-4.2.30\at\at_2po_4pgr.sql"

Using this template, we can easily import the .sql file into an exiting database. My pgRouting-enabled database is called wien_ogd.

C:\Users\Anita\temp\osm2po-4.2.30\at>psql -U [username] -d wien_ogd -q -f C:\Users\Anita\temp\osm2po-4.2.30\at\at_2po_4pgr.sql

Now, the data is ready for usage in QGIS:

The osm2po table in QGIS

Using “pgRouting Layer” plugin, it’s now straightforward to calculate shortest paths. I had to apply some changes to the plugin code, so please get the latest version from Github.

A shortest path in osm2po network

Using osm2po turned out to be far less painful than I expected and I hope you’ll find this post useful too.

Alpha shapes are generalizations of the convex hull [1]. Convex hulls are well known and widely implemented in GIS systems. Alpha shapes are different in that they capture the shape of a point set. You can watch a great demo of how alpha shapes work on François Bélair’s website “Everything You Always Wanted to Know About Alpha Shapes But Were Afraid to Ask” I borrowed the following pictures from that site:

Alpha shapes for different values of alpha. The left one equals the convex hull of the point set.

Alpha shapes for different values of alpha. The left one equals the convex hull of the point set.

pgRouting comes with an implementation of alpha shapes. There is an alpha shape function: alphashape(sql text) and a convenience wrapper: points_as_polygon(query character varying). The weird thing is that you don’t get to set an alpha value. The only thing supplied to the function is a set of points. Let’s see what kind of results it produces!

Starting point for this experiment is a 10 km catchment zone around node #2699 in my osm road network. Travel costs to nodes are calculated using driving_distance() function. (You can find more information on using this function in Catchment Areas with pgRouting driving_distance().)

CREATE TABLE home_catchment10km AS
SELECT *
   FROM osm_nodes
   JOIN
   (SELECT * FROM driving_distance('
      SELECT gid AS id,
          source,
          target,
          meters AS cost
      FROM osm_roads',
      2699,
      10000,
      false,
      false)) AS route
   ON
   osm_nodes.id = route.vertex_id

After costs are calculated, we can create some alpha shapes. The following queries create the table and insert an alpha shape for all points with a cost of less than 1500:

CREATE TABLE home_isodist (id serial, max_cost double precision);
SELECT AddGeometryColumn('home_isodist','the_geom',4326,'POLYGON',2);

INSERT INTO home_isodist (max_cost, the_geom) (
SELECT 1500, ST_SetSRID(the_geom,4326)
FROM 
  points_as_polygon(
    'SELECT id, ST_X(the_geom) AS x, ST_Y(the_geom) AS y FROM home_catchment10km where cost < 1500'));

In previous posts, I’ve created catchment areas by first interpolating a cost raster and creating contours from there. Now, let’s see how the two different approaches compare!

The following picture shows resulting catchment areas for 500, 1000, 1500, and 2000 meters around a central node. Colored areas show the form of pgRouting alpha shape results. Black contours show the results of the interpolation method:

Comparison of pgRouting alpha shapes and interpolation method

At first glance, results look similar enough. Alpha shape results look like a generalized version of interpolation results. I guess that it would be possible to get even closer if the alpha value could be set to a smaller value. The function should then produce a finer, more detailed polygon.

For a general overview about which areas of a network are reachable within certain costs, pgRouting alpha shapes function seems a viable alternative to the interpolation method presented in previous posts. However, the alpha value used by pgRouting seems too big to produce detailed catchment areas.

[1] http://biogeometry.duke.edu/software/alphashapes/

Today, I’ve been experimenting with data from OpenFlights.org. They offer airport, airline and route data for download. The first idea that came to mind was to connect airports on a shared route by lines. This kind of visualization just looks much nicer if the connections are curved instead of simple straight lines.

Luckily, that’s pretty easy to do using PostGIS. After loading airport positions and route data, we can create the connection lines like this (based on [postgis-users] Great circle as a linestring):

UPDATE experimental.airroutes
SET the_geom = 
(SELECT ST_Transform(ST_Segmentize(ST_MakeLine(
       ST_Transform(a.the_geom, 953027),
       ST_Transform(b.the_geom, 953027)
     ), 100000 ), 4326 ) 
 FROM experimental.airports a, experimental.airports b
 WHERE a.id = airroutes.source_id  
   AND b.id = airroutes.dest_id
);

The CRS used in the query is not available in PostGIS by default. You can add it like this (source: spatialreference.org):

INSERT into spatial_ref_sys (srid, auth_name, auth_srid, proj4text, srtext) values ( 953027, 'esri', 53027, '+proj=eqdc +lat_0=0 +lon_0=0 +lat_1=60 +lat_2=60 +x_0=0 +y_0=0 +a=6371000 +b=6371000 +units=m +no_defs ', 'PROJCS["Sphere_Equidistant_Conic",GEOGCS["GCS_Sphere",DATUM["Not_specified_based_on_Authalic_Sphere",SPHEROID["Sphere",6371000,0]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]],PROJECTION["Equidistant_Conic"],PARAMETER["False_Easting",0],PARAMETER["False_Northing",0],PARAMETER["Central_Meridian",0],PARAMETER["Standard_Parallel_1",60],PARAMETER["Standard_Parallel_2",60],PARAMETER["Latitude_Of_Origin",0],UNIT["Meter",1],AUTHORITY["EPSG","53027"]]');

This is an example visualization (done in QGIS) showing only flight routes starting from Vienna International Airport:

Flight routes from Vienna International

Connections crossing the date line are currently more problematic. Lines would have to be split, otherwise this is what you’ll get:

Date line trouble

Do you need to create a table with a geometry column in PostGIS from scratch?
Can’t remember the syntax of AddGeometryColumn(varchar catalog_name, varchar schema_name, varchar table_name, varchar column_name, integer srid, varchar type, integer dimension)? I can’t. ;)

Let’s make our lives easier: QGIS PostGIS Manager offers a convenient GUI for creating tables with geometry columns:

PostGIS Manager Create Table dialog

The dialog works a lot like what you’re probably used to in pgAdmin – with the added nicety of supporting Geometry columns.

In my opinion, this is the fastest way so far to create a spatially enabled table. It provides a much better user experience than telling users (especially new ones) to use AddGeometryColumn(…

In a previous post, I’ve described how to create catchment areas with pgRouting shortest_path() function. The solution described there calculates costs from the starting node (aka vertex) to all other nodes in the network. Depending on the network size, this can take a long time. Especially, if you are only interested in relatively small catchment areas (e.g. 50 km around a node in a network covering 10,000 km) there is a lot of unnecessary calculation going on. This is where you might want to use driving_distance() instead.

Driving_distance() offers a parameter for maximum distance/cost and will stop calculations when the costs exceed this limit. But let’s start at the beginning: installing the necessary functions.

Installation

If you have followed my guide to installing pgRouting, you already have some routing functions installed – but not driving_distance(). Weirdly, the necessary SQL scripts are not shipped with the .zip file available on pgRouting’s download page. You need:

routing_dd.sql
routing_dd_wrappers.sql

Both are available through the project repository at Github. Get them and execute them in your pgRouting-enabled database. Now, you should be ready.

Calculating driving distances

To calculate driving distances, we need a query very similar to shortest_path():

CREATE OR REPLACE FUNCTION driving_distance(
sql text,
source_id integer,
distance float8,
directed boolean,
has_reverse_cost boolean)
RETURNS SETOF path_result

The only new value is “distance”. That’s the maximum distance/cost you want to be contained in the result set. “distance” has to be specified in the same units as the cost attribute (which is specified in the “sql” text parameter).

Note: In my opinion, the name “(driving) distance” is misleading. While you can use distance as a cost attribute, you’re not limited to distances. You can use any cost attribute you like, e.g. travel time, fuel consumption, co2 emissions, …

The actual query for a catchment area of 100 km around node # 2000 looks like this:

SELECT * FROM driving_distance('
      SELECT gid AS id,
          start_id::int4 AS source,
          end_id::int4 AS target,
          shape_leng::float8 AS cost
      FROM network',
      2000,
      100000,
      false,
      false)

Interpreting the result

These are the first lines of the result set:

vertex_id;edge_id;cost
294;7262;97400.433506144
297;7236;98012.620979231
335;1095;96849.456306244
347;7263;93617.693852324
364;7098;93573.849081386
366;2551;92702.443434779
378;7263;91994.328368081

The cost attribute contains the total cost of travel from the starting node to the vertex_id node.
We will only be using vertex_id and cost. The use of edge_id is a mystery to me.

Visualizing the result

The easiest way to visualize driving_distance() results is using RT Sql Layer plugin. We need to join the results of driving_distance() with the table containing node geometries:

SELECT *
   FROM node
   JOIN
   (SELECT * FROM driving_distance('
      SELECT gid AS id,
          start_id::int4 AS source,
          end_id::int4 AS target,
          shape_leng::float8 AS cost
      FROM network',
      2000,
      100000,
      false,
      false)) AS route
   ON
   node.id = route.vertex_id

If you color the nodes based on the cost attribute, it will look something like this:

result of pgRouting driving_distance() visualized in QGIS

This post covers the topic of creating MULTI* geometries or GEOMETRYCOLLECTIONs in PostGIS using ST_Collect or ST_Union.

Often it doesn’t matter if you use ST_Collect or ST_Union. Both will return MULTI* geometries or – if different geometry types or multi geometries are mixed – GEOMETRYCOLLECTIONS.

Why you might want to use ST_Collect instead of ST_Union

ST_Collect is much faster than ST_Union [1]. That’s already a good argument. It’s faster because it does not dissolve boundaries or check for overlapping regions. It simply combines geometries into MULTI*s or GEOMETRYCOLLECTIONs.

This leads to reason two for why you might prefer ST_Collect: Dissolving boundaries and overlapping regions – like ST_Union does – can lead to undesired effects. For example, combining three LINESTRING geometries that partially overlap, results in a MULTILINESTRING consisting of more than three lines. The input lines are split where they overlap each other or themselves.

Analysis and visualization based on the resulting geometries might lead to incorrect conclusions if the user is not aware of the processes performed by the union command.

[1] http://postgis.refractions.net/documentation/manual-svn/ST_Collect.html

PostGIS – our favorite spatial db – is nearing its 2.0 release! We are promised even better performance and stability and many new capabilities especially in the three new areas: raster data, topology, and 3D.

Raster

In 2.0, PostGIS Raster (formerly known as WKT Raster) will be fully integrated into the main application. Raster images are stored in special raster tables, which can be loaded from and exported to any GDAL-supported format. Additionally, there are functions for analyzing and operating on the pixel data. Rasters can be for example: vectorized, averaged, checked for intersections with vector geometries and edited inside the database.

For more information on what you can do with Rasters, check: PostGIS Doc, Chapter 8. Raster Reference.

Topology

PostGIS 2.0 represents the beginning of topology support in PostGIS. It will be possible to transform standard geometry into topological data, validate topology, and edit nodes and edges. Topology can be exported to GML.

Topology is covered in PostGIS Doc, Chapter 10. Topology.

3D

PostGIS 2.0 adds two 3D geometry types: polygonal surfaces and triangular irregular networks (TINs). Additional support operators for common tasks like finding areas (and volumes) of regions in 3D are included as well. Another improvement in this area is that existing spatial indices have been made 3D-aware, and a library of 3D-functions has been added. This allows calculation of distances and intersections in 3D, 3-dimensional bounding-boxes and many more things like 3D shortest-paths.

For a list of functions that support 3D, check PostGIS Doc, Chapter 12.

Read more on lwn.net.

Site analyses can benefit greatly from using “drive-time” isochrones to define the study area. Drive time isochrones are often significantly different from simple buffer areas which disregard natural barriers such as rivers or slow roads.

Of course, creating drive time isochrones requires more input data and more compute-intensive algorithms than a simple buffer analysis. It is necessary to create a routable network graph with adequate weights to be used by the routing algorithm.

One of the most popular routing  applications in the open source world is pgRouting for PostGIS enabled databases. I’ve already shown how to create drive time isochrones for one network node based on pgRouting and QGIS.  Today, I’ll show how to create drive time isochrones for a set of points – in this case all airports in Finland.

The first step is to find the closest network node to every airport:

ALTER TABLE airport
   ADD COLUMN nearest_node integer;

CREATE TABLE temp AS
   SELECT a.gid, b.id, min(a.dist)
   FROM
     (SELECT airport.gid, 
             min(distance(airport.the_geom, node.the_geom)) AS dist
      FROM airport, node
      GROUP BY airport.gid) AS a,
     (SELECT airport.gid, node.id, 
             distance(airport.the_geom, node.the_geom) AS dist
      FROM airport, node) AS b
   WHERE a.dist = b. dist
         AND a.gid = b.gid
   GROUP BY a.gid, b.id;

UPDATE airport
SET nearest_node = 
   (SELECT id 
    FROM temp
    WHERE temp.gid = airport.gid);

Then, we can calculate drive times between network nodes and “airport nodes”. I am still looking for the most efficient way to perform this calculation. The trivial solution I used for this example was to calculate all drive time values separately for each airport node (as described in “Creating Catchment Areas with pgRouting and QGIS”).
I create the table with the first node:

CREATE TABLE catchment_airport AS
SELECT 
    id,
    the_geom,
    (SELECT sum(cost) FROM (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   5657,
	   id,
	   false,
	   false)) AS foo ) AS cost
FROM node;

Every following node is done with an INSERT:

INSERT INTO catchment_airport (
SELECT 
    id,
    the_geom,
    (SELECT sum(cost) FROM (
	   SELECT * FROM shortest_path('
	   SELECT gid AS id,
		  start_id::int4 AS source,
		  end_id::int4 AS target,
		  traveltime::float8 AS cost
	   FROM network',
	   123, -- put a new index here!
	   id,
	   false,
	   false)) AS foo ) AS cost
FROM node);

Afterwards, I combined the values to find the minimum drive time for each network node:

CREATE table catchment_airport_final AS
SELECT id, the_geom, min (cost) AS cost
FROM catchment_airport
GROUP By id, the_geom;

The resulting point layer was imported into QGIS. Using TIN interpolation (from Interpolation plugin), you can calculate a continuous cost surface. And Contour function (from GDALTools) finally yields drive time isochrones.

Drive time isochrones around airports in northern Finland - spatial data © National Land Survey of Finland 2011

Based on this analysis, it is possible to determine how many inhabitants live within one hour driving distance from an airport or how many people have to drive longer than e.g. ninety minutes to reach any airport.

Follow

Get every new post delivered to your Inbox.

Join 1,493 other followers

%d bloggers like this: