Archive

PostgreSQL

This post is a quick instruction for installing Postgres 9.2, PostGIS 2.0 and pgRouting 2.0.

  1. For Postgres, download the installer from enterprisedb.com.
  2. Run the installer. You’ll have to pick a superuser password – remember it, you’ll need it again soon.
  3. At the end of the installation process, allow the installer to start Stack Builder.
  4. In Stack Builder, select the Postgres 9.2 installation and install PostGIS from the list of available extensions.
  5. The PostGIS installation procedure will prompt you for the superuser password you picked before.
  6. I suggest letting the installer create a sample database We’ll need it later anyway.

Now for the pgRouting part:

  1. Download the pgRouting zip file for your system (32 or 64 bit) from Winnie.
  2. Unzip the file. It contains bin, lib and share folders as well as two text files.
  3. Copy these folders and files over to your Postgres installation. In my case C:\Program Files\PostgreSQL\9.2

Installation – done.

Next, fire up pgAdmin. If you allowed the PostGIS installer to create a sample database, you will find it named postgis20 or similar. Otherwise just create a new database using the PostGIS template database. You can enable pgRouting in a database using the following steps:

  1. In postgis20, execute the following query to create the pgrouting extension. This will add the pgRouting-specific functions:
    CREATE EXTENSION pgrouting;
  2. Test if everything worked fine:
    SELECT pgr_version();

    It should return "(2.0.0-dev,v2.0.0-beta,18,a3be38b,develop,1.46.1)" or similar – depending on the version you downloaded.

pgadmin_pgrouting

How about some test data? I’ll be using the public transport network of the city of Vienna provided in GeoJSON format from http://data.wien.gv.at/daten/wfs?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:OEFFLINIENOGD&srsName=EPSG:4326&outputFormat=json:

  1. Just copy paste the url in Add Vector Layer | Protocol to load the dataset.
  2. Use DB Manager to load the layer into your database. (As you can see in the screenshot, I created a schema called network but that’s optional.)
  3. import_publictransport

  4. To make the line vector table routable, we use pgr_createTopology. This function assumes the columsn called “source” and “target” exist so we have to create those as well:
    alter table network.publictransport add column source integer;
    alter table network.publictransport add column target integer;
    select pgr_createTopology('network.publictransport', 0.0001, 'geom', 'id');
    

    I’m quite generously using a tolerance of 0.0001 degrees to build the topology. Depending on your dataset, you might want to be more strict here.

  5. Let’s test it! Route from source #1 to target #3000 using pgr_dijkstra:
    SELECT seq, id1 AS node, id2 AS edge, cost, geom
      FROM pgr_dijkstra(
        'SELECT id, source, target, st_length(geom) as cost FROM network.publictransport',
        1, 3000, false, false
      ) as di
      JOIN network.publictransport pt
      ON di.id2 = pt.id ;

    Note how the query joins the routing results and the network table together. (I’m aware that using the link length as a cost attribute will not lead to realistic results in a public transport network but bear with me for this example.)

  6. We can immediately see the routing results using the Load as new layer option:

select_dijkstra
route

Nice work! pgRouting 2.0 has come a long way. In a post from April this year, Boston GIS even announced to add pgRouting into the Stack Builder. That’s going to make the installation even more smooth.

Advertisements

For everyone working with spatial databases in QGIS there comes a time when “Add PostGIS/SpatiaLite Layer” and “RT Sql Layer” start to be annoying. You always have to retype or copy-paste your SQL queries into the user interface if you need to change the tiniest thing in the layer’s definition.

This is where “Fast SQL Layer” can be a real time saver. Fast SQL Layer is a new plugin for QGIS by Pablo T. Carreira. It basically adds an SQL console for loading layers from PostGIS/SpatiaLite into QGIS. And it even comes with syntax highlighting!

Installation

Fast SQL Layer comes with one dependency: Pygments, which is used for syntax highlighting.

On Ubuntu, all you have to do is install it with apt-get:

sudo apt-get install python-pygments

For Windows with OSGeo4W, @Mike_Toews posted this on gis.stackexchange:

I downloaded and extracted Pygments-1.4.tar.gz, then in an OSGeo4W shell within the Pygments-1.4 directory, type python setup.py build then python setup.py install

Usage

When you activate the plugin in plugin manager, a dock widget will appear which contains the console and some fields for specifying the database connection that should be used. Then, you can simply write your SQL query and load the results with one click.

Fast SQL plugin

In this example, I renamed “gid” to “id”, but you can actually edit the values in the drop down boxes to adjust the column names for id and geometry:

A second layer loaded using Fast SQL plugin

It certainly needs some polishing on the user interface side but I really like it.

Today’s challenge is to read data from a CSV file and construct usable timestamps. This is the layout of the file, notice the missing leading zeros in the month, hour and seconds fields:

utc_date;utc_time
"2011/3/15 ";"6:17:3 "
"2011/3/15 ";"6:17:8 "
"2011/3/15 ";"6:17:13 "
"2011/3/15 ";"6:17:18 "
"2011/3/15 ";"6:17:23 "

To get the data into Postgres, I used the copy command with csv header option. The table has to exist already. “csv header” tells the command to ignore the file’s header line.

copy mytable 
from '~/my.csv' 
csv header;

Now, we have to handle the “missing leading zeros” problem. Luckily, Postgres offers a template pattern modifier that addresses exactly this problem: the “FM” (fill mode) prefix. The pattern for a month without leading zeros therefore is ‘FMMM’. The full to_timestamp command looks like this:

update mytable
set utc_timestamp = to_timestamp(utc_date||' '||utc_time,'YYYY/FMMM/FMDD FMHH24:FMMI:FMSS' )

Do you need a random sample of features in a Postgres table? Here is an example of how to select 1,000 random features from a table:

SELECT * FROM myTable
WHERE attribute = 'myValue'
ORDER BY random()
LIMIT 1000;

Besides many other interesting topics, Opengeo’s PostGIS tutorial discusses “Tuning PostgreSQL for Spatial”.

The following values are recommended for production environments:

  • shared_buffers: 75 % of database memory (500 MB)
  • work_mem: 16 MB
  • maintenance_work_mem: 128 MB
  • wal_buffers: 1 MB
  • checkpoint_segments: 6
  • random_page_cost: 2.0
  • seq_page_cost: 1.0

All of these configuration parameters can edited in the database configuration file, C:\Documents and Settings\%USER\.opengeo\pgdata\%USER. This is a regular text file and can be edited using Notepad or any other text editor. An easier way of editing this configuration is by using the built-in “Backend Configuration Editor”. In pgAdmin, go to File > Open postgresql.conf…. It will ask for the location of the file, and navigate to C:\Documents and Settings\%USER\.opengeo\pgdata\%USER.

The changes will not take effect until the server is restarted.

Maybe this is a bit off-topic, but I just spent quite some time on this and I need to write it down so I can look it up again later :)

These are instructions for Ubuntu running Postgres 8.4. By default, Postgres ships without PL/Python so we need to get it first:

sudo apt-get install postgresql-plpython-8.4

Next, we need to create the language for our database. I’m using PgAdmin3. From there, I ran:

CREATE PROCEDURAL LANGUAGE 'plpython' HANDLER plpython_call_handler;

This should have been it. Let’s try with a simple function:

CREATE FUNCTION replace_e_to_a(text) RETURNS text AS
'
import re
Text1 = re.sub(''e'', ''a'',args[0])
return Text1
'
LANGUAGE 'plpython';

SELECT replace_e_to_a('Meee');

… should return ‘Maaa’.

Now for the juicy part: Let’s create an INSERT trigger function!

First, let’s have a look at the corresponding table structure. We have two tables “user_data” and “user_keywords”. “user_data” is the table that’s being filled with information from external functions. “user_keywords” has to be kept up-to-date. It is supposed to count the appearance of keywords on a per-user base.

user_data                                   user_keywords
user_id, event_id, keywords                 user_id, keyword,   count
1,       1,        'music,rock'             1,       'music',   2
1,       2,        'music,classic'          1,       'rock',    1
                                            1,       'classic', 1

First, the keyword list has to be split. Then a row has to be inserted for new keywords (compare insert_plan) and the counter has to be increased for existing keywords (update_plan).

The values that are about to be inserted can be accessed via TD[“new”][“column_name”].

CREATE FUNCTION update_keyword_count() RETURNS trigger AS '

keywords = TD["new"]["keywords"]
user = TD["new"]["user_id"]

insert_plan = plpy.prepare("INSERT INTO user_keywords (keyword, count, user_id) VALUES ($1, $2, $3)", ["text", "int", "int"])

update_plan = plpy.prepare("UPDATE user_keywords SET count = $3 WHERE user_id = $1 AND keyword = $2", ["int", "text", "int"])

for keyword in keywords.split(","):
  select_cnt_rows = plpy.prepare("SELECT count(*) AS cnt FROM user_keywords WHERE user_id = $1 AND keyword = $2", ["int", "text"])
  cnt_rows = plpy.execute(select_cnt_rows, [user, keyword])

  select_plan = plpy.prepare("SELECT count AS cnt FROM user_keywords WHERE user_id = $1 AND keyword = $2", ["int", "text"])
  results = plpy.execute(select_plan, [user, keyword])

  if cnt_rows[0]["cnt"] == 0:
   rv = plpy.execute(insert_plan, [keyword, 1, user])
  else:
   rv = plpy.execute(update_plan, [user, keyword, results[0]["cnt"]+1])

' LANGUAGE plpython;

Now, we need to wire it up by defining the trigger:

CREATE TRIGGER update_keywords
BEFORE INSERT ON user_data
FORE EACH ROW
EXECUTE PROCEDURE update_keyword_count();

… Wasn’t that bad ;)

%d bloggers like this: