Advertisements

Archive

PostgreSQL

This is a guest post by Chris Kohler .

Introduction:

This guide provides step-by-step instructions to produce drive-time isochrones using a single vector shapefile. The method described here involves building a routing network using a single vector shapefile of your roads data within a Virtual Box. Furthermore, the network is built by creating start and end nodes (source and target nodes) on each road segment. We will use Postgresql, with PostGIS and Pgrouting extensions, as our database. Please consider this type of routing to be fair, regarding accuracy, as the routing algorithms are based off the nodes locations and not specific addresses. I am currently working on an improved workflow to have site address points serve as nodes to optimize results. One of the many benefits of this workflow is no financial cost to produce (outside collecting your roads data). I will provide instructions for creating, and using your virtual machine within this guide.

Steps:–Getting Virtual Box(begin)–

Intro 1. Download/Install Oracle VM(https://www.virtualbox.org/wiki/Downloads)

Intro 2. Start the download/install OSGeo-Live 11(https://live.osgeo.org/en/overview/overview.html).

Pictures used in this workflow will show 10.5, though version 11 can be applied similarly. Make sure you download the version: osgeo-live-11-amd64.iso. If you have trouble finding it, here is the direct link to the download (https://sourceforge.net/projects/osgeo-live/files/10.5/osgeo-live-10.5-amd64.iso/download)
Intro 3. Ready for virtual machine creation: We will utilize the downloaded OSGeo-Live 11 suite with a virtual machine we create to begin our workflow. The steps to create your virtual machine are listed below. Also, here are steps from an earlier workshop with additional details with setting up your virtual machine with osgeo live(http://workshop.pgrouting.org/2.2.10/en/chapters/installation.html).

1.  Create Virutal Machine: In this step we begin creating the virtual machine housing our database.

Open Oracle VM VirtualBox Manager and select “New” located at the top left of the window.

VBstep1

Then fill out name, operating system, memory, etc. to create your first VM.

vbstep1.2

2. Add IDE Controller:  The purpose of this step is to create a placeholder for the osgeo 11 suite to be implemented. In the virtual box main window, right-click your newly-created vm and open the settings.

vbstep2

In the settings window, on the left side select the storage tab.

Find “adds new storage controller button located at the bottom of the tab. Be careful of other buttons labeled “adds new storage attachment”! Select “adds new storage controller button and a drop-down menu will appear. From the top of the drop-down select “Add IDE Controller”.

vbstep2.2

vbstep2.3

You will see a new item appear in the center of the window under the “Storage Tree”.

3.  Add Optical Drive: The osgeo 11 suite will be implemented into the virtual machine via an optical drive. Highlight the new controller IDE you created and select “add optical drive”.

vbstep3

A new window will pop-up and select “Choose Disk”.

vbstep3.2

Locate your downloaded file “osgeo-live 11 amd64.iso” and click open. A new object should appear in the middle window under your new controller displaying “osgeo-live-11.0-amd64.iso”.

vbstep3.3

Finally your virtual machine is ready for use.
Start your new Virtual Box, then wait and follow the onscreen prompts to begin using your virtual machine.

vbstep3.4

–Getting Virtual Box(end)—

4. Creating the routing database, and both extensions (postgis, pgrouting): The database we create and both extensions we add will provide the functions capable of producing isochrones.

To begin, start by opening the command line tool (hold control+left-alt+T) then log in to postgresql by typing “psql -U user;” into the command line and then press Enter. For the purpose of clear instruction I will refer to database name in this guide as “routing”, feel free to choose your own database name. Please input the command, seen in the figure below, to create the database:

CREATE DATABASE routing;

You can use “\c routing” to connect to the database after creation.

step4

The next step after creating and connecting to your new database is to create both extensions. I find it easier to take two-birds-with-one-stone typing “psql -U user routing;” this will simultaneously log you into postgresql and your routing database.

When your logged into your database, apply the commands below to add both extensions

CREATE EXTENSION postgis;
CREATE EXTENSION pgrouting;

step4.2

step4.3

5. Load shapefile to database: In this next step, the shapefile of your roads data must be placed into your virtual machine and further into your database.

My method is using email to send myself the roads shapefile then download and copy it from within my virtual machines web browser. From the desktop of your Virtual Machine, open the folder named “Databases” and select the application “shape2pgsql”.

step5

Follow the UI of shp2pgsql to connect to your routing database you created in Step 4.

step5.2

Next, select “Add File” and find your roads shapefile (in this guide we will call our shapefile “roads_table”) you want to use for your isochrones and click Open.

step5.3

Finally, click “Import” to place your shapefile into your routing database.

6. Add source & target columns: The purpose of this step is to create columns which will serve as placeholders for our nodes data we create later.

There are multiple ways to add these columns into the roads_table. The most important part of this step is which table you choose to edit, the names of the columns you create, and the format of the columns. Take time to ensure the source & target columns are integer format. Below are the commands used in your command line for these functions.

ALTER TABLE roads_table ADD COLUMN "source" integer;
ALTER TABLE roads_table ADD COLUMN "target" integer;

step6

step6.2

7. Create topology: Next, we will use a function to attach a node to each end of every road segment in the roads_table. The function in this step will create these nodes. These newly-created nodes will be stored in the source and target columns we created earlier in step 6.

As well as creating nodes, this function will also create a new table which will contain all these nodes. The suffix “_vertices_pgr” is added to the name of your shapefile to create this new table. For example, using our guide’s shapefile name , “roads_table”, the nodes table will be named accordingly: roads_table_vertices_pgr. However, we will not use the new table created from this function (roads_table_vertices_pgr). Below is the function, and a second simplified version, to be used in the command line for populating our source and target columns, in other words creating our network topology. Note the input format, the “geom” column in my case was called “the_geom” within my shapefile:

pgr_createTopology('roads_table', 0.001, 'geom', 'id',
 'source', 'target', rows_where := 'true', clean := f)

step7

Here is a direct link for more information on this function: http://docs.pgrouting.org/2.3/en/src/topology/doc/pgr_createTopology.html#pgr-create-topology

Below is an example(simplified) function for my roads shapefile:

SELECT pgr_createTopology('roads_table', 0.001, 'the_geom', 'id')

8. Create a second nodes table: A second nodes table will be created for later use. This second node table will contain the node data generated from pgr_createtopology function and be named “node”. Below is the command function for this process. Fill in your appropriate source and target fields following the manner seen in the command below, as well as your shapefile name.

To begin, find the folder on the Virtual Machines desktop named “Databases” and open the program “pgAdmin lll” located within.

step8

Connect to your routing database in pgAdmin window. Then highlight your routing database, and find “SQL” tool at the top of the pgAdmin window. The tool resembles a small magnifying glass.

step8.2

We input the below function into the SQL window of pgAdmin. Feel free to refer to this link for further information: (https://anitagraser.com/2011/02/07/a-beginners-guide-to-pgrouting/)

CREATE TABLE node AS
   SELECT row_number() OVER (ORDER BY foo.p)::integer AS id,
          foo.p AS the_geom
   FROM (     
      SELECT DISTINCT roads_table.source AS p FROM roads_table
      UNION
      SELECT DISTINCT roads_table.target AS p FROM roads_table
   ) foo
   GROUP BY foo.p;

step8.3

  1.  Create a routable network: After creating the second node table from step 8,  we will combine this node table(node) with our shapefile(roads_table) into one, new, table(network) that will be used as the routing network. This table will be called “network” and will be capable of processing routing queries.  Please input this command and execute in SQL pgAdmin tool as we did in step 8. Here is a reference for more information:(https://anitagraser.com/2011/02/07/a-beginners-guide-to-pgrouting/)   

step8.2

 

CREATE TABLE network AS
   SELECT a.*, b.id as start_id, c.id as end_id
   FROM roads_table AS a
      JOIN node AS b ON a.source = b.the_geom
      JOIN node AS c ON a.target = c.the_geom;

step9.2

10. Create a “noded” view of the network:  This new view will later be used to calculate the visual isochrones in later steps. Input this command and execute in SQL pgAdmin tool.

CREATE OR REPLACE VIEW network_nodes AS 
SELECT foo.id,
 st_centroid(st_collect(foo.pt)) AS geom 
FROM ( 
  SELECT network.source AS id,
         st_geometryn (st_multi(network.geom),1) AS pt 
  FROM network
  UNION 
  SELECT network.target AS id, 
         st_boundary(st_multi(network.geom)) AS pt 
  FROM network) foo 
GROUP BY foo.id;

step10

11.​ Add column for speed:​ This step may, or may not, apply if your original shapefile contained a field of values for road speeds.

In reality a network of roads will typically contain multiple speed limits. The shapefile you choose may have a speed field, otherwise the discrimination for the following steps will not allow varying speeds to be applied to your routing network respectfully.

If values of speed exists in your shapefile we will implement these values into a new field, “traveltime“, that will show rate of travel for every road segment in our network based off their geometry. Firstly, we will need to create a column to store individual traveling speeds. The name of our column will be “traveltime” using the format: ​double precision.​ Input this command and execute in the command line tool as seen below.

ALTER TABLE network ADD COLUMN traveltime double precision;

step11

Next, we will populate the new column “traveltime” by calculating traveling speeds using an equation. This equation will take each road segments geometry(shape_leng) and divide by the rate of travel(either mph or kph). The sample command I’m using below utilizes mph as the rate while our geometry(shape_leng) units for my roads_table is in feet​. If you are using either mph or kph, input this command and execute in SQL pgAdmin tool. Below further details explain the variable “X”.

UPDATE network SET traveltime = shape_leng / X*60

step11.2

How to find X​, ​here is an example​: Using example 30 mph as rate. To find X, we convert 30 miles to feet, we know 5280 ft = 1 mile, so we multiply 30 by 5280 and this gives us 158400 ft. Our rate has been converted from 30 miles per hour to 158400 feet per hour. For a rate of 30 mph, our equation for the field “traveltime”  equates to “shape_leng / 158400*60″. To discriminate this calculations output, we will insert additional details such as “where speed = 30;”. What this additional detail does is apply our calculated output to features with a “30” value in our “speed” field. Note: your “speed” field may be named differently.

UPDATE network SET traveltime = shape_leng / 158400*60 where speed = 30;

Repeat this step for each speed value in your shapefile examples:

UPDATE network SET traveltime = shape_leng / X*60 where speed = 45;
UPDATE network SET traveltime = shape_leng / X*60 where speed = 55;

The back end is done. Great Job!

Our next step will be visualizing our data in QGIS. Open and connect QGIS to your routing database by right-clicking “PostGIS” in the Browser Panel within QGIS main window. Confirm the checkbox “Also list tables with no geometry” is checked to allow you to see the interior of your database more clearly. Fill out the name or your routing database and click “OK”.

If done correctly, from QGIS you will have access to tables and views created in your routing database. Feel free to visualize your network by drag-and-drop the network table into your QGIS Layers Panel. From here you can use the identify tool to select each road segment, and see the source and target nodes contained within that road segment. The node you choose will be used in the next step to create the views of drive-time.

12.Create views​: In this step, we create views from a function designed to determine the travel time cost. Transforming these views with tools will visualize the travel time costs as isochrones.

The command below will be how you start querying your database to create drive-time isochrones. Begin in QGIS by draging your network table into the contents. The visual will show your network as vector(lines). Simply select the road segment closest to your point of interest you would like to build your isochrone around. Then identify the road segment using the identify tool and locate the source and target fields.

step12

step12.2

Place the source or target field value in the below command where you see ​VALUE​, in all caps​.

This will serve you now as an isochrone catchment function for this workflow. Please feel free to use this command repeatedly for creating new isochrones by substituting the source value. Please input this command and execute in SQL pgAdmin tool.

*AT THE BOTTOM OF THIS WORKFLOW I PROVIDED AN EXAMPLE USING SOURCE VALUE “2022”

CREATE OR REPLACE VIEW "​view_name" AS 
SELECT di.seq, 
       di.id1, 
       di.id2, 
       di.cost, 
       pt.id, 
       pt.geom 
FROM pgr_drivingdistance('SELECT
     gid::integer AS id, 
     Source::integer AS source, 
     Target::integer AS target,                                    
     Traveltime::double precision AS cost 
       FROM network'::text, ​VALUE::bigint, 
    100000::double precision, false, false)
    di(seq, id1, id2, cost)
JOIN network_nodes pt ON di.id1 = pt.id;

step12.3

13.Visualize Isochrone: Applying tools to the view will allow us to adjust the visual aspect to a more suitable isochrone overlay.

​After creating your view, a new item in your routing database is created, using the “view_name” you chose. Drag-and-drop this item into your QGIS LayersPanel. You will see lots of small dots which represent the nodes.

In the figure below, I named my view “take1“.

step13

Each node you see contains a drive-time value, “cost”, which represents the time used to travel from the node you input in step 12’s function.

step13.2

Start by installing the QGIS plug-in Interpolation” by opening the Plugin Manager in QGIS interface.

step13.3

Next, at the top of QGIS window select “Raster” and a drop-down will appear, select “Interpolation”.

step13.4

 

A new window pops up and asks you for input.

step13.5

Select your “​view”​ as the​ vector layer​, select ​”cost​” as your ​interpolation attribute​, and then click “Add”.

step13.6

A new vector layer will show up in the bottom of the window, take care the type is Points. For output, on the other half of the window, keep the interpolation method as “TIN”, edit the ​output file​ location and name. Check the box “​Add result to project​”.

Note: decreasing the cellsize of X and Y will increase the resolution but at the cost of performance.

Click “OK” on the bottom right of the window.

step13.7

A black and white raster will appear in QGIS, also in the Layers Panel a new item was created.

step13.8

Take some time to visualize the raster by coloring and adjusting values in symbology until you are comfortable with the look.

step13.9

step13.10

14. ​Create contours of our isochrone:​ Contours can be calculated from the isochrone as well.

Find near the top of QGIS window, open the “Raster” menu drop-down and select Extraction → Contour.

step14

Fill out the appropriate interval between contour lines but leave the check box “Attribute name” unchecked. Click “OK”.

step14.2

step14.3

15.​ Zip and Share:​ Find where you saved your TIN and contours, compress them in a zip folder by highlighting them both and right-click to select “compress”. Email the compressed folder to yourself to export out of your virtual machine.

Example Isochrone catchment for this workflow:

CREATE OR REPLACE VIEW "2022" AS 
SELECT di.seq, Di.id1, Di.id2, Di.cost,                           
       Pt.id, Pt.geom 
FROM pgr_drivingdistance('SELECT gid::integer AS id,                                       
     Source::integer AS source, Target::integer AS target, 
     Traveltime::double precision AS cost FROM network'::text, 
     2022::bigint, 100000::double precision, false, false) 
   di(seq, id1, id2, cost) 
JOIN netowrk_nodes pt 
ON di.id1 = pt.id;

References: Virtual Box ORACLE VM, OSGeo-Live 11  amd64 iso, Workshop FOSS4G Bonn(​http://workshop.pgrouting.org/2.2.10/en/index.html​),

Advertisements

This post is a quick instruction for installing Postgres 9.2, PostGIS 2.0 and pgRouting 2.0.

  1. For Postgres, download the installer from enterprisedb.com.
  2. Run the installer. You’ll have to pick a superuser password – remember it, you’ll need it again soon.
  3. At the end of the installation process, allow the installer to start Stack Builder.
  4. In Stack Builder, select the Postgres 9.2 installation and install PostGIS from the list of available extensions.
  5. The PostGIS installation procedure will prompt you for the superuser password you picked before.
  6. I suggest letting the installer create a sample database We’ll need it later anyway.

Now for the pgRouting part:

  1. Download the pgRouting zip file for your system (32 or 64 bit) from Winnie.
  2. Unzip the file. It contains bin, lib and share folders as well as two text files.
  3. Copy these folders and files over to your Postgres installation. In my case C:\Program Files\PostgreSQL\9.2

Installation – done.

Next, fire up pgAdmin. If you allowed the PostGIS installer to create a sample database, you will find it named postgis20 or similar. Otherwise just create a new database using the PostGIS template database. You can enable pgRouting in a database using the following steps:

  1. In postgis20, execute the following query to create the pgrouting extension. This will add the pgRouting-specific functions:
    CREATE EXTENSION pgrouting;
  2. Test if everything worked fine:
    SELECT pgr_version();

    It should return "(2.0.0-dev,v2.0.0-beta,18,a3be38b,develop,1.46.1)" or similar – depending on the version you downloaded.

pgadmin_pgrouting

How about some test data? I’ll be using the public transport network of the city of Vienna provided in GeoJSON format from http://data.wien.gv.at/daten/geo?service=WFS&request=GetFeature&version=1.1.0&typeName=ogdwien:OEFFLINIENOGD&srsName=EPSG:4326&outputFormat=json:

    1. Just copy paste the url in Add Vector Layer | Protocol to load the dataset.
    2. Use DB Manager to load the layer into your database. (As you can see in the screenshot, I created a schema called network but that’s optional.)

import_publictransport

  1. To make the line vector table routable, we use pgr_createTopology. This function assumes the columsn called “source” and “target” exist so we have to create those as well:
    alter table network.publictransport add column source integer;
    alter table network.publictransport add column target integer;
    select pgr_createTopology('network.publictransport', 0.0001, 'geom', 'id');
    

    I’m quite generously using a tolerance of 0.0001 degrees to build the topology. Depending on your dataset, you might want to be more strict here.

  2. Let’s test it! Route from source #1 to target #3000 using pgr_dijkstra:
    SELECT seq, id1 AS node, id2 AS edge, cost, geom
      FROM pgr_dijkstra(
        'SELECT id, source, target, st_length(geom) as cost FROM network.publictransport',
        1, 3000, false, false
      ) as di
      JOIN network.publictransport pt
      ON di.id2 = pt.id ;

    Note how the query joins the routing results and the network table together. (I’m aware that using the link length as a cost attribute will not lead to realistic results in a public transport network but bear with me for this example.)

  3. We can immediately see the routing results using the Load as new layer option:

select_dijkstra
route

Nice work! pgRouting 2.0 has come a long way. In a post from April this year, Boston GIS even announced to add pgRouting into the Stack Builder. That’s going to make the installation even more smooth.

For everyone working with spatial databases in QGIS there comes a time when “Add PostGIS/SpatiaLite Layer” and “RT Sql Layer” start to be annoying. You always have to retype or copy-paste your SQL queries into the user interface if you need to change the tiniest thing in the layer’s definition.

This is where “Fast SQL Layer” can be a real time saver. Fast SQL Layer is a new plugin for QGIS by Pablo T. Carreira. It basically adds an SQL console for loading layers from PostGIS/SpatiaLite into QGIS. And it even comes with syntax highlighting!

Installation

Fast SQL Layer comes with one dependency: Pygments, which is used for syntax highlighting.

On Ubuntu, all you have to do is install it with apt-get:

sudo apt-get install python-pygments

For Windows with OSGeo4W, @Mike_Toews posted this on gis.stackexchange:

I downloaded and extracted Pygments-1.4.tar.gz, then in an OSGeo4W shell within the Pygments-1.4 directory, type python setup.py build then python setup.py install

Usage

When you activate the plugin in plugin manager, a dock widget will appear which contains the console and some fields for specifying the database connection that should be used. Then, you can simply write your SQL query and load the results with one click.

Fast SQL plugin

In this example, I renamed “gid” to “id”, but you can actually edit the values in the drop down boxes to adjust the column names for id and geometry:

A second layer loaded using Fast SQL plugin

It certainly needs some polishing on the user interface side but I really like it.

Today’s challenge is to read data from a CSV file and construct usable timestamps. This is the layout of the file, notice the missing leading zeros in the month, hour and seconds fields:

utc_date;utc_time
"2011/3/15 ";"6:17:3 "
"2011/3/15 ";"6:17:8 "
"2011/3/15 ";"6:17:13 "
"2011/3/15 ";"6:17:18 "
"2011/3/15 ";"6:17:23 "

To get the data into Postgres, I used the copy command with csv header option. The table has to exist already. “csv header” tells the command to ignore the file’s header line.

copy mytable 
from '~/my.csv' 
csv header;

Now, we have to handle the “missing leading zeros” problem. Luckily, Postgres offers a template pattern modifier that addresses exactly this problem: the “FM” (fill mode) prefix. The pattern for a month without leading zeros therefore is ‘FMMM’. The full to_timestamp command looks like this:

update mytable
set utc_timestamp = to_timestamp(utc_date||' '||utc_time,'YYYY/FMMM/FMDD FMHH24:FMMI:FMSS' )

Do you need a random sample of features in a Postgres table? Here is an example of how to select 1,000 random features from a table:

SELECT * FROM myTable
WHERE attribute = 'myValue'
ORDER BY random()
LIMIT 1000;

Besides many other interesting topics, Opengeo’s PostGIS tutorial discusses “Tuning PostgreSQL for Spatial”.

The following values are recommended for production environments:

  • shared_buffers: 75 % of database memory (500 MB)
  • work_mem: 16 MB
  • maintenance_work_mem: 128 MB
  • wal_buffers: 1 MB
  • checkpoint_segments: 6
  • random_page_cost: 2.0
  • seq_page_cost: 1.0

All of these configuration parameters can edited in the database configuration file, C:\Documents and Settings\%USER\.opengeo\pgdata\%USER. This is a regular text file and can be edited using Notepad or any other text editor. An easier way of editing this configuration is by using the built-in “Backend Configuration Editor”. In pgAdmin, go to File > Open postgresql.conf…. It will ask for the location of the file, and navigate to C:\Documents and Settings\%USER\.opengeo\pgdata\%USER.

The changes will not take effect until the server is restarted.

Maybe this is a bit off-topic, but I just spent quite some time on this and I need to write it down so I can look it up again later :)

These are instructions for Ubuntu running Postgres 8.4. By default, Postgres ships without PL/Python so we need to get it first:

sudo apt-get install postgresql-plpython-8.4

Next, we need to create the language for our database. I’m using PgAdmin3. From there, I ran:

CREATE PROCEDURAL LANGUAGE 'plpython' HANDLER plpython_call_handler;

This should have been it. Let’s try with a simple function:

CREATE FUNCTION replace_e_to_a(text) RETURNS text AS
'
import re
Text1 = re.sub(''e'', ''a'',args[0])
return Text1
'
LANGUAGE 'plpython';

SELECT replace_e_to_a('Meee');

… should return ‘Maaa’.

Now for the juicy part: Let’s create an INSERT trigger function!

First, let’s have a look at the corresponding table structure. We have two tables “user_data” and “user_keywords”. “user_data” is the table that’s being filled with information from external functions. “user_keywords” has to be kept up-to-date. It is supposed to count the appearance of keywords on a per-user base.

user_data                                   user_keywords
user_id, event_id, keywords                 user_id, keyword,   count
1,       1,        'music,rock'             1,       'music',   2
1,       2,        'music,classic'          1,       'rock',    1
                                            1,       'classic', 1

First, the keyword list has to be split. Then a row has to be inserted for new keywords (compare insert_plan) and the counter has to be increased for existing keywords (update_plan).

The values that are about to be inserted can be accessed via TD[“new”][“column_name”].

CREATE FUNCTION update_keyword_count() RETURNS trigger AS '

keywords = TD["new"]["keywords"]
user = TD["new"]["user_id"]

insert_plan = plpy.prepare("INSERT INTO user_keywords (keyword, count, user_id) VALUES ($1, $2, $3)", ["text", "int", "int"])

update_plan = plpy.prepare("UPDATE user_keywords SET count = $3 WHERE user_id = $1 AND keyword = $2", ["int", "text", "int"])

for keyword in keywords.split(","):
  select_cnt_rows = plpy.prepare("SELECT count(*) AS cnt FROM user_keywords WHERE user_id = $1 AND keyword = $2", ["int", "text"])
  cnt_rows = plpy.execute(select_cnt_rows, [user, keyword])

  select_plan = plpy.prepare("SELECT count AS cnt FROM user_keywords WHERE user_id = $1 AND keyword = $2", ["int", "text"])
  results = plpy.execute(select_plan, [user, keyword])

  if cnt_rows[0]["cnt"] == 0:
   rv = plpy.execute(insert_plan, [keyword, 1, user])
  else:
   rv = plpy.execute(update_plan, [user, keyword, results[0]["cnt"]+1])

' LANGUAGE plpython;

Now, we need to wire it up by defining the trigger:

CREATE TRIGGER update_keywords
BEFORE INSERT ON user_data
FORE EACH ROW
EXECUTE PROCEDURE update_keyword_count();

… Wasn’t that bad ;)

%d bloggers like this: