Advertisements

Archive

Author Archives: underdark

In short: both writing trajectory queries as well as executing them is considerably faster using PostGIS trajectories (as LinestringM) rather than the commonly used point-based approach.

Here are a couple of examples to give you an impression of the differences.

Spoiler alert! Trajectory queries are up to 500 times faster than comparable point-based queries.

Length

First, let’s see how to determine trajectory length for all observed moving objects (identified by a tracker id).

Using the point-based approach, we first need to ensure that the points are in the correct temporal order, create the lines, and finally sum up their length:

WITH ordered AS (
 SELECT trajectory_id, tracker, t, pt
 FROM geolife.trajectory_pt
 ORDER BY t
), tmp AS (
 SELECT trajectory_id, tracker, st_makeline(pt) traj
 FROM ordered 
 GROUP BY trajectory_id, tracker
)
SELECT tracker, round(sum(ST_Length(traj::geography)))
FROM tmp
GROUP BY tracker 
ORDER BY tracker

With trajectories, we can go right to computing lengths:

SELECT tracker, round(sum(ST_Length(track::geography)))
FROM geolife.trajectory_ext
GROUP BY tracker
ORDER BY tracker

On my test system, the trajectory query run time is 22.7 sec instead of 43.0 sec for the point-based approach:

Duration

Compared to trajectory length, duration is less complicated in the point-based approach:

WITH tmp AS (
 SELECT trajectory_id, tracker, min(t) start_time, max(t) end_time
 FROM geolife.trajectory_pt
 GROUP BY trajectory_id, tracker
)
SELECT tracker, sum(end_time - start_time)
FROM tmp
GROUP BY tracker
ORDER BY tracker

Still, the trajectory query is less complex and much faster at 31 ms instead of 6.0 sec:

SELECT tracker, sum(upper(time_range) - lower(time_range))
FROM geolife.trajectory_ext
GROUP BY tracker
ORDER BY tracker

Temporal filter

Extracting trajectories that occurred during a certain time frame is another common use case:

WITH tmp AS (
 SELECT trajectory_id, tracker, min(t) start_time, max(t) end_time
 FROM geolife.trajectory_pt
 GROUP BY trajectory_id, tracker
)
SELECT trajectory_id, tracker, start_time, end_time
FROM tmp
WHERE end_time > '2008-11-26 11:00'
AND start_time < '2008-11-26 15:00'
ORDER BY tracker

This point-based query takes 6.0 sec while the shorter trajectory query finishes in 12 ms:

SELECT id, tracker, time_range
FROM geolife.trajectory_ext
WHERE time_range && '[2008-11-26 11:00+1,2008-11-26 15:00+01]'::tstzrange

or equally fast (12 ms) by making use of the n-dimensional index:

WHERE track &&&	ST_Collect(
 ST_MakePointM(-180, -90, extract(epoch from '2008-11-26 11:00'::timestamptz)),
 ST_MakePointM(180, 90, extract(epoch from '2008-11-26 15:00'::timestamptz))
)

Spatial filter

Finally, of course, let’s have a look at spatial filters, for example, trajectories that start in a certain area:

WITH my AS ( 
 SELECT ST_Buffer(ST_SetSRID(ST_MakePoint(116.31894,39.97472),4326),0.0005) areaA
), tmp AS (
 SELECT trajectory_id, tracker, min(t) t 
 FROM geolife.trajectory_pt
 GROUP BY trajectory_id, tracker
)
SELECT distinct traj.tracker, traj.trajectory_id 
FROM tmp
JOIN geolife.trajectory_pt traj
ON tmp.trajectory_id = traj.trajectory_id AND traj.t = tmp.t
JOIN my
ON ST_Within(traj.pt, my.areaA)

This point-based query takes 6.0 sec while the shorter trajectory query finishes in 488 ms:

WITH my AS ( 
 SELECT ST_Buffer(ST_SetSRID(ST_MakePoint(116.31894, 39.97472),4326),0.0005) areaA
)
SELECT id, tracker, ST_AsText(track)
FROM geolife.trajectory_ext
JOIN my
ON areaA && track
AND ST_Within(ST_StartPoint(track), areaA)

For more generic “does this trajectory intersect another geometry”, the points can also be aggregated to a linestring on the fly but that takes 21.9 sec:

I’ll be presenting more work on PostGIS trajectories at GI_Forum in Salzburg in July. In the talk, I’ll also have a look at the custom PG-Trajectory datatype.


Read more:

Advertisements

Remember the good old times when all parameters in Processing were mandatory?

Inputs and outputs are fixed, and optional parameters or outputs are not supported. [Graser & Olaya, 2015]

Since QGIS 2.14, this is no longer the case. Scripts, as well as models, can now have optional parameters. Here is how for QGIS 3:

When defining a Processing script parameter, the parameter’s constructor takes a boolean flag indicating whether the parameter should be optional. It’s false by default:

class qgis.core.QgsProcessingParameterNumber(
   name: str, description: str = '', 
   type: QgsProcessingParameterNumber.Type = QgsProcessingParameterNumber.Integer, 
   defaultValue: Any = None, 
   optional: bool = False,
   minValue: float = -DBL_MAX+1, maxValue: float = DBL_MAX)

(Source: http://python.qgis.org/api/core/Processing/QgsProcessingParameterNumber.html)

One standard tool that uses optional parameters is Add autoincremental field:

From Python, this algorithm can be called with or without the optional parameters:

When building a model, an optional input can be assigned to the optional parameter. To create an optional input, make sure to deactivate the mandatory checkbox at the bottom of the input parameter definition:

Then this optional input can be used in an algorithm. For example, here the numerical input optional_value is passed to the Start values at parameter:

You can get access to all available inputs by clicking the … button next to the Start values at field. In this example, I have access to values of the input layer as well as  the optional value:

Once this is set up, this is how it looks when the model is run:

You can see that the optional value is indeed Not set.

References

Graser, A., & Olaya, V. (2015). Processing: A Python Framework for the Seamless Integration of Geoprocessing Tools in QGIS. ISPRS Int. J. Geo-Inf. 2015, 4, 2219-2245. doi:10.3390/ijgi4042219.

Processing has been overhauled significantly for QGIS 3.0. Besides speed-ups, one of the most obvious changes is the way to write Processing scripts. Instead of the old Processing-specific syntax, Processing scripts for QGIS3 are purely pythonic implementations of QgsProcessingAlgorithm.

Here’s a template that you can use to develop your own algorithms:

from qgis.PyQt.QtCore import QCoreApplication, QVariant
from qgis.core import (QgsField, QgsFeature, QgsFeatureSink, QgsFeatureRequest, QgsProcessing, QgsProcessingAlgorithm, QgsProcessingParameterFeatureSource, QgsProcessingParameterFeatureSink)
                      
class ExAlgo(QgsProcessingAlgorithm):
    INPUT = 'INPUT'
    OUTPUT = 'OUTPUT'

    def __init__(self):
        super().__init__()

    def name(self):
        return "exalgo"
    
    def tr(self, text):
        return QCoreApplication.translate("exalgo", text)
        
    def displayName(self):
        return self.tr("Example script")

    def group(self):
        return self.tr("Examples")

    def groupId(self):
        return "examples"

    def shortHelpString(self):
        return self.tr("Example script without logic")

    def helpUrl(self):
        return "https://qgis.org"
        
    def createInstance(self):
        return type(self)()
  
    def initAlgorithm(self, config=None):
        self.addParameter(QgsProcessingParameterFeatureSource(
            self.INPUT,
            self.tr("Input layer"),
            [QgsProcessing.TypeVectorAnyGeometry]))
        self.addParameter(QgsProcessingParameterFeatureSink(
            self.OUTPUT,
            self.tr("Output layer"),
            QgsProcessing.TypeVectorAnyGeometry))

    def processAlgorithm(self, parameters, context, feedback):
        source = self.parameterAsSource(parameters, self.INPUT, context)
        (sink, dest_id) = self.parameterAsSink(parameters, self.OUTPUT, context,
                                               source.fields(), source.wkbType(), source.sourceCrs())

        features = source.getFeatures(QgsFeatureRequest())
        for feat in features:
            out_feat = QgsFeature()
            out_feat.setGeometry(feat.geometry())
            out_feat.setAttributes(feat.attributes())
            sink.addFeature(out_feat, QgsFeatureSink.FastInsert)

        return {self.OUTPUT: dest_id}

This script just copies the features of the input layer to the output layer without any modifications. Add your logic to the processAlgorithm() function to get started.

Use Create New Script from the Toolbox toolbar:

Paste the example script:

Once saved, the script will show up in the Processing toolbox:

Joining polygon attributes to points based on their location is a very common GIS task. In QGIS 2, QGIS’ own implementation of “Join attributes by location” was much slower than SAGA’s “Add polygon attributes to points”. Thus, installations without SAGA were out of good options.

Luckily this issue (and many more) has been fixed by the rewrite of many geoprocessing algorithms for QGIS 3! Let’s revisit the comparison:

I’m using publicly available datasets from Naturalearth: The small scale populated places (243 points) and the large scale countries (255 polygons with many nodes). Turns out that QGIS 3’s built-in tool takes a little less than two seconds while the SAGA Processing tool requires a litte less than six seconds:

Like in the previous comparison, times were measured using the Python Console:

In both tools, only the countries’ SOVEREIGNT attribute is joined to the point attribute table:

import processing
t0 = datetime.datetime.now()
print("QGIS Join attributes by location ...")
processing.runAndLoadResults(
   "qgis:joinattributesbylocation", 
   {'INPUT':'E:/Geodata/NaturalEarth/vector_v4/natural_earth_vector/110m_cultural/ne_110m_populated_places.shp',
   'JOIN':'E:/Geodata/NaturalEarth/vector_v4/natural_earth_vector/10m_cultural/ne_10m_admin_0_countries.shp',
   'PREDICATE':[5],'JOIN_FIELDS':['SOVEREIGNT'],
   'METHOD':0,'DISCARD_NONMATCHING':False,'OUTPUT':'memory:'})
t1 = datetime.datetime.now()
print("Runtime: "+str(t1-t0))
print("SAGA Add polygon attributers to points ...")
processing.runAndLoadResults("saga:addpolygonattributestopoints", 
   {'INPUT':'E:/Geodata/NaturalEarth/vector_v4/natural_earth_vector/110m_cultural/ne_110m_populated_places.shp',
   'POLYGONS':'E:/Geodata/NaturalEarth/vector_v4/natural_earth_vector/10m_cultural/ne_10m_admin_0_countries.shp',
   'FIELDS':'SOVEREIGNT','OUTPUT':'C:/Users/anita/AppData/Local/Temp/processing_8b1bbde78de5490285dd530e115cca52/099660d88bf14c54a853cc230e388e55/OUTPUT.shp'})
t2 = datetime.datetime.now()
print("Runtime: "+str(t2-t1))

It is worth noting that it takes longer if more attributes are to be joined to the point layer attribute table. For example, if the JOIN_FIELDS parameter is empty:

'JOIN_FIELDS':[]

instead of

'JOIN_FIELDS':['SOVEREIGNT']

then the the Join attributes by location takes almost 16 seconds. (The country layer contains 71 attributes after all.)

(The SAGA tool currently allows only joining one attribute at a time.)

The release of 3.0 is really close now. If you want to know what’s new or are just looking for interesting ways to pass the time until the packages land, check out the following QGIS3 resources.

For users

For more recordings from the developer meeting in Madeira check my Youtube playlist.

For developers

If you have further reading recommendations, please post them in the comments below.

 

TimeManager 2.5 is quite likely going to be the final TimeManager release for the QGIS 2 series. It comes with a couple of bug fixes and enhancements:

  • Fixed #245: updated help.htm
  • Fixed #240: now hiding unmanageable WFS layers
  • Fixed #220: fixed issues with label size
  • Fixed #194: now exposing additional functions: animation_time_frame_size, animation_time_frame_type, animation_start_datetime, animation_end_datetime

Besides updating the help, I also decided to display it more prominently in the settings dialog (similarly to how the help is displayed in the field calculator or in Processing):

So far, I haven’t started porting to QGIS 3 yet. If you are interested in TimeManager and want to help, please get in touch.

On this note, let me leave you with a couple of animation inspirations from the Twitterverse:

I’ll start with some tech talk first. Feel free to jump to the usage example further down if you are here for the edge bundling plugin.

As you certainly know, QGIS 3 brings a lot of improvements and under-the-hood changes. One of those changes affects all Python scripts. They need to be updated to Python 3 and the new PyQGIS API. (See the official migration guide for details.)

To get ready for the big 3.0 release, I’ve started porting my Processing tools. The edge bundling script is my first candidate for porting to QGIS 3. I also wanted to use this opportunity to “upgrade” from a simple script to a plugin that integrates into Processing.

I used Alexander Bruy’s “prepair for Processing” plugin as a template but you can also find an example template in your Processing folder. (On my system, it is located in C:\OSGeo4W64\apps\qgis-dev\python\plugins\processing\algs\exampleprovider.)

Since I didn’t want to miss the advantages of a good IDE, I set up PyCharm as described by Heikki Vesanto. This will give you code completion for Python 3 and PyQGIS which is very helpful for refactoring and porting. (I also tried Eclipse with PyDev but if you don’t have a favorite IDE yet, I find PyCharm easier to install and configure.)

My PyCharm startup script qgis3_pycharm.bat is a copy of C:\OSGeo4W64\bin\python-qgis-dev.bat with the last line altered to start PyCharm:

@echo off
call "%~dp0\o4w_env.bat"
call qt5_env.bat
call py3_env.bat
@echo off<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>
path %OSGEO4W_ROOT%\apps\qgis-dev\bin;%PATH%
set QGIS_PREFIX_PATH=%OSGEO4W_ROOT:\=/%/apps/qgis-dev
set GDAL_FILENAME_IS_UTF8=YES
rem Set VSI cache to be used as buffer, see #6448
set VSI_CACHE=TRUE
set VSI_CACHE_SIZE=1000000
set QT_PLUGIN_PATH=%OSGEO4W_ROOT%\apps\qgis-dev\qtplugins;%OSGEO4W_ROOT%\apps\qt5\plugins
set PYTHONPATH=%OSGEO4W_ROOT%\apps\qgis-dev\python;%PYTHONPATH%
start /d "C:\Program Files\JetBrains\PyCharm\bin\" pycharm64.exe

In PyCharm File | Settings, I configured the OSGeo4W Python 3.6 interpreter and added qgis-dev and the plugin folder to its path:

With this setup done, we can go back to the code.

I first resolved all occurrences of import * in my script to follow good coding practices. For example:

from qgis.core import *

became

from qgis.core import QgsFeature, QgsPoint, QgsVector, QgsGeometry, QgsField, QGis<span data-mce-type="bookmark" style="display: inline-block; width: 0px; overflow: hidden; line-height: 0;" class="mce_SELRES_start"></span>

in this PR.

I didn’t even run the 2to3 script that is provided to make porting from Python 2 to Python 3 easier. Since the edge bundling code is mostly Numpy, there were almost no changes necessary. The only head scratching moment was when Numpy refused to add a map() return value to an array. So (with the help of Stackoverflow of course) I added a work around to convert the map() return value to an array as well:

flocal_x = map(forcecalcx, subtr_x, subtr_y, distance)
electrostaticforces_x[e_idx, :] += np.array(list(flocal_x))

The biggest change related to Processing is that the VectorWriter has been replaced by a QgsFeatureSink. It’s defined as a parameter of the edgebundling QgsProcessingAlgorithm:

self.addParameter(QgsProcessingParameterFeatureSink(
   self.OUTPUT,
   self.tr("Bundled edges"),
   QgsProcessing.TypeVectorLine)
)

And when the algorithm is run, the sink is filled with the output features:

(sink, dest_id) = self.parameterAsSink(
   parameters, self.OUTPUT, context,
   source.fields(), source.wkbType(), source.sourceCrs()
)

# code that creates features

sink.addFeature(feat, QgsFeatureSink.FastInsert)

The ported plugin is available on Github.

The edge bundling plugin in action

I haven’t uploaded the plugin to the official plugin repository yet, but you can already download if from Github and give it a try:

For this example, I’m using taxi pick-up and drop-off data provided by the NYC Taxi & Limousine Commission. I downloaded the January 2017 green taxi data and extracted all trips for the 1st of January. Then I created origin-destination (OD) lines using the QGIS virtual layer feature:

To get an interesting subset of the data, I extracted only those OD flows that cross the East River and have a count of at least 5 taxis:

Now the data is ready for bundling.

If you have installed the edge bundling plugin, the force-directed edge bundling algorithm should be available in the Processing toolbox. The UI of the edge bundling algorithm looks pretty much the same as it did for the QGIS 2 Processing script:

Since this is a small dataset with only 148 OD flows, the edge bundling processes is pretty quick and we can explore the results:

Beyond this core edge bundling algorithm, the repository also contains two more scripts that still need to be ported. They include dependencies on sklearn, so it will be interesting to see how straightforward it is to convert them.

%d bloggers like this: