Advertisements

Archive

QGIS

In Movement data in GIS #16, I presented a new way to deal with trajectory data using GeoPandas and how to load the trajectory GeoDataframes as a QGIS layer. Following up on this initial experiment, I’ve now implemented a first version of an algorithm that performs a spatial analysis on my GeoPandas trajectories.

The first spatial analysis algorithm I’ve implemented is Clip trajectories by extent. Implementing this algorithm revealed a couple of pitfalls:

  • To achieve correct results, we need to compute spatial intersections between linear trajectory segments and the extent. Therefore, we need to convert our point GeoDataframe to a line GeoDataframe.
  • Based on the spatial intersection, we need to take care of computing the corresponding timestamps of the events when trajectories enter or leave the extent.
  • A trajectory can intersect the extent multiple times. Therefore, we cannot simply use the global minimum and maximum timestamp of intersecting segments.
  • GeoPandas provides spatial intersection functionality but if the trajectory contains consecutive rows without location change, these will result in zero length lines and those cause an empty intersection result.

So far, the clip result only contains the trajectory id plus a suffix indicating the sequence of the intersection segments for a specific trajectory (because one trajectory can intersect the extent multiple times). The following screenshot shows one highlighted trajectory that intersects the extent three times and the resulting clipped trajectories:

This algorithm together with the basic trajectory from points algorithm is now available in a Processing algorithm provider plugin called Processing Trajectory.

Note: This plugin depends on GeoPandas.

Note for Windows users: GeoPandas is not a standard package that is available in OSGeo4W, so you’ll have to install it manually. (For the necessary steps, see this answer on gis.stackexchange.com)

The implemented tests show how to use the Trajectory class independently of QGIS. So far, I’m only testing the spatial properties though:

def test_two_intersections_with_same_polygon(self):
    polygon = Polygon([(5,-5),(7,-5),(7,12),(5,12),(5,-5)])
    data = [{'id':1, 'geometry':Point(0,0), 't':datetime(2018,1,1,12,0,0)},
        {'id':1, 'geometry':Point(6,0), 't':datetime(2018,1,1,12,10,0)},
        {'id':1, 'geometry':Point(10,0), 't':datetime(2018,1,1,12,15,0)},
        {'id':1, 'geometry':Point(10,10), 't':datetime(2018,1,1,12,30,0)},
        {'id':1, 'geometry':Point(0,10), 't':datetime(2018,1,1,13,0,0)}]
    df = pd.DataFrame(data).set_index('t')
    geo_df = GeoDataFrame(df, crs={'init': '31256'})
    traj = Trajectory(1, geo_df)
    intersections = traj.intersection(polygon)
    result = []
    for x in intersections:
        result.append(x.to_linestring())
    expected_result = [LineString([(5,0),(6,0),(7,0)]), LineString([(7,10),(5,10)])]
    self.assertEqual(result, expected_result) 

One issue with implementing the algorithms as QGIS Processing tools in this way is that the tools are independent of one another. That means that each tool has to repeat the expensive step of creating the trajectory objects in memory. I’m not sure this can be solved.

Advertisements

Bugfix release 3.0.2 fixes an issue where “accumulate features” was broken for timestamps with milliseconds.

If you like TimeManager, know your way around setting up Travis for testing QGIS plugins, and want to help improve TimeManager stability, please get in touch!

Many of my previous posts in this series [1][2][3] have relied on PostGIS for trajectory data handling. While I love PostGIS, it feels like overkill to require a database to analyze smaller movement datasets. Wouldn’t it be great to have a pure Python solution?

If we look into moving object data literature, beyond the “trajectories are points with timestamps” perspective, which is common in GIS, we also encounter the “trajectories are time series with coordinates” perspective. I don’t know about you, but if I hear “time series” and Python, I think Pandas! In the Python Data Science Handbook, Jake VanderPlas writes:

Pandas was developed in the context of financial modeling, so as you might expect, it contains a fairly extensive set of tools for working with dates, times, and time-indexed data.

Of course, time series are one thing, but spatial data handling is another. Lucky for us, this is where GeoPandas comes in. GeoPandas has been around for a while and version 0.4 has been released in June 2018. So far, I haven’t found examples that use GeoPandas to manage movement data, so I’ve set out to give it a shot. My trajectory class uses a GeoDataFrame df for data storage. For visualization purposes, it can be converted to a LineString:

import pandas as pd 
from geopandas import GeoDataFrame
from shapely.geometry import Point, LineString

class Trajectory():
    def __init__(self, id, df, id_col):
        self.id = id
        self.df = df    
        self.id_col = id_col
        
    def __str__(self):
        return "Trajectory {1} ({2} to {3}) | Size: {0}".format(
            self.df.geometry.count(), self.id, self.get_start_time(), 
            self.get_end_time())
        
    def get_start_time(self):
        return self.df.index.min()
        
    def get_end_time(self):
        return self.df.index.max()
        
    def to_linestring(self):
        return self.make_line(self.df)
        
    def make_line(self, df):
        if df.size > 1:
            return df.groupby(self.id_col)['geometry'].apply(
                lambda x: LineString(x.tolist())).values[0]
        else:
            raise RuntimeError('Dataframe needs at least two points to make line!')

    def get_position_at(self, t):
        try:
            return self.df.loc[t]['geometry'][0]
        except:
            return self.df.iloc[self.df.index.drop_duplicates().get_loc(
                t, method='nearest')]['geometry']

Of course, this class can be used in stand-alone Python scripts, but it can also be used in QGIS. The following script takes data from a QGIS point layer, creates a GeoDataFrame, and finally generates trajectories. These trajectories can then be added to the map as a line layer.

All we need to do to ensure that our data is ordered by time is to set the GeoDataFrame’s index to the time field. From then on, Pandas takes care of the time series aspects and we can access the index as shown in the Trajectory.get_position_at() function above.

# Get data from a point layer
l = iface.activeLayer()
time_field_name = 't'
trajectory_id_field = 'trajectory_id' 
names = [field.name() for field in l.fields()]
data = []
for feature in l.getFeatures():
    my_dict = {}
    for i, a in enumerate(feature.attributes()):
        my_dict[names[i]] = a
    x = feature.geometry().asPoint().x()
    y = feature.geometry().asPoint().y()
    my_dict['geometry']=Point((x,y))
    data.append(my_dict)

# Create a GeoDataFrame
df = pd.DataFrame(data).set_index(time_field_name)
crs = {'init': l.crs().geographicCrsAuthId()} 
geo_df = GeoDataFrame(df, crs=crs)
print(geo_df)

# Test if spatial functions work
print(geo_df.dissolve([True]*len(geo_df)).centroid)

# Create a QGIS layer for trajectory lines
vl = QgsVectorLayer("LineString", "trajectories", "memory")
vl.setCrs(l.crs()) # doesn't stop popup :(
pr = vl.dataProvider()
pr.addAttributes([QgsField("id", QVariant.String)])
vl.updateFields() 

df_by_id = dict(tuple(geo_df.groupby(trajectory_id_field)))
trajectories = {}
for key, value in df_by_id.items():
    traj = Trajectory(key, value, trajectory_id_field)
    trajectories[key] = traj
    line = QgsGeometry.fromWkt(traj.to_linestring().wkt)
    f = QgsFeature()
    f.setGeometry(line)
    f.setAttributes([key])
    pr.addFeature(f) 
print(trajectories[1])

vl.updateExtents() 
QgsProject.instance().addMapLayer(vl)

The following screenshot shows this script applied to a sample of the Geolife datasets containing 100 trajectories with a total of 236,776 points. On my notebook, the runtime is approx. 20 seconds.

So far, GeoPandas has proven to be a convenient way to handle time series with coordinates. Trying to implement some trajectory analysis tools will show if it is indeed a promising data structure for trajectories.

If you follow me on Twitter, you have probably already heard that the ebook of “QGIS Map Design 2nd Edition” has now been published and we are expecting the print version to be up for sale later this month. Gretchen Peterson and I – together with our editor Gary Sherman (yes, that Gary Sherman!) – have been working hard to provide you with tons of new and improved map design workflows and many many completely new maps. By Gretchen’s count, this edition contains 23 new maps, so it’s very hard to pick a favorite!

Like the 1st edition, we provide increasingly advanced recipes in three chapters, each focusing on either layer styling, labeling, or creating print layouts. If I had to pick a favorite, I’d have to go with “Mastering Rotated Maps”, one of the advanced recipes in the print layouts chapter. It looks deceptively simple but it combines a variety of great QGIS features and clever ideas to design a map that provides information on multiple levels of detail. Besides the name inspiring rotated map items, this design combines

  • map overviews
  • map themes
  • graduated lines and polygons
  • a rotated north arrow
  • fancy leader lines

all in one:

“QGIS Map Design 2nd Edition” provides how-to instructions, as well as data and project files for each recipe. So you can jump right into it and work with the provided materials or apply the techniques to your own data.

The ebook is available at LocatePress.

If you’re are following me on Twitter, you’ve certainly already read that I’m working on PyQGIS 101 a tutorial to help GIS users to get started with Python programming for QGIS.

I’ve often been asked to recommend Python tutorials for beginners and I’ve been surprised how difficult it can be to find an engaging tutorial for Python 3 that does not assume that the reader already knows all kinds of programming concepts.

It’s been a while since I started programming, but I do teach QGIS and Python programming for QGIS to university students and therefore have some ideas of which concepts are challenging. Nonetheless, it’s well possible that I overlook something that is not self explanatory. If you’re using PyQGIS 101 and find that some points could use further explanations, please leave a comment on the corresponding page.

PyQGIS 101 is a work in progress. I’d appreciate any feedback, particularly from beginners!

In Movement data in GIS #2: visualization I mentioned that it should be possible to label trajectory segments without having to break the original trajectory feature. While it’s not a straightforward process, it is indeed possible to create timestamp labels at desired intervals:

The main point here is that we cannot use regular labels because there would be only one label for the whole trajectory feature. Instead, we are using a marker line with a font marker:

By default, font markers only display one character from a given font but by using expressions we can make it display longer text, including datetime strings:

If you want to have a label at every node of the trajectory, the expression looks like this:

format_date( 
   to_datetime('1970-01-01T00:00:00Z')+to_interval(
      m(start_point(geometry_n(
         segments_to_lines( $geometry ),
         @geometry_part_num)
      ))||' seconds'
   ),
   'HH:mm:ss'
)

You probably remember those parts of the expression that extract the m value from previous posts. Note that – compared to 2016 – it is now necessary to add the segments_to_lines() function.

The m value (which stores time as seconds since Unix epoch) is then converted to datetime and finally formatted to only show time. Of course you can edit the datetime format string to also include the date.

If we only want a label every 30 seconds, we can add a case statement around that:

CASE WHEN 
m(start_point(geometry_n(
   segments_to_lines( $geometry ),
   @geometry_part_num)
)) % 30 = 0
THEN
format_date( 
   to_datetime('1970-01-01T00:00:00Z')+to_interval(
      m(start_point(geometry_n(
         segments_to_lines( $geometry ),
         @geometry_part_num)
      ))||' seconds'
   ),
   'HH:mm:ss'
)
END

This works well if the trajectory sampling interval is fairly regular. This is not always the case and that means that the above case statement wouldn’t find many nodes with a timestamp that ends in :30 or :00. In such a case, we could resort to labeling nodes based on their order in the linestring:

CASE WHEN 
 @geometry_part_num  % 30 = 0
THEN
...

Thanks a lot to @JuergenEFischer for providing a solution for converting seconds since Unix epoch to datetime without a custom function!

Note that expressions using @geometry_part_num currently suffer from the following issue: Combination of segments_to_lines($geometry) and @geometry_part_num gives wrong segment numbers


This post is part of a series. Read more about movement data in GIS.

Remember the good old times when all parameters in Processing were mandatory?

Inputs and outputs are fixed, and optional parameters or outputs are not supported. [Graser & Olaya, 2015]

Since QGIS 2.14, this is no longer the case. Scripts, as well as models, can now have optional parameters. Here is how for QGIS 3:

When defining a Processing script parameter, the parameter’s constructor takes a boolean flag indicating whether the parameter should be optional. It’s false by default:

class qgis.core.QgsProcessingParameterNumber(
   name: str, description: str = '', 
   type: QgsProcessingParameterNumber.Type = QgsProcessingParameterNumber.Integer, 
   defaultValue: Any = None, 
   optional: bool = False,
   minValue: float = -DBL_MAX+1, maxValue: float = DBL_MAX)

(Source: http://python.qgis.org/api/core/Processing/QgsProcessingParameterNumber.html)

One standard tool that uses optional parameters is Add autoincremental field:

From Python, this algorithm can be called with or without the optional parameters:

When building a model, an optional input can be assigned to the optional parameter. To create an optional input, make sure to deactivate the mandatory checkbox at the bottom of the input parameter definition:

Then this optional input can be used in an algorithm. For example, here the numerical input optional_value is passed to the Start values at parameter:

You can get access to all available inputs by clicking the … button next to the Start values at field. In this example, I have access to values of the input layer as well as  the optional value:

Once this is set up, this is how it looks when the model is run:

You can see that the optional value is indeed Not set.

References

Graser, A., & Olaya, V. (2015). Processing: A Python Framework for the Seamless Integration of Geoprocessing Tools in QGIS. ISPRS Int. J. Geo-Inf. 2015, 4, 2219-2245. doi:10.3390/ijgi4042219.

%d bloggers like this: