Archive

Tag Archives: movement data

It’s my pleasure to share with you that Trajectools 2.0 just landed in the official QGIS Plugin Repository.

This is the first version without the “experimental” flag. If you look at the plugin release history, you will see that the previous release was from 2020. That’s quite a while ago and a lot has happened since, including the development of MovingPandas.

Let’s have a look what’s new!

The old “Trajectories from point layer”, “Add heading to points”, and “Add speed (m/s) to points” algorithms have been superseded by the new “Create trajectories” algorithm which automatically computes speeds and headings when creating the trajectory outputs.

“Day trajectories from point layer” is covered by the new “Split trajectories at time intervals” which supports splitting by hour, day, month, and year.

“Clip trajectories by extent” still exists but, additionally, we can now also “Clip trajectories by polygon layer”

There are two new event extraction algorithms to “Extract OD points” and “Extract OD points”, as well as the related “Split trajectories at stops”. Additionally, we can also “Split trajectories at observation gaps”.

Trajectory outputs, by default, come as a pair of a point layer and a line layer. Depending on your use case, you can use both or pick just one of them. By default, the line layer is styled with a gradient line that makes it easy to see the movement direction:

while the default point layer style shows the movement speed:

How to use Trajectools

Trajectools 2.0 is powered by MovingPandas. You will need to install MovingPandas in your QGIS Python environment. I recommend installing both QGIS and MovingPandas from conda-forge:

(base) conda create -n qgis -c conda-forge python=3.9 
(base) conda activate qgis
(qgis) mamba install -c conda-forge qgis movingpandas

The plugin download includes small trajectory sample datasets so you can get started immediately.

Outlook

There is still some work to do to reach feature parity with MovingPandas. Stay tuned for more trajectory algorithms, including but not limited to down-sampling, smoothing, and outlier cleaning.

I’m also reviewing other existing QGIS plugins to see how they can complement each other. If you know a plugin I should look into, please leave a note in the comments.

The Trajectools toolbox has continued growing:

I’m continuously testing the algorithms integrated so far to see if they work as GIS users would expect and can to ensure that they can be integrated in Processing model seamlessly.

Because naming things is tricky, I’m currently struggling with how to best group the toolbox algorithms into meaningful categories. I looked into the categories mentioned in OGC Moving Features Access but honestly found them kind of lacking:

Andrienko et al.’s book “Visual Analytics of Movement” comes closer to what I’m looking for:

… but I’m not convinced yet. So take the above listed three categories with a grain of salt. Those may change before the release. (Any inputs / feedback / recommendation welcome!)

Let me close this quick status update with a screencast showcasing stop detection in AIS data, featuring the recently added trajectory styling using interpolated lines:

While Trajectools is getting ready for its 2.0 release, you can get the current development version directly from https://github.com/movingpandas/qgis-processing-trajectory.

Trajectools development started back in 2018 but has been on hold since 2020 when I realized that it would be necessary to first develop a solid trajectory analysis library. With the MovingPandas library in place, I’ve now started to reboot Trajectools.

Trajectools v2 builds on MovingPandas and exposes its trajectory analysis algorithms in the QGIS Processing Toolbox. So far, I have integrated the basic steps of

  1. Building trajectories including speed and direction information from timestamped points and
  2. Splitting trajectories at observation gaps, stops, or regular time intervals.

The algorithms create two output layers:

  • Trajectory points with speed and direction information that are styled using arrow markers
  • Trajectories as LineStringMs which makes it straightforward to count the number of trajectories and to visualize where one trajectory ends and another starts.

So far, the default style for the trajectory points is hard-coded to apply the Turbo color ramp on the speed column with values from 0 to 50 (since I’m simply loading a ready-made QML). By default, the speed is calculated as km/h but that can be customized:

I don’t have a solution yet to automatically create a style for the trajectory lines layer. Ideally, the style should be a categorized renderer that assigns random colors based on the trajectory id column. But in this case, it’s not enough to just load a QML.

In the meantime, I might instead include an Interpolated Line style. What do you think?

Of course, the goal is to make Trajectools interoperable with as many existing QGIS Processing Toolbox algorithms as possible to enable efficient Mobility Data Science workflows.

The easiest way to set up QGIS with MovingPandas Python environment is to install both from conda. You can find the instructions together with the latest Trajectools development version at: https://github.com/movingpandas/qgis-processing-trajectory


This post is part of a series. Read more about movement data in GIS.

Today, I want to point out a blog post over at

https://carto.com/blog/analyzing-mobility-hotspots-with-movingpandas

written together with my fellow co-authors and EMERALDS project team members Argyrios Kyrgiazos and Helen McKenzie.

In this blog post, we walk you through a trajectory hotspot analysis using open taxi trajectory data from Kaggle, combining data preparation with MovingPandas (including the new OutlierCleaner illustrated above) and spatiotemporal hotspot analysis from Carto.

Today’s post is a first quick dive into Neo4J (really just getting my toes wet). It’s based on a publicly available Neo4J dump containing mobility data, ship trajectories to be specific. You can find this data and the setup instructions at:

Maryam Maslek ELayam, Cyril Ray, & Christophe Claramunt. (2022). A hierarchical graph-based model for mobility data representation and analysis [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6405212

I was made aware of this work since they cited MovingPandas in their paper in Data & Knowledge Engineering“The implementation combines several open source tools such as Python, MovingPandas library, Uber H3 index, Neo4j graph database management system”

Once set up, this gives us a database with three hierarchical levels:

Neo4j comes with a nice graphical browser that lets us explore the data. We can switch between levels and click on individual node labels to get a quick preview:

Level 2 is a generalization / aggregation of level 1. Expanding the graph of one of the level 2 nodes shows its connection to level 1. For example, the level 2 port node “Audierne” actually refers to two level 1 nodes:

Every “road” level 1 relationship between ports provide information about the ship, its arrival, departure, travel time, and speed. We can see that this two level 1 ports must be pretty close since travel times are only 5 minutes:

Further expanding one of the port level 1 nodes shows its connection to waypoints of level1:

Switching to level 2, we gain access to nodes of type Traj(ectory). Additionally, the road level 2 relationships represent aggregations of the trajectories, for example, here’s a relationship with only one associated trajectory:

There are also some odd relationships, for example, trajectory 43 has two ends and begins relationships and there are also two road relationships referencing this trajectory (with identical information, only differing in their automatic <id>). I’m not yet sure if that is a feature or a bug:

On level 1, we also have access to ship nodes. They are connected to ports and waypoints. However, exploring them visually is challenging. Things look fine at first:

But after a while, once all relationships have loaded, we have it: the MIGHTY BALL OF YARN ™:

I guess this is the point where it becomes necessary to get accustomed to the query language. And no, it’s not SQL, it is Cypher. For example, selecting a specific trajectory with id 0, looks like this:

 MATCH (t1 {traj_id: 0}) RETURN t1

But more on this another time.


This post is part of a series. Read more about movement data in GIS.

This summer, I had the honor to — once again — speak at the OpenGeoHub Summer School. This time, I wanted to challenge the students and myself by not just doing MovingPandas but by introducing both MovingPandas and DVC for Mobility Data Science.

I’ve previously written about DVC and how it may be used to track geoprocessing workflows with QGIS & DVC. In my summer school session, we go into details on how to use DVC to keep track of MovingPandas movement data analytics workflow.

Here is the recording of the session live stream and you can find the materials at https://github.com/movingpandas/movingpandas-examples/blob/opengeohub2023/README.md


This post is part of a series. Read more about movement data in GIS.

Did you know that MovingPandas also supports local image coordinates? Indeed, it does.

In today’s post, we will explore how we can use this feature to analyze bicycle tracks extracted from video footage published by Michael Szell @mszll:

The bicycle trajectory coordinates are stored in two separate lists: xs_640x360 and ys640x360:

This format is kind of similar to the Kaggle Taxi dataset, we worked with in the previous post. However, to use the solution we implemented there, we need to combine the x and y coordinates into nice (x,y) tuples:

df['coordinates'] = df.apply(
    lambda row: list(zip(row['xs_640x360'], row['ys_640x360'])), axis=1)
df.drop(columns=['xs_640x360', 'ys_640x360'], inplace=True)

Afterwards, we can create the points and compute the proper timestamps from the frame numbers:

def compute_datetime(row):
    # some educated guessing going on here: the paper states that the video covers 2021-06-09 07:00-08:00
    d = datetime(2021,6,9,7,0,0) + (row['frame_in'] + row['running_number']) * timedelta(seconds=2)
    return d
def create_point(xy):
    try: 
        return Point(xy)
    except TypeError:  # when there are nan values in the input data
        return None
new_df = df.head().explode('coordinates')
new_df['geometry'] = new_df['coordinates'].apply(create_point)
new_df['running_number'] = new_df.groupby('id').cumcount()
new_df['datetime'] = new_df.apply(compute_datetime, axis=1)
new_df.drop(columns=['coordinates', 'frame_in', 'running_number'], inplace=True)
new_df

Once the points and timestamps are ready, we can create the MovingPandas TrajectoryCollection. Note how we explicitly state that there is no CRS for this dataset (crs=None):

trajs = mpd.TrajectoryCollection(
    gpd.GeoDataFrame(new_df), 
    traj_id_col='id',  t='datetime', crs=None)

Plotting trajectories with image coordinates

Similarly, to plot these trajectories, we should tell hvplot that it should not fetch any background map tiles (’tiles’:None) and that the coordinates are not geographic (‘geo’:False):

If you want to explore the full source code, you can find my Github fork with the Jupyter notebook at: https://github.com/anitagraser/desirelines/blob/main/mpd.ipynb

The repository also contains a camera image of the intersection, which we can use as a background for our trajectory plots:

bg_img = hv.RGB.load_image('img/intersection2.png', bounds=(0,0,640,360)) 

One important caveat is that speed will be calculated in pixels per second. So when we plot the bicycle speed, the segments closer to the camera will appear faster than the segments in the background:

To fix this issue, we would have to correct for the distortions of the camera lens and perspective. I’m sure that there is specialized software for this task but, for the purpose of this post, I’m going to grab the opportunity to finally test out the VectorBender plugin.

Georeferencing the trajectories using QGIS VectorBender plugin

Let’s load the five test trajectories and the camera image to QGIS. To make sure that they align properly, both are set to the same CRS and I’ve created the following basic world file for the camera image:

1
0
0
-1
0
360

Then we can use the VectorBender tools to georeference the trajectories by linking locations from the camera image to locations on aerial images. You can see the whole process in action here:

After around 15 minutes linking control points, VectorBender comes up with the following georeferenced trajectory result:

Not bad for a quick-and-dirty hack. Some points on the borders of the image could not be georeferenced since I wasn’t always able to identify suitable control points at the camera image borders. So it won’t be perfect but should improve speed estimates.


This post is part of a series. Read more about movement data in GIS.

Kaggle’s “Taxi Trajectory Data from ECML/PKDD 15: Taxi Trip Time Prediction (II) Competition” is one of the most used mobility / vehicle trajectory datasets in computer science. However, in contrast to other similar datasets, Kaggle’s taxi trajectories are provided in a format that is not readily usable in MovingPandas since the spatiotemporal information is provided as:

  • TIMESTAMP: (integer) Unix Timestamp (in seconds). It identifies the trip’s start;
  • POLYLINE: (String): It contains a list of GPS coordinates (i.e. WGS84 format) mapped as a string. The beginning and the end of the string are identified with brackets (i.e. [ and ], respectively). Each pair of coordinates is also identified by the same brackets as [LONGITUDE, LATITUDE]. This list contains one pair of coordinates for each 15 seconds of trip. The last list item corresponds to the trip’s destination while the first one represents its start;

Therefore, we need to create a DataFrame with one point + timestamp per row before we can use MovingPandas to create Trajectories and analyze them.

But first things first. Let’s download the dataset:

import datetime
import pandas as pd
import geopandas as gpd
import movingpandas as mpd
import opendatasets as od
from os.path import exists
from shapely.geometry import Point

input_file_path = 'taxi-trajectory/train.csv'

def get_porto_taxi_from_kaggle():
    if not exists(input_file_path):
        od.download("https://www.kaggle.com/datasets/crailtap/taxi-trajectory")

get_porto_taxi_from_kaggle()
df = pd.read_csv(input_file_path, nrows=10, usecols=['TRIP_ID', 'TAXI_ID', 'TIMESTAMP', 'MISSING_DATA', 'POLYLINE'])
df.POLYLINE = df.POLYLINE.apply(eval)  # string to list
df

And now for the remodelling:

def unixtime_to_datetime(unix_time):
    return datetime.datetime.fromtimestamp(unix_time)

def compute_datetime(row):
    unix_time = row['TIMESTAMP']
    offset = row['running_number'] * datetime.timedelta(seconds=15)
    return unixtime_to_datetime(unix_time) + offset

def create_point(xy):
    try: 
        return Point(xy)
    except TypeError:  # when there are nan values in the input data
        return None
 
new_df = df.explode('POLYLINE')
new_df['geometry'] = new_df['POLYLINE'].apply(create_point)
new_df['running_number'] = new_df.groupby('TRIP_ID').cumcount()
new_df['datetime'] = new_df.apply(compute_datetime, axis=1)
new_df.drop(columns=['POLYLINE', 'TIMESTAMP', 'running_number'], inplace=True)
new_df

And that’s it. Now we can create the trajectories:

trajs = mpd.TrajectoryCollection(
    gpd.GeoDataFrame(new_df, crs=4326), 
    traj_id_col='TRIP_ID', obj_id_col='TAXI_ID', t='datetime')
trajs.hvplot(title='Kaggle Taxi Trajectory Data', tiles='CartoLight')

That’s it. Now our MovingPandas.TrajectoryCollection is ready for further analysis.

By the way, the plot above illustrates a new feature in the recent MovingPandas 0.16 release which, among other features, introduced plots with arrow markers that show the movement direction. Other new features include a completely new custom distance, speed, and acceleration unit support. This means that, for example, instead of always getting speed in meters per second, you can now specify your desired output units, including km/h, mph, or nm/h (knots).


This post is part of a series. Read more about movement data in GIS.

In the previous post, we — creatively ;-) — used MobilityDB to visualize stationary IOT sensor measurements.

This post covers the more obvious use case of visualizing trajectories. Thus bringing together the MobilityDB trajectories created in Detecting close encounters using MobilityDB 1.0 and visualization using Temporal Controller.

Like in the previous post, the valueAtTimestamp function does the heavy lifting. This time, we also apply it to the geometry time series column called trip:

SELECT mmsi,
    valueAtTimestamp(trip, '2017-05-07 08:55:40') geom,
    valueAtTimestamp(SOG, '2017-05-07 08:55:40') SOG
FROM "public"."ships"

Using this SQL query, we again set up a — not yet Temporal Controller-controlled — QueryLayer.

To configure Temporal Controller to update the timestamp in our SQL query, we again need to run the Python script from the previous post.

With this done, we are all set up to animate and explore the movement patterns in our dataset:


This post is part of a series. Read more about movement data in GIS.

It’s been a while since we last talked about MobilityDB in 2019 and 2020. Since then, the project has come a long way. It joined OSGeo as a community project and formed a first PSC, including the project founders Mahmoud Sakr and Esteban Zimányi as well as Vicky Vergara (of pgRouting fame) and yours truly.

This post is a quick teaser tutorial from zero to computing closest points of approach (CPAs) between trajectories using MobilityDB.

Setting up MobilityDB with Docker

The easiest way to get started with MobilityDB is to use the ready-made Docker container provided by the project. I’m using Docker and WSL (Windows Subsystem Linux on Windows 10) here. Installing WLS/Docker is out of scope of this post. Please refer to the official documentation for your operating system.

Once Docker is ready, we can pull the official container and fire it up:

docker pull mobilitydb/mobilitydb
docker volume create mobilitydb_data
docker run --name "mobilitydb" -d -p 25432:5432 -v mobilitydb_data:/var/lib/postgresql mobilitydb/mobilitydb
psql -h localhost -p 25432 -d mobilitydb -U docker

Currently, the container provides PostGIS 3.2 and MobilityDB 1.0:

Loading movement data into MobilityDB

Once the container is running, we can already connect to it from QGIS. This is my preferred way to load data into MobilityDB because we can simply drag-and-drop any timestamped point layer into the database:

For this post, I’m using an AIS data sample in the region of Gothenburg, Sweden.

After loading this data into a new table called ais, it is necessary to remove duplicate and convert timestamps:

CREATE TABLE AISInputFiltered AS
SELECT DISTINCT ON("MMSI","Timestamp") *
FROM ais;

ALTER TABLE AISInputFiltered ADD COLUMN t timestamp;
UPDATE AISInputFiltered SET t = "Timestamp"::timestamp;

Afterwards, we can create the MobilityDB trajectories:

CREATE TABLE Ships AS
SELECT "MMSI" mmsi,
tgeompoint_seq(array_agg(tgeompoint_inst(Geom, t) ORDER BY t)) AS Trip,
tfloat_seq(array_agg(tfloat_inst("SOG", t) ORDER BY t) FILTER (WHERE "SOG" IS NOT NULL) ) AS SOG,
tfloat_seq(array_agg(tfloat_inst("COG", t) ORDER BY t) FILTER (WHERE "COG" IS NOT NULL) ) AS COG
FROM AISInputFiltered
GROUP BY "MMSI";

ALTER TABLE Ships ADD COLUMN Traj geometry;
UPDATE Ships SET Traj = trajectory(Trip);

Once this is done, we can load the resulting Ships layer and the trajectories will be loaded as lines:

Computing closest points of approach

To compute the closest point of approach between two moving objects, MobilityDB provides a shortestLine function. To be correct, this function computes the line connecting the nearest approach point between the two tgeompoint_seq. In addition, we can use the time-weighted average function twavg to compute representative average movement speeds and eliminate stationary or very slowly moving objects:

SELECT S1.MMSI mmsi1, S2.MMSI mmsi2, 
       shortestLine(S1.trip, S2.trip) Approach,
       ST_Length(shortestLine(S1.trip, S2.trip)) distance
FROM Ships S1, Ships S2
WHERE S1.MMSI > S2.MMSI AND
twavg(S1.SOG) > 1 AND twavg(S2.SOG) > 1 AND
dwithin(S1.trip, S2.trip, 0.003)

In the QGIS Browser panel, we can right-click the MobilityDB connection to bring up an SQL input using Execute SQL:

The resulting query layer shows where moving objects get close to each other:

To better see what’s going on, we’ll look at individual CPAs:

Having a closer look with the Temporal Controller

Since our filtered AIS layer has proper timestamps, we can animate it using the Temporal Controller. This enables us to replay the movement and see what was going on in a certain time frame.

I let the animation run and stopped it once I spotted a close encounter. Looking at the AIS points and the shortest line, we can see that MobilityDB computed the CPAs along the trajectories:

A more targeted way to investigate a specific CPA is to use the Temporal Controllers’ fixed temporal range mode to jump to a specific time frame. This is helpful if we already know the time frame we are interested in. For the CPA use case, this means that we can look up the timestamp of a nearby AIS position and set up the Temporal Controller accordingly:

More

I hope you enjoyed this quick dive into MobilityDB. For more details, including talks by the project founders, check out the project website.


This post is part of a series. Read more about movement data in GIS.