Archive

Tag Archives: spatio-temporal data

In the previous post, I presented an approach to generalize big trajectory datasets by extracting flows between cells of a data-driven irregular grid. This generalization provides a much better overview of the flow and directionality than a simple plot of the original raw trajectory data can. The paper introducing this method also contains more advanced visualizations that show cell statistics, such as the overall count of trajectories or the generalization quality. Another bit of information that is often of interest when exploring movement data, is the time of the movement. For example, at LBS2016 last week, M. Jahnke presented an application that allows users to explore the number of taxi pickups and dropoffs at certain locations:

By adopting this approach for the generalized flow maps, we can, for example, explore which parts of the research area are busy at which time of the day. Here I have divided the day into four quarters: night from 0 to 6 (light blue), morning from 6 to 12 (orange), afternoon from 12 to 18 (red), and evening from 18 to 24 (dark blue).

 (data credits: GeoLife project,

Aggregated trajectories with time-of-day markers at flow network nodes (data credits: GeoLife project, map tiles: Carto, map data: OSM)

The resulting visualization shows that overall, there is less movement during the night hours from midnight to 6 in the morning (light blue quarter). Sounds reasonable!

One implementation detail worth considering is which timestamp should be used for counting the number of movements. Should it be the time of the first trajectory point entering a cell, or the time when the trajectory leaves the cell, or some average value? In the current implementation, I have opted for the entry time. This means that if the tracked person spends a long time within a cell (e.g. at the work location) the trip home only adds to the evening trip count of the neighboring cell along the trajectory.

Since the time information stored in a PostGIS LinestringM feature’s m-value does not contain any time zone information, we also have to pay attention to handle any necessary offsets. For example, the GeoLife documentation states that all timestamps are provided in GMT while Beijing is in the GMT+8 time zone. This offset has to be accounted for in the analysis script, otherwise the counts per time of day will be all over the place.

Using the same approach, we could also investigate other variations, e.g. over different days of the week, seasonal variations, or the development over multiple years.

In the fist two parts of the Movement Data in GIS series, I discussed modeling trajectories as LinestringM features in PostGIS to overcome some common issues of movement data in GIS and presented a way to efficiently render speed changes along a trajectory in QGIS without having to split the trajectory into shorter segments.

While visualizing individual trajectories is important, the real challenge is trying to visualize massive trajectory datasets in a way that enables further analysis. The out-of-the-box functionality of GIS is painfully limited. Except for some transparency and heatmap approaches, there is not much that can be done to help interpret “hairballs” of trajectories. Luckily researchers in visual analytics have already put considerable effort into finding solutions for this visualization challenge. The approach I want to talk about today is by Andrienko, N., & Andrienko, G. (2011). Spatial generalization and aggregation of massive movement data. IEEE Transactions on visualization and computer graphics, 17(2), 205-219. and consists of the following main steps:

  1. Extracting characteristic points from the trajectories
  2. Grouping the extracted points by spatial proximity
  3. Computing group centroids and corresponding Voronoi cells
  4. Deviding trajectories into segments according to the Voronoi cells
  5. Counting transitions from one cell to another

The authors do a great job at describing the concepts and algorithms, which made it relatively straightforward to implement them in QGIS Processing. So far, I’ve implemented the basic logic but the paper contains further suggestions for improvements. This was also my first pyQGIS project that makes use of the measurement value support in the new geometry engine. The time information stored in the m-values is used to detect stop points, which – together with start, end, and turning points – make up the characteristic points of a trajectory.

The following animation illustrates the current state of the implementation: First the “hairball” of trajectories is rendered. Then we extract the characteristic points and group them by proximity. The big black dots are the resulting group centroids. From there, I skipped the Voronoi cells and directly counted transitions from “nearest to centroid A” to “nearest to centroid B”.

(data credits: GeoLife project)

From thousands of individual trajectories to a generalized representation of overall movement patterns (data credits: GeoLife project, map tiles: Stamen, map data: OSM)

The resulting visualization makes it possible to analyze flow strength as well as directionality. I have deliberately excluded all connections with a count below 10 transitions to reduce visual clutter. The cell size / distance between point groups – and therefore the level-of-detail – is one of the input parameters. In my example, I used a target cell size of approximately 2km. This setting results in connections which follow the major roads outside the city center very well. In the city center, where the road grid is tighter, trajectories on different roads mix and the connections are less clear.

Since trajectories in this dataset are not limited to car trips, it is expected to find additional movement that is not restricted to the road network. This is particularly noticeable in the dense area in the west where many slow trajectories – most likely from walking trips – are located. The paper also covers how to ensure that connections are limited to neighboring cells by densifying the trajectories before computing step 4.

trajectory_generalization

Running the scripts for over 18,000 trajectories requires patience. It would be worth evaluating if the first three steps can be run with only a subsample of the data without impacting the results in a negative way.

One thing I’m not satisfied with yet is the way to specify the target cell size. While it’s possible to measure ellipsoidal distances in meters using QgsDistanceArea (irrespective of the trajectory layer’s CRS), the initial regular grid used in step 2 in order to group the extracted points has to be specified in the trajectory layer’s CRS units – quite likely degrees. Instead, it may be best to transform everything into an equidistant projection before running any calculations.

It’s good to see that PyQGIS enables us to use the information encoded in PostGIS LinestringM features to perform spatio-temporal analysis. However, working with m or z values involves a lot of v2 geometry classes which work slightly differently than their v1 counterparts. It certainly takes some getting used to. This situation might get cleaned up as part of the QGIS 3 API refactoring effort. If you can, please support work on QGIS 3. Now is the time to shape the PyQGIS API for the following years!

In the first part of the Movement Data in GIS series, I discussed some of the common issues of modeling movement data in GIS, followed by a recommendation to model trajectories as LinestringM features in PostGIS to simplify analyses and improve query performance.

Of course, we don’t only want to analyse movement data within the database. We also want to visualize it to gain a better understanding of the data or communicate analysis results. For example, take one trajectory:

(data credits: GeoLife project)

Visualizing movement direction is easy: just slap an arrow head on the end of the line and done. What about movement speed? Sure! Mean speed, max speed, which should it  be?

Speed along the trajectory, a value for each segment between consecutive positions.

With the usual GIS data model, we are back to square one. A line usually has one color and width. Of course we can create doted and dashed lines but that’s not getting us anywhere here. To visualize speed variations along the trajectory, we therefore split the original trajectory into its segments, 1429 in this case. Then we can calculate speed for each segment and use a graduated or data defined renderer to show the results:

trajectory_segment_features

Speed along trajectory: red = slow to blue = fast

Very unsatisfactory! We had to increase the number of features 1429 times just to show speed variations along the trajectory, even though the original single trajectory feature already contained all the necessary information and QGIS does support geometries with measurement values.

Starting from QGIS 2.14, we have an alternative way to deal with this issue. We can stick to the original single trajectory feature and render it using the new geometry generator symbol layer. (This functionality is also used under the hood of the 2.5D renderer.) Using the segments_to_lines() function, the geometry generator basically creates individual segment lines on the fly:

geomgenerator

Segments_to_lines( $geometry) returns a multi line geometry consisting of a line for every segment in the input geometry

Once this is set up, we can style the segments with a data-defined expression that determines the speed on the segment and returns the respective color along a color ramp:

segment_speed_color

Speed is calculated using the length of the segment and the time between segment start and end point. Then speed values from 0 to 50 km/h are mapped to the red-yellow-blue color ramp:

ramp_color(
  'RdYlBu',
  scale_linear(
    length( 
      transform(
	    geometry_n($geometry,@geometry_part_num),
		'EPSG:4326','EPSG:54027'
		)
    ) / (
      m(end_point(  geometry_n($geometry,@geometry_part_num))) -
      m(start_point(geometry_n($geometry,@geometry_part_num)))
    ) * 3.6,
    0,50,
    0,1
  )
)

Thanks a lot to @nyalldawson for all the help figuring out the details!

While the following map might look just like the previous one in the end, note that we now only deal with the original single line feature:

trajectory_geomgenerator

Similar approaches can be used to label segments or positions along the trajectory without having to break the original feature. Thanks to the geometry generator functionality, we can make direct use of the LinestringM data model for trajectory visualization.

Since I’ve started working, transport and movement data have been at the core of many of my projects. The spatial nature of movement data makes it interesting for GIScience but typical GIS tools are not a particularly good match.

Dealing with the temporal dynamics of geographic processes is one of the grand challenges for Geographic Information Science. Geographic Information Systems (GIS) and related spatial analysis methods are quite adept at handling spatial dimensions of patterns and processes, but the temporal and coupled space-time attributes of phenomena are difficult to represent and examine with contemporary GIS. (Dr. Paul M. Torrens, Center for Urban Science + Progress, New York University)

It’s still a hot topic right now, as the variety of related publications and events illustrates. For example, just this month, there is an Animove two-week professional training course (18–30 September 2016, Max-Planck Institute for Ornithology, Lake Konstanz) as well as the GIScience 2016 Workshop on Analysis of Movement Data (27 September 2016, Montreal, Canada).

Space-time cubes and animations are classics when it comes to visualizing movement data in GIS. They can be used for some visual analysis but have their limitations, particularly when it comes to working with and trying to understand lots of data. Visualization and analysis of spatio-temporal data in GIS is further complicated by the fact that the temporal information is not standardized in most GIS data formats. (Some notable exceptions of formats that do support time by design are GPX and NetCDF but those aren’t really first-class citizens in current desktop GIS.)

Most commonly, movement data is modeled as points (x,y, and optionally z) with a timestamp, object or tracker id, and potential additional info, such as speed, status, heading, and so on. With this data model, even simple questions like “Find all tracks that start in area A and end in area B” can become a real pain in “vanilla” desktop GIS. Even if the points come with a sequence number, which makes it easy to identify the start point, getting the end point is tricky without some custom code or queries. That’s why I have been storing the points in databases in order to at least have the powers of SQL to deal with the data. Even so, most queries were still painfully complex and performance unsatisfactory.

So I reached out to the Twitterverse asking for pointers towards moving objects database extensions for PostGIS and @bitnerd, @pwramsey, @hruske, and others replied. Amongst other useful tips, they pointed me towards the new temporal support, which ships with PostGIS 2.2. It includes the following neat functions:

  • ST_IsValidTrajectory — Returns true if the geometry is a valid trajectory.
  • ST_ClosestPointOfApproach — Returns the measure at which points interpolated along two lines are closest.
  • ST_DistanceCPA — Returns the distance between closest points of approach in two trajectories.
  • ST_CPAWithin — Returns true if the trajectories’ closest points of approach are within the specified distance.

Instead of  points, these functions expect trajectories that are stored as LinestringM (or LinestringZM) where M is the time dimension. This approach makes many analyses considerably easier to handle. For example, clustering trajectory start and end locations and identifying the most common connections:

animation_clusters

(data credits: GeoLife project)

Overall, it’s an interesting and promising approach but there are still some open questions I’ll have to look into, such as: Is there an efficient way to store additional info for each location along the trajectory (e.g. instantaneous speed or other status)? How well do desktop GIS play with LinestringM data and what’s the overhead of dealing with it?

Today’s post is a short tutorial for creating trajectory animations with a fadeout effect using QGIS Time Manager. This is the result we are aiming for:

The animation shows the current movement in pink which fades out and leaves behind green traces of the trajectories.

About the data

GeoLife GPS Trajectories were collected within the (Microsoft Research Asia) Geolife project by 182 users in a period of over three years (from April 2007 to August 2012). [1,2,3] The GeoLife GPS Trajectories download contains many text files organized in multiple directories. The data files are basically CSVs with 6 lines of header information. They contain the following fields:

Field 1: Latitude in decimal degrees.
Field 2: Longitude in decimal degrees.
Field 3: All set to 0 for this dataset.
Field 4: Altitude in feet (-777 if not valid).
Field 5: Date – number of days (with fractional part) that have passed since 12/30/1899.
Field 6: Date as a string.
Field 7: Time as a string.

Data prep: PostGIS

Since any kind of GIS operation on text files will be quite inefficient, I decided to load the data into a PostGIS database. This table of millions of GPS points can then be sliced into appropriate chunks for exploration, for example, a day in Beijing:

CREATE MATERIALIZED VIEW geolife.beijing 
AS SELECT trajectories.id,
    trajectories.t_datetime,
    trajectories.t_datetime + interval '1 day' as t_to_datetime,
    trajectories.geom,
    trajectories.oid
   FROM geolife.trajectories
   WHERE st_dwithin(trajectories.geom,
           st_setsrid(
             st_makepoint(116.3974589, 
                           39.9388838), 
             4326), 
           0.1) 
   AND trajectories.t_datetime >= '2008-11-11 00:00:00'
   AND trajectories.t_datetime < '2008-11-12 00:00:00'
WITH DATA

Trajectory viz: a fadeout effect for point markers

The idea behind this visualization is to show both the current movement as well as the history of the trajectories. This can be achieved with a fadeout effect which leaves behind traces of past movement while the most recent positions are highlighted to stand out.

Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.

Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.

This effect can be created using a Single Symbol renderer with a marker symbol with two symbol layers: one layer serves as the highlights layer (pink) while the second layer represents the traces (green) which linger after the highlights disappear. Feature blending is used to achieve the desired effect for overlapping markers.

Screenshot 2015-05-06 23.52.40

The highlights layer has two expression-based properties: color and size. The color fades to white and the point size shrinks as the point ages. The age can be computed by comparing the point’s t_datetime timestamp to the Time Manager animation time $animation_datetime.

This expression creates the color fading effect:

color_hsv(  
  311,
  scale_exp( 
    minute(age(animation_datetime(),"t_datetime")),
    0,60,
    100,0,
    0.2
  ),
  90
)

(Note that before QGIS 2.10, we had to use $animation_datetime instead of animation_datetime().)

and this expression makes the point size shrink:

scale_exp( 
  minute(age(animation_datetime(),"t_datetime")), 0,60, 24,0, 0.2 ) 

Outlook

I’m currently preparing this and a couple of other examples for my Time Manager workshop at the upcoming 1st QGIS conference in Nødebo. The workshop materials will be made available online afterwards.

Literature

[1] Yu Zheng, Lizhu Zhang, Xing Xie, Wei-Ying Ma. Mining interesting locations and travel sequences from GPS trajectories. In Proceedings of International conference on World Wild Web (WWW 2009), Madrid Spain. ACM Press: 791-800.
[2] Yu Zheng, Quannan Li, Yukun Chen, Xing Xie, Wei-Ying Ma. Understanding Mobility Based on GPS Data. In Proceedings of ACM conference on Ubiquitous Computing (UbiComp 2008), Seoul, Korea. ACM Press: 312-321.
[3] Yu Zheng, Xing Xie, Wei-Ying Ma, GeoLife: A Collaborative Social Networking Service among User, location and trajectory. Invited paper, in IEEE Data Engineering Bulletin. 33, 2, 2010, pp. 32-40.

Data from various vehicles is collected for many purposes in cities worldwide. To get a feeling for just how much data is available, I created the following video using QGIS Time Manager which has been shown at the Austrian Museum of Applied Arts “MADE 4 YOU – Design for Change”. It shows one hour of taxi tracks in the city of Vienna:

If you like the video, please go to http://www.ertico.com/2012-its-video-competition-open-vote and vote for it in the category “Videos directed at the general public”.

Today, I’ve compiled a short video showcasing one of the possible uses of Time Manager plugin: Storm tracking. (Storm data can be downloaded from www.nhc.noaa.gov.)

Point size shows storm class, labels read maximum speed in mph.

If you are using Time Manager for your work, I’d love to hear about it.

%d bloggers like this: