Advertisements

Archive

spatio-temporal data

In the previous post, I presented an approach to generalize big trajectory datasets by extracting flows between cells of a data-driven irregular grid. This generalization provides a much better overview of the flow and directionality than a simple plot of the original raw trajectory data can. The paper introducing this method also contains more advanced visualizations that show cell statistics, such as the overall count of trajectories or the generalization quality. Another bit of information that is often of interest when exploring movement data, is the time of the movement. For example, at LBS2016 last week, M. Jahnke presented an application that allows users to explore the number of taxi pickups and dropoffs at certain locations:

By adopting this approach for the generalized flow maps, we can, for example, explore which parts of the research area are busy at which time of the day. Here I have divided the day into four quarters: night from 0 to 6 (light blue), morning from 6 to 12 (orange), afternoon from 12 to 18 (red), and evening from 18 to 24 (dark blue).

 (data credits: GeoLife project,

Aggregated trajectories with time-of-day markers at flow network nodes (data credits: GeoLife project, map tiles: Carto, map data: OSM)

The resulting visualization shows that overall, there is less movement during the night hours from midnight to 6 in the morning (light blue quarter). Sounds reasonable!

One implementation detail worth considering is which timestamp should be used for counting the number of movements. Should it be the time of the first trajectory point entering a cell, or the time when the trajectory leaves the cell, or some average value? In the current implementation, I have opted for the entry time. This means that if the tracked person spends a long time within a cell (e.g. at the work location) the trip home only adds to the evening trip count of the neighboring cell along the trajectory.

Since the time information stored in a PostGIS LinestringM feature’s m-value does not contain any time zone information, we also have to pay attention to handle any necessary offsets. For example, the GeoLife documentation states that all timestamps are provided in GMT while Beijing is in the GMT+8 time zone. This offset has to be accounted for in the analysis script, otherwise the counts per time of day will be all over the place.

Using the same approach, we could also investigate other variations, e.g. over different days of the week, seasonal variations, or the development over multiple years.

Advertisements

In the fist two parts of the Movement Data in GIS series, I discussed modeling trajectories as LinestringM features in PostGIS to overcome some common issues of movement data in GIS and presented a way to efficiently render speed changes along a trajectory in QGIS without having to split the trajectory into shorter segments.

While visualizing individual trajectories is important, the real challenge is trying to visualize massive trajectory datasets in a way that enables further analysis. The out-of-the-box functionality of GIS is painfully limited. Except for some transparency and heatmap approaches, there is not much that can be done to help interpret “hairballs” of trajectories. Luckily researchers in visual analytics have already put considerable effort into finding solutions for this visualization challenge. The approach I want to talk about today is by Andrienko, N., & Andrienko, G. (2011). Spatial generalization and aggregation of massive movement data. IEEE Transactions on visualization and computer graphics, 17(2), 205-219. and consists of the following main steps:

  1. Extracting characteristic points from the trajectories
  2. Grouping the extracted points by spatial proximity
  3. Computing group centroids and corresponding Voronoi cells
  4. Dividing trajectories into segments according to the Voronoi cells
  5. Counting transitions from one cell to another

The authors do a great job at describing the concepts and algorithms, which made it relatively straightforward to implement them in QGIS Processing. So far, I’ve implemented the basic logic but the paper contains further suggestions for improvements. This was also my first pyQGIS project that makes use of the measurement value support in the new geometry engine. The time information stored in the m-values is used to detect stop points, which – together with start, end, and turning points – make up the characteristic points of a trajectory.

The following animation illustrates the current state of the implementation: First the “hairball” of trajectories is rendered. Then we extract the characteristic points and group them by proximity. The big black dots are the resulting group centroids. From there, I skipped the Voronoi cells and directly counted transitions from “nearest to centroid A” to “nearest to centroid B”.

(data credits: GeoLife project)

From thousands of individual trajectories to a generalized representation of overall movement patterns (data credits: GeoLife project, map tiles: Stamen, map data: OSM)

The resulting visualization makes it possible to analyze flow strength as well as directionality. I have deliberately excluded all connections with a count below 10 transitions to reduce visual clutter. The cell size / distance between point groups – and therefore the level-of-detail – is one of the input parameters. In my example, I used a target cell size of approximately 2km. This setting results in connections which follow the major roads outside the city center very well. In the city center, where the road grid is tighter, trajectories on different roads mix and the connections are less clear.

Since trajectories in this dataset are not limited to car trips, it is expected to find additional movement that is not restricted to the road network. This is particularly noticeable in the dense area in the west where many slow trajectories – most likely from walking trips – are located. The paper also covers how to ensure that connections are limited to neighboring cells by densifying the trajectories before computing step 4.

trajectory_generalization

Running the scripts for over 18,000 trajectories requires patience. It would be worth evaluating if the first three steps can be run with only a subsample of the data without impacting the results in a negative way.

One thing I’m not satisfied with yet is the way to specify the target cell size. While it’s possible to measure ellipsoidal distances in meters using QgsDistanceArea (irrespective of the trajectory layer’s CRS), the initial regular grid used in step 2 in order to group the extracted points has to be specified in the trajectory layer’s CRS units – quite likely degrees. Instead, it may be best to transform everything into an equidistant projection before running any calculations.

It’s good to see that PyQGIS enables us to use the information encoded in PostGIS LinestringM features to perform spatio-temporal analysis. However, working with m or z values involves a lot of v2 geometry classes which work slightly differently than their v1 counterparts. It certainly takes some getting used to. This situation might get cleaned up as part of the QGIS 3 API refactoring effort. If you can, please support work on QGIS 3. Now is the time to shape the PyQGIS API for the following years!

In the first part of the Movement Data in GIS series, I discussed some of the common issues of modeling movement data in GIS, followed by a recommendation to model trajectories as LinestringM features in PostGIS to simplify analyses and improve query performance.

Of course, we don’t only want to analyse movement data within the database. We also want to visualize it to gain a better understanding of the data or communicate analysis results. For example, take one trajectory:

(data credits: GeoLife project)

Visualizing movement direction is easy: just slap an arrow head on the end of the line and done. What about movement speed? Sure! Mean speed, max speed, which should it  be?

Speed along the trajectory, a value for each segment between consecutive positions.

With the usual GIS data model, we are back to square one. A line usually has one color and width. Of course we can create doted and dashed lines but that’s not getting us anywhere here. To visualize speed variations along the trajectory, we therefore split the original trajectory into its segments, 1429 in this case. Then we can calculate speed for each segment and use a graduated or data defined renderer to show the results:

trajectory_segment_features

Speed along trajectory: red = slow to blue = fast

Very unsatisfactory! We had to increase the number of features 1429 times just to show speed variations along the trajectory, even though the original single trajectory feature already contained all the necessary information and QGIS does support geometries with measurement values.

Starting from QGIS 2.14, we have an alternative way to deal with this issue. We can stick to the original single trajectory feature and render it using the new geometry generator symbol layer. (This functionality is also used under the hood of the 2.5D renderer.) Using the segments_to_lines() function, the geometry generator basically creates individual segment lines on the fly:

geomgenerator

Segments_to_lines( $geometry) returns a multi line geometry consisting of a line for every segment in the input geometry

Once this is set up, we can style the segments with a data-defined expression that determines the speed on the segment and returns the respective color along a color ramp:

segment_speed_color

Speed is calculated using the length of the segment and the time between segment start and end point. Then speed values from 0 to 50 km/h are mapped to the red-yellow-blue color ramp:

ramp_color(
  'RdYlBu',
  scale_linear(
    length( 
      transform(
	    geometry_n($geometry,@geometry_part_num),
		'EPSG:4326','EPSG:54027'
		)
    ) / (
      m(end_point(  geometry_n($geometry,@geometry_part_num))) -
      m(start_point(geometry_n($geometry,@geometry_part_num)))
    ) * 3.6,
    0,50,
    0,1
  )
)

Thanks a lot to @nyalldawson for all the help figuring out the details!

While the following map might look just like the previous one in the end, note that we now only deal with the original single line feature:

trajectory_geomgenerator

Similar approaches can be used to label segments or positions along the trajectory without having to break the original feature. Thanks to the geometry generator functionality, we can make direct use of the LinestringM data model for trajectory visualization.

A common use case of the QGIS TimeManager plugin is visualizing tracking data such as animal migration data. This post illustrates the steps necessary to create an animation from bird migration data. I’m using a dataset published on Movebank:

Fraser KC, Shave A, Savage A, Ritchie A, Bell K, Siegrist J, Ray JD, Applegate K, Pearman M (2016) Data from: Determining fine-scale migratory connectivity and habitat selection for a migratory songbird by using new GPS technology. Movebank Data Repository. doi:10.5441/001/1.5q5gn84d.

It’s a CSV file which can be loaded into QGIS using the Add delimited text layer tool. Once loaded, we can get started:

1. Identify time and ID columns

Especially if you are new to the dataset, have a look at the attribute table and identify the attributes containing timestamps and ID of the moving object. In our sample dataset, time is stored in the aptly named timestamp attribute and uses ISO standard formatting %Y-%m-%d %H:%M:%S.%f. This format is ideal for TimeManager and we can use it without any changes. The object ID attribute is titled individual-local-identifier.

movebank_data

The dataset contains 128 positions of 14 different birds. This means that there are rather long gaps between consecutive observations. In our animation, we’ll want to fill these gaps with interpolated positions to get uninterrupted movement traces.

2. Configuring TimeManager

To set up the animation, go to the TimeManager panel and click Settings | Add Layer. In the following dialog we can specify the time and ID attributes which we identified in the previous step. We also enable linear interpolation. The interpolation option will create an additional point layer in the QGIS project, which contains the interpolated positions.

timemanager_settings

When using the interpolation option, please note that it currently only works if the point layer is styled with a Single symbol renderer. If a different renderer is configured, it will fail to create the interpolation layer.

Once the layer is configured, the minimum and maximum timestamps will be displayed in the TimeManager dock right bellow the time slider. For this dataset, it makes sense to set the Time frame size, that is the time between animation frames, to one day, so we will see one frame per day:

timemanager_dock

Now you can test the animation by pressing the TimeManager’s play button. Feel free to add more data, such as background maps or other layers, to your project. Besides exploring the animated data in QGIS, you can also create a video to share your results.

3. Creating a video

To export the animation, click the Export video button. If you are using Linux, you can export videos directly from QGIS. On Windows, you first need to export the animation frames as individual pictures, which you can then convert to a video (for example using the free Windows Movie Maker application).

These are the basic steps to set up an animation for migration data. There are many potential extensions to this animation, including adding permanent traces of past movements. While this approach serves us well for visualizing bird migration routes, it is easy to imagine that other movement data would require different interpolation approaches. Vehicle data, for example, would profit from network-constrained interpolation between observed positions.

If you find the TimeManager plugin useful, please consider supporting its development or getting involved. Many features, such as interpolation, are weekend projects that are still in a proof-of-concept stage. In addition, we have the huge upcoming challenge of migrating the plugin to Python 3 and Qt5 to support QGIS3 ahead of us. Happy QGISing!

Since I’ve started working, transport and movement data have been at the core of many of my projects. The spatial nature of movement data makes it interesting for GIScience but typical GIS tools are not a particularly good match.

Dealing with the temporal dynamics of geographic processes is one of the grand challenges for Geographic Information Science. Geographic Information Systems (GIS) and related spatial analysis methods are quite adept at handling spatial dimensions of patterns and processes, but the temporal and coupled space-time attributes of phenomena are difficult to represent and examine with contemporary GIS. (Dr. Paul M. Torrens, Center for Urban Science + Progress, New York University)

It’s still a hot topic right now, as the variety of related publications and events illustrates. For example, just this month, there is an Animove two-week professional training course (18–30 September 2016, Max-Planck Institute for Ornithology, Lake Konstanz) as well as the GIScience 2016 Workshop on Analysis of Movement Data (27 September 2016, Montreal, Canada).

Space-time cubes and animations are classics when it comes to visualizing movement data in GIS. They can be used for some visual analysis but have their limitations, particularly when it comes to working with and trying to understand lots of data. Visualization and analysis of spatio-temporal data in GIS is further complicated by the fact that the temporal information is not standardized in most GIS data formats. (Some notable exceptions of formats that do support time by design are GPX and NetCDF but those aren’t really first-class citizens in current desktop GIS.)

Most commonly, movement data is modeled as points (x,y, and optionally z) with a timestamp, object or tracker id, and potential additional info, such as speed, status, heading, and so on. With this data model, even simple questions like “Find all tracks that start in area A and end in area B” can become a real pain in “vanilla” desktop GIS. Even if the points come with a sequence number, which makes it easy to identify the start point, getting the end point is tricky without some custom code or queries. That’s why I have been storing the points in databases in order to at least have the powers of SQL to deal with the data. Even so, most queries were still painfully complex and performance unsatisfactory.

So I reached out to the Twitterverse asking for pointers towards moving objects database extensions for PostGIS and @bitnerd, @pwramsey, @hruske, and others replied. Amongst other useful tips, they pointed me towards the new temporal support, which ships with PostGIS 2.2. It includes the following neat functions:

  • ST_IsValidTrajectory — Returns true if the geometry is a valid trajectory.
  • ST_ClosestPointOfApproach — Returns the measure at which points interpolated along two lines are closest.
  • ST_DistanceCPA — Returns the distance between closest points of approach in two trajectories.
  • ST_CPAWithin — Returns true if the trajectories’ closest points of approach are within the specified distance.

Instead of  points, these functions expect trajectories that are stored as LinestringM (or LinestringZM) where M is the time dimension. This approach makes many analyses considerably easier to handle. For example, clustering trajectory start and end locations and identifying the most common connections:

animation_clusters

(data credits: GeoLife project)

Overall, it’s an interesting and promising approach but there are still some open questions I’ll have to look into, such as: Is there an efficient way to store additional info for each location along the trajectory (e.g. instantaneous speed or other status)? How well do desktop GIS play with LinestringM data and what’s the overhead of dealing with it?


This is the first part of a series of posts, read more:

A previous version of this post has been published in German on Die bemerkenswerte Karte.

Visualizations of mobility data such as taxi or bike sharing trips have become very popular. One of the best most recent examples is cf. city flows developed by Till Nagel and Christopher Pietsch at the FH Potsdam. cf. city flows visualizes the rides in bike sharing systems in New York, Berlin and London at different levels of detail, from overviews of the whole city to detailed comparisons of individual stations:

The visualizations were developed using Unfolding, a library to create interactive maps and geovisualizations in Processing (the other Processing … not the QGIS Processing toolbox) and Java. (I tinkered with the Python port of Processing in 2012, but this is certainly on a completely different level.)

The insights into the design process, which are granted in the methodology section section of the project website are particularly interesting. Various approaches for presenting traffic flows between the stations were tested. Building on initial simple maps, where stations were connected by straight lines, consecutive design decisions are described in detail:

The results are impressive. Particularly the animated trips convey the dynamics of urban mobility very well:

However, a weak point of this (and many similar projects) is the underlying data. This is also addressed directly by the project website:

Lacking actual GPS tracks, the trip trajectories are rendered as smooth paths of the calculated optimal bike routes

This means that the actual route between start and drop off location is not known. The authors therefore estimated the routes using HERE’s routing service. The visualization therefore only shows one of many possible routes. However, cyclists don’t necessarily choose the “best” route as determined by an algorithm – be it the most direct or otherwise preferred. The visualization does not account for this uncertainty in the route selection. Rather, it gives the impression that the cyclist actually traveled on a certain route. It would therefore be undue to use this visualization to derive information about the popularity of certain routes (for example, for urban planning). Moreover, the data only contains information about the fulfilled demand, since only trips that were really performed are recorded. Demand for trips which could not take place due to lack of bicycles or stations, is therefore missing.

As always: exercise some caution when interpreting statistics or visualizations and then sit back and enjoy the animations.

If you want to read more about GIS and transportation modelling, check out
Loidl, M.; Wallentin, G.; Cyganski, R.; Graser, A.; Scholz, J.; Haslauer, E. GIS and Transport Modeling—Strengthening the Spatial Perspective. ISPRS Int. J. Geo-Inf. 2016, 5, 84. (It’s open access.)

This is a guest post by Karolina Alexiou (aka carolinux), Anita’s collaborator on the Time Manager plugin.

As of version 2.1.5, TimeManager provides some support for stepping through WMS-T layers, a format about which Anita has written  in the past.  From the official definition, the OpenGIS® Web Map Service Interface Standard (WMS) provides a simple HTTP interface for requesting geo-registered map images from one or more distributed geospatial databases. A WMS request defines the geographic layer(s) and area of interest to be processed. The response to the request is one or more geo-registered map images (returned as JPEG, PNG, etc) that can be displayed in a browser application. QGIS can display those images as a raster layer. The WMS-T standard allows the user of the service to set a time boundary in addition to a geographical boundary with their HTTP request.

We are going to add the following url as the web map provider service: http://mesonet.agron.iastate.edu/cgi-bin/wms/nexrad/n0r-t.cgi

From QGIS, go to Layer>Add Layer>Add WMS/WMST Layer and add a new server and connect to it. For the service we have chosen, we only need to specify a name and the url.

Select the top level layer, in our case named nexrad_base_reflect and click Add. Now you have added the layer to your QGIS project.

To add it to TimeManager as well, add it as a raster with the settings from the screenshot below. Start time and end time have the values 2005-08-29:03:10:00Z and 2005-08-30:03:10:00Z respectively, which is a period which overlaps with hurricane Katrina. Now, the WMS-T standard uses a handful of different time formats, and at this time, the plugin requires you to know this format and input the start and end values in this format. If there’s interest to sponsor this feature, in the future we may get the format directly from the web service description. The web service description is an XML document (see here for an example) which, among other information, contains a section that defines the format, default time and granularity of the time dimension.

add_raster

If we set the time step to 2 hours and click play, we will see that TimeManager renders each interval by querying the web map service for it, as you can see in this short video.

Querying the web service and waiting for the response takes some time. So, the plugin requires some patience for looking at this particular layer format in interactive mode. If we export the frames, however, we can get a nice result. This is an animation showing hurricane Katrina progressing over a 30 minute interval.

whoosh

If you want to sponsor further development of the Time Manager plugin, you can arrange a session with me – Karolina Alexiou – via Codementor.

%d bloggers like this: