Movement data in GIS: issues & ideas

Since I’ve started working, transport and movement data have been at the core of many of my projects. The spatial nature of movement data makes it interesting for GIScience but typical GIS tools are not a particularly good match.

Dealing with the temporal dynamics of geographic processes is one of the grand challenges for Geographic Information Science. Geographic Information Systems (GIS) and related spatial analysis methods are quite adept at handling spatial dimensions of patterns and processes, but the temporal and coupled space-time attributes of phenomena are difficult to represent and examine with contemporary GIS. (Dr. Paul M. Torrens, Center for Urban Science + Progress, New York University)

It’s still a hot topic right now, as the variety of related publications and events illustrates. For example, just this month, there is an Animove two-week professional training course (18–30 September 2016, Max-Planck Institute for Ornithology, Lake Konstanz) as well as the GIScience 2016 Workshop on Analysis of Movement Data (27 September 2016, Montreal, Canada).

Space-time cubes and animations are classics when it comes to visualizing movement data in GIS. They can be used for some visual analysis but have their limitations, particularly when it comes to working with and trying to understand lots of data. Visualization and analysis of spatio-temporal data in GIS is further complicated by the fact that the temporal information is not standardized in most GIS data formats. (Some notable exceptions of formats that do support time by design are GPX and NetCDF but those aren’t really first-class citizens in current desktop GIS.)

Most commonly, movement data is modeled as points (x,y, and optionally z) with a timestamp, object or tracker id, and potential additional info, such as speed, status, heading, and so on. With this data model, even simple questions like “Find all tracks that start in area A and end in area B” can become a real pain in “vanilla” desktop GIS. Even if the points come with a sequence number, which makes it easy to identify the start point, getting the end point is tricky without some custom code or queries. That’s why I have been storing the points in databases in order to at least have the powers of SQL to deal with the data. Even so, most queries were still painfully complex and performance unsatisfactory.

So I reached out to the Twitterverse asking for pointers towards moving objects database extensions for PostGIS and @bitnerd, @pwramsey, @hruske, and others replied. Amongst other useful tips, they pointed me towards the new temporal support, which ships with PostGIS 2.2. It includes the following neat functions:

  • ST_IsValidTrajectory — Returns true if the geometry is a valid trajectory.
  • ST_ClosestPointOfApproach — Returns the measure at which points interpolated along two lines are closest.
  • ST_DistanceCPA — Returns the distance between closest points of approach in two trajectories.
  • ST_CPAWithin — Returns true if the trajectories’ closest points of approach are within the specified distance.

Instead of  points, these functions expect trajectories that are stored as LinestringM (or LinestringZM) where M is the time dimension. This approach makes many analyses considerably easier to handle. For example, clustering trajectory start and end locations and identifying the most common connections:

animation_clusters

(data credits: GeoLife project)

Overall, it’s an interesting and promising approach but there are still some open questions I’ll have to look into, such as: Is there an efficient way to store additional info for each location along the trajectory (e.g. instantaneous speed or other status)? How well do desktop GIS play with LinestringM data and what’s the overhead of dealing with it?


This post is part of a series. Read more about movement data in GIS.

8 comments
  1. Muhammad Munir Ahmad said:

    I am a student of Kaduna polytechnic in Kaduna Nigeria. I study surveying and geoinformatics at postgraduate level. I am very much interested in materials regarding spatiat science ie books, software, tutorials etc. I hope my correspondence with you will benefit me. Thank you.

  2. Hi Anita, Jason from GIS-SE here. What would you say is your ‘test case’ problem to solve spatio-temporally? Is the example you sited (finding tracks starting in A and ending in B) the one of particular interest, or is it the clustering problem of “similar departure and destinations within time ranges,” or conflict/congestion like “all tracks passing through B within X time interval?” — And, No … “All of the above…” isn’t a swell answer.

    Here’s why I ask: I worked on linking big-data for GeoInt on Hadoop, and I believe if you lean that way, you will find the map-reduce transforms cull both the spatial and temporal data very efficiently.

    • Hi Jason! Thank you for chiming in! There are many questions that I’m interested in, and so far I haven’t decided yet which one has higher priority than others because this work is not associated with a specific project.

      Possible questions include:
      – Do all tracks from A to B use the same route or are there multiple alternative routes?
      – Which trajectories meet in time and space (i.e. the persons carrying the tracker actually were near each other)?
      – Where did people stop, for how long, and are there hotspots with many stops?

      I’ve looked into some big data solutions, such as Geomesa, but it seems like it would take weeks or months to get into the whole stack that is required to start working with these tools.

  3. Hello Anita, this is a great topic! Thanks for posting these details, and with QGIS as well. I’ve been working on telematics projects with granular latitude, longitude, altitude and timestamp values for a while, often in a “big SQL” database like vertica or paraccel. I am go happy to see how mature, and getting better the postgis functionality is. The most interesting application would seem to be aerospace, and I am thinking in particular spacecraft.

    But, of course personal transit by many modes is a similar problem. I’m going to update to postgis 2.2 and check those functions out, including the time variance and altitude components.

    Jason, I’m curious how map-reduce on hadoop will be more efficient for this. Of course when you through a cluster of machines at a problem, vs a single node of postgres, brute force alone yields results. Some google guys tried “Big SQL”, then “Big Table”, and opened a company SpaceCurve to try to tackle this problem at scale, but they did not succeed. If you have some sharable details, I’d like to read them. Else, postgres + postgis, maybe scaled up on redshift or greenplum, given parallelism, columnar with compression, and these functions that Anita describes sounds pretty good.

    • Hi Geoffrey! I’d love to see some spacecraft examples. That’s a type of trajectories with which I haven’t worked at all yet. Are there public datasets of spacecraft trajectories?

      • Hello Anita, your question got me searching for open datasets for spacecraft. While I did not locate exactly open datasets, I found some open source tools that should be able to easily generate them:

        * Predict – linux, command line -> http://www.qsl.net/kd2bd/predict.html
        * Gpredict – GUI version of same -> http://gpredict.oz9aec.net/index.php

        I’m going to install these and see if I can create a good example dataset based upon some Cold War history, the Soviet RORSAT’s -> https://en.wikipedia.org/wiki/US-A. Of course, satellite movements have all sorts of other remote sensing type applications. I’ve been having some trouble clipping flight lines to the international date line, but your other posts are useful on that front. Once I generate these datasets, and make some progress, I will share them with you here.

      • Hi Geoffrey, that sounds great! Definitely looking forward to hearing back from you and trying out some spacecraft data.