Advertisements

Archive

Tag Archives: spatio-temporal data

In Movement data in GIS #16, I presented a new way to deal with trajectory data using GeoPandas and how to load the trajectory GeoDataframes as a QGIS layer. Following up on this initial experiment, I’ve now implemented a first version of an algorithm that performs a spatial analysis on my GeoPandas trajectories.

The first spatial analysis algorithm I’ve implemented is Clip trajectories by extent. Implementing this algorithm revealed a couple of pitfalls:

  • To achieve correct results, we need to compute spatial intersections between linear trajectory segments and the extent. Therefore, we need to convert our point GeoDataframe to a line GeoDataframe.
  • Based on the spatial intersection, we need to take care of computing the corresponding timestamps of the events when trajectories enter or leave the extent.
  • A trajectory can intersect the extent multiple times. Therefore, we cannot simply use the global minimum and maximum timestamp of intersecting segments.
  • GeoPandas provides spatial intersection functionality but if the trajectory contains consecutive rows without location change, these will result in zero length lines and those cause an empty intersection result.

So far, the clip result only contains the trajectory id plus a suffix indicating the sequence of the intersection segments for a specific trajectory (because one trajectory can intersect the extent multiple times). The following screenshot shows one highlighted trajectory that intersects the extent three times and the resulting clipped trajectories:

This algorithm together with the basic trajectory from points algorithm is now available in a Processing algorithm provider plugin called Processing Trajectory.

Note: This plugin depends on GeoPandas.

Note for Windows users: GeoPandas is not a standard package that is available in OSGeo4W, so you’ll have to install it manually. (For the necessary steps, see this answer on gis.stackexchange.com)

The implemented tests show how to use the Trajectory class independently of QGIS. So far, I’m only testing the spatial properties though:

def test_two_intersections_with_same_polygon(self):
    polygon = Polygon([(5,-5),(7,-5),(7,12),(5,12),(5,-5)])
    data = [{'id':1, 'geometry':Point(0,0), 't':datetime(2018,1,1,12,0,0)},
        {'id':1, 'geometry':Point(6,0), 't':datetime(2018,1,1,12,10,0)},
        {'id':1, 'geometry':Point(10,0), 't':datetime(2018,1,1,12,15,0)},
        {'id':1, 'geometry':Point(10,10), 't':datetime(2018,1,1,12,30,0)},
        {'id':1, 'geometry':Point(0,10), 't':datetime(2018,1,1,13,0,0)}]
    df = pd.DataFrame(data).set_index('t')
    geo_df = GeoDataFrame(df, crs={'init': '31256'})
    traj = Trajectory(1, geo_df)
    intersections = traj.intersection(polygon)
    result = []
    for x in intersections:
        result.append(x.to_linestring())
    expected_result = [LineString([(5,0),(6,0),(7,0)]), LineString([(7,10),(5,10)])]
    self.assertEqual(result, expected_result) 

One issue with implementing the algorithms as QGIS Processing tools in this way is that the tools are independent of one another. That means that each tool has to repeat the expensive step of creating the trajectory objects in memory. I’m not sure this can be solved.

Advertisements

Working with movement data analysis, I’ve banged my head against performance issues every once in a while. For example, PostgreSQL – and therefore PostGIS – run queries in a single thread of execution. This is now changing, with more and more functionality being parallelized. PostgreSQL version 9.6 (released on 2016-09-29) included important steps towards parallelization, including parallel execution of sequential scans, joins and aggregates. Still, there is no parallel processing in PostGIS so far (but it is under development as described by Paul Ramsey in his posts “Parallel PostGIS II” and “PostGIS Scaling” from late 2017).

At the FOSS4G2016 in Bonn, I had the pleasure to chat with Shoaib Burq who ran the “An intro to Apache PySpark for Big Data GeoAnalysis” workshop. Back home, I downloaded the workshop material and gave it a try but since I wanted a scalable system for storing, analyzing, and visualizing spatial data, it didn’t really seem to fit the bill.

Around one year ago, my search grew more serious since we needed a solution that would support our research group’s new projects where we expected to work with billions of location records (timestamped points and associated attributes). I was happy to find that the fine folks at LocationTech have some very promising open source projects focusing on big spatial data, most notably GeoMesa and GeoWave. Both tools take care of storing and querying big spatio-temporal datasets and integrate into GeoServer for publication and visualization. (A good – if already slightly outdated – comparison of the two has been published by Azavea.)

My understanding at the time was that GeoMesa had a stronger vector data focus while GeoWave was more focused on raster data. This lead me to try out GeoMesa. I published my first steps in “Getting started with GeoMesa using Geodocker” but things only really started to take off once I joined the developer chats and was pointed towards CCRI’s cloud-local “a collection of bash scripts to set up a single-node cloud on your desktop, laptop, or NUC”. This enabled me to skip most of the setup pains and go straight to testing GeoMesa’s functionality.

The learning curve is rather significant: numerous big data stack components (including HDFS, Accumulo, and GeoMesa), a most likely new language (Scala), as well as the Spark computing system require some getting used to. One thing that softened the blow is the fact that writing queries in SparkSQL + GeoMesa is pretty close to writing PostGIS queries. It’s also rather impressive to browse hundreds of millions of points by connecting QGIS TimeManager to a GeoServer WMS-T with GeoMesa backend.

Spatial big data stack with GeoMesa

One of the first big datasets I’ve tested are taxi floating car data (FCD). At one million records per day, the three years in the following example amount to a total of around one billion timestamped points. A query for travel times between arbitrary start and destination locations took a couple of seconds:

Travel time statistics with GeoMesa (left) compared to Google Maps predictions (right)

Besides travel time predictions, I’m also looking into the potential for predicting future movement. After all, it seems not unreasonable to assume that an object would move in a similar fashion as other similar objects did in the past.

Early results of a proof of concept for GeoMesa based movement prediction

Big spatial data – both vector and raster – are an exciting challenge bringing new tools and approaches to our ever expanding spatial toolset. Development of components in open source big data stacks is rapid – not unlike the development speed of QGIS. This can make it challenging to keep up but it also holds promises for continuous improvements and quick turn-around times.

If you are using GeoMesa to work with spatio-temporal data, I’d love to hear about your experiences.

In Movement data in GIS #2: visualization I mentioned that it should be possible to label trajectory segments without having to break the original trajectory feature. While it’s not a straightforward process, it is indeed possible to create timestamp labels at desired intervals:

The main point here is that we cannot use regular labels because there would be only one label for the whole trajectory feature. Instead, we are using a marker line with a font marker:

By default, font markers only display one character from a given font but by using expressions we can make it display longer text, including datetime strings:

If you want to have a label at every node of the trajectory, the expression looks like this:

format_date( 
   to_datetime('1970-01-01T00:00:00Z')+to_interval(
      m(start_point(geometry_n(
         segments_to_lines( $geometry ),
         @geometry_part_num)
      ))||' seconds'
   ),
   'HH:mm:ss'
)

You probably remember those parts of the expression that extract the m value from previous posts. Note that – compared to 2016 – it is now necessary to add the segments_to_lines() function.

The m value (which stores time as seconds since Unix epoch) is then converted to datetime and finally formatted to only show time. Of course you can edit the datetime format string to also include the date.

If we only want a label every 30 seconds, we can add a case statement around that:

CASE WHEN 
m(start_point(geometry_n(
   segments_to_lines( $geometry ),
   @geometry_part_num)
)) % 30 = 0
THEN
format_date( 
   to_datetime('1970-01-01T00:00:00Z')+to_interval(
      m(start_point(geometry_n(
         segments_to_lines( $geometry ),
         @geometry_part_num)
      ))||' seconds'
   ),
   'HH:mm:ss'
)
END

This works well if the trajectory sampling interval is fairly regular. This is not always the case and that means that the above case statement wouldn’t find many nodes with a timestamp that ends in :30 or :00. In such a case, we could resort to labeling nodes based on their order in the linestring:

CASE WHEN 
 @geometry_part_num  % 30 = 0
THEN
...

Thanks a lot to @JuergenEFischer for providing a solution for converting seconds since Unix epoch to datetime without a custom function!

Note that expressions using @geometry_part_num currently suffer from the following issue: Combination of segments_to_lines($geometry) and @geometry_part_num gives wrong segment numbers


This post is part of a series. Read more about movement data in GIS.

TimeManager 2.5 is quite likely going to be the final TimeManager release for the QGIS 2 series. It comes with a couple of bug fixes and enhancements:

  • Fixed #245: updated help.htm
  • Fixed #240: now hiding unmanageable WFS layers
  • Fixed #220: fixed issues with label size
  • Fixed #194: now exposing additional functions: animation_time_frame_size, animation_time_frame_type, animation_start_datetime, animation_end_datetime

Besides updating the help, I also decided to display it more prominently in the settings dialog (similarly to how the help is displayed in the field calculator or in Processing):

So far, I haven’t started porting to QGIS 3 yet. If you are interested in TimeManager and want to help, please get in touch.

On this note, let me leave you with a couple of animation inspirations from the Twitterverse:

In this post, we use TimeManager to visualize the position of a moving object over time along a trajectory. This is another example of what is possible thanks to QGIS’ geometry generator feature. The result can look like this:

What makes this approach interesting is that the trajectory is stored in PostGIS as a LinestringM instead of storing individual trajectory points. So there is only one line feature loaded in QGIS:

(In part 2 of this series, we already saw how a geometry generator can be used to visualize speed along a trajectory.)

The layer is added to TimeManager using t_start and t_end attributes to define the trajectory’s temporal extent.

TimeManager exposes an animation_datetime() function which returns the current animation timestamp, that is, the timestamp that is also displayed in the TimeManager dock, as well as on the map (if we don’t explicitly disable this option).

Once TimeManager is set up, we can edit the line style to add a point marker to visualize the position of the moving object at the current animation timestamp. To do that, we interpolate the position along the trajectory segments. The first geometry generator expression splits the trajectory in its segments:

The second geometry generator expression interpolates the position on the segment that contains the current TimeManager animation time:

The WHEN statement compares the trajectory segment’s start and end times to the current TimeManager animation time. Afterwards, the line_interpolate_point function is used to draw the point marker at the correct position along the segment:

CASE 
WHEN (
m(end_point(geometry_n($geometry,@geometry_part_num)))
> second(age(animation_datetime(),to_datetime('1970-01-01 00:00')))
AND
m(start_point(geometry_n($geometry,@geometry_part_num)))
<= second(age(animation_datetime(),to_datetime('1970-01-01 00:00')))
)
THEN
line_interpolate_point( 
  geometry_n($geometry,@geometry_part_num),
  1.0 * (
    second(age(animation_datetime(),to_datetime('1970-01-01 00:00')))
	- m(start_point(geometry_n($geometry,@geometry_part_num)))
  ) / (
    m(end_point(geometry_n($geometry,@geometry_part_num)))
	- m(start_point(geometry_n($geometry,@geometry_part_num)))
  ) 
  * length(geometry_n($geometry,@geometry_part_num))
)
END

Here is the animation result for a part of the trajectory between 08:00 and 09:00:


This post is part of a series. Read more about movement data in GIS.

AGILE 2017 is the annual international conference on Geographic Information Science of the Association of Geographic Information Laboratories in Europe (AGILE) which was established in 1998 to promote academic teaching and research on GIS.

This years conference in Wageningen was my time at AGILE.  I had the honor to present our recent work on pedestrian navigation with landmarks [Graser, 2017].

If you are interested in trying it, there is an online demo. The conference also provided numerous pointers toward ideas for future improvements, including [Götze and Boye, 2016] and [Du et al., 2017]

On the issue of movement data in GIS, there weren’t too many talks on this topic at AGILE but on the conceptual side, I really enjoyed David Jonietz’ talk on how to describe trajectory processing steps:

Source: [Jonietz and Bucher, 2017]

In the pre-conference workshop I attended, there was also an interesting presentation on analyzing trajectory data with PostGIS by Phd candidate Meihan Jin.

I’m also looking forward to reading [Wiratma et al., 2017] “On Measures for Groups of Trajectories” because I think that the presentation only scratched the surface.

References

[Du et al, 2017] Du, S., Wang, X., Feng, C. C., & Zhang, X. (2017). Classifying natural-language spatial relation terms with random forest algorithm. International Journal of Geographical Information Science, 31(3), 542-568.
[Götze and Boye, 2016] Götze, J., & Boye, J. (2016). Learning landmark salience models from users’ route instructions. Journal of Location Based Services, 10(1), 47-63.
[Graser, 2017] Graser, A. (2017). Towards landmark-based instructions for pedestrian navigation systems using OpenStreetMap, AGILE2017, Wageningen, Netherlands.
[Jonietz and Bucher, 2017] Jonietz, D., Bucher, D. (2017). Towards an Analytical Framework for Enriching Movement Trajectories with Spatio-Temporal Context Data, AGILE2017, Wageningen, Netherlands.
[Wiratma et al., 2017] Wiratma L., van Kreveld M., Löffler M. (2017) On Measures for Groups of Trajectories. In: Bregt A., Sarjakoski T., van Lammeren R., Rip F. (eds) Societal Geo-innovation. GIScience 2017. Lecture Notes in Geoinformation and Cartography. Springer, Cham


This post is part of a series. Read more about movement data in GIS.

In the 1st part of this series, I mentioned the Workshop on Analysis of Movement Data at the GIScience 2016 conference. Since the workshop took place in September 2016, 11 abstracts have been published (the website seems to be down currently, see the cached version) covering topics from general concepts for movement data analysis, to transport, health, and ecology specific articles. Here’s a quick overview of what researchers are currently working on:

  • General topics
    • Interpolating trajectories with gaps in the GPS signal while taking into account the context of the gap [Hwang et al., 2016]
    • Adding time and weather context to understand their impact on origin-destination flows [Sila-Nowicka and Fotheringham, 2016]
    • Finding optimal locations for multiple moving objects to meet and still arrive at their destination in time [Gao and Zeng, 2016]
    • Modeling checkpoint-based movement data as sequence of transitions [Tao, 2016]
  • Transport domain
    • Estimating junction locations and traffic regulations using extended floating car data [Kuntzsch et al., 2016]
  • Health domain
    • Clarifying physical activity domain semantics using ontology design patterns [Sinha and Howe, 2016]
    • Recognizing activities based on Pebble Watch sensors and context for eight gestures, including brushing one’s teeth and combing one’s hair [Cherian et al., 2016]
    • Comparing GPS-based indicators of spatial activity with reported data [Fillekes et al., 2016]
  • Ecology domain
    • Linking bird movement with environmental context [Bohrer et al., 2016]
    • Quantifying interaction probabilities for moving and stationary objects using probabilistic space-time prisms [Loraamm et al., 2016]
    • Generating probability density surfaces using time-geographic density estimation [Downs and Hyzer, 2016]

If you are interested in movement data in the context of ecological research, don’t miss the workshop on spatio-temporal analysis, modelling and data visualisation for movement ecology at the Lorentz Center in Leiden in the Netherlands. There’s currently a call for applications for young researchers who want to attend this workshop.

Since I’m mostly working with human and vehicle movement data in outdoor settings, it is interesting to see the bigger picture of movement data analysis in GIScience. It is worth noting that the published texts are only abstracts, therefore there is not much detail about algorithms and whether the code will be available as open source.

For more reading: full papers of the previous workshop in 2014 have been published in the Int. Journal of Geographical Information Science, vol 30(5). More special issues on “Computational Movement Analysis” and “Representation and Analytical Models for Location-based Social Media Data and Tracking Data” have been announced.

References

[Bohrer et al., 2016] Bohrer, G., Davidson, S. C., Mcclain, K. M., Friedemann, G., Weinzierl, R., and Wikelski, M. (2016). Contextual Movement Data of Bird Flight – Direct Observations and Annotation from Remote Sensing.
[Cherian et al., 2016] Cherian, J., Goldberg, D., and Hammond, T. (2016). Sensing Day-to-Day Activities through Wearable Sensors and AI.
[Downs and Hyzer, 2016] Downs, J. A. and Hyzer, G. (2016). Spatial Uncertainty in Animal Tracking Data: Are We Throwing Away Useful Information?
[Fillekes et al., 2016] Fillekes, M., Bereuter, P. S., and Weibel, R. (2016). Comparing GPS-based Indicators of Spatial Activity to the Life-Space Questionnaire (LSQ) in Research on Health and Aging.
[Gao and Zeng, 2016] Gao, S. and Zeng, Y. (2016). Where to Meet: A Context-Based Geoprocessing Framework to Find Optimal Spatiotemporal Interaction Corridor for Multiple Moving Objects.
[Hwang et al., 2016] Hwang, S., Yalla, S., and Crews, R. (2016). Conditional resampling for segmenting GPS trajectory towards exposure assessment.
[Kuntzsch et al., 2016] Kuntzsch, C., Zourlidou, S., and Feuerhake, U. (2016). Learning the Traffic Regulation Context of Intersections from Speed Profile Data.
[Loraamm et al., 2016] Loraamm, R. W., Downs, J. A., and Lamb, D. (2016). A Time-Geographic Approach to Wildlife-Road Interactions.
[Sila-Nowicka and Fotheringham, 2016] Sila-Nowicka, K. and Fotheringham, A. (2016). A route map to calibrate spatial interaction models from GPS movement data.
[Sinha and Howe, 2016] Sinha, G. and Howe, C. (2016). An Ontology Design Pattern for Semantic Modelling of Children’s Physical Activities in School Playgrounds.
[Tao, 2016] Tao, Y. (2016). Data Modeling for Checkpoint-based Movement Data.


This post is part of a series. Read more about movement data in GIS.

%d bloggers like this: