Archive

Tag Archives: MovingPandas

The latest v0.10 release is now available from conda-forge.

This release contains some really cool new algorithms:

If you have questions about using MovingPandas or just want to discuss new ideas, you’re welcome to join our recently opened discussion forum.

As always, all tutorials are available from the movingpandas-examples repository and on MyBinder:

Besides others examples, the movingpandas-examples repo contains the following tech demo: an interactive app built with Panel that demonstrates different MovingPandas stop detection parameters

To start the app, open the stopdetection-app.ipynb notebook and press the green Panel button in the Jupyter Lab toolbar:

The latest v0.9 release is now available from conda-forge.

This release contains some really cool new algorithms:

The Kalman filter in action on the Geolife sample: smoother, less jiggly trajectories.
Top-Down Time Ratio generalization aka trajectory compression in action: reduces the number of positions along the trajectory without altering the spatiotemporal properties, such as speed, too much.

These new algorithms were contributed by Lyudmil Vladimirov and George S. Theodoropoulos.

Behind the scenes, Ray Bell took care of moving testing from Travis to Github Actions, and together we worked through the steps to ensure that the source code is now properly linted using flake8 and black.

Being able to work with so many awesome contributors has made this release really special for me. It’s great to see the project attracting more developer interest.

As always, all tutorials are available from the movingpandas-examples repository and on MyBinder:

The latest v0.8 release is now available from conda-forge.

New features include:

  • More convenient creation of TrajectoryCollection objects from (Geo)DataFrames (#137)
  • Support for different geometry column names (#112)

Last week, I also had the pleasure to speak about MovingPandas at Carto’s Spatial Data Science Conference SDSC21:

As always, all tutorials are available from the movingpandas-examples repository and on MyBinder:

The latest v0.7 release is now available from conda-forge.

New features include:

As always, all tutorials are available from the movingpandas-examples repository and on MyBinder:

In the last few days, there’s been a sharp rise in interest in vessel movements, and particularly, in understanding where and why vessels stop. Following the grounding of Ever Given in the Suez Canal, satellite images and vessel tracking data (AIS) visualizations are everywhere:

Using movement data analytics tools, such as MovingPandas, we can dig deeper and explore patterns in the data.

The MovingPandas.TrajectoryStopDetector is particularly useful in this situation. We can provide it with a Trajectory or TrajectoryCollection and let it detect all stops, that is, instances were the moving object stayed within a certain area (with a diameter of 1000m in this example) for a an extended duration (at least 3 hours).

stops = mpd.TrajectoryStopDetector(trajs).get_stop_segments(
    min_duration=timedelta(hours=3), max_diameter=1000)

The resulting stop segments include spatial and temporal information about the stop location and duration. To make this info more easily accessible, let’s turn the stop segment TrajectoryCollection into a point GeoDataFrame:

stop_pts = gpd.GeoDataFrame(columns=['geometry']).set_geometry('geometry')
stop_pts['stop_id'] = [track.id for track in stops.trajectories]
stop_pts= stop_pts.set_index('stop_id')

for stop in stops:
    stop_pts.at[stop.id, 'ID'] = stop.df['ID'][0]
    stop_pts.at[stop.id, 'datetime'] = stop.get_start_time()
    stop_pts.at[stop.id, 'duration_h'] = stop.get_duration().total_seconds()/3600
    stop_pts.at[stop.id, 'geometry'] = stop.get_start_location()

Indeed, I think the next version of MovingPandas should include a function that directly returns stops as points.

Now we can explore the stop information. For example, the map plot shows that stops are concentrated in three main areas: the northern and southern ends of the Canal, as well as the Great Bitter Lake in the middle. By looking at the timing of stops and their duration in a scatter plot, we can clearly see that the Ever Given stop (red) caused a chain reaction: the numerous points lining up on the diagonal of the scatter plot represent stops that very likely are results of the blockage:

Before the grounding, the stop distribution nicely illustrates the canal schedule. Vessels have to wait until it’s turn for their direction to go through:

You can see the full analysis workflow in the following video. Please turn on the captions for details.

Huge thanks to VesselsValue for supplying the data!

For another example of MovingPandas‘ stop dectection in action, have a look at Bryan R. Vallejo’s tutorial on detecting stops in bird tracking data which includes some awesome visualizations using KeplerGL:

Kepler.GL visualization by Bryan R. Vallejo

This post is part of a series. Read more about movement data in GIS.

After writing “Towards a template for exploring movement data” last year, I spent a lot of time thinking about how to develop a solid approach for movement data exploration that would help analysts and scientists to better understand their datasets. Finally, my search led me to the excellent paper “A protocol for data exploration to avoid common statistical problems” by Zuur et al. (2010). What they had done for the analysis of common ecological datasets was very close to what I was trying to achieve for movement data. I followed Zuur et al.’s approach of a exploratory data analysis (EDA) protocol and combined it with a typology of movement data quality problems building on Andrienko et al. (2016). Finally, I brought it all together in a Jupyter notebook implementation which you can now find on Github.

There are two options for running the notebook:

  1. The repo contains a Dockerfile you can use to spin up a container including all necessary datasets and a fitting Python environment.
  2. Alternatively, you can download the datasets manually and set up the Python environment using the provided environment.yml file.

The dataset contains over 10 million location records. Most visualizations are based on Holoviz Datashader with a sprinkling of MovingPandas for visualizing individual trajectories.

Point density map of 10 million location records, visualized using Datashader

Line density map for detecting gaps in tracks, visualized using Datashader

Example trajectory with strong jitter, visualized using MovingPandas & GeoViews

 

I hope this reference implementation will provide a starting point for many others who are working with movement data and who want to structure their data exploration workflow.

If you want to dive deeper, here’s the paper:

[1] Graser, A. (2021). An exploratory data analysis protocol for identifying problems in continuous movement data. Journal of Location Based Services. doi:10.1080/17489725.2021.1900612.

(If you don’t have institutional access to the journal, the publisher provides 50 free copies using this link. Once those are used up, just leave a comment below and I can email you a copy.)

References


This post is part of a series. Read more about movement data in GIS.

The latest v0.5 release is now available from conda-forge.

New features include:

As always, all tutorials are available on MyBinder:

Detected stops (left) and trajectory split at stops (right)

This post introduces Holoviz Panel, a library that makes it possible to create really quick dashboards in notebook environments as well as more sophisticated custom interactive web apps and dashboards.

The following example shows how to use Panel to explore a dataset (a trajectory collection in this case) and different parameter settings (relating to trajectory generalization). All the Panel code we need is a dict that defines the parameters that we want to explore. Then we can use Panel’s interact function to automatically generate a dashboard for our custom plotting function:

import panel as pn

kw = dict(traj_id=(1, len(traj_collection)), 
          tolerance=(10, 100, 10), 
          generalizer=['douglas-peucker', 'min-distance'])
pn.interact(plot_generalized, **kw)

Click to view the resulting dashboard in full resolution:

The plotting function uses the parameters to generate a Holoviews plot. First it fetches a specific trajectory from the trajectory collection. Then it generalizes the trajectory using the specified parameter settings. As you can see, we can easily combine maps and other plots to visualize different aspects of the data:

def plot_generalized(traj_id=1, tolerance=10, generalizer='douglas-peucker'):
  my_traj = traj_collection.get_trajectory(traj_id).to_crs(CRS(4088))
  if generalizer=='douglas-peucker':
    generalized = mpd.DouglasPeuckerGeneralizer(my_traj).generalize(tolerance)
  else:
    generalized = mpd.MinDistanceGeneralizer(my_traj).generalize(tolerance)
  generalized.add_speed(overwrite=True)
  return ( 
    generalized.hvplot(
      title='Trajectory {} (tolerance={})'.format(my_traj.id, tolerance), 
      c='speed', cmap='Viridis', colorbar=True, clim=(0,20), 
      line_width=10, width=500, height=500) + 
    generalized.df['speed'].hvplot.hist(
      title='Speed histogram', width=300, height=500) 
    )

Trajectory collections and generalization functions used in this example are part of the MovingPandas library. If you are interested in movement data analysis, you should check it out! You can find this example notebook in the MovingPandas tutorial section.

MovingPandas has come a long way since 2018 when I started to experiment with GeoPandas for trajectory data handling.

This week, MovingPandas passed peer review and was approved for pyOpenSci. This technical review process was extremely helpful in ensuring code, project, and documentation quality. I would strongly recommend it to everyone working on new data science libraries!

The lastest v0.3 release is now available from conda-forge.

All tutorials are available on MyBinder

New features include:

  • Support for GeoPandas 0.7
  • Trajectory collection aggregation functions to generate flow maps

 

This is a guest post by Bommakanti Krishna Chaitanya @chaitan94

Introduction

This post introduces mobilitydb-sqlalchemy, a tool I’m developing to make it easier for developers to use movement data in web applications. Many web developers use Object Relational Mappers such as SQLAlchemy to read/write Python objects from/to a database.

Mobilitydb-sqlalchemy integrates the moving objects database MobilityDB into SQLAlchemy and Flask. This is an important step towards dealing with trajectory data using appropriate spatiotemporal data structures rather than plain spatial points or polylines.

To make it even better, mobilitydb-sqlalchemy also supports MovingPandas. This makes it possible to write MovingPandas trajectory objects directly to MobilityDB.

For this post, I have made a demo application which you can find live at https://mobilitydb-sqlalchemy-demo.adonmo.com/. The code for this demo app is open source and available on GitHub. Feel free to explore both the demo app and code!

In the following sections, I will explain the most important parts of this demo app, to show how to use mobilitydb-sqlalchemy in your own webapp. If you want to reproduce this demo, you can clone the demo repository and do a “docker-compose up –build” as it automatically sets up this docker image for you along with running the backend and frontend. Just follow the instructions in README.md for more details.

Declaring your models

For the demo, we used a very simple table – with just two columns – an id and a tgeompoint column for the trip data. Using mobilitydb-sqlalchemy this is as simple as defining any regular table:

from flask_sqlalchemy import SQLAlchemy
from mobilitydb_sqlalchemy import TGeomPoint

db = SQLAlchemy()

class Trips(db.Model):
   __tablename__ = "trips"
   trip_id = db.Column(db.Integer, primary_key=True)
   trip = db.Column(TGeomPoint)

Note: The library also allows you to use the Trajectory class from MovingPandas as well. More about this is explained later in this tutorial.

Populating data

When adding data to the table, mobilitydb-sqlalchemy expects data in the tgeompoint column to be a time indexed pandas dataframe, with two columns – one for the spatial data  called “geometry” with Shapely Point objects and one for the temporal data “t” as regular python datetime objects.

from datetime import datetime
from shapely.geometry import Point

# Prepare and insert the data
# Typically it won’t be hardcoded like this, but it might be coming from 
# other data sources like a different database or maybe csv files
df = pd.DataFrame(
   [
       {"geometry": Point(0, 0), "t": datetime(2012, 1, 1, 8, 0, 0),},
       {"geometry": Point(2, 0), "t": datetime(2012, 1, 1, 8, 10, 0),},
       {"geometry": Point(2, -1.9), "t": datetime(2012, 1, 1, 8, 15, 0),},
   ]
).set_index("t")

trip = Trips(trip_id=1, trip=df)
db.session.add(trip)
db.session.commit()

Writing queries

In the demo, you see two modes. Both modes were designed specifically to explain how functions defined within MobilityDB can be leveraged by our webapp.

1. All trips mode – In this mode, we extract all trip data, along with distance travelled within each trip, and the average speed in that trip, both computed by MobilityDB itself using the ‘length’, ‘speed’ and ‘twAvg’ functions. This example also shows that MobilityDB functions can be chained to form more complicated queries.

mobilitydb-sqlalchemy-demo-1

trips = db.session.query(
   Trips.trip_id,
   Trips.trip,
   func.length(Trips.trip),
   func.twAvg(func.speed(Trips.trip))
).all()

2. Spatial query mode – In this mode, we extract only selective trip data, filtered by a user-selected region of interest. We then make a query to MobilityDB to extract only the trips which pass through the specified region. We use MobilityDB’s ‘intersects’ function to achieve this filtering at the database level itself.

mobilitydb-sqlalchemy-demo-2

trips = db.session.query(
   Trips.trip_id,
   Trips.trip,
   func.length(Trips.trip),
   func.twAvg(func.speed(Trips.trip))
).filter(
   func.intersects(Point(lat, lng).buffer(0.01).wkb, Trips.trip),
).all()

Using MovingPandas Trajectory objects

Mobilitydb-sqlalchemy also provides first-class support for MovingPandas Trajectory objects, which can be installed as an optional dependency of this library. Using this Trajectory class instead of plain DataFrames allows us to make use of much richer functionality over trajectory data like analysis speed, interpolation, splitting and simplification of trajectory points, calculating bounding boxes, etc. To make use of this feature, you have set the use_movingpandas flag to True while declaring your model, as shown in the below code snippet.

class TripsWithMovingPandas(db.Model):
   __tablename__ = "trips"
   trip_id = db.Column(db.Integer, primary_key=True)
   trip = db.Column(TGeomPoint(use_movingpandas=True))

Now when you query over this table, you automatically get the data parsed into Trajectory objects without having to do anything else. This also works during insertion of data – you can directly assign your movingpandas Trajectory objects to the trip column. In the below code snippet we show how inserting and querying works with movingpandas mode.

from datetime import datetime
from shapely.geometry import Point

# Prepare and insert the data
# Typically it won’t be hardcoded like this, but it might be coming from 
# other data sources like a different database or maybe csv files
df = pd.DataFrame(
   [
       {"geometry": Point(0, 0), "t": datetime(2012, 1, 1, 8, 0, 0),},
       {"geometry": Point(2, 0), "t": datetime(2012, 1, 1, 8, 10, 0),},
       {"geometry": Point(2, -1.9), "t": datetime(2012, 1, 1, 8, 15, 0),},
   ]
).set_index("t")

geo_df = GeoDataFrame(df)
traj = mpd.Trajectory(geo_df, 1)

trip = Trips(trip_id=1, trip=traj)
db.session.add(trip)
db.session.commit()

# Querying over this table would automatically map the resulting tgeompoint 
# column to movingpandas’ Trajectory class
result = db.session.query(TripsWithMovingPandas).filter(
   TripsWithMovingPandas.trip_id == 1
).first()

print(result.trip.__class__)
# <class 'movingpandas.trajectory.Trajectory'>

Bonus: trajectory data serialization

Along with mobilitydb-sqlalchemy, recently I have also released trajectory data serialization/compression libraries based on Google’s Encoded Polyline Format Algorithm, for python and javascript called trajectory and trajectory.js respectively. These libraries let you send trajectory data in a compressed format, resulting in smaller payloads if sending your data through human-readable serialization formats like JSON. In some of the internal APIs we use at Adonmo, we have seen this reduce our response sizes by more than half (>50%) sometimes upto 90%.

Want to learn more about mobilitydb-sqlalchemy? Check out the quick start & documentation.


This post is part of a series. Read more about movement data in GIS.

%d bloggers like this: