Archive

Author Archives: underdark

Thanks to the FOSS4G2021 video team, all talks including my keynote are now available online.

I had the honor to be invited to give the closing keynote, talking about how open source can help open science, particularly data science:

I’m convinced that efforts towards more open data science are a worthwhile investment even if current scientific incentive structures are stacked against it.

Until incentive policies catch up, we all can help encourage more people to go the extra mile(s) by properly valuing their efforts, e.g. by celebrating and citing reproducible publications, open research datasets, and open scientific software.

The Central Institution for Meteorology and Geodynamics (ZAMG) is Austrian’s meteorological and geophysical service. And as such, they have a large database of historical weather data which they have now made publicly available, as announced on 28th Oct 2021:

The new ZAMG Data Hub provides weather and station data, mainly in NetCDF and CSV formats:

I decided to grab a NetCDF sample from their analysis and nowcasting system INCA. I went with all available parameters for a period of one day (the data has a temporal resolution of one hour) and a bounding box around Vienna:

https://frontend.hub.zamg.ac.at/grid/d512d5b5-4e9f-4954-98b9-806acbf754f6/historical/form?anonymous=true

The loading screen of QGIS 3.22 shows the different NetCDF layers:

After adding the incal-hourly layer to QGIS, the layer styling panel provides access to the different weather parameters. We can switch between these parameters by clicking the gradient icon next to the parameter names. Here you can see the air temperature:

And because the NetCDF layer is time-aware, we can also use the QGIS Temporal Controller to step through the hourly measurements / create an animation:

Make sure to grab the latest version of QGIS to get access to all the functionality shown here.

The latest v0.8 release is now available from conda-forge.

New features include:

  • More convenient creation of TrajectoryCollection objects from (Geo)DataFrames (#137)
  • Support for different geometry column names (#112)

Last week, I also had the pleasure to speak about MovingPandas at Carto’s Spatial Data Science Conference SDSC21:

As always, all tutorials are available from the movingpandas-examples repository and on MyBinder:

Two weeks ago, I had the pleasure to speak at SystemX’s seminar series. The talk features a live demonstration of my protocol for exploring movement data, powered by Jupyter, Pandas, Holoviews, Datashader, GeoPandas, and MovingPandas. So if you haven’t read the paper yet, here’s the chance to watch the talk version:

Kappazunder is the city of Vienna’s database created during their recent mobile mapping campaign. Using vehicle-mounted Lidar and cameras, they collected street-level Lidar and street view images.

Slide from the official announcement on Thursday, 23rd Sept 2021. Full slide deck: https://www.slideshare.net/DigitalesWien/kappazunder-testdatensatz-2020-ogd-wien

Yesterday, they published a first sample dataset, containing one trajectory on data.gv.at. The download contains documentation, vector data (.shp), images (.jpg), and point clouds (.laz):

Trajectory

The shapefiles contain vehicle location updates, photo locations, and areas describing the extent of the point clouds. Since the shapefile lack .prj files, we need to manually specify the correct CRS (EPSG:31256 MGI / Austria GK East).

The vehicle location updates and photo locations contain timestamps as epoch. However, the format is a little special:

To display a human-readable timestamp, I therefore used the following label expression:

format_date( datetime_from_epoch( "epoch_s"*1000), 'HH:mm:ss')

Adding these labels also reveals that the whole trajectory is just 2 minutes long. This puts the download size of over 5GB into perspective. The whole dataset will be massive.

Lidar

The .laz files are between 100 and 200MB, each. There are four .laz files, even though the previously loaded point cloud extent areas only suggested three:

Loading the .laz files for the first time takes a while and there seem to be some issues – either on the user end (me) or in the files themselves. Trying to load content of the ept_ folders only results in very few points and multiple “invalid data source” errors:

For the few point that are loaded, it looks like the height information is available:

Update on 2021-10-01: I’ve reported the data loss issue and Martin Dobias has provided a first work-around that makes it possible to view the data in QGIS:

135284370-b07272bb-be8a-47ac-b050-d6024613c63b.png (911×765)

Images

The street view images are published as cubemaps. Here’s a sample of the side view:

Can we reliably measure truck traffic from space? Compared to private transport, spatiotemporal data on freight transport is even harder to come by. Detecting trucks using remote sensing has been a promising lead for many years but often required access to pretty specialized sensors, such as TerraSAR-X. That is why I was really excited to read about a new approach that detects trucks in commonly available Sentinel-2 imagery developed by Henrik Fisser (Julius-Maximilians-University Würzburg, Germany). So I reached out to him to learn more about the possibilities this new technology opens up. 

Vehicles are visible and detectable in Sentinel-2 data if they are large and moving fast enough (image source: ESA)

To verify his truck detection results. Henrik had already used data from truck counting stations along the German autobahn network. However, these counters are quite rare and thus cannot provide full spatial coverage. Therefore we started looking for more complete reference data. Fortunately, Nikolaus Kapser at the Austrian highway corporation ASFINAG offered his help. The Austrian autobahn toll system is gantry-based. It records when a truck passes a gantry. Using the timestamp of these truck passages and the current traffic speed, it is possible to estimate truck locations at arbitrary points in time, such as the time a Sentinel-2 image was taken. This makes it possible to assess the Sentinel-2-based truck detection along the autobahn network for complete Sentinel-2 images.

Overall, Sentinel-2-based detections tend to underestimate the number of trucks. Henrik found a strong correlation (with an average r value > 0.8) between German traffic counting stations and trucks detected by the Sentinel-2 method. These counting stations were selected for their ideal characteristics, including distance from volatile traffic situations such as a high number of highway intersections. This is very different from our comparison which covers autobahn sections in and near Vienna. We therefore expected larger detection errors. However, our new Austrian analysis reaches similar results (with r values of 0.79, 0.70, and 0.86 for three different days 2020-08-28, 2020-09-22, and 2020-11-06).

Thanks to the truck reference locations provided by ASFINAG, we were also able to analyze the spatial distribution of truck detections. We decided to compare ASFINAG data (truth) and Sentinel-2-based detections using a grid based approach with a cell size of 5×5 km. Confirming Henrik’s original results, grid cells with higher detection than ground truth values are clearly in the minority. Interestingly, many cells in Vienna (at the eastern border of the image extent) exhibit rather low relative errors compared to, for example, the cells along Westautobahn (the east-west running autobahn in the center of the image extent).

Some important remarks: The Sentinel-2-based detection method only works for large vehicles moving around 50km/h or faster. It is hence less suited to detect trucks in city traffic. Additionally, trucks in tunnel sections cannot be detected. To enable a fair comparison, we therefore flagged trucks in the ground truth dataset that were located in tunnels and excluded them from the analysis. Sentinel-2 captures the region around Vienna around 10:00 o’clock in the morning. As a result, it is not possible to assess other times of day. Finally, cloud cover will reduce the accuracy. Therefore we picked images with low reported cloud cover percentage (< 5%).

It is really exciting to finally see a truck detection method that works with readily available remote sensing data because this means that it is potentially transferable to other areas of the world where no official traffic counts are available. Furthermore, this method should be in line with data protection regulations (avoiding identification of individuals and potential reconstruction of movement trajectories) thus making it possible to use and publish the resulting data without further anonymization steps.


This post was written in collaboration with Henrik Fisser (Uni Würzburg / DLR) and Nikolaus Kasper (Asfinag MSG). Keep your eyes open for upcoming detailed publications on the Sentinel-2-based method by Henrik.


This post is part of a series. Read more about movement data in GIS.

One of the new features in QGIS 3.20 is the option to trim the start and end of simple line symbols. This allows for the line rendering to trim off the first and last sections of a line at a user configured distance, as shown in the visual changelog entry

This new feature makes it much easier to create decorative label callout (or leader) lines. If you know QGIS Map Design 2, the following map may look familiar – however – the following leader lines are even more intricate, making use of the new trimming capabilities:

To demonstrate some of the possibilities, I’ve created a set of four black and four white leader line styles:

You can download these symbols from the QGIS style sharing platform: https://plugins.qgis.org/styles/101/ to use them in your projects. Have fun mapping!

Today’s post is a video recommendation. In the following video, Alexandre Neto demonstrates an exciting array of tips, tricks, and hacks to create an automated Atlas map series of the Azores islands.

Highlights include:

1. A legend that includes automatically updating statistics

2. A way to support different page sizes

3. A solution for small areas overshooting the map border

You’ll find the video on the QGIS Youtube channel:

This video was recorded as part of the QGIS Open Day June edition. QGIS Open Days are organized monthly on the last Friday of the month. Anyone can take part and present their work for and with QGIS. For more details, see https://github.com/qgis/QGIS/wiki#qgis-open-day

The latest v0.7 release is now available from conda-forge.

New features include:

As always, all tutorials are available from the movingpandas-examples repository and on MyBinder:

In the last few days, there’s been a sharp rise in interest in vessel movements, and particularly, in understanding where and why vessels stop. Following the grounding of Ever Given in the Suez Canal, satellite images and vessel tracking data (AIS) visualizations are everywhere:

Using movement data analytics tools, such as MovingPandas, we can dig deeper and explore patterns in the data.

The MovingPandas.TrajectoryStopDetector is particularly useful in this situation. We can provide it with a Trajectory or TrajectoryCollection and let it detect all stops, that is, instances were the moving object stayed within a certain area (with a diameter of 1000m in this example) for a an extended duration (at least 3 hours).

stops = mpd.TrajectoryStopDetector(trajs).get_stop_segments(
    min_duration=timedelta(hours=3), max_diameter=1000)

The resulting stop segments include spatial and temporal information about the stop location and duration. To make this info more easily accessible, let’s turn the stop segment TrajectoryCollection into a point GeoDataFrame:

stop_pts = gpd.GeoDataFrame(columns=['geometry']).set_geometry('geometry')
stop_pts['stop_id'] = [track.id for track in stops.trajectories]
stop_pts= stop_pts.set_index('stop_id')

for stop in stops:
    stop_pts.at[stop.id, 'ID'] = stop.df['ID'][0]
    stop_pts.at[stop.id, 'datetime'] = stop.get_start_time()
    stop_pts.at[stop.id, 'duration_h'] = stop.get_duration().total_seconds()/3600
    stop_pts.at[stop.id, 'geometry'] = stop.get_start_location()

Indeed, I think the next version of MovingPandas should include a function that directly returns stops as points.

Now we can explore the stop information. For example, the map plot shows that stops are concentrated in three main areas: the northern and southern ends of the Canal, as well as the Great Bitter Lake in the middle. By looking at the timing of stops and their duration in a scatter plot, we can clearly see that the Ever Given stop (red) caused a chain reaction: the numerous points lining up on the diagonal of the scatter plot represent stops that very likely are results of the blockage:

Before the grounding, the stop distribution nicely illustrates the canal schedule. Vessels have to wait until it’s turn for their direction to go through:

You can see the full analysis workflow in the following video. Please turn on the captions for details.

Huge thanks to VesselsValue for supplying the data!

For another example of MovingPandas‘ stop dectection in action, have a look at Bryan R. Vallejo’s tutorial on detecting stops in bird tracking data which includes some awesome visualizations using KeplerGL:

Kepler.GL visualization by Bryan R. Vallejo

This post is part of a series. Read more about movement data in GIS.

%d bloggers like this: