PyQGIS 101: Chaining Processing tools

In “Running Processing tools“, we explored the basics of running Processing tools from PyQGIS. This time, we’ll look into how to chain multiple tools into a workflow.

In this example, we’ll be building a workflow that computes which cities are located along the Danube river (data from NaturalEarthData). We’ll first use the GUI to run the corresponding tools (Extract by attribute, Buffer, and Extract by location) and test our workflow. Then we will go forward and create a PyQGIS version of the workflow.

After using the GUI, we can look up the corresponding code in the Processing history. It looks something like this:

processing.run("native:extractbyexpression",
    {'INPUT':'E:/Geodata/NaturalEarth/natural_earth_vector.gpkg|layername=ne_110m_rivers_lake_centerlines',
    'EXPRESSION':'name = \'Donau\'','OUTPUT':'memory:'})
processing.run("native:buffer",
    {'INPUT':'MultiLineString?crs=EPSG:4326&amp;...','DISTANCE':0.1,'SEGMENTS':5,
    'END_CAP_STYLE':0,'JOIN_STYLE':0,'MITER_LIMIT':2,'DISSOLVE':False,'OUTPUT':'memory:'})
processing.run("native:extractbylocation",
    {'INPUT':'E:/Geodata/NaturalEarth/natural_earth_vector.gpkg|layername=ne_110m_populated_places',
    'PREDICATE':[0],'INTERSECT':'MultiPolygon?crs=EPSG:4326&amp;...','OUTPUT':'memory:'})

These code snippets from the Processing history provide us with a guide to the necessary syntax but we cannot use the code directly. When we want to convert this to a script, we need to take care of handling the intermediate results. Above, these intermediate results are stored in memory layers. The memory layer resulting from Extract by attribute is passed to the Buffer tool input as 'MultiLineString?crs=EPSG:4326&...'. For our script, we need to go a different way and store a reference to the result in a variable.

The following script shows how the results of one tool can be passed to the next: in case of the tools we are using in this example, processing.run() always returns a dictionary with just one entry called 'OUTPUT'. This way, we store the result layer of Extract by expression in the variable danube. Then we use the danube layer as an input for the buffer tool, and so on:

my_gpkg = 'E:/Geodata/NaturalEarth/natural_earth_vector.gpkg'
rivers = f'{my_gpkg}|layername=ne_110m_rivers_lake_centerlines'
places = f'{my_gpkg}|layername=ne_110m_populated_places'

expression = "name = 'Donau'"
danube = processing.run("native:extractbyexpression",
    {'INPUT':rivers,'EXPRESSION':expression,'OUTPUT':'memory:'}
    )['OUTPUT']

buffer_distance = 0.1 #degrees 

buffered_danube = processing.run("native:buffer",
    {'INPUT':danube,'DISTANCE':buffer_distance,'SEGMENTS':5,'END_CAP_STYLE':0,
    'JOIN_STYLE':0,'MITER_LIMIT':2,'DISSOLVE':False,'OUTPUT':'memory:'}
    )['OUTPUT']

places_along_danube = processing.run("native:extractbylocation",
    {'INPUT':places,'PREDICATE':[0],'INTERSECT':buffered_danube,'OUTPUT':'memory:'}
    )['OUTPUT']

QgsProject.instance().addMapLayer(places_along_danube)

for feature in places_along_danube.getFeatures():
    print(feature["name"])

Don’t worry if the syntax of some Processing tools looks a bit complex. Remember that you can look up the correct syntax for every Processing tool you’ve run through the GUI in the Processing history.

Running this script, returns us the following four cities:

Bratislava
Belgrade
Budapest
Vienna

These are the basics of chaining Processing tools to build more complex workflows. The key is to figure out the correct syntax by examining the Processing history. Then we can store the results in variables and pass them on to successive tools. Finally, we can load the results and explore, style, or print them.

PyQGIS 101 is a work in progress. I’d appreciate any feedback, particularly from beginners!

24 comments

soli004 said: 2019-01-0516:27

I saved the csv file as a geopackage and to the desktop, changed the path for the desktop and now it runs thru but with no result. As I’m not using lat/lon I changed the buffer distance to 500 ( I guess that would be meters) as I use ESPG 3006. I did change to like 100, 500, 1000, 5000 but the result is empty… so something is still not working for me…. sorry for this long problem solving… the “vlayer.setSubsetString” is working for me at least half thru my process so a big improvement compare to do it manually :)

Reply ↓
soli004 said: 2019-01-0520:07

data sent!
Thanks

Reply ↓
- underdark said: 2019-01-0523:43
  
  I’m confused. The river dataset you sent only covers Sweden and thus does not contain a river with name danube. How should the altered script you posted work in that case?
  
  Reply ↓
  - soli004 said: 2019-01-0523:48
    
    Sorry, my bad. I thought you could put any name you wanted so I let it be danube….
    That is way it did not work… :#) I misunderstood you….
    
    Free and Open Source GIS Ramblings skrev den 2019-01-05 kl. 23:44:
Leo said: 2019-01-1105:49

Really cool stuff. Thanks for mentioning the processing history, it was completely unknown to me!
Running these outside of the QGIS Python Console seems to be a lot more fiddly since QGIS 3. I tried a million different ways to set up the environmental variables, etc, but no luck. I currently have to use the QGIS python shell given with the stand-alone version while calling OSGeo4W paths. The normal OSGeo4w shell doesnt work :(
A neat install guide for python using OSGeo4W/QGIS tools outside of QGIS would be amazing!

Reply ↓
- underdark said: 2019-03-0321:12
  
  Hi Leo, I’ve now published a new tutorial on writing stand-alone PyQGIS scripts https://anitagraser.com/2019/03/03/stand-alone-pyqgis-scripts-with-osgeo4w/
  
  Reply ↓
  - Leo said: 2019-03-1305:23
    
    Awesome! Thanks for the update, never heard of PyCharm. Thanks
mironemanuel said: 2019-04-1510:18

Great Tutorial, thanks! Just a small note: it seems that in the second code-block in line 12 (for the buffered_danube) there is a “{“ missing. Cheers

Reply ↓
- underdark said: 2019-04-1520:10
  
  Thank you for taking the time to report that mistake! It’s fixed now.
  
  Reply ↓
Nina Schnetzer said: 2019-06-3012:18

Hi Anita,
why is [‘OUTPUT’] needed after the curly brackets with the input parameters for the algorithms? What is its function in the script?
Thank you!

Reply ↓
- underdark said: 2019-06-3019:18
  
  Running the algorithm returns a dictionary of values. ‘OUTPUT’ is the default name (aka ‘key’) of the algorithm results in the dictionary. (Theoretically it could be any other name as well.) Have a look at the earlier example in https://anitagraser.com/pyqgis-101-introduction-to-qgis-python-programming-for-non-programmers/pyqgis-101-running-processing-tools/
  
  Reply ↓
Laura said: 2020-04-0113:25

I have the same question, and I stil don’t have the answer yet. Why is the extra [‘OUTPUT’] needed? I understand the first is the dictonary (key) but the second [‘OUTPUT’] , I noticed that you use it when addmaplayer uses a variabele like adMapLayer(places_along_danube) but you don’t use it with the RunAndLoadResults function or adMapLayer(result[‘OUTPUT’]) function. I cant figure it out, whats the meaning of the second [‘OUTPUT’ ]?

Reply ↓
- underdark said: 2020-04-0216:20
  
  Hi Laura,
  Are you referring to the lesson https://anitagraser.com/pyqgis-101-introduction-to-qgis-python-programming-for-non-programmers/pyqgis-101-running-processing-tools/?
  
  Reply ↓
  - Laura said: 2020-04-0216:22
    
    Yes i do :-)
- underdark said: 2020-04-0216:44
  
  ['OUTPUT'] is always used to access the corresponding value in a dictionary. With runAndLoadResults(), we don’t need to access the result since it is automatically loaded anyway and we didn’t want to do anything else with the result in this example. However, it would be possible to write result = runAndLoadResults() and then use result['OUTPUT'].
  
  Reply ↓
  - Laura said: 2020-04-0607:58
    
    okay, thank you!
jens said: 2020-05-2008:51

Hi Anita,
this tutorial is very amazing and easy to understand. I followed it partly with my own data and encountered a problem for that I cannot find a solution on the web.
When I try to use the “gdal:cliprasterbymasklayer” algorythm it works from the GUI. Then I copy it from the history to the python script editor window as you mentioned and it silently gives no output. All other algorythms Iimplementend in my script so far work as expected, bit I cannot get an output from the cliprasterbymasklayer.
I hope you can give me the solution, if it is not too much effort.
Here are two lines of my code. the first algorithm works and delivers the correct output (as I think) for the second one that does not work. (“dgm_m” is a GeoTiff, the source for the “MASK” is a Shapefile with one polygon selected, “dgm_c” is a string of the full file path)

result = processing.run(“gdal:fillnodata”, {‘INPUT’:dgm_m,’BAND’:1,’DISTANCE’:5,’ITERATIONS’:0,’NO_MASK’:False,’MASK_LAYER’:None,’OPTIONS’:”,’EXTRA’:”,’OUTPUT’:dgm_f})[‘OUTPUT’]

processing.run(“gdal:cliprasterbymasklayer”, {‘INPUT’:result,’MASK’:QgsProcessingFeatureSourceDefinition(‘gp_2012_pol_abschnitte_f5621c7e_1625_4b67_b6b0_06348a48ebb3′, True),’SOURCE_CRS’:None,’TARGET_CRS’:QgsCoordinateReferenceSystem(‘EPSG:4647′),’NODATA’:None,’ALPHA_BAND’:False,’CROP_TO_CUTLINE’:False,’KEEP_RESOLUTION’:True,’SET_RESOLUTION’:False,’X_RESOLUTION’:None,’Y_RESOLUTION’:None,’MULTITHREADING’:False,’OPTIONS’:”,’DATA_TYPE’:0,’EXTRA’:”,’OUTPUT’:dgm_c})
Thank you in advance
Jens

Reply ↓
- underdark said: 2020-05-2312:13
  
  Hi Jens, I haven’t worked with QgsProcessingFeatureSourceDefinition yet. Try replacing that with a regular file path.
  
  Reply ↓
  - jens said: 2020-05-2620:05
    
    Hi Anita,
    it works now. The only thing I changed is using only the minimum of input parameters (So I deleted some parameters that may have caused the error)
    To replace the FeatureSourceDefinition with a regular file path is not the best idea, though. The FeatureSourceDefinition takes a layer with a selection in my case. As I tested it, the selection was gone when I used the regular file path (I think that’s logical).
    Thank you for taking the time to answer.
    Jens
Alessio said: 2020-12-2611:04

Hi Anita! Great tutorial, thank you so much for doing this!

I get this error: File “”, line 33 ‘OUTPUT’: ‘memory:’ ^ SyntaxError: invalid syntax

Do you have any idea what causes this error? Do I have to set the path of the temporary memory before using this line of code?

Thanks in advance for your time!

Reply ↓
- Alessio said: 2020-12-2611:19
  
  I’m sorry, I copied the wrong line, the syntax error is actually occurring in the last line
  
  File “”, line 33 [‘OUTPUT’] ^ SyntaxError: invalid syntax
  
  so is [‘OUTPUT’] causing the problem, and since the syntax is correct, I’m wondering if the problem relates to define somehow the directory to the folder where the temporary output is stored.
  
  Thanks!
  
  Reply ↓
  - underdark said: 2020-12-3023:46
    
    Are you maybe missing the closing brackets to the run function before accessing the output? Most lines are:
    
    )[‘OUTPUT’]
Katrin Schmidt said: 2021-11-1610:21

Hi Anita,
just found your page while tryring to figure out programming in qgis. This was super helpful and super easy to follow. I watched a lot of youtube tutorials but they never clicked for me. Thank you for this post!
Katrin

Reply ↓
Bernard said: 2023-02-2109:32

Hi Anita! Great tutorial, thank you so much for doing this!

my_gpkg = ‘C:/Geodata/NaturalEarth/natural_earth_vector.gpkg’
rivers = ‘{}|layername=ne_110m_rivers_lake_centerlines’.format(my_gpkg)

‘{}| don’t work !

but

C:/Geodata/NaturalEarth/natural_earth_vector.gpkg|layername=ne_110m_rivers_lake_centerlines’.format(my_gpkg)

work

Wish you a Nice day

Reply ↓