Splitting many polygons into (almost) equal parts using a simple script in PyQGIS, the Python Console of QGIS

Fellow researchers and open-source GIS enthusiasts,

Welcome to my blog!

I’d like to start with a disclaimer – I may be a researcher of this very area but that doesn’t mean everything I do or write here will work for you, in your own desktop configurations and package versions. I have no responsibility if you lose data or mess up your installation. I also do not authorize any copies of my content, except for the gists, which are available under the MIT License.

Whatever is your geo-thing, you may have come across a Polygon which needed to be split in half, three parts, five parts (a hundred parts?). Possible applications of the split polygons are performing zonal statistics and terrain parceling (agriculture).

But wait! There is already a known pipeline used for splitting polygons in GIS Software. It was developed by Darafei Praliaskouski for PostGIS and tested by Paul Ramsey (Thank you!), and is available at Paul Ramsey’s website. The same pipeline was also adapted by Ujaval Gandhi to be used in QGIS (through the User Interface) and can be found in his blog. If you are dealing with a small number of polygons to be divided in QGIS, the aforementioned post from Ujaval Gandhi may be useful to you, especially if you are not familiar with the Python Console of QGIS (PyQGIS).

If you are a Portuguese speaker, you can also check this video on the YouTube channel of Geoaplicada, in which Jocilene Barros uses the Polygon Divider plugin.

By the way, if you are a Portuguese speaker, did you know you could read this exact blog post, and all others, in Portuguese? Just change the language by clicking the globe symbol on the upper right.

Why PyQGIS?

Oh, so many options! If you are still here, I guess you want to use PyQGIS for this task. But why would you want that? I listed four motives below:

  1. You have many Polygons to divide in equal parts each. Doing the same process one-by-one is such a waste of time. Honestly, this was my motive.

  2. You don’t want to install more plugins, but you also don’t want to spend too much time splitting polygons.

  3. You are a Python novice/intermediate/advanced and want to see a working example of how to use PyQGIS.

  4. You want to use only native tools of QGIS. This pipeline here uses only native tools.

A Disclaimer about the usage of the word “Equal”

The resulting polygons generated by pipelines which involve K-means clustering and Voronoi/Thiessen polygons (such as mine) most of the times do not generate a perfect division in terms of area. There is an error associated to each division, which we will be exploring later, on the section “Checking the results and associated errors”. Do not use my pipeline if you need perfectly equal resulting polygons.

My pipeline

For this development, I use a slightly different pipeline from the one described in Paul Ramsey’s website, but I was inspired by that pipeline to build this one.

My pipeline, in general lines:

  1. Preprocessing: Split the original vector file into N other files, each one with one polygon.

  2. In PyQGIS: Get the extents of the vector layer

  3. Generate regular spacing points within those extents

  4. Clip the regular points to the polygon area

  5. Cluster your points by using K-Means algorithm, use the number of clusters as equal to the number of divisions

  6. Calculate mean coordinates for each cluster of points

  7. Calculate Voronoi polygons for the mean coordinates

  8. Clip the polygon based on the Voronoi polygons

  9. Postprocessing: check the results

The example

I start with this Shapefile:

shapefile qgis

This is its attribute table:

shapefile qgis attribute table

And, in this example, I want to divide each of the polygons in three equal parts.

I am using QGIS 3.22 Bialowieza for this tutorial.

Preprocessing details

To make sure the polygons are divided in equal (or almost equal) parts, it is important to know the area of each of them beforehand. On the attribute table, open the field calculator and create a new field called area. I gave my field the type “Decimal number (real)”. The expression is simply “$area”. Generate this column, save, and turn off the editing mode. My attribute table now looks like this:

shapefile qgis attribute table

Important: use a projected coordinate system when performing area calculations.

Another important preprocessing step. To run the PyQGIS code on this post, I need to have each polygon in a different shapefile. If you have all the polygons you want to split in the same shapefile, use “Split Vector Layer” (native:splitvectorlayer) to separate the polygons. In my case, I used the id column to split them.

split vector layer qgis

Save them in a permanent folder, preferentially.

The resulting shapefiles are: split vector layer qgis result

Running the PyQGIS code

This is the gist, hosted on GitHub. Made available under MIT License.

#
# dividing a Polygon in (almost) equal parts using PyQGIS
# by Luisa Vieira Lucchese
# luisalucchese.com
# February 2022
# MIT License
#Copyright (c) 2022 Luisa V. Lucchese
#Permission is hereby granted, free of charge, to any person obtaining a copy
#of this software and associated documentation files (the "Software"), to deal
#in the Software without restriction, including without limitation the rights
#to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
#copies of the Software, and to permit persons to whom the Software is
#furnished to do so, subject to the following conditions:
#The above copyright notice and this permission notice shall be included in all
#copies or substantial portions of the Software.
#THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
#IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
#FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
#AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
#LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
#OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
#SOFTWARE.
#
from qgis.core import *
import processing
#
# use split vector layer before running this code
# and refer to the blog post about this code to know how to use it
#
#### EDIT INPUTS BELOW ####
# provide the folder of the polygons below
POLYLOC='C:/.../'
POLYLOC_ID=POLYLOC+'id_'
# spacing of regular points
SPACING_REG=100.0
#number of divisions
N_DIV=3
# provide the folder to save the resulting points and polygons
OUTLOC='C:/.../'
OUTPOLYGONS=OUTLOC+'polygon_'
# coordinate system
CRS_PTS='EPSG:32722'
# min number and max number of id to process
MINNUM=1
MAXNUM=4
#################################################
for cycle in range(MINNUM,MAXNUM+1):
try:
# open
polyloc_compose=POLYLOC_ID+str(cycle)+'.shp'
poly_qgs = QgsVectorLayer(polyloc_compose, 'poly_id_'+str(cycle), 'ogr')
QgsProject.instance().addMapLayers([poly_qgs])
# extent of the polygon
extentrect=poly_qgs.extent()
xmax = extentrect.xMaximum()
ymax = extentrect.yMaximum()
xmin = extentrect.xMinimum()
ymin = extentrect.yMinimum()
extentreg=str(xmin) + ',' + str(xmax) + ','+ str(ymin)+ ',' + str(ymax)
# generate regular points
paramreg ={'EXTENT':extentreg, 'SPACING':SPACING_REG, 'INSET':5, 'RANDOMIZE':0, 'IS_SPACING':1, 'CRS': CRS_PTS, 'OUTPUT':'memory:'}
regpoints=processing.run("qgis:regularpoints", paramreg)#temp
# extract the points inside the polygons
paramclip= {'INPUT': regpoints['OUTPUT'],'OVERLAY':poly_qgs, 'OUTPUT':'memory:'}
regpointsclip=processing.run("native:clip", paramclip)#temp
QgsProject.instance().addMapLayers([regpointsclip['OUTPUT']])
# clustering
paramcluster= {'CLUSTERS' : N_DIV, 'FIELD_NAME' : 'CLUSTER_ID','INPUT': regpointsclip['OUTPUT'],'OUTPUT':'memory:','SIZE_FIELD_NAME' : 'CLUSTER_SIZE'}
regpointsclusters=processing.run("native:kmeansclustering", paramcluster)#temp
# mean coordinates
parammean= {'INPUT': regpointsclusters['OUTPUT'],'OUTPUT':'memory:','UID' : 'CLUSTER_ID', 'WEIGHT' : ''}
regpointsmean=processing.run("native:meancoordinates", parammean)
QgsProject.instance().addMapLayers([regpointsmean['OUTPUT']])
# voronoi polygons
paramvoro={'BUFFER' : 100, 'INPUT' : regpointsmean['OUTPUT'], 'OUTPUT' : 'memory:' }
voronoi_poly=processing.run("qgis:voronoipolygons", paramvoro)
QgsProject.instance().addMapLayers([voronoi_poly['OUTPUT']])
# clip voronoi polygons by the original polygons
paramclip= {'INPUT': voronoi_poly['OUTPUT'],'OVERLAY':poly_qgs, 'OUTPUT':OUTPOLYGONS+str(cycle)}
polyclip=processing.run("native:clip", paramclip)
polyclip_qgs = QgsVectorLayer(OUTPOLYGONS+str(cycle)+'.gpkg', 'divided_'+str(cycle))
QgsProject.instance().addMapLayers([polyclip_qgs])
except:
print('the following polygon could not be processed:', cycle)

Make the adaptations necessary to each case, namely, coordinate system, number of divisions, spacing of the regular points used to divide the polygons, location of the files in the system, and name of the attribute table column used to split the polygons into different files.

And… run the code!

This is the resulting split for the example polygons:

divided polygons

Checking the results and associated errors

For polygon 4, I have that the total area is: 7080260.638, which means one third of it is 2360086.879.

In the attribute table of the divided polygon, I calculate “$area” (same as I did before). I get the values: 2050355.0129558295, 2675185.5178783312, and 2354716.308374524.

Therefore, we can notice there is a variation in area between the divided parts. This happens mainly because the clusters generated by K-Means Clustering have different sizes. Using Random Points instead of Regular Points does not help and can even induce a higher error, in some cases (like if you use a small number of random points for the area). If you do find a workaround for this issue, please let me know.

Extras

  • When I should NOT be using this script to divide my polygons?

You should not use this script, or any pipeline involving clusters of different sizes, if you need precision in your division.

  • If there are already pipelines to divide polygons into different areas, what exactly have you accomplished here?

The automation of a potentially tedious process, without the need to install any plugins.

  • You said you do not use any plugins, but Python Console is a plugin. Gotcha!

You’re right, “Python Console” is a plugin, as well as “Processing” and “GRASS GIS Provider” which are plugins too. I mentioned there is no need to install extra plugins to run my script.

Luísa Vieira Lucchese
Luísa Vieira Lucchese
Research Assistant Professor

Research Assistant Professor at University of Pittsburgh

Related