Dissolving shapefiles using shapely and psycopg2

Geospatial data is commonly stored in PostgreSQL which, through PostGIS, provides a great set of tools for geospatial analysis. However, if this analysis is part of a larger workflow chain it can sometimes be easier to work with the excellent shapley package that provides a easy set of calls for advanced geospatial analysis. The script below looks at how to extract data from PostgreSQL, insert into shapely and run a common geospatial query: dissolve (cascaded union).

import psycopg2
import shapely.wkb as wkb
import shapely.wkt as wkt 
from shapely.ops import cascaded_union
import json
def main():
    #Define our connection string
    conn_string = "host='' dbname='nathan_local' user='postgres' password='XXXXXXXXX'"
    # print the connection string
    print("Connecting to database %s" % (conn_string))
    # get a connection
    conn = psycopg2.connect(conn_string)
    # conn.cursor will return a cursor object, you can use this cursor to perform queries.
    cursor = conn.cursor()

    # use the cursor object to extract data from postgreSQL
    cursor.execute("SELECT tile_name, st_asewkt(wkb_geometry) FROM test.tablename")

    tiles = list(cursor.fetchall()) #fetchall returns all matching records

    coverage = {}
    for name, geometry in tiles:
        coverage[name] = wkt.loads(geometry.split(';')[1])

    geometry = cascaded_union(coverage.values())
    with open('/file/path/test.json', 'w') as outfile:
        json.dump(mapping(geometry), outfile)
if __name__ == "__main__":

Calculating number of overlaps between polygons

A common application within GIS is to create buffer / catchment areas – an interesting analysis is then to see how many overlaps occur with the each catchment indicating whether a site is under or over provisioned. This can be achieved using Postgis and the following fantastic tutorial.

This can you allow you to produce maps like the following:


Below gives a snapshot of the process:

create table boundaries_allot as select ST_Union(ST_ExteriorRing(geom)) from
(select (ST_DumpRings((st_dump(geom)).geom)).geom as geom from allotment_buffer) q 

This has been altered from the tutorial to include multi polygons. ST_Dump breaks down multi polygons into single polygons and ST_DumpRings provides the interior and exterior rings of polygons (for complex polygons)

CREATE SEQUENCE polyseq_allots;
CREATE TABLE polys_allots AS
SELECT nextval('polyseq_allots') AS id, (ST_Dump(ST_Polygonize(st_union))).geom AS geom
FROM boundaries_allot;

This creates individual polygons for each multi-linestring and uses ST_Dump to switch any multi polygons to single polygons.

UPDATE POLYS_allots set count = p.count
  SELECT count(*) AS count, p.id AS id  
  FROM polys_allots p 
  JOIN allotment_buffer c 
  ON ST_Contains(c.geom, ST_PointOnSurface(p.geom)) 
  GROUP BY p.id
) AS p
WHERE p.id = polys_allots.id;

Once we have the individual polygons, by using the centroids of the new small polygons with the set of original circles, we can calculate how many circles contain each centroid point.

PG routing – driving distance

Building on the work of previous posts looking at building routable networks and finding the shortest path in a network, this post will look at creating buffers around nodes based on a cost attribute specifed by the user.

The driving distance function, which comes with pg routing 2.0, creates an alpha shape around a node, normally based on the distance someone could travel from that node, or how far someone could go in a time interval.

For this example, we are looking at where someone could travel in 1km along the network from a central node. The following query selects all the nodes that are within 1 km of node 2000.

Continue reading

PG routing – shortest path algorithms

Once a routable network has been set up we can start to find the shortest path (using Dijkstra’s algorithm) between two points. From here on, I will be using the OSM data for the London road network.

The query use’s a ‘cost’ column – which could be anything thing you like – to calculate the shortest cost from one node to another. In this example I decided to use travel time (rather than distance).

First, I changed the distance of each vertex (the way table) from km to miles, and then calculated the time it would take to traverse each vertex whether walking, cycling or driving. To make this more accurate, I updated the max speed (forward and backward) along each vertex dependent on its class (as listed in the class table).

Continue reading

Processing OS Codepoint with polygons

Following on from an earlier post updating OS Codepoint data , this post will now look at how to process OS Codepoint with polygons. This dataset provides a polygon for each postcode, including vertical streets (represented by squares) which is where a building has more than one postcode.

1) From the data provided by OS all the shapefiles (for each postcode sector) need to be merged into one. This can be achieved using QGIS ‘Merge to one’ tool under the vector toolbar or by writing a simple python script in arcmap utilising the merge tool.

2) We are going to create two extra fields in the attribute table ‘PCOutcode’ and ‘PCSector’ – this will provide us with additional symbology options with GIS. For ‘PCOutcode’, use =RTrim(Left(Postcode,4)) to capture the first part of each postcode, and for ‘PCSector’ use =RTrim(Left(Postcode,5)) where Postcode is the postcode field and RTrim removes any trailing spaces.

Continue reading

PostgreSQL & PostGIS command line introduction

This post highlights some very useful commands for entering spatial data into a PostgreSQL database ready for analysis within GIS or using Postgis.

For entering spatial data into PostgreSQL you can either use the ogr tools or pgsql2shp commands. After download, both need to be added to your windows environment variable:

ogr2ogr: input csv

ogr2ogr -overwrite -f "postgresql" pg:"dbname=postgres host=localhost port = 5632 user=postgres password = XXXXX" C:\Link\To\file.csv -lco  OVERWRITE=YES

ogr2ogr: input shapefile

ogr2ogr –overwrite –nlt MULTIPOLYGON –f PostgreSQL PG:"dbname=Nathan_spatial host=localhost port=5632 user=postgres password=XXXXX" example.shp

ogr2ogr: input ESRI FileGDB

ogr2ogr –overwrite – skipfailures  –f PostgreSQL PG:"dbname=Nathan_spatial host=localhost port=5632 user=postgres password=XXXXX" "C:\somefolder\somegeodatabase.gdb" "example feature class"

Continue reading