Dissolving shapefiles using shapely and psycopg2

Geospatial data is commonly stored in PostgreSQL which, through PostGIS, provides a great set of tools for geospatial analysis. However, if this analysis is part of a larger workflow chain it can sometimes be easier to work with the excellent shapley package that provides a easy set of calls for advanced geospatial analysis. The script below looks at how to extract data from PostgreSQL, insert into shapely and run a common geospatial query: dissolve (cascaded union).


#!/usr/bin/python
import psycopg2
import shapely.wkb as wkb
import shapely.wkt as wkt 
from shapely.ops import cascaded_union
import json
 
def main():
    
    #Define our connection string
    conn_string = "host='12.34.56.78' dbname='nathan_local' user='postgres' password='XXXXXXXXX'"
 
    # print the connection string
    print("Connecting to database %s" % (conn_string))
 
    # get a connection
    conn = psycopg2.connect(conn_string)
 
    # conn.cursor will return a cursor object, you can use this cursor to perform queries.
    cursor = conn.cursor()
    print("Connected!\n")

    # use the cursor object to extract data from postgreSQL
    cursor.execute("SELECT tile_name, st_asewkt(wkb_geometry) FROM test.tablename")

    tiles = list(cursor.fetchall()) #fetchall returns all matching records

    coverage = {}
    for name, geometry in tiles:
        coverage[name] = wkt.loads(geometry.split(';')[1])

    geometry = cascaded_union(coverage.values())
	
    with open('/file/path/test.json', 'w') as outfile:
        json.dump(mapping(geometry), outfile)
	 
if __name__ == "__main__":
	main()

Calculating number of overlaps between polygons

A common application within GIS is to create buffer / catchment areas – an interesting analysis is then to see how many overlaps occur with the each catchment indicating whether a site is under or over provisioned. This can be achieved using Postgis and the following fantastic tutorial.

This can you allow you to produce maps like the following:

pastedImage

Below gives a snapshot of the process:

create table boundaries_allot as select ST_Union(ST_ExteriorRing(geom)) from
(select (ST_DumpRings((st_dump(geom)).geom)).geom as geom from allotment_buffer) q 

This has been altered from the tutorial to include multi polygons. ST_Dump breaks down multi polygons into single polygons and ST_DumpRings provides the interior and exterior rings of polygons (for complex polygons)

CREATE SEQUENCE polyseq_allots;
CREATE TABLE polys_allots AS
SELECT nextval('polyseq_allots') AS id, (ST_Dump(ST_Polygonize(st_union))).geom AS geom
FROM boundaries_allot;

This creates individual polygons for each multi-linestring and uses ST_Dump to switch any multi polygons to single polygons.

ALTER TABLE polys_allots ADD COLUMN count INTEGER DEFAULT 0;
UPDATE POLYS_allots set count = p.count
FROM (
  SELECT count(*) AS count, p.id AS id  
  FROM polys_allots p 
  JOIN allotment_buffer c 
  ON ST_Contains(c.geom, ST_PointOnSurface(p.geom)) 
  GROUP BY p.id
) AS p
WHERE p.id = polys_allots.id;

Once we have the individual polygons, by using the centroids of the new small polygons with the set of original circles, we can calculate how many circles contain each centroid point.

PG routing – shortest path algorithms

Once a routable network has been set up we can start to find the shortest path (using Dijkstra’s algorithm) between two points. From here on, I will be using the OSM data for the London road network.

The query use’s a ‘cost’ column – which could be anything thing you like – to calculate the shortest cost from one node to another. In this example I decided to use travel time (rather than distance).

First, I changed the distance of each vertex (the way table) from km to miles, and then calculated the time it would take to traverse each vertex whether walking, cycling or driving. To make this more accurate, I updated the max speed (forward and backward) along each vertex dependent on its class (as listed in the class table).

Continue reading

PostgreSQL & PostGIS command line introduction

This post highlights some very useful commands for entering spatial data into a PostgreSQL database ready for analysis within GIS or using Postgis.

For entering spatial data into PostgreSQL you can either use the ogr tools or pgsql2shp commands. After download, both need to be added to your windows environment variable:

ogr2ogr: input csv

ogr2ogr -overwrite -f "postgresql" pg:"dbname=postgres host=localhost port = 5632 user=postgres password = XXXXX" C:\Link\To\file.csv -lco  OVERWRITE=YES

ogr2ogr: input shapefile

ogr2ogr –overwrite –nlt MULTIPOLYGON –f PostgreSQL PG:"dbname=Nathan_spatial host=localhost port=5632 user=postgres password=XXXXX" example.shp

ogr2ogr: input ESRI FileGDB

ogr2ogr –overwrite – skipfailures  –f PostgreSQL PG:"dbname=Nathan_spatial host=localhost port=5632 user=postgres password=XXXXX" "C:\somefolder\somegeodatabase.gdb" "example feature class"

Continue reading