Processing OS Codepoint with polygons

Following on from an earlier post updating OS Codepoint data , this post will now look at how to process OS Codepoint with polygons. This dataset provides a polygon for each postcode, including vertical streets (represented by squares) which is where a building has more than one postcode.

1) From the data provided by OS all the shapefiles (for each postcode sector) need to be merged into one. This can be achieved using QGIS ‘Merge to one’ tool under the vector toolbar or by writing a simple python script in arcmap utilising the merge tool.

2) We are going to create two extra fields in the attribute table ‘PCOutcode’ and ‘PCSector’ – this will provide us with additional symbology options with GIS. For ‘PCOutcode’, use =RTrim(Left(Postcode,4)) to capture the first part of each postcode, and for ‘PCSector’ use =RTrim(Left(Postcode,5)) where Postcode is the postcode field and RTrim removes any trailing spaces.

3) Select all the postcodes that do not begin with a V (which indicates they are a vertical street) and then create a new shapefile from these.

4) Dissolve this shapefile on ‘PCOutcode’ and then ‘PCSector’.

For simplicity, I decided to give the gaps left by the vertical streets the same ‘PCOutcode’ and ‘PCSector’ as the polygon they fall with special cases for those that border more than one. You could leave the vertical streets by omitting step 3. Although you will lose information about postcodes in the vertical streets, you will create a much smoother layer.

5) In order to fill the holes we can either use PostGIS or the ‘Fill Tool’ found in QGIS under processing -> toolbox -> qgis. If using PostGIS then the code below provides a template for the ‘PCSector’ shapefile – just remember to change table and field names.

UPDATE codepointpolygonjan2015_pcsector SET wkb_geometry = a.st_collect FROM
SELECT st_collect(st_makepolygon(st_exteriorring(individual_geometry))), ogc_fid
        FROM (
               SELECT (st_dump(wkb_geometry)).geom individual_geometry, ogc_fid
		FROM codepointpolygonjan2015_pcsector
	     ) q
	group by ogc_fid
) a

WHERE a.ogc_fid = codepointpolygonjan2015_pcsector.ogc_fid;

6) This has filled the holes of vertical streets but not ones that were bordering two postcode boundaries. To achieve this, we now load the shapefile into ArcMap and perform a union on itself ensuring that the ‘Gaps allowed’ box is unchecked. This will create features of all the vertical streets left. Highlight all these features by searching for “” in the postcode field in the attribute table. Use the eliminate tool to merge them to the nearest polygon.

7) Now dissolve the resulting shapefile on ‘PCOutcode’ and ‘PCSector’ to complete the process. Note: there should be the same amount of shapefiles as the original layer that was loaded in the previous step.

One final issue that you may notice from the resulting shapefiles is there are quite a few islands (where postcode sectors/outputs are multipolygon’s). If using this layer at a small scale then these can corrected by hand for the area of interest. Alternatively, if you want to correct them for the whole of the UK you can use this method.

8) First we breakdown the multipolygons into single polygons and then find the largest area (as this represents the main postcode boundary). This is achieved using the following PostGIS code:

CREATE TABLE largest_postcode AS 
	SELECT distinct * FROM 	(
		SELECT pc_outcode, (ST_Dump(wkb_geometry)).geom As wkb_geometry,                                             
                                    ST_area((ST_Dump(wkb_geometry)).geom) as Area 
		                    FROM codepointpostcodeoutcode_final) as a 
		WHERE area = (
			SELECT max(b.area) 
			FROM (
				SELECT pc_outcode, (ST_Dump(wkb_geometry)).geom As wkb_geometry,     
                                ST_area((ST_Dump(wkb_geometry)).geom) as Area 
				FROM codepointpostcodeoutcode_final) as b WHERE b.pc_outcode = a.pc_outcode)


9) Load this table into QGIS and repeat steaps 5 – 7.

Note: For the ‘PCSector’ layer, you will find that some large sectors are being dropped through the process of choosing only the maximum area. To avoid this, we collect all the sectors that have been eliminated (by using the select by location or intersection tool) and then creating a layer for the large areas (say 10,000,000 and greater). This layer can then be merged with the original (thus preserving these large postcode sectors), with steps 5 – 7 run after to fill in the gaps.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s