This is part 2 of the blog on GeoPandas, in which we will complete the example workflow. The steps remaining now are to generate some random points around Victoria (to simulate addresses), create some rotated bounding boxes for our postcodes, and join the postcodes to our random points.
Worked Example, Pt. 2
Let’s continue.
import geopandas
Create some random points¶
To simulate the point-based locations, let's generate some random points within Victoria.
Let's import the dissolved Victoria shape to create these points.
vic_shape = geopandas.read_file('./outputs/vic_shape.shp')
ax = vic_shape.plot()
We can then use the convex_hull
property to get a simplified shape for processing
vic_hull = vic_shape.convex_hull
ax = vic_hull.plot()
To generate some points, we need to be able to create the points and assign them random coordinates.
For the points, we can use shapely geometry's Point()
constructor; and for the randomness, numpy's random.random()
function - so let's import both of those
import numpy as np
from shapely.geometry import Point
I'm going to generate the points using the max bounds of the convex hull, which we can access using the bounds
property. This returns a DataFrame object, with bounds values stored in its columns.
(there is only one feature in the convex hull GeoDataFrame, so we can access the first member directly with [0]
xmin, xmax, ymin, ymax = vic_hull.bounds['minx'][0], \
vic_hull.bounds['maxx'][0], \
vic_hull.bounds['miny'][0], \
vic_hull.bounds['maxy'][0]
vic_hull.bounds.head()
random.random(2000)
returns an iterable series of 2000 random numbers, which we then multiply by the difference between the maxs and mins, and add to the min.
We can then use the python builtin zip()
function and some list comprehension to generate a series of 2000 random cooridnate pairs, which we can then put into a GeoDataFrame.
xc = (xmax - xmin) * np.random.random(2000) + xmin
yc = (ymax - ymin) * np.random.random(2000) + ymin
points_geom = geopandas.GeoSeries([Point(x, y) for x, y in zip(xc, yc)])
points = geopandas.GeoDataFrame(geometry=points_geom, crs=vic_shape.crs)
The result:
ax = points.plot()
points.head()
Clipping and Joining¶
Now that we have our random points, I want to both trim them to those that fall within our Victorian postcodes, and assign them a postcode.
So let's get our postcodes data that we saved earlier.
vic_poas = geopandas.read_file('./outputs/vic_poas_plus.shp', crs=vic_shape.crs)
ax = vic_poas.plot()
vic_poas.head()
Then we can actually do the clip and join in a single step, using once again the GeoPandas excellent spatial join function sjoin()
, and the GeoDataFrame query()
method.
We can perform the join directly (the overlay method defaults to intersection
), and then filter in one line of code using query('index_right >= 0')
, such that we only keep records that were successfully joined.
(Note that rows wihtout joins have a value of
NaN
, so anything with a value >= 0 will clip out any unjoined points)
vic_points = geopandas.sjoin(points, vic_poas, how='left').query('index_right >= 0')
ax = vic_points.plot()
vic_points.head()
Our GeoDataFrame has some extra data that we don't need, so let's trim those columns off using the drop()
method.
vic_points = vic_points.drop(['index_right','index_righ','code'], axis=1)
vic_points.head()
Creating bounding boxes¶
The last step is to create some bounding boxes for the postcodes. We can do this too in one line of code using the envelope
property, then plot it (with some colour customisation this time)
ax = vic_poas.envelope.plot(edgecolor='red', facecolor='none')
However, these bounding boxes are square - I'd rather have them rotated to represent the orientation of underlying polygon better.
For this we can use the shapely property minimum_rotated_rectangle
. This property isn't available (yet?) on a GeoDataFrame or GeoSeries object, but the geometries in our GeoDataFrame are stored as shapely geometries, so we can still access them by going to the geometries directly.
print(type(vic_poas['geometry'][89]))
vic_poas['geometry'][89]
vic_poas['geometry'][89].minimum_rotated_rectangle
To get the minimum_rotated_rectangle
for each postcode area, we can iterate over the rows in the GeoDataFrame by using the iterrows()
method, and using list comprehension, build a GeoSeries with geometries populated from the minimum_rotated_rectangle
property.
box_calcs = geopandas.GeoSeries([ item['geometry'].minimum_rotated_rectangle for idx, item in vic_poas.iterrows() ])
boxes = geopandas.GeoDataFrame(geometry=box_calcs)
ax = boxes.plot(edgecolor='red', facecolor='none')
Finally, let's rejoin the postcode column to our newly generated boxes. We can just slap the column directly onto the end using boxes['postcode'] = vic_poas['POA_NAME']
because the boxes were generated directly from the same data and won't lose their order
boxes['postcode'] = vic_poas['POA_NAME']
boxes.head()
Now we have both our rotated bounding boxes and random points with postcodes, let's plot them together:
final_plot = boxes.plot(edgecolor='red', facecolor='none')
vic_points.plot(ax=final_plot, marker='.', color='green')
And that's it!