Installing Postgis 2.1.0SVN on Arch

I wanted to try the new Tiger geocoder extension that’s coming with PostGIS 2.1.0.  I set up a PKGBUILD to upgrade the 2.0.2 that’s in Arch’s repository.

Since this is a major version upgrade a hard upgrade is required.  The upgrade went smoothly except the spatial_ref_sys table wasn’t populated.  Running the spatial_ref_sys.sql script found at /usr/share/postgresql/contrib/postgis-2.1 solved that problem.

Changes to building QGIS 1.9 on Arch

Recently Arch added QT5 to their repositories and renamed the old QT package to QT4.  This causes a little confusion for the qgis-git PKGBUILD since it lists the old qt package as a dependency.  It’s an easy fix, just change qt to qt4 and add


to the ./configure options.

Once 1.9 is compiled and built you can see how far along this beta version has come.  The new loader built into the DB Manager plugin works flawlessly.  This is my favorite new feature in 1.9 / 2.0.  It makes it so if you can add a dataset to a QGIS map, it only takes a few clicks to import that data into a new PostGIS layer.  It really comes in handy with layers created by ad-hoc queries.

Plugin developers have been hard at work adjusting their plugins to work with the new API.  OpenLayers has been updated, just upgrade through the Fetch Python Plugins menu and you can once again display your Google, Bing, etc maps in QGIS.


Half off PostGIS In Action 2nd Edition!

James Fee found a 50% off coupon code for PostGIS In Action’s 2nd Edition.  Not only does the coupon code get you 50% off the 2nd edition, you also get the 1st Edition, and a PDF download of a preview of the 2nd edition as it is in progress.   Currently they’re up to chapter 5, and so far it is awesome.

The book really pushes OpenJump due to its ability to run and display ad-hoc PostGIS queries.  They do a run down of various OSGeo desktop programs and in their QGIS writeup they lament the fact that QGIS doesn’t have out of the box ad-hoc capabilities like OpenJump.  I’m not sure about that though, the DB Manager function added in 1.8 seems to be pretty complete.  You can send your query results to the map canvas, DB Manager has syntax highlighting and autocomplete, it’s awesome.  The 1.9 version is even better since it adds import / export functions.  You can use the import function to import shapefiles or any other format OGR supports.  It also allows you to import your ad-hoc results into your PostGIS database.  You can take your tabular results and copy / paste them to a text file or spreadsheet as tab delimited data.

The one limitation to DB Manager / QGIS is your ad-hoc queries have to have a unique integer field if you want to see the resulting layer.  I usually get around that by using Postgres’s row_number() window function.

Installing PGRouting on Arch Linux

PGrouting adds routing functionality to a PostGIS database.  For Arch users you can install it from AUR.  If you want PGRouting’s Travelling Sales Person (TSP) functionality installed you’ll need to install Gaul (available from AUR from gaul-devel) and if you want Driving Distance (from AUR as cgal).  After you install PGRouting run

sudo ldconfig

so you’re system knows where to find the new PGRouting libraries.

Next up is to add PGRouting to your PostGIS database.  PGRouting doesn’t have an extension yet so you’ll have to run some sql files.  I like to install PGRouting into its own schema so I start psql and use:

CREATE SCHEMA pgrouting;
SET search_path pgrouting, postgis;
\i /usr/share/postlbs/routing_core.sql
\i /usr/share/postlbs/routing_core_wrappers.sql
\i /usr/share/postlbs/routing_tsp.sql
\i /usr/share/postlbs/routing_tsp_wrappers.sql
\i /usr/share/postlbs/routing_dd.sql
\i /usr/share/postlbs/routing_dd_wrappers.sql

You may get some errors like

ERROR:  language "C" does not exist

If you do, just open the file you tried to run and find the line referenced.  Change the “C” to “c” and then re-run the .sql file.

Now all that is left to do is add the pgrouting schema to your search path.  Personally I use an ALTER DATABASE statement, but other people may prefer to make the change on a per user basis.

QGIS 1.9 breaks some plugins (for now)

As part of the beta process QGIS 1.9 has started to implement the API changes that will lead to QGIS 2.0.  These changes to the low level guts of the program have broken a few plugins.  Personally I’ve come across two:  OpenLayers and Points2One.

OpenLayers allows you to use outside map sources such as Google Maps and Bing Maps as backdrops to your maps.  Currently it crashes when the plugin tries to load.  Points2One takes point layers and makes lines or polygons from them.  It loads, but crashes when you try to create the new data set.

Other plugins work just fine, for example, Statist is currently running perfectly.  When the API changes are locked in the plugins that aren’t working will be able to be re-written to work with the new version.  Until then, if you have a plugin you rely avoid the beta builds and stay with QGIS 1.8.

EDIT:  OpenLayers has been updated to work with the new API.  Just upgrade the plugin through Fetch Python Plugins menu and it will work just fine.

Postgis 2.0.2 hits Arch repository

Arch has posted the PostGIS 2.0.2 release that came out December 3rd to the Community repository. After Pacman updates the package Postgres needs to be made aware of the upgrade.  Use:

ALTER EXTENSION postgis_topology TO "2.0.2";

Chernoff Faces of Bike Rides

Chernoff faces of 4 bike rides

Chernoff faces from 4 bike rides

I plugged some of the fit file data that I stored in a Postgres database into R to generate the image above.  Two R libraries, RpostgreSQL to connect to Postgres, and aplpack, to generate the faces were used.  Once the libraries were loaded the following R command pulls the data out of Postgres:

rs <- dbSendQuery(con, "
SELECT ride_date
, ST_Length(ST_Transform(ST_Makeline(the_geom ORDER BY ride_time),26916)) As length
, date_part('hour', max(ride_time))-date_part('hour', min(ride_time)) As ridetime
, avg(temperature) As temp, avg(speed) As avg_speed
, avg(altitude) As alt, max(altitude)-min(altitude) As alt_diff
, avg(cadence) As rpm 
FROM ride GROUP BY ride_date ORDER BY ride_date

Then fetch the data from the record set with:

rides<-fetch(rs, n=-1)

And finally plot the faces with:

faces(rides[,2:8], face.type=0, labels=rides$ride_date)

The faces function draws the faces based on the order of the variables.  The features are modified in this order:

height of face 
width of face 
structure of face 
height of mouth 
width of mouth 
height of eyes 
width of eyes 
height of hair 
width of hair 
style of hair 
height of nose 
width of nose 
width of ear 
height of ear

If you don’t have enough variables to fill this list it will wrap around and start again from the top.  For more configuration options use ?faces after you load the aplpack.

There are two small problems with the data, and they both caused by stopping the timer on your bike computer, then restarting at a later time and place.   The time ridden is calculated here by simply subtracting the minimum time from the maximum.  If you rode for 3 hours and took an hour break the query would return a time of 4 hours.  Similarly, if you ride for a mile, hop in a truck and drive 10, then ride for another mile you’d get a result of 12 miles.  There’s a way to work around this, I just haven’t figured it out yet.  I suppose the best solution is to harden up and not take breaks.

The data is stored as points in the database, to get the length ST_Makeline “connects the dots” from point to point.  An awesome feature of Postgres 9.0+ is the Order By in the middle an aggregate function.  It helped in this case because the first ride graphed doubled up and crossed over itself several times.  This lead to the query planner making some interesting decisions on where the line should go.  Forcing the process to follow the points in order the line followed my route perfectly.

Use SQLite Virtual Tables to Connect to Shapefiles

Your data is in a shapefile and you want to run some spatial queries but don’t want to load your data into PostGIS or some other spatial database, what do you do?  You can use SQLite’s virtual tables to create a quick link to your data, and here’s how.

  • First, from the directory your shapefile is in start sqlite3.
  • Next, use SELECT load_extension(‘’); to enable the spatialite extension.
  • Load your shapefile as a virtual table with:
USING VirtualShape(<shapefile_name>, 'UTF-8', );

Now you can run any read-only spatial query you want against your new table.  Note, the ‘UTF-8’ value will need to be changed if your shapefile isn’t using Unicode encoding.

Ranking the Indiana Breweries by Population

In my last post I used PostGIS to calculate the number of Hoosiers that live within 5 miles of an Indiana microbrewery.  Now I want to rank each brewery by population.  The total is going to be a little more than the 2.5 million-ish I came up in the last post due to the lucky duckies who live with 5 miles of more than one brewery.

Flat 12 of Indianapolis, with over 243,000 people within 5 miles of their location is our winner.  No big surprise, but the top 11, and 12 of the top 13 are all Indy breweries.

A few breweries, most notably Three Floyds in Munster are lower than they should be because I only used Indiana census data.  Three Floyds is only a half mile from the Indiana / Illinois border.  If I included Illinois data the nearby Chicago population would have pushed them up the list.

The full list and the SQL used to create it are below the jump.

Continue reading

How many Hoosiers live with 5 miles of an Indiana brewery?

2,539,798.  Approximately.  I think.  Maybe.  Here’s what I did to get there.

First, the Census department has put together some nice pre-digested block layers, they include the total population for the blocks so you don’t have to compile that data yourself.

Second, create a 5 mile (acutally I used 8046.72 meters because my data in a UTM projection) buffer around my brewery points.  Shp2pgsql, use it to shove the buffers into a PostGIS database and call the layer buffer_5mile.  Fun!

Third, load the block data into QGIS and add a numeric field (I used totarea) to store the area of the blocks.  Use shp2pgsql to reproject the census data to match your buffers and load the results into PostGIS.  I called my layer block_pop.

Fourth, update that new area field with

UPDATE block_pop SET totarea = st_area(geom);

Fifth, to get an estimate of the population within the buffers use:

round(sum(c.pop10*(ST_Area(ST_Intersection(c.geom, b.the_geom))) / c.totarea)) 
(SELECT ST_Union(geom) as the_geom FROM buffer_5mile) as b, block_pop c
WHERE ST_Intersects(the_geom, c.geom);

Bam.  You done.