Fixed the TIGER Georeference Loader Script

PostGIS in Action has a chapter on extensions to Postgres and PostGIS that might be helpful.  The first one they cover is a georeference tool for TIGER data.  The tool generates scripts that download an entire state’s worth of data, unzips it, and loads the appropriate tables into Postgres.  The book warned that the Linux version of the script was untested.  It was more than untested, it was a disaster.

At the top of the script it sets two PATH variables, one for where the data is to be downloaded to, the other to a temporary directory where the data will be unzipped to.  The destination path is quoted, quotes aren’t necessary but they don’t break anything if they’re there.  The temporary path only had the opening quote, no closing.  That will break things.

The TIGER is organized by state on the Census’s website.  In each state directory there are zip files with state level data, and subdirectories, one for each county in the state and they all contain multiple zip files.  The script had a for loop to unzip the county level zip files, but it skipped the state level.  And the for loop’s syntax was very, very wrong.

for z in *.zip; do $UNZIPTOOL -o -d $TMPDIR $z; done

for z in */*.zip; do $UNZIPTOOL -o -d $TMPDIR $z; done

The script only had the second statement, and it was missing the semicolons, the $z so unzip had no idea what it was supposed to unzip, and the done statement was missing.

After the data was unzipped and dumped into the temp directory shp2pgql and psql are used to dump the data into Postgres.  The first statement was missing a quote.  The psql statements were missing flags to tell it which database to use and which user was to be used to do the dumping.  The tables with county level data needed loops to run through each county’s set of data.  The loops had the wrong syntax again, they almost looked like the syntax for a Windows bat file, but even that syntax wasn’t quite right.

The book’s authors aren’t to be blamed for these errors.  The code was someone else’s and they clearly state that the linux portion of the code has not been tested or looked at.  I just wanted to document this so next time I use it I know what I did to get it working.

 

Advertisements
  1. Phil,

    sorry for the lateness of this. In case you don’t know the TIGER 2010 packaged in PostGIS 2.0 has a fixed linux script and numerous fixes. One of them being upgraded to work with 2010 Tiger data. It is packaged with PostGIS 2.0, but lots of people are using it with 1.5 (including us), so works fine with that.

    More details here:
    http://www.postgis.org/documentation/manual-svn/Extras.html#Tiger_Geocoder

  2. Thanks!

    I really enjoyed your book, it has been very helpful, and of course thanks for all your work on PostGIS itself!

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: