Loading Landform Panorama Data into Postgres/PostGIS.

After a break enforced by my relocation from Singapore back to the UK I am now up an running again, so hopefully blog posts should become more regular!

I have taken a slight diversion over the last week or so inspired by Nick Whitelegg’s attempts to create a Landranger style map based on the Ordnance Survey Vector Map District and Landform Panorama datasets.  Nick posted his methodology to the OSM talk-gb mailing list and on the wiki.  I had previously looked at importing the Landform Panorama data into a Postgres/PostGIS database and tried to use ogr2ogr for this task. However I discovered that support for the DXF file format, which is used by Landform Panorama, did not seem to allow me to extract what I needed.  Nick’s approach was to write his own parser which read the DXF files, extracted the features and data he wanted and then loaded them into the database.

I felt that there were a couple of features missing from Nick’s code so I worked for a few days to write a more complete parser program modeled along the lines of the shp2pgsql program I used in a previous blog post.  My new program is imaginatively called lfp2pgsql and has the following features:

  • Builds using the standard autoconf/automake system.
  • Does not connect to the database directly, rather it writes the SQL statements to standard out.
  • Has the same command line arguments as shp2pgsql allowing the separation of creating a table in the database from the population of the data.
  • Stores the feature type as well as the height and geometry of the feature.
  • Stores the Contour, Lake, Breakline, Coastline, Ridgeline, and Formline feature types.

The source code is available from Githubhttps://github.com/keithsharp/lfp2pgsql.  You can download the source using git or by clicking on the download button.  To build the program you need to unpack the source, if you downloaded rather than using git, and then do the following:

autoreconf --force --install
configure
make

You should now have a binary in the src directory that you can use to parse the Landform Panorama DXF files.  To load the data into a Postgres/PostGIS database you would do some thing like:

lfp2pgsql -p -I nn00.dxf contours | psql -d opendata

which would create the contours tables and an index on the geometry column, then use a shell script for loop to load all the data from the files:

for FILE in *.dxf; do
    lfp2pgsql -a ${FILE} contours | psql -d opendata
done

Full details can be found in the README file included in the source.

As an example of what is possible I loaded all of the data for the NN grid area into a database and then used QGIS to render the data:

QGIS Showing Landform Panorama Data

QGIS Showing Landform Panorama Data

I loaded the data from Postgres using the same technique as I described in a previous post.

Feedback or bug reports are welcome, either use the comments below or send me an email.

Posted in Data, Mapping, Tools | Tagged , , , | Leave a comment

Visualising data with Quantum GIS.

The first thing that most people want to do with GIS data is to create a visual representation, better known as a map.  I’m no different, so this blog entry will cover how I hooked up the Quantum GIS application to the PostGIS database I created previously.

I started by examining the data that I had loaded into the database.  This ability to run fairly standard SQL queries against my spatial data is one of the main strengths of using a spacial database.  Having read the Strategi documentation I knew that there were two columns in each table that I was interested in: code: a numeric feature type identifier and legend: an English language description of the feature type.  For example I ran the following SQL against the strategi_region table:

opendata=# SELECT DISTINCT code, legend FROM strategi_region ;
 code |                legend                 
------+---------------------------------------
 5122 | Foreshore - other MLW exposed polygon
 5250 | Lake / other inland water polygon
 5120 | Foreshore - sand polygon
 5610 | Wood / Forest polygon
 5422 | Small Urban Area polygon
 5420 | Large Urban Area polygon
(6 rows)

This simple SQL statement selects the code and legend columns from the strategi_region table, and the DISTINCT keyword limits the query to returning only a single row for each match.  The output shows that there are 6 different types of feature stored in the table.  I ran the same query on the strategi_line and strategi_point tables returning 73 and 89 features respectively.

Once I knew what the data in my database represented I could turn my attention to creating a map.  As a first step I decided to use the Quantum GIS (QGIS) desktop GIS application.  To install QGIS I used yum from the command line:

sudo yum install qgis

With QGIS installed I launched the program from Applications Menu -> Graphics -> Quantum GIS.  When QGIS launched I was presented with a fairly typical application layout: menus across the top, a toolbar, and then a couple of blank areas where the data would be represented.  For more details on the QGIS interface and capabilities have a look at the manual (PDF).

There are a number of ways to add a new PostGIS layer to QGIS (see the manual for details).  I used the menu option: Layer -> Add PostGIS Layer.  This launches a new window:

Add PostGIS Layer

The first thing I did was to configure a new database connection by clicking on the New button and filling in the details in the resulting window:

New database connection

The fields should be pretty self explanatory, the username and password are those I entered when I created my database user previously.  I unselected the “Allow geometryless tables” option as this reduces the clutter on later screens by only showing tables with geometry columns.  Once I filled in the details I clicked the Test Connect button which confirmed my setup was working, then I clicked OK.

I was then returned to the Add PostGIS Table(s) window where the connect button was now enabled.  Clicking on this button showed me the tables in my database that had geometry:

Connected to the database

Selecting the strategi_line table allows me to click on the Build Query button which launches the query builder window:

The query builder window

From the screenshot you can see that I’ve started to build an SQL query to select a subset of the data in the table.  In this example I’m selecting on code = 5110, this is the code for Coastline (Natural).  Clicking on the Test button pops up a window telling me how many rows are returned by the query and I then click on OK.  Back in the Add PostGIS Table(s) window clicking on Add finally adds my new layer to QGIS and gives me a map with an outline of the UK:

Showing the coastline of the UK

I renamed the layer by right clicking on the layer name and selecting Rename from the context menu, I also changed the line colour by right clicking on the layer name and selecting Properties from the context menu.  I then went ahead and added a number of  other layers from PostGIS by repeating the previous steps but selecting for different feature codes.  An excerpt from the resulting map:

Map excerpt showing Loch Lomond, coastline, rivers, canals, and woodland

When adding the additional layers I combined feature codes using the boolean functions in SQL, for example there are three sequential codes that cover the polylines that represent canals: 5240, 5241, and 5242.  So in query builder my SQL looked like:

"code" >= '5240' AND "code" <= '5422'

I like the simplicity of creating a map with QGIS, but I feel the styling of the map is cumbersome and limited.  Next I’m going to try to create a more complex map using Mapnik to render the image.

Posted in Data, Mapping, Tools | Tagged , , , , , | 4 Comments

Creating and populating a database.

Once I had installed PostGIS it was relative simple to create a spatial database instance.  For convenience the first thing I did was create a database user with the same name as my Linux username.  This means that the default Postgres authentication system will just work transparently:

sudo -u postgres /usr/bin/createuser -s -e -P kms

If you run this command as is you’ll be prompted to enter a password for the newly created user, this is not strictly necessary for what I’ll be doing today, but it will be needed as I install and configure tools to use the database.  You should also note that this command will grant this user superuser privileges across the entire Postgres installation, you probably shouldn’t do this on a production system.

The next four commands create and then configure the database:

createdb -e opendata
createlang -e plpgsql opendata
psql -d opendata -f /usr/share/pgsql/contrib/postgis-1.5/postgis.sql
psql -d opendata -f /usr/share/pgsql/contrib/postgis-1.5/spatial_ref_sys.sql

The first command creates the database, which I’ve chosen to call “opendata”.  The next command configures the new database to support the PL/pgSQL programming language which is used by the functions that make up PostGIS.  The third command loads the PostGIS functions.  Finally the fourth command creates an new table in the database called spatial_ref_sys which holds information about different spatial reference systems and allows us to use database functions to transform between these systems.  Note that I’ve given the first two commands the -e flag, this forces the commands to echo the SQL statements to the console so you can see what’s happening.

To confirm that the database has been created I connected using the psql command and listed the tables:

$ psql -d opendata
psql (8.4.6)
Type "help" for help.

opendata=# \d
 List of relations
 Schema |       Name        | Type  | Owner
--------+-------------------+-------+-------
 public | geography_columns | view  | kms
 public | geometry_columns  | table | kms
 public | spatial_ref_sys   | table | kms
(3 rows)

opendata=#

The OS OpenData Strategi shapefile dataset that I previously downloaded arrives as a Zip file.  Unzipping this gives me a top level folder called Strategi Shape.  Under this are three directories: data, doc, and gazetteer.  I’m going to ignore the gazetteer data for the time being, and the doc directory contains README files and licencing information.

The data directory contains two sub-directories: GB_NORTH and GB_SOUTH.  These directories contain the actual shapefiles.  The shapefiles contain three different types of geometric data: LINES, POLYGONS, and POINTS, the first step is to create a table for each data type:

cd /path/to/Strategi\ Shape/data/GB_NORTH/
shp2pgsql -p -I -s 27700 admin_polyline strategi_line | psql -d opendata
shp2pgsql -p -I -s 27700 admin_font_point strategi_point | psql -d opendata
shp2pgsql -p -I -s 27700 foreshor_region strategi_region | psql -d opendata

Replace “/path/to/” with the path to your unzipped data.  The -p flag to shp2pgsql triggers prepare mode, in this mode shp2pgsql only creates the tables and does not populate them.  The -I flags creates an index on the geometry column in the table.  You could do this as a separate step, but it’s convenient to do it here.  Finally the -s 27700 instructs PostGIS about the datum our data is using, in this case OSGB36.  I’m planning a further article to go into the whole datum thing in more detail.

Notice that the output of shp2pgsql is piped into the psql command.  This is because the output of shp2pgsql is a sequence of SQL commands that create and/or populate the database.  If you chop off the pipe and the psql command you’ll see the SQL echoed to your screen.

At this point I can connect to the database again using psql and examine the tables that have been created:

opendata=# \d
 List of relations
 Schema |          Name           |   Type   | Owner
--------+-------------------------+----------+-------
 public | geography_columns       | view     | kms
 public | geometry_columns        | table    | kms
 public | spatial_ref_sys         | table    | kms
 public | strategi_line           | table    | kms
 public | strategi_line_gid_seq   | sequence | kms
 public | strategi_point          | table    | kms
 public | strategi_point_gid_seq  | sequence | kms
 public | strategi_region         | table    | kms
 public | strategi_region_gid_seq | sequence | kms
(9 rows)

This is where I ran into a problem.  While trying to populate the point table I kept getting constraint errors.  After battering at the problem for a couple of hours I turned to the GIS Stack Exchange website for help.  This is an excellent resource for those wanting to ask questions (or provide answers) on issues about GIS.  After a little bit if investigation and discussion user amercader came up with the answer.  There was a mismatch between the geometry that shp2pgsql was using to create the table (MULTIPOINT) and the geometry of the data that it was trying to load (POINT).

The solution was to drop the table from the database, re-run the shp2pgsql command to create the database but direct the output to a file rather than passing straight to psql:

$ shp2pgsql -p -I -s 27700 admin_font_point strategi_point > create_point.sql
Shapefile type: MultiPoint
Postgis type: MULTIPOINT[2]

I then edited the resulting SQL to change the geometry in the third last line to be POINT rather than MULTIPOINT:

SELECT AddGeometryColumn('','strategi_point','the_geom','27700','POINT',2);

I then loaded this SQL into the database using psql:

psql -d opendata -f create_point.sql

With the tables set up I could then write a short shell script to load each of the shapefiles into the correct table:

#!/bin/sh
cd /path/to/Strategi\ Shape/data/GB_NORTH/
for F in `ls *.shp | grep polyline`; do
    B=`basename ${F} .shp`
    shp2pgsql -a -s 27700 ${B} strategi_line | psql -d opendata
done
for F in `ls *.shp | grep point`; do
    B=`basename ${F} .shp`
    shp2pgsql -a -s 27700 ${B} strategi_point | psql -d opendata
done
for F in `ls *.shp | grep region`; do
    B=`basename ${F} .shp`
    shp2pgsql -a -s 27700 ${B} strategi_region | psql -d opendata
done
cd ../GB_SOUTH/
for F in `ls *.shp | grep polyline`; do
    B=`basename ${F} .shp`
    shp2pgsql -a -s 27700 ${B} strategi_line | psql -d opendata
done
for F in `ls *.shp | grep point`; do
    B=`basename ${F} .shp`
    shp2pgsql -a -s 27700 ${B} strategi_point | psql -d opendata
done
for F in `ls *.shp | grep region`; do
    B=`basename ${F} .shp`
    shp2pgsql -a -s 27700 ${B} strategi_region | psql -d opendata
done

This could take some time to complete depending on how fast your machine and disks are, but once it’s done that’s the database set up and the data loaded.  The -a flag tells shp2pgsql to append the data rather than the default which is to drop the existing table and then load the new data.  To check that the data has been loaded I counted the rows in each table:

opendata=# SELECT COUNT(*) FROM strategi_line;
 count  
--------
 318204
(1 row)

opendata=# SELECT COUNT(*) FROM strategi_point;
 count  
--------
 372071
(1 row)

opendata=# SELECT COUNT(*) FROM strategi_region;
 count 
-------
 42943
(1 row)

Finally to optimize the query plan I ran the VACUUM ANALYZE command in psql.

The next step is to actually use the data, hopefully I’ll get to that later in the week.

Posted in Mapping, Tools | Tagged , , | 2 Comments

Obtaining some data.

Now that I’ve got my base Linux installation up and running, and installed the central PostgreSQL database, I need to download some OpenData to process.  As I mentioned in my first post my main interest is in GIS and cartography so I’m going to use the datasets released by the Ordnance Survey under their OpenData program.

The Ordnance Survey (or OS) created the OpenData program to release and manage a number of different data sets under a free licence.  This process was initiated by the Brown Labour Government who initiated a consultation process to determine what the impact of freeing up different OS datasets would be.

The datasets that are currently available are:

  1. MiniScale a small scale raster dataset.
  2. 1:250000 Scale Colour raster dataset.
  3. OS Street View a 1:10000 scale street level digital colour raster dataset.
  4. Boundary Lines a vector dataset of local government administrative boundaries.
  5. Code Point Open a postcode to location reference in CSV format.
  6. 1:50000 Scale Gazetteer in colon separated ASCII text format.
  7. Strategi a small scale vector dataset companion to the 1:250000 raster dataset.
  8. Meridian 2 a mid-scale vector dataset focusing on topography and communications.
  9. OS Locator a gazetteer of road names in colon separated ASCII text format.
  10. Land-Form PANORAMA a vector and grid dataset of contours and spot heights.
  11. OS VectorMap District a larger scale dataset available as both raster and vector.

I’ve decided to start out by looking at the Strategi vector dataset.  To download the data:

  1. Go to the OS OpenData Supply page
  2. Scroll down to the Strategi dataset
  3. Select the format as ESRI Shape
  4. Tick the right hand box for download (unless you want a CD)
  5. Scroll to the bottom of the page and click next
  6. Fill in the form as appropriate and click continue

A few minutes later I got an email in my in box with a link that is valid for three days to allow me to download a zip file of my data.

Posted in Data, Mapping | Tagged , , , | 1 Comment

Getting the tools.

The basic tool that I am planning on using is the Fedora Linux Distribution.  At the time of writing the current version is Fedora 14.  There are lots of options available for downloading and installing Fedora, a detailed explanation of how to go about installing Linux is beyond the scope of this blog.  If you’re not familiar with Linux then I would suggest starting by downloading and creating a Fedora Live CD, and then moving onto the installation documentation.

Fedora has a GIS special interest group that have already packaged a number of GIS tools.  This means that I can largely just use the yum command to download and install the tools and any dependencies.  There some tools on the GIS SIG wish list that I may look at later, using these will require more effort as I’ll have to download the source and then build and install the tools and any dependencies.

The central tool to pretty much all of my use of OpenData will be a database.  I’ve chosen to use the PostgreSQL database.  There are two main reasons for this:

  1. PostgreSQL is packaged and available for Fedora.
  2. There is an extension to PostgreSQL call PostGIS which adds spatial support to the database.

To  install PostgreSQL and PostGIS I just use yum from the command line:

$ sudo yum install postgresql-server postgis

You should see text scroll past on your screen as yum works out what dependencies are needed for these two packages.  Exactly what these dependencies are will vary depending on which packages you chose during system installation.  Yum will then prompt you to allow it to go ahead and download and install the packages.

Once PostgreSQL is installed I need to start the database system.  Prior to starting PostgreSQL for the first time I need to initialise the database:

$ sudo service postgresql initdb

Once the initialisation has completed I can start the database:

$ sudo service postgresql start

The database should now be up and running.  There’s one last, optional, thing to do.  I know that I’m going to be using PostgreSQL a lot, so I want to configure the system to start PostgreSQL automatically when the system boots:

$ sudo chkconfig postgresql on

And that’s it.  There’s a little more to do to configure PostGIS when I actually create a database instance, but I’ll cover that once I’ve explained how to get the data to populate the database.

Posted in Tools | Tagged , , , , | 2 Comments

Hello and welcome.

Hello and welome to my new blog.  As it says on the about page, the purpose of this blog is to document my exploration of the growing amount of free and open data that is available through the Internet.  I’ll also be taking a look at free and opensource tools that help with the processing, analysis, and visualisation of the data that’s out there.

One of my main interests at the moment is GIS and cartography, so in the main that’s the area I’ll be focusing on.

Given that this is a hobby that I tinker with in my spare time, updates could be sporadic as my availability and interest ebbs and flows.

Posted in Meta | Leave a comment