Lee Hachadoorian on May 7th 2012
PostGIS and SpatiaLite offer a large number of spatial query operations, including buffers, intersections, and spatial joins. It’s really useful, especially when you’re still experimenting, to visualize your results without having to create a new spatial table, spatial view, or exporting to a shapefile or other GIS data format. Fortunately, there are several ways to do this, including at least three plugins for Quantum GIS: Fast SQL Layer, PostGIS Query Editor, and RT SQL Layer. In this blog post I will walk you through using Fast SQL Layer. I’ll assume you already know how to install a QGIS plugin (but see the link in the next section if you don’t), and that you’re somewhat familiar with SQL. Don’t worry if you’re not—SQL is a short step beyond the kind of querying that GIS users are already familiar with. The examples should be understandable, and maybe it will encourage you try it yourself. If you already are familiar with SQL, this post will give you a quick introduction to the plugin, as well as point out some gotchas that I haven’t seen documented elsewhere.
Installing Fast SQL Layer
Fast SQL Layer is installed like any other plugin. Fast SQL Layer has a number of dependencies which, unfortunately, are poorly documented. On Windows, it worked out of the box. (Note that I am using the full OSGeo4W stack, which probably took care of the dependencies.) On Ubuntu, I had to find the unmet dependencies by trial-and-error:
- Install the plugin.
- Relaunch QGIS, checking the error message for the missing package.
- Install the dependency.
- Repeat steps 2-4 until there are no more error messages.
The ones I found are: pygments, psycopg2, pyspatialite. The first two can be installed with APT, but pyspatialite cannot. It is installed via python, not APT, but to compile it you need to have python-dev, so you might as take care of that as well:
sudo apt-get install python-pygments python-psycopg2 python-dev
Pyspatialite has its own built-in spatialite, but Fast SQL Layer won’t work unless you choose the version that matches what’s installed on your computer! To install, download pyspatialite from the Python Package Index (http://pypi.python.org/pypi/pyspatialite/). Make sure to download the version that matches the version of spatialite you have installed. In my case, I dowloaded pyspatialite 3.0.1. Extract the zipped file. In the console, navigate to the folder you just extracted (e.g. ~/Downloads/pyspatialite-3.0.1), and run
sudo python setyp.py install
Note that while the Python Package Index warns against installing as sudo, this package will fail if it can’t write to /usr/local…
After this, Quantum GIS opened with no errors and the Fast SQL Layer plugin can be opened from the Database menu.
Using Fast SQL Layer
When you open Fast SQL Layer, It will open a floating window, which can also be docked to main QGIS window:
The interface is pretty spare. There are really only a handful of options. Most of the actual work will be done by your SQL statement.
The above screenshot shows a closeup of the Fast SQL Layer window. The first dropdown contains the name of any PostGIS or SpatiaLite connections you have defined in QGIS. Fast SQL Layer does not offer any way to create new database connections, relying upon the ones that you have already set up. If you need to connect to a new PostGIS server, you must first create the connection by clicking the New button in the Add PostGIS Layer dialog:
Even then, the connection won’t appear in Fast SQL Layer until you restart QGIS. How annoying this is will depend on how frequently you need to add new database connections. In my workflow I’m really only working from two different PostGIS servers. On the other hand, I do create new SpatiaLite databases with some regularity.
Next to the Connections dropdown is the Primary Key combo box (showing “gid” in the screenshot). This is the name of the primary key (unique id) in your spatial table. This only needs to be set for PostGIS tables. The combo box allows you to select between the names gid or id, but if your table uses a different field name, you can type in whatever it is. If you misspecify the primary key field when adding a PostGIS layer, the layer will appear in the Layers pane but won’t render, and an error message will pop up every time the map canvas renders (e.g. if you zoom, pan, or add a new layer). If you make this mistake, you will want to remove the offending layer immediately. When adding a SpatiaLite layer, the key field is not used; not only does it not matter whether you make a mistake in the primary key name, it can even be left blank.
Next is the Geometry Column combo box (“the_geom” in the screenshot), which is where you specify the geometry column of your spatial table. It allows you to select between the common choices the_geom and geom, but as with the Primary Key combo box, you can type in the correct column name if it is not among these options. Note that when you create spatial tables in SpatiaLite, geometry columns are created with Geometry as the default name. It would be great if Fast SQL Layer included this among the options, or if it remembered user entries in the Primary Key and Geometry Column combo boxes, but it does not, so you will have to type in Geometry each time you add a SpatiaLite table that uses this column name.
The next button is the Run button. To actually generate your query layer, you need to type in a SQL query. For this example, I am using a PostGIS server and have typed in a basic SELECT * (select all columns, in plain English) of a small dataset, the five boroughs of New York City (originally downloaded in shapefile format from Bytes of the Big Apple).
SELECT * FROM geo_boundaries.boro_nyc_2010_dcp_epsg2263
When you push Run, you get this:
Notice that the Layers pane on the left contains three layers named QueryLayer. That’s because I’ve done this three times. Each new layer is added with the default name QueryLayer. (Normally when adding a PostGIS or SpatiaLite layer in QGIS, the layer name would default to the name of the PostGIS or SpatiaLite table). If you intend to add several QueryLayers, I suggest renaming each one in the Layers pane (right-click and choose Rename, or double-click to open the Layer Properties dialog) immediately after creating it.
If you make a mistake in your query, such as a typo in the table name, you will get an error message and the layer won’t get added.
The Get Layer button (next) will return the connection string and SQL query that created the active layer (the layer selected in the Layers pane):
You can get the same information by hovering your mouse over the layer name in the Layers pane, or looking on the Metadata tab of the Layer Properties dialog. Note that although the connection string appears in the Fast SQL Layer query window, if you hit the Run button you will get an error message! The query window appears to only accept SQL statements, not full connection strings.
The next dropdown box has the options add layer / replace layer. I have some guesses as to what this should do, but in practice I don’t see any difference. In either case, when you hit Run, a new layer gets added to your map.
So far all we have done is add a data layer, no different from adding it with the normal Add [PostGIS|SpatiaLite] Layer… dialog. The real usefulness of Fast SQL Layer comes with being able to visualize your ad hoc queries. For a simple first example, I’m going to return the centroids of the five boroughs:
SELECT gid, boro_code, boro_name, ST_Centroid(the_geom) as the_geom FROM geo_boundaries.boro_nyc_2010_dcp_epsg2263
Note some things about the above spatial query. You must include a unique key in your select list, and you must return a geometry column, and the column names in the resultset must match what have in the Primary Key and Geometry Column combo boxes. In fact, you could alias these columns with arbitrary names as long as you use the same names in the combo boxes.
When it gets really interesting is when you spatially join your layers.
SELECT z.gid, name, subboro_name, ST_Buffer(z.the_geom, 100) FROM geo_boundaries.boro_nyc_2010_dcp_epsg2263 b JOIN geo_boundaries.zip_nyc_2007_cur_epsg26918 z ON ST_Intersects(ST_Centroid(b.the_geom), ST_Transform(z.the_geom, 2263))
There’s a few things going on here. First, I’m using ST_Centroid to return each borough’s centroid (as in the last example). Then I’m using ST_Intersects to find the ZIP code at each borough’s center. In order for ST_Intersects to work, the two geometries must be in the same spatial reference system, so ST_Transform is being used to transform the ZIP code layer from UTM 18N (EPSG 26918) to State Plane New York Long Island (EPSG 2263). Fortunately, I don’t have to look up the SRID of the boroughs layer because my table naming convention includes the SRID. Finally, I’m adding a 100 meter buffer to the each ZIP code that satisfies the spatial join. ST_Buffer takes a geometry and a buffer radius. The units are determined by the spatial reference system, and in this case I know that UTM 18N uses meters, so ST_Buffer(z.the_geom, 100) means give me a 100 meter buffer around each ZIP code polygon. The result looks like this:
Fast SQL Layer does need some work. A couple of queries that I attempted took a long time to run, and it would be great if there were a way to cancel the query if it was taking too long (particularly since sometimes when it takes too long you look at the query text and realize it’s taking too long because you made a mistake). As it is, a long query just hangs QGIS until it completes. It would also be great if the Get Layer button returned just the SQL query, so that it could be edited without having to first strip out the connection string parameters, and if the “replace layer” option actually replaced the selected layer with the new query.
Finally, the plugin really needs some kind of cache, either locally or in a PostGIS temp table. When the map canvas renders, the query gets resubmitted to the server. If your spatial query takes any length of time to execute (a not uncommon occurrence), you spend a lot of time waiting for minor visual changes. I know that the alternative plugin PostGIS Query Editor caches a shapefile locally, so I might try that one next, but on the whole, Fast SQL Layer’s no-frills approach is at least straightforward and easy to use.
Postscript: In writing this blog post I found someone who has taken Fast SQL Layer and turned it into a plugin for pgAdmin3. I haven’t tested it yet, but I do a lot of querying in pgAdmin, so this seems like a useful tool as well.
One Response to “QGIS Fast SQL Layer”