UNDERSTANDING TECHNOLOGY Spatial
Databases: Not Just for Data Anymore Chris Andrews
Imagine a business process that compares an area of
interest (AOI) throughout the day against thousands of
project boundaries to alert a user when new projects fall
within the AOI.
One solution for this problem stores spatial data
in a function-poor repository, possibly a database or flat
files. This solution uses an application to extract data
from the repository and to perform spatial intersections.
An alternative solution proposes storage of the spatial
data in a database and uses database functionality to
perform the spatial operations. Both solutions use an
application layer to notify the user by email that a new
project of interest has been found in the AOI.
Many GIS users underestimate the utility of a
spatially-enabled database and would not necessarily see
the difference between the proposed solutions above. The
first solution requires that the application layer perform
many “expensive” data retrievals from the database.
These data exchanges across the network probably take
longer than the individual spatial operations and may vary
over time with the hardware load and performance. The
second solution removes the data exchange required for
these searches and allows spatial processing tools to
directly access the primary data source. In some cases,
using a spatially-enabled database presents a better
performing alternative for spatial processing that is not
well advertised in the geotechnology industry.
For years, high-powered databases from companies
such as Oracle, Sybase, and IBM have provided the
underlying automatic data processing and analysis
capability for business applications that have remained
outside the realm of the GIS professional. The vendors
developed common tools, such as triggers (database
“rules” which kickstart action when specific events
occur) and stored procedures (operations stored on the
server, available to many users), to keep high-volume data
operations in the database, rather than passing data back
and forth from the database to application layers. In the
mid-1990s, database vendors saw a potential new market and
created spatial extensions for their products. These
extensions enabled customers to store spatial information
in a database and allowed access to spatial data through
the standardized Structured Query Language (SQL).
Several fundamental spatial manipulation and
analysis problems adapt perfectly to the database and to
SQL. Versioning can be done by duplicating large time
slices of whole data sets or by tracking atomic changes of
spatial features. Common database tools can be adapted to
map edges and nodes in linear networks through a
relational model. The database industry solved important
aspects of both of these seemingly complex problems years
ago.
The last decade saw the introduction of many
different spatial database storage options. The best of
these tools evolved with the information technology
industry and allow business application users to quickly
and easily incorporate spatial data and statistics as part
of their workflow. The worst act as cumbersome spatial
data gateways that require proprietary customization tools
for spatial analysis and data access. ESRI holds the
largest perceived market share in the spatial storage
realm with its ArcSDE middle-tier application. Oracle
places a close second with its powerful database that
incorporates spatial data types directly into the database
model. The PostGIS extension to the well-known open source
PostgreSQL database offers a free, community supported
alternative to commercial products. I recently heard of a
PostGIS implementation that manages hundreds of megabytes
of spatial data with sub-second response time and has run
flawlessly on a Windows machine for a year.
The migration of advanced geoprocessing
capabilities into the database is inevitable. Currently,
the geospatial software industry suffers because of the
divide between GIS vendors who have the domain knowledge
and database vendors who have the horsepower. I believe
that the majority of business applications that use
spatial processing consider spatial data to be critical,
but not necessarily central to the business function of
the system. For these businesses and organizations, the
ability to access a central data processing engine that
allows agnostic manipulation of spatial and non-spatial
data may eventually force them to choose horsepower over
familiar GIS brands.
Developing technologies such as grid processing,
object databases, and advanced tools for mapping objects
into relational databases provide new opportunities for
complex spatial processing and integration at the database
level. The capability to connect and analyze complex
utility networks at the database level exists now. Modern
object database tools and object-relational mapping tools
store entire object graphs in a database and could allow
applications to use the database as a single point of
entry into whole spatial object hierarchies without
expensive search and conversion time. The building blocks
for solutions such as these exist today and simply beg
talented, foresighted explorers to expand upon them.
About the Author
Chris Andrews has been an advocate for
standardizing GIS technology in the past eight years,
programming and listening to customers in a variety of
environments from private industry to the Kennedy Space
Center. Andrews is currently employed as a GIS Solution
Architect at Idea Integration in Denver, CO, and may be
contacted at [email protected].
Back
|