Managing and
Accessing Large Imagery Datasets: An Introduction
Richard Orchard
Imagery is big. Really, really big. In a typical
GIS implementation, 1 gigabyte is considered a lot of
data. Imagery is a whole different ball game than vectors.
It is not uncommon for organizations to have more than a
terabyte (1,000 gigabytes) of image data. Regular
collection of high-resolution imagery for a single city of
any size each may mean more than 1TB of data is delivered
annually.
Imagery is Growing . . . Fast
Not only is imagery big, an image archive is always
getting bigger as organizations acquire new image
datasets. Technology developments mean increased
resolution is available, resulting in further growth in
the total size of the imagery database. The availability of new sensors mean a
greater variety of products are available for the same
area.
It’s important to note too that organizations
store older image data in order to keep historical
information about land use and other characteristics.
Aerial and satellite photographs capture the state of land
at a given moment in time, making it a precious resource
for planners, environmental scientists, historians, and
others who can learn from the changing patterns.
Maintaining an archive allows the comparison of 2002 data
with 2004, for example, to distinguish any changes.
Recently, the Australian Greenhouse Office embarked
on a plan to make satellite imagery from the past 30 years
available. This dataset will be a valuable tool for
farmers and will help scientists monitor and measure
global warming. A recent article published by the
University of Nice cited the use of 50 years of aerial
photography as a basis for change detection across the
French Rivera.
While an image deployment system has to cater to
the users’ current needs, there must also be a plan to
add to its capacity in the future to maintain its value
over time.
The Value of Imagery
Imagery reveals additional characteristics of land
not present in typical GIS datasets. Imagery provides
“reality” and detail, improving understanding and
enhancing the users’ comprehension.
Users need access to imagery so that they have the
full picture when making decisions.
For example, a graphical look at a vector-based GIS
data source may tell you where a road is, but it won’t
tell you how many lanes it has, or what state of repair it
is in. Similarly, a vector map of parcel and an image
reveal very different information (Figure 1).
Professional and casual users from many different
industries and interests find imagery invaluable to their
work and play. Users that don’t have access to imagery
want it. And, users that have imagery want more of it.
Using and Managing Imagery
If users can’t efficiently use imagery, it is
worthless. On the other hand, sharing any sizeable amount
of imagery presents certain challenges. Opening up large
imagery files on a local area network (LAN) can be painful
and potentially time consuming. And, there may be access
issues: Are those who need access even on the LAN? What if
they are in another department, another state or another
country? How can they access a terabyte of image data?
The other part of the user equation must address
the user’s ability to use the imagery the way they want.
Do users have the appropriate software tools to access the
imagery? Is the imagery compatible with their needs?
In managing imagery datasets, organizations have
three broad objectives:
-
Manage existing imagery datasets.
-
Allow for future growth of imagery datasets.
-
Efficiently provide appropriate-quality imagery to
all users.
All three goals must be addressed in a real-world
framework, with limited time, equipment, and funding.
Effective Use of Imagery
Imagery is unique as comparable coverage may
require 1,000 times or more data than in a GIS vector
system. A 1TB image is an average size these days, where
as 1TB of vector data is considered a huge GIS system.
Complicating the problem, imagery needs to be accessed at
many different resolutions, allowing users to quickly zoom
from overviews down to detail views, at any scale.
That’s a very different technology challenge than
“zooming in” on vector data.
These and other challenges have caused the industry
to turn to image compression technologies to enable large
images to be stored efficiently, and more importantly,
accessed quickly. Image compression offers several
advantages over storage in an uncompressed form.
Compression can:
-
Decrease image file sizes.
-
Reduce the number of image files by creating
high-quality image mosaics that are more relevant to
users.
-
Allow imagery to be served using a specialized,
high performance system providing access via the Web, GIS,
CAD, and other desktop and mobile applications.
Compression: Shrink the Size of
Image Data
Compression reduces the total file size of the
image, while retaining its original quality. The key
benefits of the process revolve around the new, smaller
size of the data. Reduced file size means less hardware
resources are required, that is, less disk space. Also,
smaller files mean the imagery becomes more portable when
shared between users and across network infrastructure.
Consider that a 20 GB image can’t fit on a CD, but a
500MB compressed image can.
Earth Resource Mapping’s ECW Compression, for
example, uses image processing algorithms to compress and
filter each image. The process discards redundant data
that isn’t necessary to display the image, making it
smaller. Using the ECW format, a color image such as an
air photo can be compressed to less than 2% to 5% of its
original size. This can make a huge difference to the
overall volume of imagery, as it means that a 1TB image
reduces to 50GB in size.
JPEG2000 is another format for compress images.
JPEG2000 offers both “lossless” compression, as well
as lossy compression. However, lossless compression
typically only offers a 50% reduction in image size
compared to the 95% reduction in image size common with
lossy compression techniques.
Typically air photo and satellite imagery will be
compressed using lossy techniques, whereas digital terrain
height data (Digital Terrain Models, DTMs or Digital
Elevation Models, DEMs) will be compressed using lossless
compression.
Reduce the Number of Images and Maximize the
Value of Imagery
Raw image data is often delivered in a number of
separate images, sometimes in the hundreds, covering a
particular land area. Storing the image data in separate
image files may not be useful or efficient for the end
user. If an area of interest straddles two or more images,
it can be difficult for users to effectively analyze the
area. To look at an area of any size, analysts may have to
juggle tens or hundreds of separate images.
One solution is to convert separate files into one
seamless image mosaic. Figure 2 shows a land area
comprised of a number of different aerial photographs. By
converting the raw data into a seamless ECW image mosaic,
using ER Mapper, the image data offers a different kind of
value.
In Western Australia, the Department of Land
Information (DLI) is the custodian of a large amount of
aerial photography (5+ terabytes). Local governments look
to DLI to provide them with aerial photography. Rather
than provide each local government with tens or hundreds
of different shots for their geography, DLI provides each
one with a single mosaicked image, covering the
municipality’s area.
Providing Fast Access to Large
Imagery Datasets
Even with compression, a single image file can
still be quite large, weighing in at hundreds of bytes or
up to several gigabytes. Opening these images over a
standard LAN environment can be time consuming. Providing
imagery access via an extranet or Internet via narrow band
connections, may prove impractical if not impossible.
Employing image deployment technology can speed up
the process of image delivery. A specialized, high
performance application that serves image data via a Local
Area Network and the Internet, such as ERM’s Image Web
Server, can alleviate this challenge (Figure 3).
Image Web Server, and related solutions, allow for
the fast and efficient deployment of even the largest
image datasets, via image streaming. Image streaming is a
process that provides the client only the data needed for
a particular view. Unnecessary data is not downloaded and
does not clog the network or the user’s local resources.
For example, exploring a single property in a mosaic does
not require detailed information for properties nearby.
With compression and appropriate server technology, it is
possible to view a terabyte size image over the Internet
with a 56K modem connection.
Arizona-based Aerials Express, a provider of aerial
photography for whole of the United States, provides
online access to a multi-terabyte image repository via the
Internet using ERM’s compression and Image Web Server
solution.
Conclusion
Imagery is increasingly important to business
practices and workflows of many different industries, as
well as to the public at large. That places organizations
hosting the imagery in the position of managing and
delivering large quantities of image data to a variety of
users.
Applying relational database-centric solutions to
imagery hasn’t produced optimal results. Using image
technology based approaches, including compression and
specialty serving software, can bring large images to
computer desktops and other devices.
About the Author
Richard Orchard is a large-scale imagery deployment
and solutions specialist for Earth Resource Mapping.
Back
|