Teraflops Tackle
Terabytes on the TeraGrid
Three Groups from The University of Texas at Austin Are
Working on Real-time Flood Hazard Prediction
Merry Maisel, with Gordon Wells
Imagine that a major hurricane will make landfall
over a heavily populated coastal city like Houston, Texas,
within the next 24 hours. With the storm approaching the
mainland, a continuing deluge of rainfall already blocks
several primary evacuation routes. You are responsible for
providing emergency managers with an accurate forecast of
the flash-flood potential in broad areas threatened by
constantly changing conditions, as rain bands sweep across
the coastline and streams begin to rise beyond their banks
(Figure 1). If you are Gordon Wells of the Center for Space
Research (CSR) at The University of Texas at Austin (UT
Austin), you have at your disposal hundreds of gigabytes
of detailed elevation data collected by LiDAR and recent
high-resolution orthoimagery from satellite and aerial
surveys. You can run a sophisticated hydraulic and
hydrologic model with which to simulate floods, using
real-time NEXRAD Doppler radar data for estimating
rainfall accumulation, and you can access the city’s GIS
containing information about the stormwater drainage
system, automated stream gages, evacuation routes, and
critical infrastructure. Now comes the crucial question, “one we ask
ourselves every day,” says Wells. “Even though the
necessary data are at hand, can we make predictions of the
impending disaster with the accuracy, detail, and
timeliness required to offer guidance to emergency
managers?” There are two approaches to an answer, both
involving computation. One is to simulate a large number
of flash-flood scenarios well in advance of any actual
storm, then attempt to match the real-time observed
conditions to the most relevant scenario. But Wells says,
“We doubt that any number of flood simulations could
capture the spatial and temporal complexities of rainfall
distribution and watershed response during an actual flood
in more than a generalized manner.” The second approach
is to create real-time model simulations of metropolitan
flooding as events take place. Where would CSR find the
required on-demand computational firepower?
Enter TACC and the TeraGrid
CSR already works with the Texas Advanced Computing
Center (TACC) at UT Austin on several data-intensive
processing requirements for NASA satellite missions, such
as the monthly terrestrial gravity models produced for the
Gravity Recovery and Climate Experiment (GRACE) launched
in 2002. CSR operates a direct broadcast satellite
receiving station as part of its Mid-American Geospatial
Information Center (MAGIC) program, and Wells and his
group take advantage of the TACC resources in processing
and producing data products from a variety of
satellite-borne remote sensing instruments.
In the past three years, under the direction of Jay
Boisseau, TACC has become one of the premier academic
computing centers in the nation. In addition to supplying
archival storage for CSR and other UT institutes’ data (TACC’s
new systems can store more than 2 petabytes—2 thousand
trillion bytes), TACC operates several very large
supercomputing systems. The most recently installed
system, a Cray-Dell Linux cluster with 856 processors, has
a theoretical peak speed of more than 5 trillion
floating-point operations per second (5 teraflops). This
cluster will soon grow to more than 1,000 processors with
more than 6 teraflops peak performance.
Last September, TACC received a multimillion-dollar
award from the National Science Foundation to become a
participant in the TeraGrid, the nation’s largest
academic grid computing project (www.teragrid.org). The
TeraGrid links the resources of nine universities and
national laboratories over the world’s fastest dedicated
network (running at 40 gigabits per second). With its many
tens of teraflops of combined computing power and multiple
scientific visualization resources all online in 2005, the
TeraGrid is seen by Wells and Boisseau as an ideal
platform on which to test a real-time, on-demand flood
prediction capability (Figure 2).
Real-Time Flood Prediction and
Management
To address the need to provide real-time flood
hazard forecasts for emergency management, TeraGrid
participants from TACC will join with CSR, the Center for
Research in Water Resources (CRWR) at UT Austin, and other
TeraGrid participants at Oak Ridge National Laboratory (ORNL)
and Purdue University to develop the capability to model
flood events.
They will use the Map2Map model developed by
Professor David Maidment and his CRWR team. Map2Map, based
on ESRI’s ArcHydro data model, incorporates real-time
NEXRAD rainfall estimates into a standard hydraulic and
hydrologic model to predict inundation surfaces for
affected areas. Parallel processing will permit
regeneration of flood surfaces in near real-time when
triggered by sequences of NEXRAD inputs.
The ORNL group, directed by Budhendra Bhaduri, is
preparing dynamic population data for the Houston area,
tracing the movements of people over a 24-hour cycle.
Where will they be when a hurricane makes landfall? ORNL
can examine the impact of floods that occur at different
times of the day, using a transportation model containing
evacuation routes tied to the dynamic population database.
Results from the flash flood simulations will draw
upon the TeraGrid visualization resources at TACC and
Purdue to stream geospatial representations of the
flooding to the State Operations Center in Austin and to
other Emergency Operations Centers throughout the state,
over the state’s high-speed data network for emergency
management.
Remote Sensing over the TeraGrid
Satellite remote sensing provides another example
of geospatial technology with massive data handling and
intensive computing requirements that will benefit from
the resources managed by TeraGrid participants. The
primary focus of the MAGIC receiving station, for example,
is to supply near real-time data products for the state
and federal agencies that monitor regional air quality,
water resources and agriculture and for emergency
management during natural and man-made disasters. A single
S/L-Band and two X-Band antennas currently collect
transmissions from 14 different satellites and produce 50
gigabytes of telemetry and data products for storage and
distribution each day. With the future addition of
high-resolution radar data to be transmitted from the
German DLR TerraSAR-X satellite scheduled for launch in
2006, the receiving station will collect more than 200
gigabytes of data per day (Figure 3). Working with TACC, CSR manages the flow of data
from the processing systems of the receiving station to an
online Redundant Array of Independent Disks (RAID) storage
system for recent acquisitions of satellite data. After
two months, older datasets migrate to near-line storage on
archival tape in TACC’s multi-petabyte robotic retrieval
system. Data archive users select products for delivery by
specifying the file format, band combination, subset area,
and map projection through a graphical interface (http://synergyx.tacc.utexas.edu/DataQuery/).
Custom data products are prepared for delivery, and the
system sends an e-mail message to the user, who then
collects the requested data from an FTP site.
The availability of teraflops of TACC computing
resources also enables rapid reprocessing of archival
data. This is necessary to build and test improved
algorithms for aerosol detection, ocean color
discrimination, atmospheric correction, and other
processing procedures that extract more information from
the data stream. Extended time series of archival data,
efficiently processed through parallel computing, can
improve the analysis of changing conditions (Figure 4).
Real-Time Satellite Remote Sensing
“The distributed resources of the TeraGrid should
likewise be able to contribute to near real-time satellite
remote sensing,” Wells says. Direct-broadcast receiving
stations connected together by high-speed networks can
cooperate to share the data transmitted by a satellite
overpass to generate near real-time data products. For
instance, NASA’s Moderate Resolution Imaging
Spectroradiometer (MODIS) instruments on the Terra and
Aqua satellites transmit entire orbital swaths of data
from their solid-state recorders as they pass over the
northern polar regions. The initial products from these
telemetry transmissions only become available from NASA
several hours or longer after MODIS images a region. But
ground stations track and receive direct broadcasts from
Terra and Aqua within their line-of-sight and can generate
Level 1 (radiometric and geometrically calibrated)
products within several minutes (Figure 5).
CSR has conducted experiments with the
direct-broadcast data collected during the same satellite
overpasses tracked by CSR, Rutgers, and Louisiana State
University to produce composite data collections for MODIS
and the Indian IRS-P4 Ocean Color Monitor (OCM). When the
Purdue Terrestrial Observatory (PTO) begins operation in
2005, the high-speed TeraGrid network will allow the CSR
and PTO receiving stations to exchange data collections,
compare datasets, perform data quality analyses, and
generate composite Level 1 data products within minutes of
data reception. As more stations link to the TeraGrid and other
high-speed networks, Wells says, “I foresee tracking
satellites through a series of near real-time data
exchanges that collect and integrate data from portions of
the same orbital pass, from Canada across Latin
America.” The ability to receive and share data in this
manner is particularly important for satellites that do
not use onboard recorders to store data, such as the OCM,
for which the direct broadcasts are the only records of
each data collection.
Future Cyberinfrastructure
“It is extremely rewarding for TACC to work with
such talented researchers on problems that really make a
difference in people’s lives,” says TACC Director Jay
Boisseau. “We’re passionate about using advanced
computing technologies to solve important problems, and
this is work that showcases the value that supercomputers,
visualization systems, databases, and grids have for
society.” As applications are developed that harness the
power of TeraGrid resources, the need will grow to create
portals to the data and model results that connect to
users who do not have access to a high-speed network. The
geospatial information community may be among the first to
benefit from portals to the TeraGrid. Just as the DOD-sponsored ARPANET evolved over the
course of three decades to become the modern commodity Internet, the TeraGrid may signal
the first stages of a broad-reaching cyberinfrastructure.
The TeraGrid brings data processing, storage, and
visualization capabilities to large numbers of users in
the same way that today’s Internet offers the standard
digital content of text, graphics, music, and video.
As grid supercomputing extends beyond the realm of
university research, the outcome will likely change the
way we view and use geospatial information.
About the Authors
Merry Maisel is a science writer at the Texas
Advanced Computing Center, The University of Texas at
Austin, and can be reached at [email protected].
Gordon Wells is the Program Manager for the
Mid-American Geospatial Information Center at the Center
for Space Research, The University of Texas at Austin, and
can be reached at [email protected].
Back
|