Earth Observation Magazine

FAST 3D Visualization of
Large Image Datasets in a GIS
By Clayton Crawford, Salvador Bayarri, and Dragan Petrovic

Remote sensing technology has long provided vast quantities of image data. An increased number of sensor platforms, in conjunction with finer resolution data capture, ensures this trend will continue. For many users of this data, efficient display of large image datasets is critical. 3D display is often preferable but presents a unique set of problems. Specialized applications, particularly in the field of simulation, have been available for some time, although a generalized solution required for a mainstream GIS has remained elusive. This article discusses a technology that efficiently displays raster datasets, in interactive 3D, on a globe. It handles imagery of virtually any extent and spatial resolution, regardless of physical dataset size. Elevation data, in a variety of resolutions, can be incorporated to influence the surface geometry of the base globe. The technology works with modest hardware requirements by utilizing an intelligent, scale dependent, paging mechanism that operates on data processed into multiple levels-of-detail (LOD). The integration of these display capabilities within a GIS provides a framework useful for many applications. It adds the ability to efficiently display huge image databases in 3D to the existing comprehensive set of display and query operators of a GIS. Automatic mapping of georeferenced imagery to its proper location on the globe provides for a user-friendly and intuitive interface. This combined with Internet support should prove useful for remote GIS services, including visualization and graphics-oriented data searching. It is hoped that this work will facilitate the GIS and remote sensing communities to take advantage of the image data available to us today.

Professionals in the GIS community have ready access to very large image archives. Aerial photography, satellite imagery, and scanned maps can be easily procured or even obtained for free from the Internet. Over time, demand for increased accuracy has also resulted in finer resolution data. Sub-meter satellite imagery, available to the general public, is now a reality. Images exceeding 1 gigabyte in size are not uncommon. Analytic derivatives from these products increase data holdings even further. For many, efficient access to these quantities of data is an issue. The lucky few have specialized software and hardware to handle this problem, but most GIS users, with less expensive and generic solutions, experience unwanted delays. The goal is to incorporate key aspects of the aforementioned specialized systems into a GIS so the capability becomes less expensive and ubiquitous.

Simulation systems represent a prime example of a specialized application domain. These use specialized software and hardware to quickly render large amounts of information. When one looks to find commonalities in these systems, what is typically found is a data processing scheme used in conjunction with 3D graphics hardware (Falby, 1993). Input data, often sourced from a GIS database to take advantage of georeferencing and clean-up capabilities, is converted into a non-standard proprietary format (Polis, 1994) with multiple levels-of-detail (LOD). What these LODs provide is the ability to quickly return detailed data where needed, such as the foreground of a perspective display, and more generalized data elsewhere. The ultimate purpose is to reduce the amount of data that needs to be rendered at any given moment. This enables the graphics hardware to provide smooth real-time display of the data. These systems work in practice, because even when the volume of information for the entire database is large, the amount that needs to be retrieved and interacted with by the user at any moment is relatively small.

To accomplish a similar feat in a GIS is a challenge, but not insurmountable. As previously mentioned, many simulation environments already use a GIS as a back-end data processing system anyway. The trick is to add the LOD data processing and graphics hardware capabilities to the GIS and make them as transparent as possible to the end user. An intuitive front-end that exploits the capability and fits naturally in a GIS environment also helps. Since GIS applications range from local to global in scope, an interface that easily handles this range in scale is important. What has been implemented is a globe-centric application, referred to here as ArcGlobe, which works as an extension to an existing GIS.

A Globe-Centric Viewing Environment

The viewing context of the ArcGlobe application is a continuous sphere in true 3D space (Figure 1). Users can spin it around, zoom in and out, fly, view from above like a map, or side-on (obliquely) in perspective. The surface geometry of the sphere can be augmented with elevation data. The application ships with, and displays, global data out of the box. This gives the user a base environment to work from that just gets richer in detail as higher resolution site specific data is added. This is easy to do as geo-referenced data is automatically placed at its proper location on the earth. Since the data is displayed on a sphere, it need not be projected ahead of time. All that is required is registration with a geographic coordinate system and a datum. A 3D perspective projection is applied at runtime. If the input image data is already in a projection, the system can "unproject" through reuse of tools already in the GIS. All the data is displayed using the same datum. The user selects the datum that the majority of data holdings are in. Any input data in a different datum will be transformed on the fly using the appropriate datum transformation code available in the GIS.

Aside from ease of use, another benefit of having data displayed on a sphere is the effect of Earth curvature. This yields an accurate horizon beyond which nothing is visible. There are many GIS applications, particularly in the domain of visibility analysis (i.e., line-of-sight, or LOS, viewshed) that need to take curvature into consideration. If one wants to see what is visible from a particular vantage point, with an accurate horizon, one simply places the virtual camera that defines the perspective to that location. Elevation data of sufficient accuracy is necessary, however.

The globe-centric nature of the application promotes a seamless view of geography. Everything is viewed in context with its surroundings. A user's data holdings are blended into the global base data creating complete continuity. There are no data voids, islands, or "edge of the world" effects that make it seem like there is nothing outside the user's study area.

3D Rendering

In order to display potentially massive amounts of data in a real-time graphics environment, one must apply tactics on a number of fronts. More specifically, this involves the graphics application programming interface (API), the graphics hardware, and data management. All three are closely related, although the first two are more so.

Graphics API and Hardware

ArcGlobe uses OpenGL, an industry standard graphics API developed initially by Silicon Graphics Inc. (SGI), and subsequently turned into a standard by industry consortium. It is the most commonly used API for scientific visualization. Our application benefits by use of OpenGL through its support across computing platforms (UNIX and Wintel). Additionally, use of a standard graphics API translates into hardware acceleration. A competitive graphics hardware industry is constantly making improvements in performance while simultaneously reducing costs. Consumer level graphics cards in standard PCs are now capable of performing what was only possible in expensive specialized equipment of only a few years ago. In this regard, ArcGlobe has benefitted from the advent of time. The hardware required is now relatively commonplace. What remains a challenge is the data processing.

Global Indexing Scheme

Like many other applications that handle large amounts of data, ArcGlobe uses a divide and conquer technique. In this case, the methodology is similar in concept to research performed in support of the "Digital Earth" (Faust, 2000) and (Goodchild, 2000). A hierarchical global spatial indexing scheme, implemented through recursive subdivision, organizes data into manageable pieces (Hansen, 1999). All input data is indexed using the same logic to provide consistency. The coarsest level of subdivision views the globe as a cube, dividing it into six 90-degree square tiles (Figure 2). Four tiles encompass the equatorial regions of the globe, between 45 degrees north and south. The remaining two cover the polar regions.

Each 90-degree tile can be recursively subdivided into smaller areas. The location and degree of subdivision depends on where data is available and its resolution. It also has a direct relationship to LODs. Assume an image occupies a full 90-degree tile. Using a power-of-two subdivision the first recursion breaks the tile into four 45-degree tiles, the second recursion into eight 22.5-degree tiles and so on (Figure 3). Resampling is performed at each step producing a 512x512 raster for each tile. Thus, larger tiles contain coarser representations of the data. Subdivision is repeated until the effective tile resolution matches the data source. Up to 32 LODs are possible, which achieves sub-centimeter resolution at the equator. This tiling is not something the user needs to be concerned with. The software automatically generates the scheme at run-time. No data conversion needs to take place. It happens transparently, yet is crucial in achieving real-time performance.

The rendering subsystem uses this spatial index when placing calls for data that must be retrieved for display. At any change in perspective, the subsystem quickly determines what tiles are needed and at what LOD. This is based on what's visible and its relative distance to the virtual camera (Figure 4). Foreground information requires greater detail than background. Since any number of datasets may be present, a determination must also be made to find which ones have a spatial extent overlapping the visible tiles. The appropriate subset of data is then retrieved for rendering. If the input datasets overlap in spatial extent, the highest resolution data will be displayed in the area of overlap. This behavior can be overridden if the thematic importance of an image takes precedence over resolution.

Resampled data generated from an image or elevation model are stored in a cache that follows the hierarchical spatial index. Only those areas that have been visited are cached. So, low-resolution textures might be generated for the entire extent of an image when viewed from a distance, but high-resolution textures are only made from the specific locations visited by the user in close perspective. Data in the cache are reused when an area that has already been cached is viewed again. This process of on-demand caching avoids the need for pre-processing an entire dataset before it can be viewed. It can be a tremendous time saver when a relatively small area of a large dataset needs to be viewed. As an option, the user may choose to cache an entire dataset off-line in a batch oriented pre-process before viewing it. This is useful when the utmost in performance is required and when the majority of the data is likely to be viewed in high detail.

As with other systems designed to handle large quantities of data, a dynamic paging mechanism is used to load information in and out of memory (Lindstrom, 2001). Only what is required based on current perspective is needed. Any additional memory that's available is used to store most recently viewed data, anticipating the likelihood it will be needed again. This paging ability keeps memory requirements low even when viewing terabytes worth of imagery and elevation data.

Any lag in time to acquire the necessary data can be handled gracefully through use of multiple threads. One thread can be fetching the data while the other processes events from the user. So, for example, the user can still navigate and change perspective while more data is being acquired as opposed to the system freezing up while waiting for the data to arrive.

All input datasets are treated as separate entities, or layers, that can be turned on and off at will. This is standard functionality in a GIS mapping environment but represents a significant departure from some other simulation and real-time display systems. These systems may require all data to be combined into one homogenous dataset in a long pre-processing step before it can be viewed.

Direct Read of Raster Data in Its Native Format

One can imagine that requests for subsets data, resampled to specific resolutions, from very large raster datasets would be slow. This is true and is the reason why other real time graphics applications that support large datasets typically convert the input imagery or elevation data into some proprietary multi-resolution format. Additionally, it's not only an issue for 3D applications. Planimetric mapping systems have a similar problem. Panning and roaming around big images in 2D can take a long time unless some form of optimization is used. One readily adopted approach has been through the use of image pyramids. These are basically down sampled versions of an image stored as a collection in an ancillary file. At small display scales, the full detail of the image cannot be seen, so instead of attempting to draw the entire full resolution image, the appropriate pyramid is used. Only at large display scales is the full resolution used. Typically, a large display scale means only a small portion of the image is being viewed. So, pyramids effectively keep the amount of data that needs to be rendered at any one time relatively constant regardless of display scale.

The same pyramids used for planimetric mapping can be used in 3D to quickly display raster data in its native format. When viewed from a distance, the application will ask an image for low-resolution textures. These can be generated quickly from its low-resolution pyramids. As the virtual camera comes in close, high-resolution data is requested but for a much smaller area; still fast. Thus, in the presence of pyramids, which are usually present anyway for the sake of efficient 2D mapping in the GIS, imagery can be read without the need for conversion. All image formats supported by the GIS, be they file based, or served from an enterprise database, will work. The same approach also applies to elevation data as pyramids can be used with raster DEMs.

Terrain

Surface geometry for topography and bathymetry is supported through use of raster elevation models. The same spatial indexing scheme used to process imagery is used for terrain. As with the provision of base imagery, the application comes with worldwide elevation coverage. This consists of GTOPO30 for topography and ETOPO2 for bathymetry. The former is 30-second data that's roughly 1km resolution. The latter is 2-minute data that's roughly 4km resolution. Users can add higher resolution data where available. If there is overlap in coverage the higher resolution data will take precedence.

The potential for discontinuity in surface geometry exists due to the presence of tile-based LODs. Some systems that use LODs for topography exhibit visual anomalies by suddenly popping in higher resolution data, or worse yet, by revealing cracks in the surface geometry between LOD tile boundaries. What is desired is a continuous surface free of those artifacts (Duchaineau, 1997) and (Hoppe, 1998) and (Lindstrom, 2002). This is achieved through a tessellation that obtains heights across LODs through averaging. This smoothly morphs the surface from one LOD to the next. The tessellation is calculated in real-time based on the current perspective. A recursive technique forms a tessellation that is coarse in the background with increasing refinement toward the foreground (Figure 5). Based on distance to observer, each node of the tessellation is assigned its LOD measure as a floating-point value rather than as a discreet integer. The decimal portion is used as a weight to interpolate a height between the corresponding LODs. For example, a node with a calculated LOD of 7.9 will average heights between LOD 7 and LOD 8, giving 90% of the weight to LOD 8.

Benefits of GIS Integration

Having a technology like ArcGlobe that is integrated with a GIS is useful for both 3D graphics applications and traditional mapping requirements of the GIS. Successful integration requires a sharing of underlying technology, plus reuse of higher-level user interface constructs. To be considered useful in a GIS context the viewing application must offer a number of basic capabilities common to mapping applications. For example, "locate" and "identify" are important query operators. Layer transparency and rule-based visibility (e.g., distance/scale) are examples related to data display. ArcGlobe supports these operations either by direct use of the underlying GIS technology or through its own implementation. Examples of benefits coming directly from the GIS include:

data creation/editing
data management
raster (image/DEM) file format support
raster data in enterprise geodatabases
geo-referencing tools
symbology
pyramid layers
spatial database query
analysis, derivation of new data

Simulation systems have used the data processing capabilities of GIS for some time. The GIS is used to import data from a variety of formats, co-register them, define spatial referencing, apply geometry and topology operators to clean up the data, etc. But this has typically meant that data be converted between systems. In this process the "map intelligence" of the GIS is lost. Querying for information about a location might only be supported at the most rudimentary level in the simulator. Once the 3D application is in the GIS these issues go away.

The GIS benefits, too. Interactive display and query of map data has long been considered a primary strength of GIS. With this integration it can now be performed in interactive 3D which is arguably more intuitive and informative than planimetric display. Additionally, the rapid retrieval and display system implemented by ArcGlobe makes the process of viewing large collections of imagery more efficient. The use of hardware acceleration and the implementation of a caching mechanism offer significant performance gains.

Results

For proof of concept, the authors and others associated with the development project have successfully used terabytes worth of imagery and elevation data. Worldwide datasets include NASA MODIS imagery, GTOPO30 topography, and ETOPO2 bathymetry. For large area coverage with medium resolution data we used DigitalGlobe's entire Millennium Mosaic, a pan-sharpened 15-meter mosaic for the conterminous United States. For the same region we used the United States' 30 meter National Elevation Dataset (NED) for topography. Dozens of smaller study areas were covered with DigitalGlobe's sub-meter QuickBird imagery, various hi-resolution terrain models, terrain model derivatives, scanned maps, and orthophotos. These could all be loaded into the application, in the same session, and be viewed effectively.

Conclusion

Large image archives can be viewed efficiently in a GIS using a globe-centric interactive 3D viewing environment. Any number of images, of varying extent and spatial resolution, can be viewed along with underlying topography. The combination of an intuitive user interface based on the globe, coupled with the ability to handle large datasets, makes a compelling environment for GIS users. For many it has the potential to become the display and query front-end of choice. But before this can happen more functionality is required. Support for vector data is critical. Much of the same approach for raster data can be used for vectors, although vectors pose some unique problems. Unlike images that can be easily sampled at any resolution in rectangular areas, vector features tend to be irregular, exhibiting variability in proximity, size, and shape. An additional area of growing importance is Web services. The viewing technology discussed here lends itself to the creation of thin clients that can hit against one or more servers for data to display. Massive archives could potentially be served over the Internet in an effective manner. A successful implementation would promote the use of imagery in a distributed GIS environment.

Acknowledgments

The authors thank Paul Hansen of GeoFusion for his support. Additionally, they'd like to thank DigitalGlobe, NASA, the USGS EROS Data Center, and WorldSat as those that provided the majority imagery and topographic data used during development and testing.

About the Authors

Clayton Crawford is the Product Manager of the 3D Team at ESRI in Redlands, California, USA.

Dragan Petrovic is the Development Lead of the 3D Team at ESRI in Redlands, California, USA.

Salvador Bayarri is a Senior Software Developer in the 3D Team at ESRI in Redlands, California, USA.

References

Duchaineau, Mark and Wolinsky, Murray and Sigeti, David E. and Miller, Mark C. and Aldrich, Charles and Mineev-Weinstein, Mark B. (1997), "ROAMing Terrain: Real-Time Optimally Adapting Meshes," Technical Report UCRL-ID-127870, Lawrence Livermore National Laboratory, U.S. Dept. of Energy, Livermore, CA.

Falby, John S. and Zyda, Michael J. and Pratt, David R. and Mackey, Randy L. (1993), "NPSNET: Hierarchical Data Structures for Real-Time Three-Dimensional Visual Simulation," in Computer Graphics, Vol. 17, No. 1, pp. 65-69.

Faust, Nickolas and Ribarsky, William and Jiang, T.Y. and Wasilewski, Tony (2000), "Real-Time Global Data Model for the Digital Earth," in Proceedings of the International Conference on Discreet Global Grids, U.S. Center for Geographic Information Analysis, Santa Barbara, CA, March 26-28.

Goodchild, Michael F. (2000), "Discreet Global Grids for Digital Earth," in Proceedings of the International Conference on Discreet Global Grids, U.S. Center for Geographic Information Analysis, Santa Barbara, CA, March 26-28.

Hansen, Paul (1999), "OpenGL Texture-Mapping With Very Large Datasets and Multi-Resolution Tiles," Slide Presentation, SIGGRAPH 1999, Los Angeles, CA. www.omnitect.com/ hansen/slideshow/sld001.htm

Hoppe, Hugues (1998), "Smooth View-Dependent Level-of-Detail Control and its Application to Terrain Rendering," IEEE Visualization '98, pp. 35-42.

Lindstrom, P. and Pascucci V. (2001), "Visualization of Large Terrains Made Easy," in Proceedings of IEEE Visualization 2001, San Diego, California, Oct. 21-26, pp. 363-370.

Lindstrom, P. and Pascucci V. (2002), "Terrain Simplification Simplified: A General Framework for View-Dependent Out-of-Core Visualization," Technical Report UCRL-JC-147847, Lawrence Livermore National Laboratory, U.S. Dept. of Energy, Livermore, CA.

Polis, Michael F. and Gifford, Stephen J. and McKeown, David M. (1994), "Automating the Construction of Large Scale Virtual Worlds," in Proceedings of the ARPA Image Understanding Workshop, Nov. 13-16, Advanced Research Projects Agency, Morgan Kaufmann Publishers, pp. 931-946.