EOM Archives

ESE Update—The Comming Revolution in Geospatial Imagery
By W. Frederick Limp

The articles in the special edition April issue of EOM showed how ESE's next generation sensor systems, data processing, and value added products will impact the remote sensing and GIS communities. While these major developments are underway in ESE, there are other fundamentally significant changes in GIS software and systems that will have equally dramatic, perhaps even greater effects. There are three major themes: 1) the implications of the Open GIS Consortium's specification work; 2) the inclusion of imagery within new enterprise-class database management systems; and 3) the role and size of the server and client application in a distributed environment.
All of these are, in turn, taking place within the larger contexts of digital convergence and high bandwidth availability. Before going on to the major points, a couple of words about these larger contexts are in order. One of the key long-term impacts of digital convergence is the continuedblurring of boundaries between different data, sources, and applications. Within the Earth observation communities, this is becoming obvious as boundaries between GIS, remote sensing, photogrammetry, and database management become blurred. High bandwidth means that traditional models of data structure and the way in which they are converted into useful information will change. Both of these larger contexts affect the themes.

The Open GIS Consortium Specifications
The Open GIS Consortium is a consortium of more than 150 members in the commercial, governmental, and academic arenas. Its goal is "transparent access to heterogeneous geodata and geoprocessing resources in a networked environment... to provide a comprehensive suite of open interface specifications that enable developers to write interoperating components that provide these capabilities."
OGC has developed specifications for simple features and is actively working on a total of thirteen different specifications. The work of OGC already has had a major impact upon the GIS community and will begin to have a major impact on the remote sensing community. This is due to new models of spatial data and resulting software that are already becoming available in the marketplace. Using object modeling, OGC specifications have been created that separate the data (actually an object) from services (what can be done with the object). The conceptual separation of data and services that can be applied, means that the monolithic structure of traditional image processing can be broken into components, though it does not have to be.

Geospatial Middlewares
In such a situation, middleware becomes critical. Middleware can perform tasks that range from geo-rectification, to image enhancement, to information extraction, making it possible to define a network structure that goes from initial data acquisition, to application of a number of services, to consumption. Because it is an interoperable network, content is drawn transparently from any compliant provider, and because services have wide application, there is opportunity for many different vendors to enter the marketplace. OGC has three different tracks in place that will have considerable bearing on the future of remote sensing processing. These are coverages, catalogues, and Earth imaging. For more information, go to the OGC web site and download the specifications at www.opengis.org

Enterprise Databases and Spatial Information
In parallel with the OGC efforts, there are major changes in basic data processing and storage. Until recently, spatial data (generally) and imagery (specifically), were seen as "special" data. In any large enterprise there are the "normal" information technologies, and somewhere (usually in a skunk works), far from the classic glass computer room, are the image processors and GIS types. Such a division is based on fundamental difference in underlying data characteristics, though the cultures of the various groups, no doubt, contributed to the distance. Traditional database management systems were unable to effectively handle spatial data for the simple reason that spatial data was multi-dimensional-especially when it came to getting data out of a database. Consider a list of names. It is quite easy to order the names, alphabetically for example, assign a single index number to each name, and create an index that allows rapid retrieval of any specific name from the database.
A GIS data set of some polygons, or an image data set, consists of data that is (at least) two-dimensional. The only way that such data can be easily stored and processed is in specialized file structures optimized for display and analysis. Thus, a polygon is stored in a specialized graphic-oriented structure with linkages to a database that may have attributes (area, ownership, soil type, etc.) about the entity. Images, which are easily expressed as a matrix of values, can be accessed as a file of a rectangular array, displayed, and processed. While there are many immediate advantages to this "special" status, it has kept spatial data from becoming part of the mainstream.

Quad Tree Indexing
As computing power is increased, and as the value of spatial data to an enterprise becomes clearer, the isolation of spatial data, both "vector" and "raster," is breached. While many vendors are moving in this direction, Oracle in Redlands, California, is the most aggressive. A key to the successful inclusion of spatial data into traditional IT structures is the use of the "quad tree" to create a single index value that reflects a two-dimensional area. The quad tree is quite simple in concept. Consider any area. Divide the entire area into four equal square subdivisions and give them the index numbers 0, 1, 2, 3. Each of these subdivisions is then divided into four parts. The ones that are in the area labeled 1 are themselves given index values of 10, 11, 12, 13. This process can be repeated as many time as necessary-depending on how small the resulting "smallest" area needs to be. Subdivisions of the area 12 are indexed 120, 121, 122, 123. Each of the quad tree index values refers to a specific location. You can then enter a coordinate and get the index for it. Suppose you wanted to find all locations near to one that fell in area 122. All geographic features in the data base assigned the index of 122 are within the same area. To generalize your search, you simply ask for all the indexes with the first two digits of 12! With version 7.x, and now with 8.0 and 8i, of Oracle's Enterprise Database, spatial data is a regular part of an enterprise database. Until 8i, only vector data was included. The next release, now in beta, will add georeferenced rasters as well.

Spatial Queries and Processing
Any new technology is only as good as it is useful in real operations. With these new enterprise spatial systems, it is possible to enter an address into the system. Geocoding will identify the coordinates of the address and associate them with a quad index value. With that same index value, any parcels at that location are returned as is, a raster TM image containing the same quad index, and classified by forest type. Attribute data associated with the parcel is compared to the imagery and current forest harvesting data added. Only a small amount of data is quickly retrieved from the database and transmitted to the user.

Middlewares, Thick and Thin Clients, and Data Stores
When the growing role of geospatial processing components, the increased bandwidth of the network, and enterprise spatial data stores are coupled, a new structure for the intake, storage, and use of geospatial data quickly emerges. In this context, each end-user desktop will have a client that is "thick enough" for the purposes at hand. For some, this will mean that the desktop only need support a downloadable Java applet, while in other cases, substantial capabilities will need to be local. Each user will be able to acquire the appropriate level of service that is needed for the specific problem at hand. As data moves from source (sensor in RS areas), to the data store, to the client; value and information will be added by the addition of specialized services to the object and not by the creation of multiple, different versions. This will reduce the amount of data transmitted and will reduce the confusion that arises from many different processed versions of the same original data. The overhead associated with data acquisition, storage, and maintenance will be off-loaded to large data stores and data will be acquired by the end-user only when needed, ensuring data freshness and reliability.

New Users and New Perspectives
These new approaches are in considerable contrast to traditional approaches. They will mean that the number of users of remotely sensed data will grow exponentially. Conversely, it will mean that the traditional artisan approach to adding value to sensor data will be replaced by an industrial model. The cost of development of components that add value to raw data will be large, but the market for such components will be orders of magnitude much larger than today. Software firms that will be most successful will be those that can deconstruct the monolithic software packages of today and create thin components. As components replace people in the information flow, paradoxically, there will be a great increase in the use of remotely sensed data in many areas while there will be a decrease in the number of individuals necessary.

Back