Collecting Control Data for Remote Sensing Applications
in the Frontier Environment of the Ecuadorian Amazon
Brian G. Frizzelle,
Stephen J. Walsh,
Christine M. Erlien, and
Carlos F. Mena
The fusion of GPS technology, remote sensing methods, and social
survey practices
combined to generate sufficient control data for image processing
and analysis in a remote and inhospitable environment.
There are many places in the world where the environment offers
significant challenges to the successful application
of remote sensing technology and value-added image processing
for landscape characterization. Frontier environments, such
as the Northern Oriente region of Ecuador, pose considerable
problems associated with their limited access into and throughout
the region, and the complex and constantly changing nature of
their biophysical landscapes. These limitations manifest themselves
as constraints to the traditional practice of collecting numerous
widely distributed training samples per mapping class for land
use and land cover (LULC) classifications and for their corresponding
accuracy assessments. Scientists working in frontier environments
must overcome the in situ difficulties associated with such
environments before addressing the challenges inherent in the
data processing and analysis schemes.
Our study area lies in the Amazonian lowlands of northeastern
Ecuador (the Northern Oriente), east of the Andean mountains,
in the provinces of Sucumbios and Orellana (Figure 1). Natural
vegetation consists of vast expanses of tropical rainforest,
containing a complex structure and the highest biodiversity
in the entire Amazon Basin. The region also exhibits substantial
zones of secondary forest succession particularly along rivers
altered by natural forces, land conversion by localized indigenous
communities, as well as broad areas of LULC change by recently
arriving colonists. Within this forested landscape is a large
contiguous agricultural zone comprised of farms containing both
tree and non-tree crops, as well as areas of secondary succession
at a variety of stages associated with deforestation, agricultural
extensification, and community settlement patterns. The regional
infrastructure consists of a growing network of mostly unpaved
roads, dozens of small towns providing local services to nearby
farmers, and a handful of larger towns that serve as the centers
of commerce and trade within and outside the region. Small indigenous
villages lie within and adjacent to national parks and reserves,
primarily along rivers, but with some small agricultural plots
scattered in the forest.
As part of our ongoing NASA-funded research within the region,
we have assembled a deep time-series of satellite imagery ranging
from 1973 through 2002. Our earliest images (1973-1986) are
from the Landsat 1 and Landsat 4 Multispectral Scanner (MSS)
sensors. The later images (1986-2002) are from the Landsat 5
Thematic Mapper (TM) and the Landsat 7 Enhanced Thematic Mapper
(ETM) sensors. The images have been collected to identify the
regional LULC at each image date and patterns of LULC change
throughout the 30-year period. To accomplish these tasks, two
steps must be taken: register the images to a real world coordinate
system, in this case the Universal Transverse Mercator (UTM);
and classify the images into discrete LULC classes for change-detections
and models of change. Both steps require “control” data collected
both remotely and in situ. In this study, geodetic control was
collected to aid in image registration, and compositional control
was collected as reference data for LULC classifications.
Image Registration
The horizontal accuracy requirements for the image rectification
were set at 15 meters, or half the spatial resolution of a Landsat
TM pixel. Rectification, a form of georeferencing, requires
the use of control reference data. Pre-existing control data,
such as topographic maps, surveyed locations, and geodetic horizontal/vertical
control markers, are commonly used for rectification, but
our study areas lack them in sufficient quantity or quality.
Topographic maps exist at a 1:50,000 scale, but the other types
are not present. However, the maps’ minimum horizontal error
of 42.3 meters, based on the USGS National Map Accuracy Standards
that were used to create them, is too large to satisfy our accuracy
requirements. Therefore, we created our own network of
geodetic control points (GCPs) using Global Positioning System
(GPS) receivers. The data were post-processed using differential
correction for the highest possible accuracy.
For the best possible rectification, the GCPs should be evenly
distributed over as much of the image as possible and collected
at large, static locations that are visible on the imagery (e.g.,
30x30 meters in size for Landsat TM). In a frontier region such
as the Ecuadorian Amazon, road intersections and bridges are
the main targets for GCPs. The GCP distribution was therefore
constrained by the road network, which is dense in the center
of the image and much sparser toward the edges (Figure 2). To
add to the difficulty, many of the roads are of poor quality
and not easily traversed, making accessibility to the more
remote regions of the image difficult if not impossible.
Land Cover Assessment
Most methods for converting a multispectral satellite image
into discrete LULC classes require some knowledge of the location
and composition of each land cover type of interest so that
classes can be properly labeled, training areas defined, and
the classification assessed for accuracy. LULC compositional
control data can be used to attribute statistical output clusters
from an unsupervised classification, or can be used as training
data sets to generate spectral signatures for a supervised classifier.
LULC samples are often collected in the field by finding locations
that are representative of the classes of interest and collecting
a GPS point at that site. The GPS points are then used to identify
one or more pixels on the image for attribution/training
purposes. As with geodetic control, it is a good practice to
collect compositional control in a widely distributed manner
to minimize small-area effects such as atmospheric haze and
other local anomalous conditions linked to environment site
and situation factors.
In our study, the sparse road network and poor quality roads
affected LULC data collection just as they impacted GCP collection.
The classification scheme contains 18 classes, with a mixture
of natural vegetation, agriculture, water, and urban areas.
All classes, other than those easily identifiable in the imagery
(e.g., primary forest, urban, and water), were located in the
field, with the constraint that the field site be at least 3600m2,
or four TM pixels, in size. The four-pixel size requirement
was set to control for possible errors in the horizontal accuracy
of the imagery and the GPS data. The sites also needed to contain
a “pure” class to minimize the within-class spectral variance
and to maximize the between-class spectral variance.
Given the constraints imposed on data collection, multiple field
trips failed to produce a sufficient quantity and distribution
of training data for all classes, leading to the collection
of additional sources of compositional control. The final set
of compositional control only spanned the period of 1999 to
2002, requiring the creation of a method for generating a detailed
classification for image dates prior to 1999. This method is
briefly discussed below.
Collecting GCPs
When collecting GCPs for image rectification, a set of nine
for a rectangular image would be ideal if one point were located
in the center, four points at the corners, and four points on
the edges midway between the corners. If necessary, four more
points could be added at the center of each quadrant, with additional
points added evenly throughout the image until the rectification
requirements are satisfied. Figure 3 shows the nine ideal points
as blue diamonds, with the four additional points displayed
as red triangles. These are ideal placements, but rarely realistic
in a frontier environment.
The GCPs in the study area were collected using mapping grade
Trimble GeoExplorer II and GeoExplorer 3 GPS receivers.
All points were differentially corrected through post-processing,
using base station files from Quito. The resultant accuracy
of each corrected/averaged point was better than 15 meters,
within the accuracy requirements for the project.
GCP collection proceeded at road intersections or primary roads
and at bridges along the more heavily traveled roads, bypassing
the poorer quality roads that service the periphery of the region.
This resulted in a high density of GCPs in the center of the
image, necessitating further excursions to the peripheral regions.
As there are fewer and more degraded roads along the periphery,
more points were added away from the center, but the cluster
in the center remained.
GCPs and Image Rectification
The central GCP was well placed. However, only one of the eight
boundary and edge “ideal” GCPs were within close proximity to
any collected GCPs (Figure 3). Therefore, those GCPs that were
located nearest to the edges and boundaries were selected for
the rectification along with a sample of GCPs that (1) were
evenly distributed throughout the image, and (2) resulted in
the lowest possible root mean squared error (RMSE). A total
of 15 GCPs were used to apply a second-order polynomial absolute
rectification to the November 1999 image. This resulted in an
RMSE of 0.3263 pixels, or 9.8 meters, which was well within
the horizontal accuracy requirement of 1/2 pixel, or 15 meters.
The other time-series images were rectified to the November
1999 image using the relative rectification method. This approach
was used to obtain the highest level of co-registration throughout
the time-series. Accurate co-registration was necessary for
successful implementation of our change-detection methodologies.
Multiple Data Sources
Compositional control was collected to build a training data
set for classifying the Landsat time-series. Five types of compositional
control were collected: (1) land use information from a regional
household survey questionnaire, (2) sketch maps created during
the household survey, (3) convenience sample of GPS points of
selected LULC sites throughout the region, (4) IKONOS satellite
imagery, and (5) a detailed GPS-based LULC survey of a sample
of farms. Compositional control data were used with the Landsat
image dates that most closely corresponded with the dates of
data collection, and a database of LULC points and polygons
was created.
The land use survey questions were pulled from a 1999 household
socio-economic/demographic survey, in which the head of household
was asked questions regarding current and past land use on the
farm. Responses were used to inform analysts as to the type
of LULC on particular farms on a November 1999 Landsat TM image.
During the interviews, the head of household helped create a
sketch map of LULC for the farm. The sketch maps were not surveyed
or drawn to scale, but do contain the relative spatial composition
of LULC parcels (Figure 4). The general shape of each parcel
was hand-drawn, and information on the parcel’s LULC and size
in hectares was recorded. Analysts used the sketch maps to identify
large plots of LULC on the farms and delineate the plots on
the November 1999 TM image, using farm boundary polygons as
reference locations.
Selected LULC sites were collected with Trimble GeoExplorer
II and GeoExplorer 3 GPS receivers during various field trips
from 1999 to 2002. In most cases, the sites consist of one point
representing a patch of land cover, although some sites
were collected as polygons by walking the perimeter of the plot.
All point and polygon data were differentially corrected and
grouped into GIS data sets with information such as LULC
class and size of plot.
IKONOS satellite imagery was acquired for selected sites from
2000 to 2002. IKONOS’s 1-meter panchromatic and 4-meter multispectral
imagery allows for easy recognition of some LULC types, based
on expert knowledge and spatial pattern recognition (Figure
5). Georeferenced images were overlaid on the corresponding
Landsat image dates, allowing the analyst to visually interpret
the land cover on the IKONOS image and delineate polygons on
the co-registered Landsat image.
In November 2001, a GPS-based LULC survey was conducted on a
sample of farms from the 1999 survey. On each farm, the researcher
and farmer walked the 50 hectares of land with Trimble GPS receivers
and collected points to delineate the shape of each individual
parcel. The points were later connected to form polygons representing
specific LULC parcels, and were attributed with the current
LULC, the LULC 2-years prior, and the expected LULC 5-years
in the future (Figure 6). These farm-wide detailed LULC surveys
provided additional temporal LULC information for the classification
of Landsat data.
Land Cover Classification
The main concern for accurately classifying the Landsat time-series
was the lack of compositional control prior to 1999. A methodology
to resolve this issue was developed that included the normalization
of the time-series and the generation of a multi-temporal spectral
signature data set.
The image normalization was performed with a 5S Top of Atmosphere
(ToA) reflectance and atmospheric correction model (Teillet
et al. 1997), which was applied to each image in the time-series.
An ARTMAP neural network model classified all clouds and cloud
shadows, which were subsequently masked from the images.
Clouds are a particularly severe problem in the Ecuadorian Amazon.
Of the 407 Landsat 5 overpasses of the area between March 1,
1983 and December 31, 2000, only 7 (1.72%) scenes were considered
sufficiently cloud-free to suit our purposes, and those scenes
still contain clouds to various extents.
The compositional control was used in conjunction with the appropriate
corresponding image dates to create a multi-temporal spectral
signature data set. Signatures for each class were extracted
using each of the five compositional control types. The signatures
were then merged into one multi-temporal data set, which was
then applied to each image in the time-series using a standard
maximum likelihood classification algorithm. The classifications
were further processed using several different change-detection
methods to better grasp the dynamics of LULC change over time
throughout the region. The classifications and change-detections
required geodetic and compositional control to succeed.
Summary
In the frontier of the Ecuadorian Amazon, isolation, inaccessibility,
and a general lack of discernible landscape features make the
task of image rectification and classification difficult. Geodetic
and spectral control is fundamental for LULC characterization.
In this setting, obtaining such data was a considerable
challenge. Developing site-specific geodetic control was critical
to the success of the project. Using GPS technology was basic,
but still the areal distribution and the number of control points
were subject to road access. To augment the GPS work, alternative
compositional control was acquired through field sketch maps,
a longitudinal social survey, and high spatial resolution IKONOS
satellite data that were acquired over targeted features. As
a result, we were able to classify images from dates earlier
than any of our compositional control data at a level of LULC
detail otherwise unattainable. The fusion of GPS technology,
remote sensing methods, and social survey practices combined
to generate sufficient control data for image processing and
analysis in a remote and inhospitable environment.
Acknowledgements
This work is based on a project (R.E. Bilsborrow and S.J. Walsh,
Co-PIs) of the Carolina Population Center and the Departments
of Geography and Biostatistics at the University of North Carolina,
Chapel Hill. Ecuadorian collaborators include individuals from
EcoCiencia, a leading non-profit ecological research organization
in Quito, Ecuador, and Cepar, a leading survey processing center
in Quito. Funding for this research was provided by NASA (grant
NCC5-295) and the Mellon Foundation.
About the Authors
Brian G. Frizzelle is a Senior Spatial Analyst, Stephen
J. Walsh is a Professor in the Department of Geography and Research
Fellow, Christine M. Erlien is a PhD candidate in the Department
of Geography and a pre-doctoral trainee, and Carlos F. Mena
is a PhD candidate in the Department of Geography and a pre-doctoral
trainee at the Carolina Population Center, University of North
Carolina at Chapel Hill.
References
Teillet, P.M., Staenz, K., and Williams, D.J. 1997.
Effects of spectral, spatial, and radiometric characteristics
on remote sensing vegetation indices of forested regions. Remote
Sensing of Environment, 61(1): 139-149.
Back
|