LAB 4: Digitizing and metadata
Suggested time for completion: Two weeks
Purpose
Introduction
and background
4.1 Purpose
In this lab, you will:
4.2 Introduction and background
In this lab we will be making use of some remotely-sensed imagery. Hopefully you have been exposed to the basics; if you are completely new to the topic you may want to review some of the web pages listed in the Brief Intro to Landsat-7 section..
A Word on Rasters, Images, and Grids
In the context of ArcInfo, the words raster and grid are often used interchangeably. In some contexts, such as ArcToolbox, "grid" refers to a raster in ArcInfo format (there are dozens of methods and formats for storing raster data).
Digital images are always rasters.
However, not all rasters are images, as we will see when we look at elevation
data in Lab
6. ArcInfo8 can support and display many different image formats, but
often times certain tools will only work on ArcInfo's native raster format, the
grid. ArcToolbox can convert many different image formats to grid
format, however. Digital imagery can either be derived from scanned
photos, or from airborne or spaceborn electronic sensors. We will be using
data from the new satellite Landsat-7 ETM.
More on Landsat 7 & Remote Sensing: |
The first Landsat satellite was launched in 1972. The first five Landsat satellites carried the Multi-Spectral Scanner (MSS), which recorded light reflected from the Earth's surface in 4 bands of the electromagnetic spectrum. The resolution was ~80 m. Landsat 4 and 5 also carried the Thematic Mapper sensor, with a resolution of ~30 m for 7 bands. Landsat 5 was launched in 1984 and is amazingly still operating. Landsat 6 was lost after launch in 1993. Landsat 7 was launched on April 15, 1999 with the Enhanced Thematic Mapper plus (ETM+) sensor. As a result of this combination of foresight and luck, the product of the Landsat program is an invaluable 25-year record of the Earth's surface with high spatial and temporal resolution. This record is being used to study land-use/land-cover change (and other topics) all around the world. The liberalized data policy for Landsat 7 will doubtlessly further encourage this research.
Example: North half of a Landsat 7 scene showing Cape Town and the Cape of Good Hope, South Africa. From the Landsat 7 Home Page.
Basic characteristics of MSS and TM:
Each pixel of a Landsat image raster contains a brightness value for each wavelength band. The brightness value, or digital number (DN), is a number between 0 and 255 that is proportional to the number of photons detected by the sensor. This includes light reflected from the surface as well as light scattered by atmospheric haze and light scattered by the atmosphere itself (the sky is blue because blue light is scattered by the atmosphere; some of this blue scattering is detected by the satellite as well). It is important to remember that DNs are not just abstract numbers; they represent actual physical measurements of the light reaching the Landsat sensor: sunlight that was reflected by the earth's surface and the atmosphere above it.
Different materials reflect and absorb different wavelengths of light differently. For example, chlorophyll in plants absorbs red and blue light but reflects green, therefore plants are green. Plant leaves reflect very strongly in the near-infrared (NIR) -- if you could see in the NIR, plant leaves would probably appear bright white. This strong reflectance is exploited in remote sensing to differentiate vegetation and to monitor seasonal and inter-annual vegetation change. Soils tend to reflect more strongly in the middle-infrared. Rough surfaces will tend to be darker than smooth surfaces due to increased shadowing.
The wavelength bands detected by Landsat are listed below (source: AGI GIS Dictionary):
The ETM+ sensor is similar to TM (Thematic Mapper), except that it adds a second thermal infrared band (band 6b)and a panchromatic band (wavelength 520-900 nm) with a resolution of 15m. It also has improvements in other areas such as georegistration.
The thermal band(s) -- band 6 in TM, bands 6a & 6b in ETM -- are usually excluded from land-cover mapping projects. As you can see, the wavelength of band 6 is much longer than for the other ETM bands; this region of the electromagnetic spectrum is highly sensitive to atmospheric conditions such as water vapor; also, the resolution of band 6 is much coarser (~120 m). For these reasons, band 6 is therefore usually not very useful for land-cover mapping. For added confusion, however, band 7 is often referred to as band 6, as it is the sixth band when the thermal bands are excluded.
The Landsat satellite operates in sun-synchronous orbit, allowing it to cross the equator at approximately 10 a.m. every day. It repeats a fly over of a given location every 16 days. Thus the same region can be imaged repeatedly under similar lighting conditions and at high resolution (~30 m for Landsat TM and ETM+, and ~80 m for MSS). This data has proven highly valuable for vegetation and land-use classification, the study of sediment plumes in rivers and oceans, and the monitoring of land-use change (such as tropical deforestation).
Orthophotos are aerial photographs that have been orthorectified
-- that is, they have been geometrically corrected so that ground locations
are in their true positions on the image and any given area on the photo equals
a proportional area on the ground. Artifacts that need to be removed can
be caused by topography (the ground being closer to the plane than usual),
variations in the flight of the plane (either in elevation or the tilt of the
plane or camera), and by the edges of the photo being more "spread out" and
representing more area than the center
.
USGS DOQ Details: DOQ Availability: |
Correcting raw photos is computationally intensive, but a well corrected, georeferenced orthophoto can be highly valuable as a basemap. GDT, for examples, uses orthophotos to check and correct their street locations.
The federal government's National Aerial Photography Program (NAPP) has supervised the collection of standardized airphotos covering the entire 48 states, plus Hawaii, since the 1980s. These serve as data sources for updating the USGS quadmaps as well as for the National Digital Orthophoto Program (NDOP).
The USGS orthophotos are generally referred to as DOQs (Digital Orthophoto Quadrangles, or Digital Orthoquads for short) or DOQQs (Digital Orthophoto Quarter-Quadrangles). One would think that 'DOQQ' would refer to the individual orthophoto taken of a single quarter-quadrangle, while DOQ would refer to the area of an entire quadrangle. However, the USGS DOQ dataset-level metadata page uses the terms interchangeably.
More on Metadata
|
Metadata is data about data. You have probably all had the experience of digging through a list of files on a disk one by one because you can't remember the name of the file you want. You may have had a similar experience working with geographic files; after converting, processing, merging, and editing your data, it is difficult to remember what changes you have made, and which version of your file you wish to use. Now that geographic data is being made available on the internet on a massive scale, metadata has become the key that GIS professionals and the public will use to find the data they need.
Furthermore, GIS professionals will often be publishing their datasets and maps, for sale by a company or for distribution to the public. Publishing the metadata along with the data, in a standardized format accessible to online search engines, will be crucial to help other people find your data, to understand exactly what it contains, and to know the purpose for which the dataset was created.
Metadata standards in
ArcInfo8
The Federal Geographic Data Committee, whose goal is to promote "the coordinated use, sharing, and dissemination of geospatial data on a national basis," has released a metadata standard called the Content Standard for Digital Geospatial Metadata, or CSDGM. ESRI has a metadata standard as well that is interconvertable with CSDGM.
More on Internet GIS
|
4.3 Data
The data you will use for this lab includes:
4.4 Procedures
Copy the lab 4 data into your working directory. First we will explore the Landsat image. If this is your first exposure to satellite images, you needn't worry as none of the tasks will be particularly difficult. However, you should make sure you understand the introduction above, and don't be afraid to ask questions of your TA or of other students who have taken remote sensing classes.
4.4.1 Displaying image data
Preview Individual Bands
|
Question #1: How many pixels are found in each image? How did you find out? Why does each band have a different number of records in its attribute table? What do the attributes "value" and "count" refer to? (if you aren't able to view the tables in ArcCatalog, try adding the bands individually into ArcMap and viewing the attribute tables) |
Question #2: A) Which resampling method looks best for
general display? Which best shows you the original data and
therefore is best for analysis purposes?
B) Change the display method back to RGB Composite. Which bands are initially represented by which color? What objects/features appear red, green, blue? Why? What objects appear black, white, gray? Why? |
Question #3: A) Which color combination most closely resembles "natural" color (i.e., what you might expect to see in a color air-photo)? Which combination highlights urban areas, roads, and soils well? Which brings out vegetation? For the latter combination, what are the colors of golf courses, chaparral, dry grassland, and riparian vegetation, respectively? What accounts for the differences between these vegetation types? |
4.4.2 Creation of Line and Polygon Themes from a TM Image
We will digitize new vector themes in ArcMap with the sbetm
image as our guide. We will use the new geodatabase data model instead of
the coverage or shapefile model. First, we will digitize a small
feature: the shoreline of Campus Lagoon.
![]()
Alternative Instruction if Geodatabase Can't be used due to Errors
|
Saving Edits
and Turning Off Edit Mode
When you are done creating and editing your feature, select Editor --> Save Edits and Editor --> Stop Editing. |
4.4.3 The Effect of Resolution on Digitizing
As you could see when digitizing the lagoon, coarse
resolution makes it impossible to draw borders with high precision. We
will digitize the same lagoon using the 15-m panchromatic band from ETM.
|
Answer Question #4: A) Compare the resulting lagoon features.
What are the differences between them? Account for the
differences.
B) Does the 15-m resolution image contain mixed pixels? Would you expect mixed pixels in the high-resolution DOQ? Why or why not? Is it possible to have an image raster without it containing any mixed pixels? Note: To answer part B, it may be helpful to look at a wider range of resolutions. To see what the campus-lagoon region looks like at different resolutions, color composites, etc., check out the various parts of the UCSB Geography 115B lab #1 page. |
Now, digitize Highways 101 and 217 across the whole
ETM image. Use 30 m bands with a useful color arrangement, or the
panchromatic band, whichever you prefer. Make sure the arcs are snapped where
these roads meet. Add any other roads you want as long as you can identify them
well in the image. Digitize these as lines, in a new Feature Class named
something like etmroads. Note: You may need to save and close your
ArcMap file in order to create the new Feature Class.
Digitizing Your Highway Theme
In order to digitize line features like highways, follow the instructions that we used for lakes. The only only difference from digitizing a polygon will be to set the Geometry Type to Line while you're in the 'New Feature Class' wizard.
|
Answer Question # 5: A) How do the two roads datasets (Coverages or geodatabase feature data sets) compare (the one you just created, compared to the one which came with the lab)? Which one has smoother lines? What are some sources for the differences between these two datasets? B) What other attributes might you want to add to your roads theme? |
Giving Your Digitized Feature Name Attributes
Now that you have digitized all the roads and water bodies that you want, you will attribute them with names (this is a repeat of the procedure used for Campus Lagoon).
|
4.4.4 Creating Land-Use/Land-cover Polygon Themes
We will now create a choropleth map showing broad
land-use/land-cover categories for our ETM image. We will do this by
visual inspection; however, the most common method of classifying an image is
with a classifying algorithm that classifies pixels based on the
similarity of their spectra to the spectra of training site pixels.
Possible Bug:
Using version 8.1, "autocomplete polygon" will ONLY work if your polygon theme is added before your TM image. It is unknown if this is also true in previous versions of ArcInfo. |
|
Answer Question #6: A) Describe the decision-making process you used to determine land-use/land-cover boundaries. What qualities in the TM image helped you define the boundaries? B) How could you quantify the accuracy of your land- cover classes or boundaries? How was error potentially introduced into your land cover theme? |
4.4.5 Creating a Map Overlaying LU/LC Themes on Image
Now, you are going to create and print a map showing
the themes you created on top of the ETM image. While designing your map
keep in mind your goals: you want to convey a sense of the the topography and
clearly show your LU/LC classes, along with your road theme.
Displaying Themes on Top of Satellite Image
For your etm_lulc theme,
|
Your map for Lab 4: Finish the map using the layout
environment of ArcMap. Add a title, north arrow, legend, scale-bar, map
cartographer (that would be you, the author), date, and a brief text
description (a sentence or two) of the data sources (review the
introduction).
*** Although you will print the map on a grayscale printer, focus on making a good color map. Color is an extremely effective tool, especially when working with rich data such as Landsat TM, and multiple themes.*** |
4.4.6 Metadata Display and Editing in ArcInfo8
ArcCatalog contains a metadata editor that allows the user to easily view metadata about a dataset's contents, attributes, and spatial referencing. The editor also allows metadata entry, conversion of metadata between formats, and the attachment of additional files to a metadata file. A number of optional settings allow the user to decide if metadata standards will be enforced (i.e., must all of the required fields be filled in before the data is used (??), and if the metadata changes will be "logged"; that is, will changes made to the data be recorded, creating a history for the dataset.
eXtended Markup Language
XML Page from the World Wide Web Consortium
The XML Cover
Pages hosted by OASIS |
ArcInfo8 stores metadata in XML format. XML stands for eXtensible Markup Language, which is similar to HTML (Hypertext Markup Language) used for webpages. It is a standardized, open language designed for exchange of structured documents on the web. http://www.xml-zone.com/ defines the difference between XML and HTML thus:
"While HTML specifies how a document should be displayed, it does not describe what kind of information the document contains, or how it's organized. XML fills this void and allows document authors to organize information in a standard way."
Metadata in HTML formatIdentification Information: description, purpose, creator of the dataset and the area and time period it coversData Quality Information: sources, processes, accuracy statements
Spatial Organization Information: the data model used (internal structure of the data)
Spatial Reference Information: coordinate system/projection and datum
Entity and Attribute Information: details on what the dataset describes
Distribution Information: where/how to get the dataset
Metadata Reference Information: who wrote the metadata and what version it is
To familiarize yourself with metadata content and
layout, examine the metadata for the LANDCOV layer found on the California
Gap Analysis Project Webpage (hosted by the UCSB Biogeography Lab). You do not
need to download the data from the website, as we have already clipped it and
put it on the network for you. You will go to the webpage to look at the
metadata, however. We will be using a subset of this data layer for part
of Lab 6. Answer these questions:
Answer Question #7: A) What extra elements in addition to the seven
standard elements are listed in the LANDCOV metadata?
B) Briefly describe (1 sentence) the LANDCOV layer. C) What is the MMU (minimum mapping unit) for the dataset? D) What was the main data source upon which the polygon boundaries were based? E) What projection and spheroid are the data in? |
ArcCatalog Metadata Editor
You are already familiar with viewing metadata in
ArcCatalog. You will now learn how to create metadata, convert metadata
between display styles and formats, attach metadata files, and attach additional
metadata.
Creating Metadata Using the ArcCatalog Metadata Editor
You are already familiar with ArcCatalog's
metadata tab:
![]()
|
Answer Question #8: A) What are the main differences between
the stylesheets? Which do you prefer, and why?
B) What formats can you import metadata from? What formats can you export metadata to? |
Print out your metadata in the FGDC stylesheet and attach it to your lab. |
4.5 Conclusion
In this lab, you have been learned how to digitize features on-screen. Often digitizing is also done directly off of paper maps, but that procedure is better covered in a cartography class. You have also been exposed to some of the most common imagery used for digitizing and land-use/land-cover. You now know how to alter raster display to improve digitizing and to gather more information from your image.
You have learned how to view, read, create, convert, and search metadata. While metadata creation is often considered tedious and expensive, remember that metadata is the key that allows you, as a data producer, to make your data available to the world GIS community. It also gives you, as a GIS user, the ability to find useful spatial data in the free-for-all of the World Wide Web.
4.6 To turn in
Lab originally created by Nicholas Matzke and Sarah
Battersby
UC Santa Barbara, Department of Geography
© 2000, Regents of the University of California; redistributed by permission
Last update: April 25, 2002
http://dusk.geo.orst.edu/buffgis/Arc8Labs/lab4/lab4.html