Title: Spectroscopy and Hyperspectral Imaging for Mineral and Environmental Applications
1Spectroscopy and Hyperspectral Imaging for
Mineral and Environmental Applications
- Mark Berman
- CSIRO Mathematical and Information Sciences
(CMIS), Sydney - June 12, 2009
2Outline of talk
- What are spectroscopy and hyperspectral imaging,
and why are they useful? - Major areas of CMIS hyperspectral research
- Applications in mineral exploration,
biotechnology and environmental monitoring
3What is Spectroscopy?
- Many different types of spectroscopy
- Typically, the amount of light that an object
reflects, absorbs, transmits, scatters or
fluoresces is measured at many wavelengths (often
equally spaced) - We will mostly be concerned with reflectance
spectra - Can often be represented as a smooth (low
frequency) curve with absorption features
(intermediate frequency) - The spectrum of a material tells us about its
chemistry - Some examples follow
4 Some Examples of Visible Near Infrared
Reflectance (VNIR) Spectra
Visible Light blue (400-500nm), green
(500-600nm), red (600-700nm)
ASD spectrometer 350, 2500 (nm), 1 nm
apart Note within class variation
Water features
5 Shortwave (SWIR) Spectra from 4 White Mica
Classes (PIMA spectrometer 1300, 2500 (nm),
2nm apart)
(bound water)
(K rich)
Note higher frequency nature of absorption feature
s than in the VNIR Also note variability of low
frequency backgrounds
(Na rich)
(Fe or Mg rich)
6What is a Hyperspectral Image?
- A hyperspectral image produces a spectrum
(typically tens or hundreds of numbers) at each
pixel - So hyperspectral images enable us to map
variations in chemistry (Chemical Imaging) - Original hyperspectral sensors (in 1980s and
1990s) were airborne used mainly for
exploration, environmental and military
applications - Several hyperspectral satellites are due for
launch over the next 5 or 6 years - There are now hyperspectral microscopes and
cameras being used for terrestrial applications
(e.g. medical diagnosis, burns analysis skin
cancer, biosecurity, pharmaceuticals, forensics,
agribusiness, exploration mining) - Can generate large volumes of data
- Consequent issues automation, speed, accuracy of
results (All models are wrong and in
large data sets it shows)
7Example Excedrin tablet (Spectral Dimensions)
(121 images, 1200, 2400 nm, 10nm apart, 160 x
128 ( 20480) pixels)
4 (of 20480) absorption spectra from image
8Application Excedrin tablet constituent
concentration maps averages
Filler (15) Aspirin (34)
Caffeine (14) Paracetamol (36)
- The above maps are based on pixelwise mixture
models - Can also test of for spatial distribution of
constituents
9Major Areas of CMIS Hyperspectral Research
- In many applications, most pixels contain a
mixture of materials. - Our research has been focussed on unmixing
spectra into their constituent materials - Fast and reliable identification of material
mixtures with a spectral library of pure spectra
The Spectral Assistant (TSA) - Fast and reliable identification of material
mixtures without using a spectral library the
endmember problem ICE (Iterated Constrained
Endmembers) - Algorithms are mainly based on linear mixture
models with constraints on the coefficients
either non-negative or proportions (convex
geometry)
101. Identification of material mixtures with a
spectral library The Spectral Assistant (TSA)
- TSA developed since mid-90s
- (Berman, Bischof Huntington (1999)
Lagerstrom, Mason, Guo since) - Identifies pure minerals and mixtures of 2 or
more - minerals from a spectral library.
- TSA (Version 6) uses 2 libraries of pure
materials - (a) SWIR library 299 wavelengths, 60
materials - (various minerals, dry vegetation, wood,
teflon, plastics) - (b) VNIR library 163 wavelengths, 17
materials - (various iron ores, sulphides and green
vegetation)
PIMA? field-portable IR (1300, 2500
nm) spectrometer
11Technical Details About TSA
3 major components (i) SWIR
model is multiplicative (Beers Law) where M
is the number of materials in the mixture,
Ej is the mean spectrum of the jth library
class (out of M), are non-negative
coefficients (to be estimated) - can be
interpreted as proportions if renormalised to sum
to 1, Bk (k 1, , 6) are the basis
functions of a low-frequency
smoothing spline (modelling the background)
needs automation, B7 is the spectrum of
water, are unconstrained coefficients (to be
estimated), is the error (common
covariance matrix assumed incorrect)
12Technical Details About TSA (cont.)
- (ii) Penalised discriminant (canonical variate)
analysis (Hastie, Buja and Tibshirani, 1995) -
for dimension reduction and decorrelating the
data - (iii) Fast subset selection procedures - for
identifying best mixtures of M materials, e.g.
for SWIR library, 1770, 34220 and 487,635
possible mixtures of 2, 3 and 4 materials
respectively - (iv) Empirical rules currently used to decide
between M 1, 2 or 3 only (because model is
incorrect) - Incorporated into The Spectral Geologist (TSG),
developed by CSIRO Exploration and Mining (CEM),
which is sold commercially - Can analyse tens of thousands of spectra in a few
minutes
13CEMs HyLoggerTM SystemsTypically 500 m (
60,000 spectra) of drill core / day in an
operational commercial environment
14 Emmie Bluff Drill Cores (measured by
Alan Mauger colleagues, PIRSA)
- These 67
- trays represent
- one drill core!
- Data are n
- 58,890 spectra,
- each with d 522
- observations in
- 400, 2500 nm
- (n d
- 30.7 million)
- About 100 drill
- holes at each
- site!
15The Spectral Geologist The Spectral Assistant
Log Screen
16The Spectral Geologist The Spectral Assistant
Summary Screen
172. Identification of material mixtures without
using a spectral library
- Building spectral libraries is a very time
consuming process! - Currently, it is easier to measure training
samples with a spectrometer (e.g. a
PIMA), which is more portable, than with a
hyperspectral camera, especially if it is
airborne or a microscope - Also, airborne spectra are distorted by the
atmosphere and the sun correction algorithms
are not yet good enough to enable reliable
matching of airborne and terrestrial spectra - Lot of interest in identifying purest spectra in
a scene (endmembers) blind unmixing still
requires manual identification of endmembers
18Return to Linear Mixture Model
- For simplicity, assume that the background has
been removed. For pixel i - Xi Sk pik Ek ei
- where pik 0 Ek is the unknown kth endmember
- (k 1, , M) and the ei are errors (again
assumed to have a common covariance matrix). - Note the scale indeterminacy between pik and Ek.
- Common to assume Sk pik 1 convex geometry
model.
19Implications of the Convex Geometry Model
- If there are M materials and there is no noise,
then the data lie inside a simplex with M
vertices in an M-1 dimensional subspace. - The pure materials lie at the vertices.
- If M 3, the simplex is a triangle in a
two-dimensional subspace. - If M 4, the simplex is a tetrahedron (pyramid)
in a three-dimensional subspace.
20Toy convex geometry example simulated
data true simplex (M 3)
- Note
- Noise in data due to natural variation in
spectra or inadequacy of linear mixture model - Some vertices (endmembers) are not present
in the data
21Toy convex geometry example Some existing
solutions
Craig (1994) encloses the data sensitive to
noise Winter (1999) constrained to lie
within the data (restrictive)
22Iterated Constrained Endmembers (ICE) Algorithm
- Recently developed by CSIRO scientists
(Berman et al,
2004, 2005, 2009) - Unlike other endmember finding algorithms, ICE
- - accounts for noise in the data,
- - does not assume that, for every material in a
scene, there is at
least one pixel which consists only of that
material
23Toy example
Existing Solutions ICE Solutions
Endmember spectra, proportion and error (RSS)
maps follow easily
24Hyperspectral Mapping for Environmental
Applications
Landsat (6 bands)
- CMIS Terrestrial Mapping and Monitoring (TeMM)
Group is the worlds first group to use low
dimensional remote sensing data for reliable
quantitative mapping on a continental scale - Information obtained is being used in national
and international environmental applications
(e.g. carbon accounting) - TeMM won the CSIRO Chairmans Medal in 2004 and a
2008 Eureka Medal - Major issue is distortion of information
due to atmospheric
and solar effects - Size of data set 7.8 x 1010 observations
( 360 6 3.6 107) - More if time series analysis included (14 time
slices analysed between 1972 and 2004)
25Hyperspectral Mapping for Environmental
Applications (cont.)
- Hyperspectral satellites (which produce
200 dimensional
spectra at each pixel)
will be launched over the next few
years - TeMM wants to combine unmixing algorithms
and atmospheric and solar
correction algorithms to
produce more detailed continental scale maps - Preliminary work carried out using airborne data
HyMap (126 bands, 70 million pixels), near Mt.
Isa
Band 53 of Two Neighbouring Images
26Some Sample Spectra from the Red Region
Mean-standardised spectra
27Independent ICE Proportion Maps for Left and
Right Mean-Normalised Images (10 Endmembers in
Each)
Left Proportion Maps RSS Map
Right Proportion Maps RSS Map
Which images match with which?
28A Version of ICE Which Constrains Proportions in
Overlap Regions to be Equal
13 proportion maps RSS map
29Long Term Goals for Satellite Hyperspectral Work
- To produce seamless continental scale maps for
environmental and other mapping and monitoring
applications - Will need to use both TSA and ICE type algorithms
- Will need to build spectral libraries of relevant
vegetation, soils, water, possibly man made
objects - Will need to improve our models to deal with
atmospheric and solar distortions of spectra - Later to incorporate time series information
- Will need to make algorithms robust and fast
(including parallelisation) - Storage and transfer of data and/or maps will
also be a critical issue