Quantitative Methods in Palaeoecology and

Palaeoclimatology PAGES Valdivia October 2010

- Analysis of Stratigraphical Data

John Birks

CONTENTS

Introduction Temporal stratigraphical

data Single sequence Partitioning or

zonation Sequence splitting Rate-of-change

analysis Gradient analysis and

summarisation Analogue matching Relationships

between two or more sets of variables in same

sequence Two or more sequences Sequence

comparison and correlation Combined

scaling Variance partitioning space and

time Difference diagrams Mapping Locally

weighted regression (LOWESS) Summary

INTRODUCTION

Analysis of quadrats, lakes, streams, etc.

Assume no autocorrelation, namely cannot predict

the values of a variable at some point in space

from known values at other sampling

points. PALAEOCOLOGY fixed sample order in

time. strong autocorrelation temporal

autocorrelation STRATIGRAPHICAL

DATA biostratigraphic, lithostratigraphic,

geochemical, geophysical, morphometric,

isotopic multivariate continuous or discontinuous

time series ordering very important display,

partitioning, trends, interpretation

ZONATION OR PARTITIONING OF STRATIGRAPHICAL DATA

Useful for 1) description 2) discussion and

interpretation 3) comparisons in time and

space sediment body with a broadly similar

composition that differs from underlying and

overlying sediment bodies in the kind and/or

amount of its composition.

CONSTRAINED CLASSIFICATIONS 1) Constrained

agglomerative procedures CONSLINK

CONISS 2) Constrained binary divisive

procedures Partition into g groups by

placing g 1 boundaries. Number of

possibilities Compared with non-constrained

situation. Criteria within-group

sum-of-squares or variance SPLITLSQ

within-group information

SPLITINF

n3

Pollen diagram and numerical zonation analyses

for the complete Abernethy Forest 1974 data set.

Birks Gordon 1985

CONISS constrained incremental sum-of-squares

( constrained Word's minimum variance)

(No Transcript)

OPTIMAL SUM OF SQUARES PARTITIONS OF THE

ABERNETHY FOREST 1974 DATA

Number of groups g (zones) Percentage of total sum-of-squares Markers Markers Markers Markers Markers Markers Markers Markers Markers

2 59.3 15

3 28.4 15 32

4 18.9 15 33 41

5 14.7 15 33 41 45

6 10.6 15 32 34 41 45

7 8.1 15 26 32 34 41 45

8 5.8 8 15 26 32 34 41 45

9 4.7 8 15 24 29 32 34 41 45

10 3.9 8 15 24 29 32 33 34 41 45

HOW MANY ZONES?

K D Bennett (1996) Determination of the number of

zones in a biostratigraphical sequence. New

Phytologist 132, 155-170

Broken stick model

Ioannina Basin

Tzedakis 1994

Pollen percentage diagram plotted against depth.

Lithostratigraphic column is represented symbols

are based on Troels-Smith (1995).

Ioannina Basin

Tzedakis 1994

Variance accounted for by the nth zone as a

proportion of the total variance (fluctuating

curve) compared with values from a broken-stick

model (smooth curve) (a) randomized data set,

(b) original data set. Zonation method binary

divisive using the information content statistic.

Data set Ioannina.

Original data

Broken stick model

Bennett 1996

SEQUENCE SPLITTING

Walker Wilson 1978 J Biogeog 5, 121 Walker

Pittelkow 1981 J Biogeog 8, 3751 SPLIT,

SPLIT2 BOUND2 Need statistically independent

curves Pollen influx (grains cm2 year1) PCA

or CA or DCA axes CANOCO Aitchison

log-ratio transformation LOGRATIO

where

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

Correlograms of sequence splits with charcoal,

inorganic matter and total pollen influxes for

three sections of the pollen record. The vertical

scales give correlations the horizontal scales

give time lag in years (assuming a sampling

interval of 50 years).

RATE OF CHANGE ANALYSIS

Amount of palynological compositional change

per unit time. Calculate dissimilarity between

pollen assemblages of two adjacent samples and

standardise to constant time unit, e.g. 250 14C

years. Jacobson Grimm 1986 Ecology 67,

958-966 Grimm Jacobson 1992 Climate Dynamics

6, 179-184 RATEPOL POLSTACK (TILIA)

Graph of distance (number of standard deviations)

moved every 100 yr in the first three dimensions

of the ordination vs age. Greater distance

indicates greater change in pollen spectra in

100yr.

Jacobson Grimm 1986

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

MANY PROXIES, ONE SITE

ONE PROXY, MANY SITES

- fertile

- poor

Chord distance between samples at Solsø, Skånsø,

and Kragsø, calculated on smoothed data with 35

taxa and interpolated at 400 year and 1,000 year

intervals.

- poor

Pollen percentages from Loch Lang, Western Isles,

plotted against age (radiocarbon years BP). Data

from Bennett (1990).

Pollen percentages from Hockham Mere, eastern

England, plotted against age (radiocarbon years

BP). Data from Bennett (1983).

Rate x5 that at Loch Lang

Comparison of Holocene rates of change at Loch

Lang and Hockham Mere, with ?2 - 2 dissimilarity

coefficient on unsmoothed data, with a

radiocarbon timescale.

SE - continental

NW - oceanic

DATA SUMMARISATION BY ORDINATION OR GRADIENT

ANALYSIS OF SINGLE SEQUENCE

Ordination methods CA/DCA or PCA joint

plot biplot Sample summary Species

arrangement CA correspondence analysis DCA

detrended correspondence analysis PCA principal

components analysis

Biplot

PCA Biplot of the Kirchner Marsh data C2

0.746. The lengths of the Picea and Quercus

vectors have been scaled down relative to the

other vectors, in the manner described in the

text. Stratigraphically neighbouring levels are

joined by a line.

Joint plot

Correspondence analysis representation of the

Kirchner Marsh data C2 0.620.

Stratigraphically neighbouring levels are joined

by a line. Joint plot. Gordon 1982

Stratigraphical plot of sample scores on the

first correspondence analysis axis (left) and of

rarefaction estimate of richness (E(Sn)) (right)

for Diss Mere, England. Major pollen-stratigraphi

cal and cultural levels are also shown. The

vertical axis is depth (cm). The scale for sample

scores runs from 1.0 (left) to 1.2

(right).

(No Transcript)

(No Transcript)

The 1st and 2nd axis of the Detrended

Correspondence Analysis for Laguna Oprasa and

Laguna Facil plotted against calibrated calendar

age (cal yr BP). The 1st axis contrasts taxa from

warmer forested sites with cooler herbaceous

sites. The 2nd axis contrasts taxa preferring

wetter sites with those preferring drier sites

Haberle Bennett 2005

Species arrangement

Percentage pollen and spore diagram from

Abernethy Forest, Inverness-shire. The

percentages are plotted against time, the age of

each sample having been estimated from the

deposition time. Nomenclatural conventions follow

Birks (1973a) unless stated in Appendix 1. The

sediment lithology is indicated on the left side,

using the symbols of Troels-Smith (1995). The

pollen sum, ?P, includes all non-aquatic taxa.

Aquatic taxa, pteridophytes, and algae are

calculated on the basis of ?P ? group as

indicated.

Pollen types re-arranged on the basis of the

weighted average for depth TRAN

ANALOGUE ANALYSIS

Modern training set similar taxonomy

similar sedimentary environment Compare fossil

sample 1 with all modern samples, use appropriate

DC, find sample in modern set most like (i.e.

lowest DC) fossil sample 1, call it closest

analogue, repeat for fossil sample 2,

etc. Overpeck et al 1985 Quat Res 23,

87108 ANALOG MATCH MAT

(No Transcript)

Dissimilarity coefficients, radiocarbon dates,

pollen zones, and vegetation types represented by

the top ten analogs from the Lake West Okoboji

site.

Maps of squared chord distance values with modern

samples at selected time intervals

Plots of the minimum squared chord-distance for

each fossil spectrum at each of the eight sites.

A schematic representation of how fossil diatom

zones/samples in a sediment core from an

acidified lake can be compared numerically with

modern surface sediment samples collected from

potential modern analogue lakes. In this

space-for-time model the vertical axis represents

sedimentary diatom zones defined by depth and

time the horizontal axis represents spatially

distributed modern analogue lakes and the dotted

lines indicate good floristic matches (dij

lt0.65), as defined by the mean squared

Chi-squared estimate of dissimilarity (SCD, see

text).

Flower et al. 1997

Flower et al. 1997

COMPARISON AND CORRELATION BETWEEN TIME SERIES

Two or more stratigraphical sets of variables

from same sequence. Are the temporal patterns

similar? (1) Separate ordinations Oscillation

log - likelihood G-test or ?2 test (2) Constrained

ordinations Pollen data - 3 or 4 ordination

axes or major patterns of variation Y Chemical

data - 3 or 4 ordination axes X Depth as a

covariable Does 'chemistry' explain or predict

'pollen'? i.e. is variance in Y well explained by

X? Lotter et al., 1992 J. Quat. Sci. Pollen

16O/18O (depth)

34 16 12

79 12 4 1

(No Transcript)

COMPARISON AND CORRELATION BETWEEN TIME SERIES

Two or more stratigraphical sets of variables

from same sequence. Are the temporal patterns

similar? (1) Separate ordinations Oscillation

log - likelihood G-test or ?2 test (2) Constrained

ordinations Pollen data - 3 or 4 ordination

axes or major patterns of variation Y Chemical

data - 3 or 4 ordination axes X Depth as a

covariable Does 'chemistry' explain or predict

'pollen'? i.e. is variance in Y well explained by

X? Lotter et al., 1992 J. Quat. Sci. Pollen

16O/18O (depth)

Pollen, oxygen-isotope stratigraphy, and sediment

composition of Aegelsee core AE-1 (after

Wegmüller and Lotter 1990)

Pollen and oxygen-isotope stratigraphy of

Gerzensee core G-III (after Eicher and

Siegenthaler 1976)

Is there a statistically significant relationship

between the pollen stratigraphy and the

stable-isotope record? Summary of the results

from detrended correspondence analysis (DCA) of

late-glacial pollen spectra from five sequences.

The percentage variance represented by each DCA

axis is listed. Reduce pollen data to DCA axes.

Use these then as responses

Site No. of samples No. of taxa DCA Axis DCA Axis DCA Axis DCA Axis

Site No. of samples No. of taxa 1 2 3 4

Aegelsee AE-1 100 26 57.2 12.0 2.3 1.4

Aegelsee AE-3 54 32 44.3 3.3 1.5 1.4

Gerzensee G-III 65 28 37.6 4.0 1.2 0.9

Faulenseemoos 62 25 44.1 18.8 5.0 3.8

Rotsee RL-250 44 23 38.2 13.3 3.1 2.3

Results of redundancy analysis and partial

redundancy analysis permutation tests for the

significance of axis 1 when oxygen isotopes and

depth are predictor variables, when oxygen is the

only predictor, and when oxygen isotopes are the

predictor variable and depth is a covariable.

Site Predictor variable ? 18O and depth Predictor variable ? 18O Covariable depth Predictor variable ? 18O Number of response variables (DCA axes) Pollen DCA axes

Aegelsee AE-1 0.01a 0.01a 0.02a 2

Aegelsee AE-3 0.01a 0.16 0.20 1

Gerzensee G-III 0.01a 0.46 0.57 1

Faulenseemoos 0.01a 0.01a 0.01a 3

Rotsee RL-250 0.01a 0.21 0.08 2

a Significant at plt 0.05 a Significant at plt 0.05

(Lotter et al. 1992)

ANALYSIS OF TWO OR MORE SEQUENCES

Regional zones, description of common features,

interpretation, detection of unique

features. Sequence comparison and

correlation. Sequence slotting

SLOTSEQ FITSEQ CONSSLOT Combined scaling of

two or more sequences.

CANOCO Variance Partitioning

CANOCO Difference diagrams Mapping procedures

SLOTSEQ

Slotting of the sequences S1 (A1, A2, ..., A10)

and S2 (B1, B2, ..., B7), illustrating the

contributions to the measure of discordance ?

(S1, S2) and the 'length' of the sequences, ?(S1,

S2).

The results of sequence-slotting of the Wolf

Creek and Horseshoe Lake pollen sequences (?

2.095). Radiocarbon dates for the pollen zone

boundaries are also given, expressed as

radiocarbon years before present (BP).

Birks Gordon 1985

Comparison of oxygen-isotope records from Swiss

lakes Aegelsee (AE-3), Faulenseemoos (FSM) and

Gerzensee (G-III) with the Greenland Dye 3 record

(Dansgaard et al, 1982). LST marks the position

of the Laacher See Tephra (11,000 yr BP). Letters

and numbers mark the position of synchronous

events (for details see text).

Lotter et al 1992

Psi values for pair-wise sequence slotting of the

stable-isotope stratigraphy at five Swiss

late-glacial sites and the Dye 3 site in

Greenland. Values above the diagonal are

constrained slotting, using the three major

shifts shown in previous figure values below the

diagonal are for sequence slotting in the absence

of any external constraints. The mean ? 18O and

standard deviation for each sequence is also

listed.

CONSLOXY

FUGLA NESS, Shetland

Pollen diagram from Sel Ayre showing the

frequencies of all determinable and

indeterminable pollen and spores expressed as

percentages of total pollen and spores (?P).

Abbreviations undiff. undifferentiated, indet

indeterminable.

COMBINED SCALING

Comparison of Bjärsjöholmssjön and Färskesjön

using principal component analysis. The mean

scores of the local pollen zones and the ranges

of the sample scores in each zone are plotted on

the first and second principal components, and

are joined up in stratigraphic order. The

Blekinge regional pollen assemblage zones are

also shown.

Comparison of Färskesjön and Lösensjön using

principal component analysis. The mean scores of

the local pollen zones and the ranges of the

sample scores in each zone are plotted on the

first and second principal components, and are

joined up in stratigraphic order. The regional

pollen assemblage zones are also shown.

Birks Berglund, 1979

SWISS LATE-GLACIAL

7 sites, 357 samples

(No Transcript)

Summary of results of detrended correspondence

analyses of the biozone II and III assemblages at

the seven individual sites. Gradient length is

given in standard deviation units. The contrast

statistic is explained in the text.

Site Number of biozone II and III samples Sum of eigenvalues (total variance) Gradient length Biozone II/III contrast

Lobsigensee 32 0.18 1.14 0.58

Murifeld 22 0.12 0.63 0.24

Aegelsee 60 0.12 0.62 0.24 R

Saanenmöser 21 0.15 0.73 0.14

Zeneggen (Hellelen) 16 0.29 0.88 0.47 R

Hopschensee 22 0.21 0.77 0.46 R

Lago di Ganna 21 0.46 1.36 0.06

R revertence

VARIANCE PARTITIONING

Total variance between-site variance

within-site variance unexplained (error)

component

VARIANCE PARTITIONING

Use partial constrained ordinations to partition

variance into a) Unexplained variance

13.8 b) Between-site spatial variance 13.2 p

0.01 c) Within-site temporal variance 73.0 p

0.01 Within a sequence variance

partitioning a) Unexplained variance not captured

by zonation 39.7 b) Variance captured by zone

II 33.2 c) Variance captured by zone

III 17.9 d) Variance captured by zone I 9.2

Can now do a partial ordination of 39.7

unexplained variance to see what sort of patterns

remain. Noise, chaos, trends or what?

Tzedakis Bennett 1995

Pollen percentage diagram of selected taxa

plotted against depth. Lithostratigraphic symbols

are based on Troels-Smith (1995). For

correlations and ages see Tzedakis (1993, 1994).

Pollen percentage diagrams of selected arboreal

taxa of the Metsovon, Zista, Pamvotis and Dodoni

I and II forest periods of Ioannina 249

5e

5e

7c

7c

9c

9c

11a b c

11a b c

Tzedakis Bennett 1995

Tzedakis Bennett 1995

Tzedakis Bennett 1995

DIFFERENCE DIAGRAMS

Pollen percentage difference diagram for the

Hockham Mere and Stow Bedon sequences for

selected taxa, plotted against radiocarbon age.

Note different percentage scale for each taxon.

Location of the two coring sites, Rezina Marsh

and Gramousti Lake, in relation to altitude.

Pollen percentage difference diagram to compare

results between the pollen percentage values of

selected taxa at Rezina Marsh and Gramousti Lake.

The values are plotted against an estimated time

scale and have been calculated at a time interval

of 250 yr. Values to the right of the axis (blue)

indicate a higher recorded percentage of a taxon

at Rezina Marsh, values to the left (red)

indicate a higher recorded percentage of the

taxon at Gramousti Lake.

MAPPING

Distribution in northern England of maximum

values for pollen of Tilia during the period 5000

to 3000 B.C.

Pinus

Betula

Maps of pollen frequencies 5,000 years B.P.

Ulmus

Corylus

Maps of pollen frequencies 5,000 years B.P.

Quercus

Tilia

Maps of pollen frequencies 5,000 years B.P.

Alnus

Map of pollen frequencies 5,000 years B.P.

Map of scores of pollen spectra on the first

principal component, 5,000 years B.P.

Map of scores of pollen spectra on the second

principal component, 5,000 years B.P.

Map of scores of pollen spectra on the third

principal component, 5,000 years B.P.

Provisional map of wood-land types for the

British Isles 5,000 years ago.

Vegetation regions reconstructed from pollen data

for 9,000, 6,000, 3,000, and 0 yr B.P.

LOCALLY WEIGHTED REGRESSION

W.S. Cleveland LOWESS locally weighted

regression or LOESS scatterplot smoothing May

be unreasonable to expect a single functional

relationship between Y and X throughout range of

X. (Running averages for time-series smooth by

average of yt-1, y, yt1 or add weights to yt-1,

y, yt1)

Linear

(A) Survival rate (angularly transformed) of

tadpoles in a single enclosure plotted as a

function of the average body mass of the

survivors in the enclosure. Data from Travis

(1983). Line indicates the normal least-squares

regression. (B) Residuals from the linear

regression depicted in part A plotted as a

function of the independent variable, average

body mass.

Quadratic

LOWESS

LOWESS

(A) DATA from previous graph A with a line

depicting a least-square quadratic model. (B)

Data from previous graph A with a line depicting

LOWESS regression model with f 0.67. (C) Data

from previous graph A with a line depicting a

LOWESS regression model with f 0.33.

LOWESS - more general

- Decide how smooth the fitted relationship

should be. - Each observation given a weight depending on

distance to observation x1 for all adjacent

points considered. - Fit simple linear regression for adjacent points

using weighted least squares. - Repeat for all observations.
- Calculate residuals (difference between observed

and fitted y). - Estimate robustness weights based on residuals,

so that well-fitted points have high weight. - Repeat LOWESS procedure but with new weights

based on robustness weights and distance weights.

- Repeat for different degree of smoothness, to

find optimal smoother.

linear regression

tri-cube function

target value

How the LOESS smoother works. The shaded region

indicates the window of values around the target

value (arrow). A weighted linear regression

(broken line) is computed, using weights given by

the 'tri-cube' function (dotted curve). Repeating

this process for all target values gives the

solid curve.

Round Loch of Glenhead

LOWESS curve

SUMMARY

- Stratigraphical data have special numerical

properties fixed order of samples, often closed

percentage data, many variables, many samples - Numerical procedures that take account of these

properties are available for partitioning

(zonation), sequence splitting, rate-of-change

analysis, summarisation of stratigraphical

patterns, analogue matching, and establishing

relationships between two or more sets of

variables in the same sequence - Numerical procedures for analysing two or more

sequences from different sites are less well

developed but there are robust techniques for

sequence comparison and correlation, examining

differences, and displaying spatial pattern at

particular times - Locally weighted regression (LOWESS) is a very

valuable technique for highlighting signal in

stratigraphical data - Palaeoecologists now have a valuable set of

robust numerical tools available for summarising

patterns in stratigraphical data