A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS

About This Presentation

Title:

A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS

Description:

A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS – PowerPoint PPT presentation

Number of Views:217

Avg rating:3.0/5.0

Slides: 60

Provided by: mpdcMae

Category:

more less

Transcript and Presenter's Notes

Title: A NONLINEAR DIMENSION REDUCTION STRATEGY FOR GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS

1
A NONLINEAR DIMENSION REDUCTION STRATEGY FOR
GENERATING DATA DRIVEN STOCHASTIC INPUT MODELS
Nicholas Zabaras and Baskar Ganapathysubramanian M
aterials Process Design and Control
Laboratory Sibley School of Mechanical and
Aerospace Engineering Cornell University Ithaca,
NY 14853-3801 zabaras_at_cornell.edu,
bg74_at_cornell.edu http//mpdc.mae.cornell.edu
Symposium on Algorithms and Analysis in
Uncertainty Quantification in the 2008 SIAM
Annual Meeting, San Diego CA, July 7-11, 2008
2
TRANSPORT IN HETEROGENEOUS MEDIA

Thermal and fluid transport in heterogeneous
media are ubiquitous
Range from large scale systems (geothermal
systems) to the small scale
Most critical devices/applications utilize
heterogeneous/polycrystalline/functionally graded
materials

Thermal transport through polycrystalline and
functionally graded materials

Properties depend on the distribution of
material/microstructure
But only possess limited information about the
microstructure/property distribution

- Naturally leads to stochastic analysis
Hydrodynamic transport through heterogeneous
permeable media
3
STOCHASTIC ANALYSIS A BRIEF LOOK
Include the effects of uncertainties in initial,
boundary conditions, parameters, constitutive
relations .
Statistical and non-statistical approaches
Monte Carlo sampling Non-intrusive but slow
Representation of uncertainty as additional
dimension
Spectral expansion of uncertainty Generalized
Polynomial Chaos Expansion
GPCE methods Intrusive but fast
Collocation based methods Non-Intrusive and fast

R.G. Ghanem, P.D.Spanos, Stochastic Finite
Elements A SpectralApproach, Dover publications,
1991.
D. Xiu, G.E. Karniadakis, Modeling uncertainty in
steady state diffusion problems via generalized
polynomial chaos, Comput Methods Appl. Mech.
Engrg. 191 (2002) 4927-4948.
X. Wan, G.E. Karniadakis, An adaptive
multi-element generalized polynomial chaos method
for stochastic differential equations, J.Comp.
Physics 209 (2005) 617--642.
B. Ganapathysubramanian, N. Zabaras, Sparse grid
collocation methods for stochastic natural
convection problems, Journal of Computational
Physics, 225 (2007) 652-685

4
STOCHASTIC ANALYSIS STATE-OF-THE-ART

Important necessities of a stochastic framework
Scalable/Parallelizable
Utilize available deterministic simulators

Black box stochastic analysis toolkit Adaptive,
parallel and non-intrusive
Temperature
Y velocity

N. Zabaras, B. Ganapathysubramanian, A scalable
framework for the solution of stochastic inverse
problems using a sparse grid collocation
approach, Journal of Computational Physics, Vol.
227 pp. 4697-4735, 2008
B. Ganapathysubramanian, N. Zabaras, Modelling
diffusion in random heterogeneous media
Data-driven models, stochastic collocation and
the variational multi-scale method, Journal of
Computational Physics, 226 (2007) 326-353
B. Ganapathysubramanian, N. Zabaras, Sparse grid
collocation methods for stochastic natural
convection problems, Journal of Computational
Physics, 225 (2007) 652-685

5
GENERATING VIABLE STOCHASTIC INPUT MODELS
Realistic input models of variability in
properties, parameters necessary for drawing
meaningful conclusions from stochastic analysis
Realistic Input models
Link experimental data to theoretical predictions
Experimental data
Input Model
Deterministic Simulator
Mathematical framework
Stochastic wrapper
Preferably low-dimensional model Ability to
dynamically incorporate data Possibility to
provide error bounds
Stochastic Analysis
What is given? Possible reconstructions of the
distribution
6
GENERATING VIABLE STOCHASTIC INPUT MODELS
Recent efforts to convert experimental data into
stochastic models
Convert experimental data/statistics into
probability distributions Highly application
specific, require expert knowledge in assigning
probabilities, heuristic parameter fitting.
Motivation encode experimentally available data
directly into a low-dimensional continuous space
First investigations into constructing
data-driven reduced order representation of
topological/ material/ property distributions
utilized a Proper Orthogonal Decomposition
(POD/PCA/KLE) based approach.

C. Desceliers, R. Ghanem, C. Soize, Maximum
likelihood estimation of stochastic chaos
representations from experimental data, IJNME 66
(2006) 9781001
L. Guadagnini, A. Guadagnini, D. M. Tartakovsky,
Probabilistic reconstruction of geologic facies,
J. Hydrol. 294 (2004) 5767
B. Ganapathysubramanian, N. Zabaras, Modelling
diffusion in random heterogeneous media
Data-driven models, stochastic collocation and
the variational multi-scale method, J. Comp.
Physics 226 (2007) 326353

7
PROBLEM OF INTEREST
Interested in modeling diffusion through
heterogeneous random media
Aim To develop procedure to predict statistics
of properties of heterogeneous materials
undergoing diffusion based transport
Account for the uncertainties in the topology of
the heterogeneous media

What is given
Realistically speaking, one usually has access to
a few experimental 3D or 2D images of the
microstructure. Statistics of the heterogeneous
microstructure can then be extracted from the
same.
This is our starting point

8
DATA TO CONSTRUCT INPUT MODELS
Only have characterization of property variation
in finite number of regions or finite realizations
Consider the property variation and/or
microstructure to be a stochastic process.
Identify this stochastic process using the
experimental information available
2D microstructure characterization
Process data for statistical invariance of the
structure
volume fraction, 2-point correlation, 3-point
correlations .
All realizations of the stochastic process
satisfy the experimental statistical
relations These microstructures belong to a very
large (possibly) infinite dimensional space.
Convert this representation into a
computationally useful form Finite dimensional
representation
tomographic characterization
9
FINITE DIMENSIONAL REPRESENTATION
The data extraction/reconstruction procedure
gives a set of 3D microstructures. These are
samples from the microstructural space. Need a
qualitative, functional representation of the
topological variation. Must be finite dimensional
for this description to be useful Necessity of
model reduction arises
I Iavg I1a1 I2a2 I3a3 Inan
Represent any microstructure as a linear
combination of the microstructures or some
eigenimages
Move randomness from image to coefficients

an
a1
a2
..

10
REDUCED MODEL OF THE TOPOLOGICAL VARIATIONS
Construct descriptor from sample images. Use
POD Microstructure images (nxnxn pixels)
represented as vectors Ii i1,..,M The
eigenvectors of the covariance matrix are
computed The first N eigenimages are chosen to
represent the microstructures
Represent any microstructure as a linear
combination of the microstructures or some
eigenimages
Move randomness from image to coefficients

an
a1
a2
..

11
PROPER ORTHOGONAL DECOMPOSITION
Suppose we had a collection of data (from
experiments or simulations) for the some
variable/process/parameter
Proper Orthogonal Decomposition (POD), Principal
Component Analysis (PCA), Karhunen Loeve
Decomposition (KLE), Sirovich, Lumley, Ravindran,
Ito
PCA is mathematically defined as an orthogonal
linear transformation that transforms the data to
a new coordinate system such that the greatest
variance by any projection of the data comes to
lie on the first coordinate (called the first
principal component), PCA is theoretically the
most optimum transform for a given data in least
square terms.
Is it possible to identify a basis such that this
data can be represented in the smallest possible
space. I.e find
Such that it is optimal for the data to be
represented as
12
PROPER ORTHOGONAL DECOMPOSITION
Data usually collated in terms of a matrix XT. XT
is shifted to a mean zero value The covariance
matrix of this data is computed.
Method of snapshots
Solve the optimization problem
Compute the eigen values and eigen vectors of the
covariance matrix
where
The reduced description is given by
Eigen-value problem
Requires the computation of the covariance matrix
of the data and subsequent eigen decomposition.
Can become computationally demanding as N
increases. A computationally simpler POD
technique is the method of snapshots
where
13
REDUCED MODEL FOR THE STRUCTURE CONSTRAINTS
Let I be an arbitrary microstructure satisfying
the experimental statistical correlations The PCA
method provides a unique representation of the
image That is, the PCA provides a function
The function is injective but nor
surjective Every image has a unique
mapping But every point
need not define an image in
Construct the subspace of allowable n-tuples
14
CONSTRUCTING THE REDUCED SUBSPACE H
Image I belongs to the class of structures? It
must satisfy certain conditions a) Its volume
fraction must equal the specified volume
fraction b) Volume fraction at every pixel must
be between 0 and 1 c) It should satisfy the
given two point correlation Thus the n tuple
(a1,a2,..,an) must further satisfy some
constraints. Enforce these constraints
sequentially
1. Pixel based constraints
Microstructures represented as discrete images.
Pixels have bounds This results in 2n3 inequality
constraints
15
CONSTRUCTING THE REDUCED SUBSPACE H
2. First order constraints
The Microstructure must satisfy the experimental
volume fraction
This results in one linear equality constraint on
the n-tuple
3. Second order constraints
The Microstructure must satisfy the experimental
two point correlation. This results in a set of
quadratic equality constraints
This can be written as
16
SEQUENTIAL CONSTRUCTION OF THE SUBSPACE
Computational complexity Pixel based constraints
first order constraints result in a simple
convex hull problem Enforcing second order
constraints becomes a problem in quadratic
programming Sequential construction of the
subspace First enforce first order statistics, On
this reduced subspace, enforce second order
statistics Example for a three dimensional space
3 eigen images
17
THE REDUCED MODEL
The sequential contraction procedure a subspace
H, such that all n-tuples from this space result
in acceptable microstructures
H represents the space of coefficients that map
to allowable microstructures. Since H is a plane
in N dimensional space, we call this the
material plane
Since each of the microstructures in the
material plane satisfies all required
statistical properties, they are equally
probable. This observation provides a way to
construct the stochastic model for the allowable
microstructures
Define such that
This is our reduced stochastic model of the
random topology of the microstructure class
18
DEVELOPING INPUT STOCHASTIC MODELS
Data driven techniques for encoding the
variability in properties into a viable, finite
dimensional stochastic model. Advances in using
Bayesian modeling, Random domain
decomposition Aim is to create a seamless
technique that utilizes the tools of the mature
field of property/ microstructure reconstruction
First investigations into constructing
data-driven reduced order representation of
topological/ material/ property distributions
utilized a Principal Component Analysis
(PCA/POD/KLE) based approach.
Generate 3D samples from the microstructure space
and apply PCA to them

an
a1
a2
..

Convert variability of property/microstructure to
variability of coefficients. Not all
combinations allowed. Developed subspace reducing
methodology1 to find the space of allowable
coefficients that reconstruct plausible
microstructures

B. Ganapathysubramanian, N. Zabaras, Modelling
diffusion in random heterogeneous media
Data-driven models, stochastic collocation and
the variational multi-scale method, J Comp
Physics, in press

19
INPUT STOCHASTIC MODELS LINEAR APPROACH
- PCA based approaches find the smallest
coordinate representation of the data . but
assumes that the data lies in a linear vector
space
Only guaranteed to discover the true structure of
data lying on a linear subspace of the high
dimensional input space
What is the result when the data lies in a
nonlinear space?
As the number of input samples increases, PCA
based approaches tend to overestimate the
dimensionality of the reduced representation. Beco
mes computationally challenging

Further related issues
How to generalize it to other properties/structure
s? Can PCA be applied to other classes of
microstructures, say, polycrystals?
How does convergence change as the amount of
information increases? Computationally?

of eigen vectors
NONLINEAR APPRACHES TO MODEL REDUCTION IDEAS
FROM IMAGE PROCESSING, PSYCOLOGY
of samples
20
NONLINEAR REDUCTION THE KEY IDEA
Set of images. Each image 64x64 4096
pixels Each image is a point in 4096 dimensional
space. But each and every image is related (they
are pictures of the same object). Same object but
different poses. That is, all these images lie on
a unique curve (manifold) in ?4096. Can we get a
parametric representation of this curve? Problem
Can the parameters that define this manifold be
extracted, ONLY given these images (points in
?4096) Solution Each image can be uniquely
represented as a point in 2D space (UD,
LR). Strategy based on the manifold learning
problem
Different images of the same object changes in
up-down (UD) and left-right (LR) poses
21
NONLINEAR REDUCTION EXTENSION TO INPUT MODELS
Given some experimental correlation that the
microstructure/property variation
satisfies. Construct several plausible images
of the microstructure/property. Each of these
images consists of , say, n pixels. Each image
is a point in n dimensional space. But each and
every image is related. That is, all these
images lie on a unique curve (manifold) in
?n. Can a low dimensional parameterization of
this curve be computed? Strategy based on a
variant of the manifold learning problem.
Different microstructure realizations satisfying
some experimental correlations
22
A FORMAL DEFINITION OF THE PROBLEM
State the problem as a parameterization problem
(also called the manifold learning problem)
Given a set of N unordered points belonging to a
manifold ? embedded in a high dimensional space
?n, find a low dimensional region ? ? ?d that
parameterizes ?, where d ltlt n
Classical methods in manifold learning have been
methods like the Principle Component Analysis
(PCA) and multidimensional scaling (MDS). These
methods have been shown to extract optimal
mappings when the manifold is embedded linearly
or almost linearly in the input space. In most
cases of interest, the manifold is nonlinearly
embedded in the input space, making the classical
methods of dimension reduction highly
approximate. Two approaches developed that can
extract non-linear structures while maintaining
the computational advantage offered by PCA1,2.

J. B. Tenenbaum, V. De Silva, J. C. Langford, A
global geometric framework for nonlinear
dimension reduction Science 290 (2000),
2319-2323.
S Roweis, L. Saul., Nonlinear Dimensionality
Reduction by Locally Linear Embedding, Science
290 (2000) 2323--2326.

23
AN INTUITIVE PICTURE OF THE STRATEGY
Input data lies on a curved surface in a
high-dimensional space. Key is to unravel and
smooth out this curve to construct a
low-dimensional representation
This unraveling and smoothing corresponds to a
topological transformation that preserves some
notion of the geometry of the manifold.
Defining the appropriate manifold and identifying
properties of the manifold Defining the
appropriate transformation that results in the
low-dimensional equivalent space.

J. B. Tenenbaum, V. De Silva, J. C. Langford, A
global geometric framework for nonlinear
dimension reduction, Science 290 (2000),
2319-2323
S Roweis, L. Saul, Nonlinear Dimensionality
Reduction by Locally Linear Embedding, Science
290 (2000) 2323--2326

24
KEY CONCEPT

Geometry can be preserved if the distances
between the points are preserved Isometric
mapping.
The geometry of the manifold is reflected in the
geodesic distance between points
First step towards reduced representation is to
construct the geodesic distances between all the
sample points

25
MATHEMATICAL ISSUES/DETAILS
1) The set of input data lies on a manifold
embedded in a high dimensional space. Define the
appropriate manifold and identifying properties
of the manifold
2) Identifying the intrinsic dimensionality of
the manifold
3) Constructing a reliable transformation of
points on the manifold to a low dimensional
surrogate space.
26
The Manifold ? ? ?n
Definition By a microstructure x, we mean a
pixelized representation of a topology or
property variation (with n pixels)
Definition Let ? denote the set of
microstructures satisfying a set of statistical
relations (limited experimental information).
Denote this set of limited information as
? is a subset of Rn. ? is a curve in this
space We show that with appropriate choice of a
distance metric, ? is a compact manifold
27
Defining the distance metric
The distance measure, ?, based on how much the
microstructures vary. Two choices A) When the
experimental statistics are explicitly known, the
distance metric can be defined as the difference
in some statistical correlation between two
microstructures. B) When only snap shots of the
data is provided, utilize euclidean metric as the
distance metric
(A) is useful when the statistical difference in
the microstructures/property variations are
important (B) is useful when we want to
incorporate the effect of rotation and scaling
28
Properties of the manifold ? ? ?n
The key to a reasonable dimension reduction is a
good choice of the distance measure
Any choice of functions are allowable as long as
they satisfy the metric properties
Show a sequence of properties that the manifold
satisfies
a) (?, ? ) is a metric space.
b) (?, ? ) is a bounded.
c) (?, ? ) is dense.
d) (?, ? ) is complete.
e) (?, ? ) is a compact metric space1,2.

B. Ganapathysubramanian and N. Zabaras, "A
non-linear dimension reduction methodology for
generating data-driven stochastic input models",
Journal of Computational Physics, Vol. 227, pp.
6612-6637, 2008
J. R. Munkres, Topology, Second edition,
Prentice-Hall, 2000.

29
Properties of the manifold ? ? ?n
Why do we have to show compactness?
Compactness is a very strong condition for a
manifold
In these problems when the data satisfies some
correlations or has some structure it is
straightforward
The basic conditions is that the manifold must be
unraveled The manifold must not have holes or any
singularities Compactness ensures these
well-behaved properties But the strict
compactness condition can be relaxed scope for
future work
30
Mapping a compact manifold to a low-d set
Map close points on the manifold to close point
on the low dimensional space Map points far away
on the manifold to points further away to each
other in the low dimensional space This results
in a isometric transformation of the manifold
embedded in a high dimensional space to its low
dimensional counterpart
31
Mapping a compact manifold to a low-d set
Have no notion of the geometry of the manifold to
start with. Hence cannot construct true geodesic
distances!
Approximate the geodesic distance using the
concept of graph distance ?G(i,j) the distance
of points far away is computed as a sequence of
small hops. This approximation, ?G,
asymptotically matches the actual geodesic
distance ??. In the limit of large number of
samples1,2. (Theorem 4.5 in 1)
Based on results on graph approximations to
geodesics2.

B. Ganapathysubramanian and N. Zabaras, "A
non-linear dimension reduction methodology for
generating data-driven stochastic input models",
Journal of Computational Physics, Vol. 227, pp.
6612-6637, 2008
M.Bernstein, V. deSilva, J.C.Langford,
J.B.Tenenbaum, Graph approximations to geodesics
on embedded manifolds, Dec 2000

32
PAIRWISE DISTANCES TO LOW-D POINTS

Given the N unordered sample points
(microstructures, property maps, )
Compute the geodesic distance between each pair
of samples ??(i,j) .
Given the pairwise distance matrix between N
objects, compute the location of N points, ?i
in ?d such that the distance between these points
is arbitrarily close to the given distance matrix
?? . Basic premise of group of statistical
methods called Multi Dimensional Scaling1 (MDS)

Given N unordered samples
Compute pairwise geodesic distance
Perform MDS on this distance matrix
N points in a low-dimensional space

T.F.Cox, M.A.A.Cox, Multidimensional scaling,
1994, Chapman and Hall

33
Choosing the dimensionality of the reduced space
Perform MDS on the geodesic matrix. i.e perform
an eigenvalue decomposition of the squared
geodesic matrix. The largest d eigenvalues are
the coordinates of the N points.
The manifold has an intrinsic dimensionality. How
to choose the correct value of d? (related with
issues of accuracy and computational effort)
Can choose the dimensionality based on the
eigen-spectrum. Similar procedure as followed in
PCA
Estimate the dimensionality of the manifold based
on a novel geometrical probability approach
(developed by A. Hero et. al.)

B. Ganapathysubramanian and N. Zabaras, "A
non-linear dimension reduction methodology for
generating data-driven stochastic input models",
Journal of Computational Physics, Vol. 227, pp.
6612-6637, 2008
J.A.Costa, A.O.Hero, Geodesic Entropic Graphs for
Dimension and Entropy Estimation in Manifold
Learning, IEEE Trans. on Signal Processing, 52
(2004) 2210--2221.

34
Choosing the dimensionality of the reduced space
The intrinsic dimension of an embedded manifold
is estimated using a novel geometrical
probability approach. This work is based on a
powerful result in geometric probability - the
Breadwood-Halton-Hammersley theorem where d is
linked to the rate of convergence of the
length-functional of the minimal spanning tree of
the geodesic distance matrix of the unordered
data points in the high-dimensional space.
Consistent estimates of the intrinsic dimension
d of the sample set are obtained using a very
simple procedure.
35
Some basic definitions in graph theory
A graph consists of two types of elements, namely
vertices and edges.
A weighted graph associates a weight (here,
this weight is the geodesic distance between the
vertices) with every edge in the graph. A
tree is a graph in which any two vertices are
connected by exactly one path. A spanning tree
of a graph (with k vertices) is a subset of k - 1
edges that form a tree. The minimum spanning
tree of a weighted graph is a set of edges of
minimum total weight which form a spanning tree
36
The BHH Theorem and link to the dimension
The rate of change of the length functional as
more number of points are chosen is related to
the dimensionality of the manifold
with
37
THE REDUCED ORDER STOCHASTIC MODEL
? ? ?d
?? ?n.
Given N unordered samples
N points in a low dimensional space
The procedure results in N points in a
low-dimensional space. The geodesic distance
MDS step (Isomap algorithm1) results in a
low-dimensional convex, connected space2, ? ? ?d.
Using the N samples, the reduced space is given
as
? serves as the surrogate space for ?. Access
variability in ? by sampling over ?. BUT have
only come up with ? ?? map . Need ??? map too

J. B. Tenenbaum, V. De Silva, J. C. Langford, A
global geometric framework for nonlinear
dimension reduction Science 290 (2000),
2319-2323.
B. Ganapathysubramanian and N. Zabaras, "A
non-linear dimension reduction methodology for
generating data-driven stochastic input models",
Journal of Computational Physics, Vol. 227, pp.
6612-6637, 2008

38
THE REDUCED ORDER STOCHASTIC MODEL
Only have N pairs to construct ??? map. Various
possibilities based on specific problem at hand.
But have to be conscious about computational
effort and efficiency. Illustrate 3 such
possibilities below. Error bounds can be
computed1.
? ? ?d
?? ?n
? ? ?d
?? ?n
2. Local linear interpolation
1. Nearest neighbor map
? ? ?d
?? ?n
3. Local linear interpolation with projection
1. B. Ganapathysubramanian and N. Zabaras, "A
non-linear dimension reduction methodology for
generating data-driven stochastic input models",
Journal of Computational Physics, Vol. 227, pp.
6612-6637, 2008
39
THE LOW DIMENSIONAL STOCHASTIC MODEL

Algorithm consists of two parts.
Compute the low dimensional representation of a
set of N unordered sample points belonging to a
high dimensional space

N points in a low dimensional space
Given N unordered samples
Compute pairwise geodesic distance
Perform MDS on this distance matrix
For using this model in a stochastic collocation
framework, must sample points in ??? 2) For an
arbitrary point ? ? must fins the corresponding
point x ?. Compute the mapping from ???
? ? ?d
?? ?n.
40
THE NONLINEAR MODEL REDUCTION ALGORITHM
Experimental information, gappy data
A good alternative to linear based
strategies. Sequential assimilation now
possible Link with statistical learning for
improved performance Issues 1) Generating an
appropriate basis 2) Investigate low-dimensional
mappings for output
41
NUMERICAL EXAMPLE
Given an experimental image of a two-phase
metal-metal composite (Silver-Tungsten
composite). Find the variability in temperature
arising due to the uncertainty in the knowledge
of the exact 3D material distribution of the
specified microstructure.
Problem strategy Extract pertinent statistical
information form the experimental
image Reconstruct dataset of plausible 3D
microstructures Construct a low dimensional
parametrization of this space of
microstructures Solve the SPDE for temperature
evolution using this input model in a stochastic
collocation framework
T -0.5
T 0.5
1. S. Umekawa, R. Kotfila, O. D. Sherby, Elastic
properties of a tungsten-silver composite above
and below the melting point of silver J. Mech.
Phys. Solids 13 (1965) 229-230
42
TWO PHASE MATERIAL
Experimental image
Experimental statistics
Realizations of 3D microstructure
GRF statistics
43
NON LINEAR DIMENSION REDUCTION
The developments detailed before are applied to
find a low dimensional representation of these
1000 microstructure samples. The optimal
representation of these points was a 9
dimensional region
Able to theoretically show that these points in
9D space form a convex region in ?9. This convex
region now represents the low dimensional
stochastic input space Use sparse grid
collocation strategies to sample this space.
44
COMPUTATIONAL DETAILS
The construction of the stochastic solution
through sparse grid collocation level 5
interpolation scheme used Number of deterministic
problems solved 26017
Computational domain of each deterministic
problem 65x65x65 pixels
Total number of dofs 653x26017 7x109
Computational platform 50 nodes on local Linux
cluster (x2 3.2 GHz) Total time 210 minutes
45
MEAN TEMPERATURE PROFILE
d
c
e
b

Temp contour
(b-d) Temp isocontours
(e-g) Temp slices

f
a
g
46
HIGHER ORDER TEMPERATURE STATISTICS
c
b
d

Temp contour
(b) Temp isocontours
(c) PDF of temp
(d-f) Temp slices

e
a
f
47
EFFECT OF UNCERTAINTIES IN FLOW
Utilize the stochastic analysis framework to
analyze an interesting problem in flow through
heterogeneous random media Motivation discrete
permeability measurements, some amount of
randomness Question What is more important-
large or fine scale structure?
State-of-art Deterministic multiscaling (Aarnes
200,2004, Juanes 2006, Arbogast 2007) Stochastic
analysis limited to estimating moments (Zhang
2002, Tartakovsky 2004 )
48
EFFECT OF UNCERTAINTIES IN FLOW
Experimental data of the semi-variogram of the
permeability distribution is known. There is a
impermeable fault across the field Uncertainty in
the location of the fault line as well as
uncertainties in the permeability distribution
200 ft
Construct a model for the fine-scale permeability
variation from the experimental correlations
determined. Assume 10 error in the location and
characterization of the fault.
Bureau of economic geology
49
UNCERTAINTY IN FINE SCALE PERMEABILITY
Extract statistics from experimental
data. Represent permeability variation as a
stochastic field. Need 25 dimensions to
accurately represent permeability. Permeability
given as 100x100 grid blocks Coarse-scale
representation 10x10 elements
Log permeability
50
UNCERTAINTY IN FINE SCALE PERMEABILITY
Compare with MC results 1x106 realizations for
statistics 900 hours on 120 processors
Assume that the fracture location is
known. Upscaled 320 variables for stochastic
analysis. Effective dof in a spectral setting
4.36 x 109. 96 hours on 120 processors of Linux
cluster. Conventional sparse grid 2.9 x 109.
mean
mean
var
var
Mean pressure
e(p)
e(v)
velocity
Pressure variance
pressure
51
UNCERTAINTY IN FINE SCALE PERMEABILITY
Uncertainty in fine-scale permeability causes
diffusion everywhere. Most prominent near
source and sink 10 uncertainty localized
regions of larger magnitude
Variance pressure
Mean pressure
Variance velocity
Mean velocity
PDF at (180, 20)
52
UNCERTAINTY IN FAULT LOCATION
Investigate uncertainties caused by long-range
permeability features. 10 uncertainty in the
location and characteristics of the
fault. Described by 4 uniform random variables 4
hours on 160 processors of Linux cluster
mean
mean
var
var
Adaptive grid
e(p)
e(v)
Some of the dimensions are more
important Adaptivity very efficient. Conventional
sparse grid 8000 samples
pressure
velocity
53
UNCERTAINTY IN FAULT LOCATION
Large deviations near the fault
region. Probability distribution peaked at a few
locations Evidence of mode shifts for small
variation in fault location Find sensitive
dimension
var velocity
Mean pressure
Mean velocity
PDF at (120, 40)
PDF at (100, 60)
54
UNCERTAINTY IN FAULT LOCATION
Most sensitive variable is the fault
location Mode shifts apparent 1 change in
location results in substantial reorganization of
flow Completely resolved by coarse-grid Large-scal
e features critically more important than
smaller-scale features
Difference between two realizations 0.01 apart
PDF at (120, 40)
Var velocity
55
EFFECT OF LOCALIZED UNCERTAINTIES
Effect of localized uncertainties Permeability in
one block of the domain is uncertain Variation of
5 orders of magnitude Limited information about
uncertain block
56
DATA DRIVEN MODELS
Given 1000 snapshots of plausible permeability
distributions (using reconstruction) Assume they
share some statistical features. Construct a
reduced-order model that encodes this without
knowing the features a priori. Snapshots lie on a
manifold. Unravel the manifold. Compute the
pair-wise distances. Estimate optimal
dimensionality. Dimensionality estimate 5
Log permeability
57
UNCERTAINTY IN FINE SCALE PERMEABILITY
Mean pressure
Deviation pressure
Deviation velocity
Mean velocity
58
UNCERTAINTY IN FINE SCALE PERMEABILITY
PDF of pressure and flux at two points Large
support of pdf on domain with uncertainty
Pressure more sensitive to uncertainty than flux
59
CONCLUSIONS

Developed an efficient data-driven non-linear
model reduction technique for experimental
statistics into viable stochastic input models.
Seamlessly meshes with any reconstruction method
Showcased the framework to construct a reduced
model of topology of two-phase material given
limited statistical data
This methodology has significant applications to
problems where working in high dimensional spaces
is computationally intractable visualizing
property evolution, process-property maps,
searching and contouring

RELATED PUBLICATIONS

B. Ganapathysubramanian and N. Zabaras, "A
non-linear dimension reduction methodology for
generating data-driven stochastic input models",
Journal of Computational Physics, Vol. 227, pp.
6612-6637, 2008
B. Ganapathysubramanian and N. Zabaras,
"Modelling diffusion in random heterogeneous
media Data-driven models, stochastic collocation
and the variational multi-scale method", Journal
of Computational Physics, Vol. 226, pp. 326-353,
2007
B. Ganapathysubramanian and N. Zabaras, "A
stochastic multiscale framework for modeling flow
through heterogeneous porous media", Journal of
Computational Physics, under review