Loading...

PPT – BAYESIAN NETWORKS PowerPoint presentation | free to download - id: 7c342e-NDZmM

The Adobe Flash plugin is needed to view this content

BAYESIAN NETWORKS IN MODEL AND DATA INTEGRATION

AND DECISION MAKING IN RIVER BASIN MANAGEMENT

USING Consideration of opportunities for Bayes

networks in predictive water quality

modelling Olli Malve (M. Sc.) Water Resources

Research and Management Citations from Ames,

D.P. Neilson, B. (Utah Water resources

Laboratory) 2001 Bayesian Decision Networks for

total maximum daily load analysis East Canyon

Creek Case Study (WWW-document). Reckhow, K.H.

(UNC Water Resources Research Institute, North

Carolina State University, Raleigh, USA) 1999

Water quality prediction and probability network

models probability network model for nitrogen

enrichment and algal blooms in the Neuse River

(Can.J. Aquat. Sci. 561150-1158 (1999)). see

also http//www2.ncsu.edu/ncsu/CIL/WRRI/ken's_pa

ge.html (home page of K. Reckhov) http//www.epa.g

ov/OWOW/tmdl/ (A Total Maximum Daily Load (TMDL)

program)

Bayes network for discrete variables implement

with Hugin software

- Do not include real Bayesian update of parameters

with new data. - There are several other statistical and

computational methods and software one of the

best - OpenBugs for continuous variables was used

in hierarchical modeling of Finnish lakes. - Resembles Structural equation models.
- They both belong to the family of Graphical

probibilistic models.

Hirarchical linear chlorophyll a model

DAG diagram

ß

s2

ßi

s2i

t

ßij

yijk

xijk

Structural equation model

LAKE PYHÄJÄRVI in SÄKYLÄ research model

Planktiv Planktivorous fish Z zooplankton

(Crustacea) A3- Cyanobacteria TP total

phosphorus TN total nitrogen

PHYSICAL WAY OF THINKING Hydraulic routing of

ground and surface water flow in drainage basin,

in river channels, in lakes and in estuaries.

Drainage basin, river, lake and estuary are

linked with hydraulic principles High spatial

and temporal resolution

STATISTICAL INFERENCE Small-scale transport and

transformation processes of pollutants in

drainage basin are summarized with

probabilistic expression that characterize the

aggregate response of interest to the decision

makers.

Outcomes expressed as probabilities are an

acknowledgement of the lack of precission in

predictive models

BAYES NETWORKS Formally, BNTs are directed

acyclic graphs in which each node represents a

random variable, or uncertain quantity, whick can

take two or more possible values.

Each node represents a multi-valued variable,

comprising a collection of mutually exclusive

hypothesis (state of a lake Oligotrophic,

Mesotrophic, Eutrophic) or observations

(nutrient loading Low, Medium, High)

The arcs signify the existence of direct causal

influence between the linked variables, and the

strength of these influences are quantified by

conditional probabilities

Conditional probability (each direct link X-gtY)

discrete variables is quantified by a fixed

conditional probability matrix M, in which the

(x,y) entry is given by Myx?P(yx) ?P(Yy

Xx) P(y1x1) P(y2x1) ... P(ynx1) P(y1x2)

P(y2x2) ... P(ynx2) . .

. . . . .

. . P(y1xm) P(y2xm) ...

P(ynxm)

QUANTIFYING THE LINKS Bayes learning of

Conditional Probability Matrix (CPM) from 1.

Observational data -simultaneus observations of

each variable are tabulated, sorted by the parent

variables and converted into categories as

prescribed in node definitions. -for every

combination of states of parent nodes, the number

of occurences of states of the child is

counted. -probabilities are calculated as a

number of occurences of a child state divided by

the total number of observations for the

combination of parent states

2.Parameter learning from Model simulations

(uncertainty analysis such as Monte Carlo

simulations) -varying the selected input

variables about an appropriate distribution and

drawing random samples from model parameter

distributions -gtresults of simulations at the

selected output variables are tabulated with

their corresponding set of input variable

conditions -gtCPM is generated from this data

tabulation using the same method described above

for observational data

3. Parameter learning from scientists, experts,

stakeholders, cost and benifits If data is not

available and typical models are not appropriate,

conditional probability tables can be generated

by eliciting information from experts and

stakeholders. -in the case of cost and benifit

analysis for example the costs assosiated with

wastewater treatment plant upgrade will likely

need to be elicited from experts and through

market inquiries -benefits assosiated with water

quality improvement (recreation, biological

habitat, esthetics and other environmental

benefits) are subjective in nature and are

difficult to quantify without input from local

individuals, stakeholders and experts The

probabilistic relationships described here may be

more difficult to generate than those calculated

from data and models.

DECISIONS AND UTILITY A Bayesian Decision Network

(BDN) is a specific form of a Bayesian network

that includes decision and utility nodes and is

used to model the relationship between decisions

and outcomes. Decision node contain descrete

options instead of a probability distribution

across states. Decision node can only exist in

one state at a time, representing a decision or

management option made between multiple

choices. Utility node provide a simple mean for

estimating expected values of different outcomes.

Expected value E of an uncertain outcome with n

states (i1n) is computed as E?Pi Bi , where

a benifit Bi, associated with each state, and a

probability, Pi, of being in each state.

APPLICATION OF Bayes Decision Networks 1.

Defining the problem 2. Integrating disparate

data rources 3. Scenario generation and

analysis 4. Building a Bayesian Decision Network

(Influence diagram) 5. Obtaining Probability

Distributions

Decision tree

- Bayesin networks can be transformed to decision

tree

Bayes net

Decision tree

0.7

Get ill

Algal bloom (yes/no)

Algal bloom

yes

Go swimming (yes/no)

0.3

yes

Feeling well

no

Go swimming

0.1

Get ill

no

Algal bloom

Get ill (yes/no)

yes

0.9

Feeling well

no

Hot sunshine

LIST OF REFERENCES Varis, O. (1990 onwards) 1.

Restoration of a temperate lake 2. Fisheries

management in trophical reservoir 3. Real-time

monitoring system for a river 4. Rehabilitation

of fisheries in a temperate river 5. Cod

fisheries management 6. Salmon fisheries

management 7. Cost-effective wastewater treatment

for a river 8. A nationwide climatic change

impact assessment

Ames, D.P. Neilson, B. (Utah Water resources

Laboratory) 2001 Bayesian Decision Networks for

total maximum daily load analysis East Canyon

Creek Case Study. Reckhow, K.H. (UNC Water

Resources Research Institute, North Carolina

State University, Raleigh, USA) 1999 Water

quality prediction and probability network

models probability network model for nitrogen

enrichment and algal blooms in the Neuse River

(Can.J. Aquat. Sci. 561150-1158 (1999)). .

SUMMARY Bayesian Decision Networks provide

successful way to make educated decisions. BDN is

simple for stakeholder involvement and

understanding, while still containing proven and

defensible science. BDN is a tool for

communication between scientists, stakeholders

and decision makers.

Bayesian Decision Networks 1. provides a good

conceptual framework for clear defining relevant

variables 2. etablishes the relationship between

causes and effects in the system 3. Integrates

different sources of information into a single

analytic tool 4. Captures model responses for

quick scenario generation and investigation 5.

Quantifies risk which can be used in establishing

the marigin of safety

A carefully devised and calibrated probabiltiy

network model is ideally designed to communicate

at the interface between scientists,

stakeholders, and decision makers. By

acknowledging the sometimes-substantial

uncertainty in model predictions, we enhance,

rather than diminish, the value of predictive

modelling by focusing on the model ability to

estimate risk.

Bayesian Decision network (Influence diagram) of

Lake Säkylän Pyhäjärvi

Management scenarion

Studying the effect of zooplankton and TotP-load

Studying the effect of management actions on the

costs and the attainment of water quality

standards

Conditional marginal distributions of costs,

attainment of water quality satndard and

Cyanobacteria (BlueGmax) summer maximum biomass

with given Buffer Strip width (21 36 m),

wetland percentage (1.1 1.25 ), forestation

(25 31 ) and fish catch ( 3, in a artificial

scale which will be replaced after expert

judgement).

Water quality modelling and probability network

models with reference to Reckhow, K.H. Can. J.

Fish. Aquat. Sci. 561150-1158 (1999). Modelling

for nitrogen enrichment and algal blooms in Nuese

River, Canada with Bayes nets - probabilistic

prediction of eutrophication

Initial forcing function Spring precipitation

is expressed as marginal probabilities assessed

from statistics on historic precipitation data in

the watershed. Distribution was segmented into

three eually likely precipitation ranges (below

average, average, above average).

The probabilities for precentage forested

buffer reflect a judgemental assessment of the

total perennial stream miles in the Neuese River

watershed that would be required to have a

maintained minimum width buffer, based on the

project outcome of proposed management plans. The

resultant probability estimates are given in the

table.

Conditional probabilities were assessed for the

four intermediate conditional probabilities.

Precentage of nitrogen load reduction was

conditional on only the precentage of forested

buffer. A scientific expert was consulted for a

probabilistic statement reflecting the expected

reduction in nitrogen loading due to buffers

alone.

The nitrogen concentration was expressed as a

fuction of spring precipitation and the

nitrogen loading reduction in the absence of

data to fit a statistical model for these

variables, nitrogen concentration was based on

scientific judgement. The relationship between

summer precipitation and summer streamflow

were based on the statistical model developed

from precipitaion and sreamflow data.

The conditional probabilities for the reponse

variable algal bloom were based in the

scientific judgement (for the effect of nitrogen

concentration) and in part on the interpretation

of chlorophyll a versus flow data. Using the

data, the chlorophyll levels were grouped to

algal bloom categories, and flow data were

grouped into flow categories. The relative

frequency of data points in each algal bloom /

flow group determined the initial

probabilities these probabilities were further

decomposed, using judgement, to account for the

effect of nitrogen concentration.

Conditional probabilities for anoxia were based

on judgement. These responce variable conditional

probabilitites are presented in the table below.

Probabilities expressed in earlier pages can be

combined into a joint probability on all

variables, which when allows us to solve for a

number of interesting variables. While all

marginal and conditional probabilities can be

easily calculated using the estimates,

computation in larges problems is facilitaed with

Bayes nets software.

From the probabilities expressed earlier the

marginal probability of anoxia is 0.30 in

Bayesian terms, this calculation reflects only

prior information. If the implementation of

management option could assure that at least 95

of streams had the the required buffer (p(95-100

for forested buffer) 1.0), then anoxia

probability drops slightly to 0.27. This

calculation, although hypothetical, is indicative

of the types of policy related questions that can

be addressed with a complete probabiltiy network

model.

As another example probabilities presented

earlier yields p(severe algal bloom) 0.18. We

can make the Bayes net more useful by combining

the prior probabilities in the network with new

(sample) information to produce a posterior

probabiltity, using Bayes Theorem. For example,

if spring precipitation is observed to be above

average, then the conditional probability becomes

p(sereve algal bloom/above-average spring

precipitation) 0.21. If, instead, summer srteam

flow is extreamly low (lt500 ft3/s) then p(serve

algal bloom/summer flow lt500) 0.33. Both events

together yield p(serve algal bloom/ above-average

spring precipitation, summer flow lt500)

0.37. The types of what if probabilistic

calculations are relatively quick and easy, even

with much larger and more realistic probability

network model. In addition, since Bayes Theorem

allows the new observational information to be

combined with the prior probabilities, as more

observational information is incorporated into

the analysis, the often-subjective prior is

dominated by the newer data-based sample

evidence. Outcomes expressed as probabilities

are an acknowledgement of the lack of precision

in predictive models. The probabilities, and

relative change in probabilities between

scenarios, give decision makers and stakeholders

an explicit characterization of risk.

For example, is the probability of a severe algal

bloom of 0.37 unacceptably high? If management

actioncould reduce this probability from 0.37 to

0.20, is that worthwhile, given the costs and

changes in attributes? Questions like these are

of interest to stakeholders can be examined using

probabiltiy network models. The example above

discussed above suggests that the primary sources

of information used to characterize

probabiltities are (1) observational/experimental

evidence or data and (2) expert scientific

judgement. In conventional modelling studies,

observational information that is based on

precise measurements of variable or ralationship

of interest is likely to be the least

controversial and most useful information. It is

unfortunate but common fact that observational

data for parametrization of water quality models

are almost allways woefully inadequate for the

task.

What would be the basis for selection of a

predictive model? Few will argue againts the

viewpoint that the model should be as simple as

possible. However, it is also true that few argue

againts the viewpoint that model as accurate as

possible, and it is likely that few will argue

againts the viewpoint that a model should

correctly characterize process. Unfortunately ,

these desirable features for models are often in

conflict with one another. Here a probability

network model is recommented as a predictive

model to guide Neuse river decision making

because of uncertainty, or accuracy, is believed

to be an essential attribute for a predictive

model. Does this mean that we can ignore correct

process discription and focus on probabilities?

No! It is important to regonize that any process

model can be easily incorporated into a

probability network model if the accuracy of the

mathematical process discription can be

quantified and is acceptable. For example, any

(or all) of the mecanistic process discriptions

in CE-QUAL-W2 can be represented in a

probability network model if all relationships

are expressed probabilistically. For this to

happen, of course, a complite uncertainty

analysis must be undertaken for the CE-QUAL-W2

process description.