Definizione e Attivazione della Rete per il Supporto Metodologico e lAttivit di Ricerca del Dipartim - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Definizione e Attivazione della Rete per il Supporto Metodologico e lAttivit di Ricerca del Dipartim

Description:

To make available BAYESIAN NETWORKS in the framework of complex ... n=989 (Tuscany) Sample design= stratified (9 strata, SRS) Prin 2002 - Unit di Perugia ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 13
Provided by: ist455
Category:

less

Transcript and Presenter's Notes

Title: Definizione e Attivazione della Rete per il Supporto Metodologico e lAttivit di Ricerca del Dipartim


1
Metodi statistici per lintegrazione di dati da
più fonti
Roma 9-10 Dicembre 2004
2
Reti bayesiane e campionamento complesso da
popolazioni finite
  • Marco Ballin Mauro Scanu
  • Istituto Nazionale di Statistica
  • Paola Vicard
  • Università degli Studi - Roma Tre

Prin 2002 - Unità di Perugia
3
Aim of the methodogical project
  • To make available BAYESIAN NETWORKS in the
    framework of complex sampling design on finite
    population

focus of the talk
Definition of Bayesian networks in the case of
stratified sampling design
Prin 2002 - Unità di Perugia
4
Why BNs?
BAYESIAN NETWORKS are a very powerful tool
suitable to manage problems with many variables.
In which contexts are BNs developed and applied?
Medicine-Biostatistics (diagnosis) Forensic
Statistics (identification) Finance (customer
segmentation-classification) Troubleshooting
(decision problems) ..
Prin 2002 - Unità di Perugia
5
Are BNs useful in official statistics?
  • Some experiences
  • Imputation (Di Zio et al,..later this afternoon)
  • Statistical Matching (Istat working group on SAM,
    2004)
  • Description of Census results (Getoor et al,
    2001)
  • General useful characteristics of BNs
  • Straightforward description of statistical
    relationships among variables by a graphical
    representation.
  • Fast and easy propagation of evidence
  • Useful tool to describe possible scenarios
  • Simple updating of high dimensional distributions
    given auxiliary information

Prin 2002 - Unità di Perugia
6
What is a BN?
  • NodesVariables
  • Directed edges directed relationships
  • Each node has associated a ddp given its parents

Chain rule P(X1,,Xk)? P(XiX1,,Xi-1) ?
P(Xiparents(Xi))
Prin 2002 - Unità di Perugia
7
What is a BN?
  • A BN is defined by
  • The set of edges (structure)
  • The conditional probability distributions
    (parameters)

Methods and software for BN estimation (structure
and parameters) are developed under the iid
assumption.
But
Prin 2002 - Unità di Perugia
8
Whats the problem in using BNs in complex
surveys?
  • Sample design is not taken into consideration by
    the usual methods and software
  • Therefore
  • The estimated structure and parameters are not
    consistent with those obtained through the usual
    finite population techniques

Prin 2002 - Unità di Perugia
9
Whats the problem in using BNs in complex
surveys?
  • Our proposal to overcome the problem
  • Introduce an additional node S describing the
    sampling design
  • Add arrows from S to each variable
  • Learn the structure and estimate parameters
    conditionally on S

Prin 2002 - Unità di Perugia
10
A toy example on real data
  • Survey business survey on farms
  • Variables
  • multifunctionality,
  • altimetry,
  • Internet,
  • classes of Gross Operative Margin
  • n989 (Tuscany)
  • Sample design stratified (9 strata, SRS)

Prin 2002 - Unità di Perugia
11
Future developments
  • Understanding of the BN based estimator
    properties and analogies with calibration methods
  • Develop alternative methods to learn the
    structure in the case of finite population and
    complex sample design
  • Development of a software suitable for finite
    population context
  • Application of BN in the following contexts
  • Consistency among different surveys results
  • Integration of different sources
  • What happens when the data set is partially
    observed
  • Application to the imputation framework
  • Description (choosing contingency tables)

Prin 2002 - Unità di Perugia
12
Metodi statistici per lintegrazione di dati da
più fonti
Roma 9-10 Dicembre 2004
Write a Comment
User Comments (0)
About PowerShow.com