Data Analysis - PowerPoint PPT Presentation

Loading...

PPT – Data Analysis PowerPoint presentation | free to download - id: 67a028-MmI5Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Data Analysis

Description:

Data Analysis Part1: The Initial Questions of the AFCS Madhu Natarajan, Rama Ranganathan AFCS Annual Meeting 2003 The first five dimensions: the reduced calcium ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Date added: 9 March 2020
Slides: 44
Provided by: RRanga2
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Data Analysis


1
Data Analysis Part1 The Initial Questions of
the AFCS Madhu Natarajan, Rama Ranganathan AFCS
Annual Meeting 2003
2
Data Analysis The Initial Questions of the AFCS
  • What are the goals of data analysis right now?
    Our first general questionhow complex is
    signaling in cells?

3
Data Analysis The Initial Questions of the AFCS
  • What are the goals of the analysis?
  • Quantitative measurement of the similarity (or
    dissimilarity) of the responses to different
    ligands.

4
Data Analysis The Initial Questions of the AFCS
  • What are the goals of the analysis?
  • (1) Quantitative measurement of the similarity
    (or dissimilarity) of the responses to different
    ligands.
  • Quantitative evaluation of the interactions
    between pairs of ligand responses, and an
    estimation of total interaction density. The
    next talk

5
Data Analysis The Initial Questions of the AFCS
  • What are the goals of the analysis?
  • Quantitative measurement of the similarity (or
    dissimilarity) of the responses to different
    ligands. This experiment is designed to provide
    a response pattern for a ligand sampled at
    several points in the signaling network. It may
    or may not provide much information about
    specific mechanism.

6
Data Analysis The Initial Questions of the AFCS
  • What are the goals of the analysis?

Calcium time points
cAMP time points
. . .
  • (1) Quantitative measurement of the similarity
    (or dissimilarity) of the responses to different
    ligands. The problems to solve
  • A way of combining all the multivariate output
    data into general parameters that represent
    signaling.

7
Data Analysis The Initial Questions of the AFCS
  • What are the goals of the analysis?

Calcium time points
cAMP time points
. . .
  • (1) Quantitative measurement of the similarity
    (or dissimilarity) of the responses to different
    ligands. Issues here
  • A way of combining all the multivariate output
    data into general parameters that represent
    signaling.
  • A way of collapsing the non-independent
    outputshow many independent variables are there
    in a calcium trace?

8
Data Analysis The Initial Questions of the AFCS
  • What are the goals of the analysis?

Calcium time points
cAMP time points
. . .
  • (1) Quantitative measurement of the similarity
    (or dissimilarity) of the responses to different
    ligands. Issues here
  • A way of combining all the multivariate output
    data into general parameters that represent
    signaling.
  • A way of collapsing the non-independent outputs
  • A formalism for calculating the similarity of
    ligand responses.

9
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale?
    If we do that, then how do we create a sensible
    representation of the complete dataset for each
    ligand?

10
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?
  • One approach is to make an Gaussian error model
    for the unstimulated value of each variable.
    Then convert each variable for a ligand into the
    statistical significance of observing the value
    given the unstimulated value and error model.

s
Observed value
basal
11
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?
  • One approach is to make an Gaussian error model
    for the unstimulated value of each variable.
    Then convert each variable for a ligand into the
    statistical significance of observing the value
    given the unstimulated value and error model.

s
Observed value
basal
So, we define a parameter S (for significance or
signaling)
12
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?
  • One approach is to make an Gaussian error model
    for the unstimulated value of each variable. Then
    convert each variable for a ligand into the
    statistical significance of observing the value
    given the unstimulated value and error model.

s0.7
3.8
1.5
So for an observed value of 3.8 given a basal
value of 1.5 and a standard deviation of 0.7, you
get an S value of 3.29.
13
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?
  • One approach is to make an Gaussian error model
    for the unstimulated value of each variable. Then
    convert each variable for a ligand into the
    statistical significance of observing the value
    given the unstimulated value and error model.

s1.4
3.8
1.5
So for an observed value of 3.8 given a basal
value of 1.5 and a standard deviation of 0.7, you
get an S value of 3.29. But if the standard
deviation was 1.4, then S is only 1.68
14
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?

3.29
Our observed variable and basal value get
transformed into these new units of statistical
significance.
15
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?

3.29
Our observed variable and basal value get
transformed into these new units of statistical
significance. Why do this? Every data element
we collect (regardless of type, time scale,
method of collection) can now be put on a common
basis for comparison, clustering, etc. The only
assumption is that the basal value is normally
distributed around its mean.
16
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?

3.29
Our observed variable and basal value get
transformed into these new units of statistical
significance. Why do this? Every data element
we collect (regardless of type, time scale,
method of collection) can now be put on a common
basis for comparison, clustering, etc. Also,
provides a suitable measure for talking about the
interaction of two ligandsthe additivity of S.
17
Quantitative measurement of similarity in ligand
screen data
  • Merging different types of data. How can we put
    all the experimental variables on a common scale
    and then create a unified representation of the
    dataset for each ligand?

3.29
Now what about all experimental variables?
18
Quantitative measurement of similarity in ligand
screen data
The Experiment Space
A highly multi-dimensional space, but one that
behaves just like three-dimensional space. Each
variable gets an independent dimension, and so a
complete single ligand dataset is one vector in
this space.
19
Quantitative measurement of similarity in ligand
screen data
The Experiment Space
What can we learn from this representation?
20
Quantitative measurement of similarity in ligand
screen data
The Experiment Space
  • What can we learn from this representation?
  • The response profile for each ligand is the final
    S vector.

21
Quantitative measurement of similarity in ligand
screen data
The Experiment Space
  • What can we learn from this representation?
  • The response profile for each ligand is the final
    S vector.
  • Differences between ligand responses have a
    natural meaning

22
Quantitative measurement of similarity in ligand
screen data
The Experiment Space
  • What can we learn from this representation?
  • The response profile for each ligand is the final
    S vector.
  • Differences between ligand responses have a
    natural meaning

DS1,2
23
Quantitative measurement of similarity in ligand
screen data
The Experiment Space
  • What can we learn from this representation?
  • The response profile for each ligand is the final
    S vector.
  • Differences between ligand responses have a
    natural meaningand this preserves the dimensions
    along which the differences occur.

DS1,2
24
Quantitative measurement of similarity in ligand
screen data
The Experiment Space
  • What can we learn from this representation?
  • The response profile for each ligand is the final
    S vector.
  • Differences between ligand responses have a
    natural meaningand this preserves the dimensions
    along which the differences occur.
  • So lets start constructing the experiment space
    for the B cell data

DS1,2
25
Converting raw data to S variables
LPA
Fluorescence units
Time (sec)
26
Converting raw data to S variables
LPA
LPA
Signaling units (S)
Fluorescence units
Time (sec)
Time (sec)
27
Converting raw data to S variables
LPA
LPA
Signaling units (S)
BLC
BLC
Fluorescence units
Signaling units (S)
AIG
AIG
Signaling units (S)
Time (sec)
Time (sec)
28
Converting all the raw data for one experiment
type to S variables
0
600
Time (sec)
29
Converting all the raw data for one experiment
type to S variables
0
600
Time (sec)
S
30
Converting all the raw data for one experiment
type to S variables
0
600
Time (sec)
S
31
Converting all the raw data for one experiment
type to S variables
0
600
Time (sec)
Now, 200 separate variables for the calcium
traces is clearly idiotic
S
32
Data reductiona cluster-based approach
0
600
Time (sec)
33
Data reductiona cluster-based approach
0
600
Time (sec)
1
2
3
4
5
2
1
3
4
5
34
The first five dimensions the reduced calcium
response
0
600
Time (sec)
1
2
3
4
5
35
All the dimensions (minus gene expression)
2.5
5.0
30
15
1
2
3
4
5
3
8
20
.5
1
36
Clustering the experiment space
5.0
2.5
30
15
1
2
3
4
5
.5
1
3
8
20
37
  • Conclusions
  • A simple transformation of raw data variables
    into dimensionless S variables (units of
    significance) permits construction of an unified
    experiment space of all data, regardless of
    source or differences in intrinsic dynamic range
    and signal to noise.

38
  • Conclusions
  • A simple transformation of raw data variables
    into dimensionless S variables (units of
    significance) permits construction of an unified
    experiment space of all data, regardless of
    source or differences in intrinsic dynamic range
    and signal to noise.
  • A potentially serious danger is
    over-parameterization, the usage of many
    non-independent variables to represent a
    biological process (say, the inactivation of a
    calcium response). We suggest that this can be
    addressed through clustering variables over many
    ligand responses.

39
  • Conclusions
  • A simple transformation of raw data variables
    into dimensionless S variables (units of
    significance) permits construction of an unified
    experiment space of all data, regardless of
    source or differences in intrinsic dynamic range
    and signal to noise.
  • A potentially serious danger is
    over-parameterization, the usage of many
    non-independent variables to represent a
    biological process (say, the inactivation of a
    calcium response). We suggest that this can be
    addressed through clustering variables over many
    ligand responses.
  • 14 out of 32 ligands applied to the B cell showed
    some significant response in at least one of the
    54 experiment space dimensions.

40
  • Conclusions
  • A simple transformation of raw data variables
    into dimensionless S variables (units of
    significance) permits construction of an unified
    experiment space of all data, regardless of
    source or differences in intrinsic dynamic range
    and signal to noise.
  • A potentially serious danger is
    over-parameterization, the usage of many
    non-independent variables to represent a
    biological process (say, the inactivation of a
    calcium response). We suggest that this can be
    addressed through clustering variables over many
    ligand responses.
  • 14 out of 32 ligands applied to the B cell showed
    some significant response in at least one of the
    54 experiment space dimensions.
  • Of the 14 with measurable responses, we discern 8
    distinct patterns of response.

41
  • Conclusions
  • A simple transformation of raw data variables
    into dimensionless S variables (units of
    significance) permits construction of an unified
    experiment space of all data, regardless of
    source or differences in intrinsic dynamic range
    and signal to noise.
  • A potentially serious danger is
    over-parameterization, the usage of many
    non-independent variables to represent a
    biological process (say, the inactivation of a
    calcium response). We suggest that this can be
    addressed through clustering variables over many
    ligand responses.
  • 14 out of 32 ligands applied to the B cell showed
    some significant response in at least one of the
    54 experiment space dimensions.
  • Of the 14 with measurable responses, we discern 8
    distinct patterns of response.
  • The gene expression dataset will be integrated
    into the experiment spaceas soon as we clearly
    understand how to identify the gene clusters that
    should be collapsed into experiment space
    dimensions.

42
  • Conclusions
  • A simple transformation of raw data variables
    into dimensionless S variables (units of
    significance) permits construction of an unified
    experiment space of all data, regardless of
    source or differences in intrinsic dynamic range
    and signal to noise.
  • A potentially serious danger is
    over-parameterization, the usage of many
    non-independent variables to represent a
    biological process (say, the inactivation of a
    calcium response). We suggest that this can be
    addressed through clustering variables over many
    ligand responses.
  • 14 out of 32 ligands applied to the B cell showed
    some significant response in at least one of the
    54 experiment space dimensions.
  • Of the 14 with measurable responses, we discern 8
    distinct patterns of response.
  • The gene expression dataset will be integrated
    into the experiment spaceas soon as we clearly
    understand how to identify the gene clusters that
    should be collapsed into experiment space
    dimensions.
  • What predictions seem reasonable for the double
    ligand screen?
  • combinations of ligands that show similar
    response patterns might be expected to show
    interaction,
  • combinations of ligands that are very different
    in response might show less or no interaction.

43
Acknowledgements Madhu Natarajan Paul
Sternweis Elliott Ross Mel Simon Al Gilman
About PowerShow.com