Generation of patterns from gene expression by assigning confidence to differentially expressed genes - PowerPoint PPT Presentation

About This Presentation
Title:

Generation of patterns from gene expression by assigning confidence to differentially expressed genes

Description:

Generation of patterns from gene expression by assigning confidence to ... Steven E.McKenzie, G. Christian Overton, Saul Surrey, Christian J. Stoeckert ... – PowerPoint PPT presentation

Number of Views:147
Avg rating:3.0/5.0
Slides: 39
Provided by: Asatisfied333
Category:

less

Transcript and Presenter's Notes

Title: Generation of patterns from gene expression by assigning confidence to differentially expressed genes


1
Generation of patterns from gene expression by
assigning confidence to differentially expressed
genes
  • Elisabetta Manduchi, Gregory R. Grant, Steven
    E.McKenzie, G. Christian Overton, Saul Surrey,
    Christian J. Stoeckert

Presented by Keith Betts
2
Goal
  • Provide tools to aid in the analysis of data
    collected from highly parallel gene expression
    experiments.
  • Generate descriptive and dependable expression
    patterns representing the differential expression
    of genes across cell types.

3
  • Identify those genes that are most likely to be
    differentially expressed.
  • Transform typical raw input into easily
    interpretable list of patterns

4
(No Transcript)
5
            Patterns from Gene Expression
6
What is
???
  • PaGE is free downloadable Perl software (tested
    mainly on Unix systems) which can be used as a
    statistical test for differentially expressed
    genes between two experimental conditions, given
    replicated expiriments.
  • Available at
  • http//www.cbil.upenn.edu/PaGE/

7
Methods and Algorithm
  • Input consists of normalized data (the
    normalization procedure depends on the kind of
    experiments conducted)
  • The input normalized intensities are subjected to
    preprocessing steps.

8
Methods and Algorithm Cont.
  • In each gene tags expression pattern there will
    be one symbol for each homotypic group (set of
    samples of the same type)
  • For each homotypic group and for each gene tag,
    compute the average intensity of that tag over
    the group which have values for that tag.
  • This average will represent the intensity of that
    tag at that group.

9
Two Stage Approach
  • First
  • Attach an ordered list of real numbers to each
    tag.
  • Second
  • Bin the numbers in this list, resulting in a
    pattern of integers.

10
First Stage
  • Fix an ordering of the groups in the collection.
  • Attach to each tag the ordered list of real
    numbers obtained by dividing each of its
    non-reference group intensities by the median of
    its group intensities.
  • List of ratios attached to the tag.

11
Second Stage
  • For each non-reference group, partition the range
    into disjoint subintervals.
  • Number the bins using consecutive integers
    m,,0,.m (where 0 corresponds to ratio 1)
  • Attach the ordered list of integers to each gene
    tag.

12
Example
  • For group i Divide the range into mi ni 1
    bins.
  • The list of ratios from the first stage for a
    certain gene tag is (r1, r2,., rl)
  • Each ri belongs to exactly one of the bins Bi,j.
  • The expression pattern associated with this tag
    is then (j1, j2,, jl)

13
Choose level cutoffs
  • Suppose we are taking ratios to a reference
    homotypic group (group 0) and are focusing on a
    fixed group (group i).
  • Suppose also that we have replicate experiments
    for each of the two groups.
  • Concentrate on up-regulation

14
Goal
  • Goal is to achieve a certain degree of confidence
    in the assertion
  • this gene is up-regulated at group i as compared
    to the reference group

15
  • Each gene will have a distribution of intensities
    in a group, whose mean will be called the true
    mean intensity of the gene at that group
  • Denote the Random Variable giving the intensity
    of gene g at group j by Xg,j, and denote the Mean
    and Std. Dev as ?g,j, ?g,j

16
False Positive Rate
  • Prob((Xg,I / Xg,0) gt Ci
  • (?g,j / ?g,0 ) lt 1)

17
  • Claim that
  • (Ave.g,I / ?g,j) / (Ave.g,0 / ?g,0) gt Ci )
  • And
  • (?g,j / ?g,0 ) lt 1
  • Are independent events.

18
  • Seek Ci as small as possible such that
  • Prob(
  • (Ave.g,I / ?g,j) / (Ave.g,0 / ?g,0) gt Ci )
  • lt s

19
  • Approximate (Ave.g,j / ?g,j) for (j 0, i)
  • ((Xg,j,k / Ave.g,j) 1)
  • / Sqrt(tj 1) 1

20
  • Compute the desired Ci through integration
  • If fj ( j 0,i) is the density function for
    Ave.g,j / ?g,j , and C is fixed, then evaluate
    using

21
  • If this is above the desired false positive rate,
    them C is raised and the integral is
    recalculated.
  • Repeat process until the desired false positive
    rate is attained.

22
Down-regulation
  • Proceed in similar manner
  • Seek ci as small as possible such that
  • Prob(
  • (Ave.g,I / ?g,j) / (Ave.g,0 / ?g,0) gt ci )
  • lt s

23
  • Once the Cis and the cis are determined for
    each reference group I, if the ratio of the
    average intensity of a gene tag at group i, and
    the average intensity of the same gene tag at the
    reference group is between Ci and Ci2, we say
    that the gene tag is up-regulated one level at
    this group as compared to the reference group.

24
  • One can now estimate the probability Prob(not up
    predicted up)
  • Prob(not up) Prob(predicted up not up) /
    Prob(predicted up)
  • ?
  • Prob(predicted up not up) / Prob(predicted up)

25
  • As a consequence of this approach, when we see a
    level different from 0, we have a certain
    confidence in the gene tag being up-regulated or
    down-regulated as compared to the reference
    group.
  • However, when we see a 0 there is no confidence
    implied.
  • We can only take 0 to mean that we do not have
    enough evidence to support a change in level.

26
Results
  • Application to an erythroid development nylon
    filter dataset

27
Background
  • Erythroid development dataset contains 5
    homotypic groups representing an erythroleukemic
    cell line and normal cells under different
    conditions
  • There are repliate data for each of the groups.

28
Background Continued
  • The groups are
  • CD34 positive cells
  • Human adult erythroblasts
  • Cord erythroblasts
  • HEL cells
  • HEL cells treated with hemin

29
Application
  • Available replicates
  • Two CD34
  • Three adult erythroblasts
  • Two cord blood erythroblasts
  • Three HEL
  • Two HEL hemin

30
  • The value of d is set at 15
  • Only the moderate to highly abundant mRNA classes
    are likely to have given hybridization signals
    above background on the filter array.
  • Set the HEL group as reference

31
Two approaches
  • PaGE was run once merging the adult and the cord
    erythoblasts into one group with five replicates
  • PaGe was run a second time keeping the adult and
    cord erythoblasts in separate groups.

32
Performance
  • Running time always under 90 seconds when run on
    a UltraSPARC Iii CPU at 300MHZ with 128MB RAM.

33
Adult and Cord Merged Results
  • Total of 18,123 clones
  • 540 were above the minimum useful value in every
    group
  • 5,063 were above the minimum useful value in at
    least one group.

34
Merged Results Cont.
  • For s 1 (false positive rate)
  • 5 levels for CD-34 (0 to 4)
  • 10 levels for erythoblasts (-1 to 8)
  • 6 levels for HEL hemen (-1 to 4)

35
Findings
  • Clones representing the same gene were usually
    found to have identical or very similar patterns.
  • Clones representing genes whose expression is
    known in these cells presented patterns
    compatible with what was expected.

36
(No Transcript)
37
New Application
  • Ask what genes are differentially expressed
    between Normal and leukemic cells?
  • Ask which genes are induced by hemin to adopt a
    normal expression pattern.

38
Findings
  • Having more genes available to start with led to
    more genes identified as differentially expressed
    but at lower confidence.
  • At similar confidence levels, starting with more
    genes did not necessarily lead to more genes
    identified as differentially expressed between
    normal and HEL cells.
Write a Comment
User Comments (0)
About PowerShow.com