Non-coding RNA - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Non-coding RNA

Description:

Non-coding RNA (ncRNA) is a RNA molecule that functions w/o being translated ... Statistically biased (codon triplets) Open Reading Frames. ncRNA Approaches ... – PowerPoint PPT presentation

Number of Views:426
Avg rating:3.0/5.0
Slides: 32
Provided by: willi62
Category:
Tags: rna | coding | non | triplets

less

Transcript and Presenter's Notes

Title: Non-coding RNA


1
Non-coding RNA
  • William Liu
  • CS374 Algorithms in Biology
  • November 23, 2004

2
Non-Coding RNA
  • Background Basics
  • Biology Overview
  • Why ncRNA - Central Dogma?
  • Problem Space
  • HMM/sCFG Solution
  • Paper
  • Pair HMMs on Tree Structures
  • Alignment of Trees, Structural Alignment
  • Experimental Evaluation
  • Conclusion

3
Central Dogma of Molec. Bio.
4
Biology Overview
  • RNA merely plays an accessory role
  • Complexity is defined by proteins encoded in the
    genome

5
Biology Overview
  • Non-coding RNA (ncRNA) is a RNA molecule that
    functions w/o being translated into a protein
  • Most prominent examples Transfer RNA (tRNA),
    Ribosomal RNA (rRNA)

6
Why Non-coding RNA
  • Protein-coding genes cant account for all
    complexity
  • ncRNA is important!
  • Gene regulators

Genome Biol. 2002 Beyond The Proteome
Non-coding Regulatory RNAs
7
Non-coding RNA Problems
  • Finding ncRNA genes in the genome locate these
    genes
  • Finding Homologs of ncRNA figure out what they do

8
Finding ncRNA Genes
  • Protein Approaches
  • Statistically biased (codon triplets)
  • Open Reading Frames
  • ncRNA Approaches
  • High CG content (hyperthermophiles)
  • Promoter/Terminator identification (E. Coli)

Comparative Genome Analysis
Comparative Genome Analysis
9
Genetic Code
10
Similarity Searching
  • Proteins
  • BLAST, Sequence Alignment (DP)
  • Genes that code for proteins are conserved across
    genomes (e.g. low rate of mutation)
  • ncRNA
  • Secondary structure usually conserved
  • Alignment scoring based on structure is imperative

11
ncRNA Sequence vs Structure
12
Alignment Approaches
  • sCFGs Modeling secondary structure, scoring
    sequences
  • HMM for scoring of sequence and secondary
    structure alignment

13
Pair HMMs on Tree Structures
  • Outline
  • Alignment on Trees
  • Structural Alignment
  • Secondary Structure Representation
  • Hidden Markov Model
  • Recurrence Relations
  • Experimental Evaluation
  • Future Work

14
Alignment on Trees
15
Structural Alignment
  • Problem Given an RNA sequence with known
    Secondary Structure and an RNA sequence (unknown
    structure), obtain the optimal alignment of the
    two

16
Structural Representation
  • Skeletal Tree

?(?, ?) Branch Structure ?(X, ?, Y)
Base-pairs ?(X, ?) or ?(?, Y) Unpaired
bases X,Y ?A,U,G,C
17
Hidden Markov Model
  • M Match state, I Insertion state, D Deletion
    state
  • ?XY State transition probability from X to Y
  • ?X Initial probability
  • Emission probability for pair x,y
  • X,Y ? M,I,D

18
Notation
  • Let wa1a2an be an unfolded RNA sequence of
    length n
  • Let wi denote ith symbol in w
  • Let wi,j denote a substring aiai1aj of w

19
Notation
  • Let T be a skeletal tree representing a folded
    RNA sequence (known structure)
  • Let v(j) denote the label of node j in tree T
  • Let Tj denote the subtree rooted at node j in
    tree T
  • Let jn denote the nth child of node j in tree T

20
Recurrence Relation (Match)
21
Recurrence Relation (Delete)
22
Recurrence Relation (Insert)
23
Structural Alignment
  • Intuition Given the ncRNA sequence, b with
    unknown structure, generate a predicted folded
    structure for b, align the resulting tree with
    the ncRNA with known secondary structure a.
  • Complexity O(K M N3 )
  • K states in pair HMM,
  • M size of skeletal tree,
  • N length of unfolded sequence

24
Experimental Evaluation
  • Dynamic Programming to calculate recurrence
    relations, prototype system to execute algorithm
  • Experiments on 2 families of RNA Transfer RNAs
    and Hammerhead Ribozyme

25
Parameters
Gorodkin et al. (1997)
26
Results tRNA
27
Results Hammerhead Ribozyme
28
Future Work
  • Since based on dynamic programming (of pairwise
    alignment), many DP techniques can apply
  • Refine emission probabilities, relate score
    matrix (reliable alignment for RNA families)

29
Conclusions
  • ncRNA space is quite open - no really great
    techniques yet
  • How many ncRNA genes are there?
  • Absence of evidence ? evidence of absence
  • Eddys call to arms

it is time for RNA computational biologists to
step up
30
Thanks!
31
References
  • Sakakibara, K., Pair Hidden Markov Models on
    Tree Structures, Bioinformatics, 19232-240,
    2003
  • Eddy, S., Computational Genomics of Noncoding
    RNA Genes, Cell, Vol 109137-140, 2002
  • Szymanski, M., Barciszewski, J., Beyond The
    Proteome Non-coding Regulatory RNAs
Write a Comment
User Comments (0)
About PowerShow.com