Modeling RNA motifs by graphgrammars Franois.MajorUMontreal.CA - PowerPoint PPT Presentation

Loading...

PPT – Modeling RNA motifs by graphgrammars Franois.MajorUMontreal.CA PowerPoint presentation | free to view - id: a987a-YTFiZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Modeling RNA motifs by graphgrammars Franois.MajorUMontreal.CA

Description:

St-Onge et al. NAR 2007 ... Patrick Gendron (Res. assistant) Romain Rivi re (Postdoc, CS) ... Karine St-Onge (Ph.D. Computer Science) Louis-Philippe Lavoie (M. ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 18
Provided by: steph320
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Modeling RNA motifs by graphgrammars Franois.MajorUMontreal.CA


1
Modeling RNA motifs by graph-grammarsFrançois.Ma
jor_at_UMontreal.CA
www.iric.ca
2
MC-Tools Functions
  • ( MC-Annotate 3-D ) -gt graph
  • ( MC-Cycles graph ) -gt NCM
  • ( MC-Seq graph ) -gt sequence
  • ( MC-Fold sequence ) -gt graph
  • ( MC-Cons ( sequence, graph ) ) -gt
    graph
  • ( MC-Search ( graph, 3-D ) -gt 3-D
  • ( MC-Sym graph ) -gt 3-D

3
MC-Tools Objects(rat 28S rRNA sarcin/ricin
stem-loop)
Nucleotide cyclic motifs
( MC-Sym graph ) -gt 3-D
Graph
3-D structure
( MC-Fold sequence ) -gt graph
Szewczak et al. PNAS(USA) 1993 Lemieux Major
NAR 2006 Parisien, Thibault Major (in prep.)
Sequence GGGUGCUCAGUACGAGAGGAACCGCACCC
4
Graph
( MC-Annotate 3-D ) -gt graph
Gendron, Lemieux Major JMB 2001 Lemieux Major
NAR 2002 Leontis Westhof RNA 2001
5
Shortest Cycle Basis
( MC-Cycle graph ) -gt NCM
Horton SIAM J Comp 1987 St-Onge et al. NAR 2007
6
The Nucleotide Cyclic Motifs (NCM)
  • Embrace indistinctly all base pairing types
    (Watson-Crick and others)
  • Precisely designate how any nucleotide in the
    sequence relate to others
  • Are joined through a common base pair (context).
    This helps us predict coherent chains of NCMs and
    to project them in 3-D. Tentative definition of a
    motif ordered chain of NCMs.
  • Recur within and across all RNAs
  • Are short (lt 10 nts most of 3 to 5 nts)
  • Compose the classical motifs (cf. GRNA tetraloop
    sarcin/ricin motif, etc). There are exceptions
    (cf. AA platform).

Lemieux Major (2006) NAR 342340 Parisien,
Thibault Major (in prep.)
7
Aim
  • We want a computational model that can encode the
    valid sequences and structural features of RNA
    motifs.
  • Hypothesis A relation between the sequence and
    the structure of RNA motifs exists.

8
Graph Grammars
  • A graph grammar is to a set of graphs what a
    formal generative grammar is to a set of strings,
    i.e. a precise and formal description of that
    set.
  • A graph-grammar consists of a set of rules or
    productions for transforming graphs.
  • Formally, a graph-grammar, H N, ?, P,
    consists of a set of non-terminal symbols, N, a
    set of terminal symbols, ?, and a set of
    production rules, P.
  • Hypothesis NCMs are independent building
    blocks.

Nagl Computing 1976 Nagl In H. Ehrig et al., eds
1987 St-Onge et al. NAR 2007
9
Sarcin/Ricin Graph Grammar
  • N C1, C2, C5,
  • the set of NCMs
  • ? S1, S2, S5
  • the sets of sequences for each NCM
  • P is a set of consistent assignment of the
    sequences in ? to the NCMs in N (production
    rules)

?
St-Onge et al. NAR 2007
10
Sarcin/Ricin Building Blocks
  • C1
  • Theoretical 256 (16 x 16)
  • IMs 120 (10 x 12)
  • PDB 7
  • C2
  • Theoretical 64 (16 x 4)
  • IMs 40 (10 x 4)
  • PDB 5
  • Theoretical 16
  • IMs 10
  • PDB 15
  • C3
  • Theoretical 64 (16 x 4)
  • IMs 56 (14 x 4)
  • PDB 2
  • C4
  • Theoretical 256 (16 x 16)
  • IMs 160 (16 x 10)
  • PDB 3
  • C5
  • Theoretical 64 (16 x 4)
  • IMs 40 (10 x 4)
  • PDB 8

St-Onge et al. NAR 2007
11
( MC-Seq sarcin-ricin-graph ) -gt sequence
  • Sequences supported by the NCMs in the PDB
  • AGUA-GAA AGUA-AAA
  • GGUA-GAA GGUA-AAA
  • If we remove the instances of the sarcin/ricin
    motifs
  • ( MC-Search ( sarcin-ricin-graph, PDB ) ) -gt
    3-D
  • Then, the same four sequences are supported
  • gt NCMs are found outside the sarcin/ricin
    context

Larose et al. (in prep.) St-Onge et al. NAR 2007
12
Graph Grammar Parsing
806 sequences aligned according to E. coli 23S
rRNA structure site 204-207 / 189-191.
Westhof (personal comm.) St-Onge et al. NAR 2007
13
Validation(MC-Seq vs. PDB vs. Alignment)
Isostericity matrices
MC-Seq
PDB
GGUA-AAA
AGUA-AAA AGUA-GAA GGUA-GAA
10 000 sequences
AAUA-AAA AAUA-GAA ACUA-AAA ACUA-GAA ACUA-GAC AGUA-
AAC
AGUA-CAA AGUA-GAC AGUA-GAU AGUA-GCC AGUA-GGG AGUA-
GUG AGUC-GAA AUUA-GAA
CGUA-GAA GAUA-GAA GGUA-GAU GUUA-GAA UGUA-GAA UGUA-
GAC
Alignement 5S, 16S, 23S
St-Onge et al. NAR 2007
14
Perspectives
  • We want to develop a version of MC-Seq that would
    be useful during the alignment process.
  • PDB does not seem to contain enough structural
    information yet.
  • To avoid too many sequences, the NCMs (context)
    are necessary.
  • Two more things need to be considered

15
Sarcin/Ricin(Sequence/Structure Space Is Not
Simple)
St-Onge et al. (in prep.)
16
Modeling In 3-D Might Be Necessary
MC-Fold CAUU-AAG (2.1Å)
Alignment AUUA-GAA (0.9Å)
St-Onge et al. NAR 2007
17
Acknowledgments
Martin Larose (Res. assistant) Philippe Thibault
(Res. assistant) Patrick Gendron (Res.
assistant) Romain Rivière (Postdoc,
CS) Véronique Lisi (Ph.D. Molecular
Biology) Marc Parisien (Ph.D. Computer
Science) Emmanuelle Permal (Ph.D.
Bioinformatics) Karine St-Onge (Ph.D. Computer
Science) Louis-Philippe Lavoie (M.Sc.
Bioinformatics) Maxime Caron (M.Sc.
Bioinformatics) Caroline Louis-Jeune (M.Sc.
Bioinformatics)
Montréal Pascal Chartrand Gerardo
Ferberye Sylvie Hamel Sébastien Lemieux Pascale
Legault Luc Desgroseillers Kathy Borden Daniel
Lamarre Éric Westhof (Strasbourg) Alain Denise
(Paris) Dave Mathews (Rochester)
About PowerShow.com