ENCODE Pseudogene Call Summary - PowerPoint PPT Presentation

About This Presentation
Title:

ENCODE Pseudogene Call Summary

Description:

Intersection set of above is 81 (proc) 49 (non-proc) ... EST evidence supports expression from the pseudogene locus extending to known gene LILRA3. ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 18
Provided by: MarkGe4
Category:
Tags: encode | call | est | non | pseudogene | summary

less

Transcript and Presenter's Notes

Title: ENCODE Pseudogene Call Summary


1
ENCODE Pseudogene CallSummary
  • Mark Gerstein
  • 2005,10.27 1100 EDT
  • (Draft for GT call on
  • 2005,10.28 1000 EDT)

2
Pseudogene group
  • Core people Jennifer Harrow ltjla1_at_sanger.ac.ukgt,
    WEI Chia-Lin ltweicl_at_gis.a-star.edu.sggt, Adam
    Frankish ltaf2_at_sanger.ac.ukgt, "Dike, Sujit"
    ltSujit_Dike_at_affymetrix.comgt, Robert Baertsch
    ltbaertsch_at_SOE.UCSC.EDUgt, fdenoeud_at_imim.es, Deyou
    Zheng ltzhengdy_at_csb.yale.edugt, Yontao Lu
    ltytlu_at_SOE.UCSC.EDUgt Alexandre.Reymond_at_medecine.un
    ige.ch, ytlu_at_SOE.UCSC.EDU
  • Others "Hoyem, Tara L" ltTara.Hoyem_at_pnl.govgt,
    Roderic Guigo Serra ltrguigo_at_imim.esgt, "'Gingeras,
    Tom' Tom_Gingeras_at_affymetrix.comgt,
    thomas.royce_at_yale.edu, Suganthi Balasubramanian
    suganthi_at_csb.yale.edu
  • 6 Calls Sept. 15, 22 Oct. 6, 13, 20, 27

3
Refresher many repetitions of the below Venn
analysis
54 (2)
Havana-Gencode 165 pseudogenes (167 -2 )
17 (2)
16 (0)
Yale 167 pseudogenes (164 3)
81 (34)
15 (1)
16 (7)
7 Havana agrees to be added (8, 11, 40, 59, 139,
152, 169). 4 at coding loci. Yale agrees to
delete 1 with weak sequence identity. 5 with
non-real proteins.
Numbers according to Adams note
33 (1)
UCSC retrogenes 146 not expressed
9 Havana agrees to be added. 2 at coding loci.
Yale agrees to delete 1 with weak sequence
identity. 2 with non-real proteins.
Solved by consistent protein set threshold
4
A proposal for qualified union with a uniform
criteria for boundaries
  • Identify a good set of human proteins HAVANA
    set?
  • Remove pseudogenes (from all 4 groups)
    overlapping with current GENCODE exons
  • (does GENCODE have an updated version?).
  • Create an union of the remaining pseudogenes.
  • Find the best matching proteins for each
    pseudogene, remove entries without a BLAST hit
    (e-value cutoff issue?).
  • Realign each pseudogene to its parent protein to
    produce a uniform alignment and to define the
    start and end coordinates.
  • Apply a threshold to sequence identity and
    coverage? (No.)
  • Classify pseudogenes into processed and
    non-processed (how?)
  • Overall 222 pseudogenes
  • Application of above receipe gives 198 Consensus
  • Intersection set of above is 81 (proc) 49
    (non-proc)
  • on browser encode wiki http//pseudogene.org/
    ENCODE

From Deyou Z. Robert B.
5
Insertion into processed pseudogene
From Adam F.
First insertion event
heterogeneous nuclear ribonucleoprotein A1
(HNRPA1) pseudogene (parent on Chr12)
Remnant of a second, mitochondrial insertion
event (has post-insertion deletions)
NADH dehydrogenase 2 (MTND2) pseudogene (parent
mitochondrial)
NADH dehydrogenase 4 (MTND4) pseudogene (parent
mitochondrial)
cytochrome b (CYTB) pseudogene (parent
mitochondrial)
Protein evidence
6
Rearranged exon order in unprocessed pseudogene
From Adam F.
Dot plot protein evidence vs genome
adaptor-related protein complex 1, beta 1 subunit
(AP1B1) pseudogenes
Protein evidence
Exon 6
Exon 3
Splice sites same as parent gene
Following duplication of the AP1B1 locus
rearrangements/duplications have produced two
unprocessed pseudogenes corresponding to exons 6
and 3 of the parent gene
7
Rearrangement of processed pseudogene
From Adam F.
mRNA dot plot
pseudogene similar to part of ribosomal protein
L3 (RPL3)
Following insertion, one end of the RPL3
pseudogene has been flipped onto the opposite
strand (with some loss of internal sequence)
Protein dot plot
8
Transcription among 198 consensus pseudogenes
- Nb overlapped by interrogated regions (affy
arrays) 180 (90.9) - Nb overlapped by yale
tars or affy transfrags (union) 106 (53.5 of
all 58.9 of interrogated) gt There is
evidence of transcription (from tars or
transfrags) of the pseudogene or the parent gene
(if cross-hybridization) for 53.5 of the
consensus pseudogenes - Nb overlapped by cage
tags 11 (5.5) - Nb overlapped by ditag
tags 1 (0.5) (83 (41.9) are overlapped by
full length ditags)
From France D.
9
Pseudogene overlapped by tars/transfrags and
ditags ENCODE_consensus_187
93 similar to parent
From France D.
10
Overlaps by tar/transfrag subset - Nb
overlapped by interrogated regions (affy
arrays) 180 (90.9) - Nb overlapped by yale
tars or affy transfrags (union) 106 (53.5 of
all 58.9 of interrogated) - Nb overlapped by
yale tars (union) 84 (42.4 of all 46.7 of
interrogated) - Nb overlapped by affy
transfrags (union) 102 (51.5 of all 56.7
of interrogated) - Nb overlapped by polyA
tars/transfrags (union) 105 (53 of all 58.3
of interrogated) - Nb overlapped by total RNA
tars (union) 61 (30.8 of all 33.9 of
interrogated)
From France D.
11
ENCODE pseudogenes expression
  • ENCODE pseudogenes from the intersection part of
    consensus set
  • 49 non-processed, 125 processed
  • Designed oligos (25mer, Tm 70C)
  • Either specific to pseudogene or shared between
    parental gene and pseudogene

From Alex R.
12
ENCODE pseudogenes expression 2
  • 5RACE in 12 human tissues
  • Brain, heart, kidney, spleen, liver, colon, sm.
    intestine, muscle, lung, stomach, testis,
    placenta
  • First 96 pseudogenes 5RACEs done in 12 tissues
  • Last 78 will be done next week
  • To do pool multiple RACEs, send to Santa Clara
    and hybridize to Affymetrix ENCODE 20 nucleotide
    resolution arrays

Stylianos Antonarakis, Robert Baertsch, Jorg
Drenkow, Tom Gingeras, Charlotte Henrichsen
Philipp Kapranov, Catherine Ucla, Alexandre
Reymond Affymetrix, UCSC, University of Geneva,
University of Lausanne
From Alex R.
13
Expression from pseudogene locus (1) putative
novel transcript
Aligned proteins (column collapsed)
HAVANA sialyltransferase pseudogene (RP3-477O4.5)
supported by protein evidence
Supporting EST (100 ID)
Putative novel transcript supported by a single
EST with has a polyA site and signal
polyA site and signal
Appears to be some transcription from this locus
which is supported at the 3 end by a single EST
From Adam F.
14
Expression from pseudogene locus (2) 5 UTR of
known gene
From Adam F.
LILR pseudogene
Frameshift
Upstream pseudogene corresponds to exons 1-3 of
LILR family genes, 3 exons have been lost. EST
evidence supports expression from the pseudogene
locus extending to known gene LILRA3.
LILRA3
15
Intersect Consensus Pseudogenes with ChIP-chip
Hits
From Deyou Z.
16
Consensus Pseudogenes with 2 ChIP-chip Hits
Has Trans-criptional Evidence (intersects Gencode
transcript)
From Deyou Z.
17
Example Pseudogene with Binding Hits (177)
From Deyou Z.
Write a Comment
User Comments (0)
About PowerShow.com