Transcriptome - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Transcriptome

Description:

Ligate together the oligomers and clone them. Sequence thousands of clones. ... Ligate, PCR, and gel purify ditags (102bp). Recleave with anchoring enzyme ... – PowerPoint PPT presentation

Number of Views:540
Avg rating:3.0/5.0
Slides: 36
Provided by: chucks96
Category:

less

Transcript and Presenter's Notes

Title: Transcriptome


1
Transcriptome
  • Gene Discovery
  • Quantitation of Gene Expression

BIO520 Bioinformatics Jim Lund
2
WHY?
  • The genes expressed determine the state of the
    cell.
  • Signaling.
  • Metabolic capabilities.
  • Differentiation state (cell type).
  • Response to changes in environment.
  • Verifies gene predictions.
  • Transcriptional regulation
  • Normal vs. abnormal
  • Conditional expression

3
Transcriptome Analysis
  • Gene (transcript) discovery
  • transcripts
  • alternative splicing/processing
  • Transcript assays
  • Promoter analysis
  • Transcription Factors
  • Cellular control networks

4
Gene Discovery
  • Inference from genomic DNA
  • Prokaryotes fungi OK
  • cDNA characterization
  • EST
  • SAGE

5
EST (Expressed Sequence Tag)
  • Sequence cDNA libraries
  • proportional libraries
  • subtracted or normalized libraries
  • Which end?
  • 5 or 3 or Whole

6
Library Type
  • regular or proportional
  • Subtracted
  • Miss alternate transcripts
  • normalized
  • Tissue
  • Primer
  • dT vs random

7
Ideal cDNAs
8
Real cDNAs
9
Which end?
  • Whole cDNA
  • BEST HARDEST (Long)
  • 3-end
  • Consistent technically, limited information
  • 5end
  • Coding identity highest
  • 5 AND 3
  • Good, but technical informatic challenge

10
(No Transcript)
11
EST Data Analyses
  • Clustering Analysis
  • Assemble ESTs into genes.
  • Alternative splicing forms
  • Find coding SNPs.
  • Truncated, unspliced, and junk ESTs can be
    misleading
  • Project Unigene
  • Program stackPACK
  • Frequency analysis
  • Digital Differential Display
  • DDD is a computational method for comparing
    sequence-based gene representation profiles among
    individual cDNA libraries or pools of libraries.

12
EST Results (old)
  • Known genes (30)
  • Similarities to other ORFs, ESTs (30)
  • Infer Function?
  • Novel Class (30, ? w/ time)

13
Typical Progress/Results
  • Humans
  • 6,694,833 ESTs
  • 124,179 clusters (sets)
  • 29,000 sets contain EST and mRNA seqs.
  • CGAP EST library plateau broken by
  • different tissues, different states
  • normalized libraries

14
Data Quality Considerations
  • 99 correct data (1 errors!).
  • Frameshifts-effects depend on tools
  • BLASTX tool to find
  • How sensitive
  • TBLASTX, TBLASTN to use in other projects
  • How sensitive

15
Gene Expression Assay
  • EST (Poor method)
  • SAGE
  • Microarray Hybridization
  • Transcriptional Fusions
  • GFP, LacZ fusions

16
Serial Analysis of Gene Expression (SAGE)
  • Collect mRNA
  • Isolate short oligomers from each transcript.
  • Ligate together the oligomers and clone them.
  • Sequence thousands of clones.
  • Map the 1x104 1x105 oligomers to their genes.
  • Find which genes are transcribed and their
    relative expression levels.
  • http//www.sagenet.org (Vogelstein at JHU)

17
SAGE technique
  • Prepare biotin labeled cDNA
  • Cleave with anchoring enzyme (NlaIII)

18
SAGE technique
  • Ligate on linkers
  • Cleave with tagging enzyme (BsmFI)

19
SAGE technique
  • Ligate, PCR, and gel purify ditags (102bp).
  • Recleave with anchoring enzyme (NlaIII), ligate
    to form concatemers.
  • Size select, clone and sequence concatemers.

20
Colon cancer vs. normal colon epithelium (SAGE)
21
Microarray Hybridization
  • Determine gene expression by parallel
    hybridization of labeled cDNA to DNA attached to
    a fixed support.
  • http//cmgm.stanford.edu/pbrown/

22
Microarray Hybridization
  • Producing chips
  • Producing probes / reading arrays
  • Analyzing and interpreting data

23
Transcriptional Array
orf 1
orf 2
orf 3
1
2
3
3 cm
4
5
6
200 spots
7
8
9
2
40,000 dot/9 cm
or
Condition 1
Condition 2
gt All human genes
mRNA
mRNA
24
Transcriptional Array-1
orf 1
orf 2
orf 3
1
2
3
3 cm
4
5
6
200 spots
7
8
9
2
40,000 dot/9 cm
or
Condition 1
Condition 2
Condition 2
gt All human genes
mRNA
mRNA
mRNA
25
Transcriptional Array-2
orf 1
orf 2
orf 3
1
2
3
3
1
2
3 cm
6
4
5
6
200 spots
7
8
9
7
8
2
40,000 dot/9 cm
or
Condition 1
Condition 2
gt All human genes
mRNA
mRNA
26
Microarray Technologies
  • Spotted arrays (Brown et al.)
  • Spot arrays on glass slides
  • PCR fragments
  • Long (50-70bp) oligo arrays
  • Synthesis
  • Affymetrix (www.affymetrix.com)
  • High density array of 25 bp oligos
  • Made using light directed oligonucleotide
    synthesis and photolithography
  • Agilent, CombiMatrix
  • Made using light directed oligonucleotide
    synthesis and mirrors.

27
Spotted Arrays
28
Print Quill
29
Spotted microarray image
30
Affymetrix photolithographic technology
  • Lithographic masks are used to either block or
    transmit light onto specific locations of the
    array.
  • The surface is then flooded with a solution
    containing either adenine, thymine, cytosine, or
    guanine, and coupling occurs only in those
    regions on the glass that have been deprotected
    through illumination.
  • The coupled nucleotide also bears a
    light-sensitive protecting group, so the cycle
    can be repeated.
  • Microarray is built as the probes are synthesized
    through repeated cycles of deprotection and
    coupling.
  • Typically ends at 25 bps.)
  • Current arrays have 1.3 million unique features
    per array.

31
GeneChip Expression Assay Design
32
Affymetrix GeneChips Expression Analysis
  • Available for humans and model organisms.
  • Made only by Affymetrix.
  • Chip designs change slowly.
  • GeneChips
  • Human 50,000 RefSeq genes and ESTs
  • C. elegans 22,500 genes (12/00 genome
    annotation)
  • Rat 230 30,000 genes, ESTs
  • Yeast 6100 gene set
  • Tiling arrays for model organisms
  • http//affymetrix.com

33
Quantitation of fluorescence signals (Image to
data)
  • Hybridization, scan in chip image.
  • Gridding
  • Determine where the spots are.
  • Spot intensity and local background
    determination.
  • Normalization
  • Adjust to make the red and green total signal
    intensities the same.
  • Gene expression ratio.
  • Red channel/green channel.
  • Programs
  • ScanAlyze, http//rana.lbl.gov/EisenSoftware.htm
  • GenePix, http//www.axon.com/gn_GenePixSoftware.ht
    ml

34
Microarray data
Big tables of numbers!
35
Viewing microarray data
Clustergram
Scatter plot log(ch1) vs log(ch2)
M vs A signal vs expression change
Volcano plot log(expr) vs p-value
Write a Comment
User Comments (0)
About PowerShow.com