Aucun titre de diapositive - PowerPoint PPT Presentation

About This Presentation
Title:

Aucun titre de diapositive

Description:

Title: Aucun titre de diapositive Author: nat Last modified by: elf Created Date: 8/13/2001 10:17:35 PM Document presentation format: Pr sentation l' cran – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 40
Provided by: Nat109
Category:

less

Transcript and Presenter's Notes

Title: Aucun titre de diapositive


1
Introduction to DNA Microarrays
DNA Microarrays and DNA chips resources on the
web
2
INTRODUCTION
Microarray analysis is a new technology that
allows scientists to simultaneously detect
thousands of genes in a small sample and to
analyze the expression of those genes.
Microarrays are simply ordered sets of DNA
molecules of known sequence. Usually rectangular
shaped, they can consist of a few hundred to
hundreds of thousands of sets. Each individual
sequence goes on the array at precisely defined
location.
3
Potential application domains
  • Identification of complex genetic diseases
  • Drug discovery and toxicology studies
  • Mutation/polymorphism detection (SNPs)
  • Pathogen analysis
  • Differing expression of genes over time, between
    tissues, and disease states
  • Preventive medicine
  • Specific genotype (population) targeted drugs
  • More targeted drug treatments AIDS
  • Genetic testing and privacy

4
The technique
Based on already known methods, such as
fluorescence and hybridization. High throughput
miniaturized method. It's main purpose is to
compare gene transcription levels in two or more
different kinds of cells.
  • - Microarrays
  • DNA chips
  • SAGE
  • Beads (liquid chip)

5
The challenge
The big revolution here is in the "micro" term.
New slides will contain a survey of the human
genome on a 2 cm2 chip! The use of this
large-scale method tends to create phenomenal
amounts of data, that have then to be analyzed,
processed and stored. As the technique is quite
new, analyzing the data is still a problem, and
nothing is standardized yet. A few databases and
on-line repositories are coming out, and the
future standard will probably be chosen among
them.
This is a job for Bioinformatics !
6
General overview
  • Making the chip
  • Experiment design, sequence selection, collection
  • maintenance, PCR, spotting, printing, synthesis
  • Probe hybridization
  • Probe purification, labelling, hybridization,
    washing
  • Scanning and image treatment
  • Fluorescence correction, find spots, background
  • Analysing the data
  • Filtering, normalisation
  • Clustering (hierarchical, centroid, SPC)
  • Representation, storage
  • Graphics, databases, web public resources

wet lab
7
THE EXPERIMENT making the chip
1- Designing the chip choosing genes of
interest for the experiment and/or select the
samples - Selection of sequences that represent
the investigated genes. - Finding sequences,
usually in the EST database. - Problems
sequencing errors, alternative splicing, chimeric
sequences, contamination
8
THE EXPERIMENT making the chip
  • 2- Spotting the sequences on the substrate
  • Substrate usually glass, but also nylon
    membranes, plastic, ceramic
  • Sequences cDNA (500-5000 nucleotides, dna
    chips), oligonucleotides (2080-mer oligos, oligo
    chips), genomic DNA ( 50000 bases)
  • Printing methods microspotting, ink-jetting
    (for dna chips) or in-situ printing,
    photolithography (for oligos, Affymetrix method)

9
THE EXPERIMENT making the chip
Microspotting and ink-jetting
10
THE EXPERIMENT making the chip
The microspotting is done by a robot called
arrayer
11
THE EXPERIMENT making the chip
Oligo-spotting (Affymetrix method)
12
THE EXPERIMENT hybridization
  • Sample preparation
  • Extracting DNA (for genomic studies) or mRNA (for
    gene expressions studies) from the two or more
    samples to compare.
  • Making cDNAs with extracts, and labeling them
    with different fluorochromes to allow direct
    comparison. (Cy-3, Cy-5, DIG)
  • Some techniques use radiolabeling

13
THE EXPERIMENT hybridization
Probes are overlaid on the chip, put in a
hybridization chamber, and then washed.
14
THE EXPERIMENT generating data
  • Chip scanning
  • Fluorescence measurements are made with scanning
    laser fluorescence microscope that scans the
    slide, illuminating each DNA spot and measuring
    fluorescence for each dye separately. It creates
    one red and one green image.
  • The two images are then superimposed to give a
    virtual result of RNA ratio in both samples

15
THE EXPERIMENT generating data
16
1- Samples 2- Extracting mRNA 3- Labeling 4-
Hybridizing 5- Scanning 6- Visualizing
17
Examples of images
Affymetrix chip
Stanford array
18
THE EXPERIMENT generating data
  • Image analysis
  • These fluorescence measures are then used to
    determine the ratio, and in turn the relative
    abundance, of the sequence of each specific gene
    in the two mRNA or DNA samples.
  • This analysis is performed by a software such as
    Scanalyze, available at http//rana.lbl.gov/Ei
    senSoftware.htm
  • or Spotfinder from TIGR
  • The files created can then be submitted to
    further analysis

19
THE EXPERIMENT making sense of the data
Although the visual image of a microarray panel
is alluring, its information content, per se, is
still not human readable.
How to visualize, organize and explore the
meaning of information consisting of several
million measurements of expression of thousands
of genes under thousands of conditions?
20
THE EXPERIMENT making sense of the data
Data mining depends on the questions which are
asked. The most frequent question is to find sets
of genes that have correlated expression profiles
(belonging to the same biological process and/or
co-regulated), or to divide conditions to groups
with similar gene expression profiles (for
example divide drugs according to their effect on
gene expression). The method used to answer
these questions is called CLUSTERING.
21
Clustering data
  • Input N data points, Xi, i1,2,,N (the color
    ratios measured with Scanalyze, for example) in a
    D dimensional space. N and D will be either genes
    and conditions for gene clustering, or conditions
    and genes for condition clustering.
  • Goal Find natural groups or clusters.
  • Note according to the method, the number of
    clusters will be fixed from the beginning
    (centroid clustering) or determined after the
    analysis (hierarchical clustering)

22
Clustering data
Before clustering, a few steps to clean the
data are necessary (normalization,
filtering) Clustering methods (examples) 1-
Agglomerative Hierarchical 2- Centroids K-means
or SOM 3- Super-Paramagnetic Clustering For a
good introduction on different clustering
techniques, read the article from Gavin Sherlock
Analysis of large-scale gene expression data in
Current Opinion in Immunology 2000, 12201-205
(pdf) http//www.isrec.isb-sib.ch/vpraz/chips/She
rlock.pdf
23
Agglomerative Hierarchical Clustering
Distance between joined clusters
The dendrogram induces a linear ordering of the
data points
Dendrogram
24
Agglomerative Hierarchical Clustering
Before doing a such clustering, one has to define
two things
  • 1- The similarity measure between two genes (or
    experiments)

Centered correlation Uncentered
correlation Absolute correlation Euclidean
2- The distance measure between the new cluster
and the others
Single Linkage distance between closest
pair. Complete Linkage distance between farthest
pair. Average Linkage distance between cluster
centers
25
Centroid methods - K-means
  • Start with random position of K centroids.
  • Assign points to centroids
  • Move centroids to centerof assigned points
  • Iterate until centroids are stable

Iteration 0
26
Centroid methods - K-means
  • Start with random position of K centroids.
  • Assign points to centroids
  • Move centroids to centerof assigned points
  • Iterate until centroids are stable

Iteration 1
27
Centroid methods - K-means
  • Start with random position of K centroids.
  • Assign points to centroids
  • Move centroids to centerof assigned points
  • Iterate until centroids are stable

Iteration 3
28
Self-organizing Maps
  • Choose a number of partitions
  • Assign a random reference vector to each
    partition.
  • Pick a gene randomly and assign it to its most
    similar reference vector.
  • Adjust that reference vector is so that it is
    more similar to the chosen gene.
  • Adjust the other reference vectors.
  • Repeat thousands of times until partitions are
    stable.

A self-organizing map.
29
Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation
  • The idea behind SPC is based on the physical
    properties of dilute magnets.
  • Calculating correlation between magnet
    orientations at different temperatures (T).

TLow
30
Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation
  • The idea behind SPC is based on the physical
    properties of dilute magnets.
  • Calculating correlation between magnet
    orientations at different temperatures (T).

THigh
31
Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation
  • The algorithm simulates the magnets behavior at a
    range of temperatures and calculates their
    correlation
  • The temperature (T) controls the resolution

TIntermediate
32
Clustering data
Available clustering tools
  • M. Eisens programs for clustering and display of
    results (Cluster, TreeView)
  • Predefined set of normalizations and filtering
  • Agglomerative, K-means, 1D SOM
  • Matlab
  • Agglomerative, public m-files.
  • Dedicated software packages (SPC)
  • Web sites e.g. http//ep.ebi.ac.uk/EP/EPCLUST/
  • Statistical programs (SPSS, SAS, S-plus)
  • And much more

33
Clustering data
The final data representation is then a big
matrix with rows being the genes and columns
representing the different experiments. To keep
the image coherent with the scan output, the
ratio numbers calculated by Scanalyze are
transformed back in color spots on a green-red
based scale.
34
Clustering data
Another way to represent these data is a graph
showing the genes expression variation during
the different experiments
Expression variation of nine genes along the 19
experiments from Lyer et al. (Fibroblast response
to serum stimulation)
35
Web resources data analysis tools
Expression Profiler Online clustering and analysis tools (EBI)
GenEx Database, repository and analysis tools (NCGR)
MAExplorer MicroArray Explorer for data mining Gene Expression, free download
ArrayDB Downloadable tools, short online demo
MAXD Downloadable data warehouse and visualisation for expression data
Jexpress Java tools for gene expression data analysis, free download
Eisen Lab Michael Eisen's suite for image quantitation and data analysis (Scanalyze, Cluster, TreeView). Downloadable.
36
Web resources public databases
SMD The Stanford Microarray Database
Chip DB Searchable database on gene expression (MIT)
ExpressDB Public queries of E. coli and yeast data
GEO Gene expression data repository and online resource (NCBI)
RAD RNA Abundance Database
Expression Connection Saccharomyces Genome Database expression data retrieval
EpoDB Expression information retrieval for one gene at a time
yMGV Public queries of yeast data
37
Web resources public databases
AMAD Downloadable web driven database system
ArrayExpress Public data deposition and public queries (EBI)
maxdSQL Downloadable data warehouse and visualization environment
GXD Mouse expression data storage and integration
GeNet Distribution and visualization of gene expression data from any organism
38
Web resources public databases
Drosophila microarray project Drosophila Metamorphosis Time Course Database
Samson Lab Yeast Transcriptional Profiling Experiments
SageMap NCBI SAGE data and analysis tools
NCI60 cancer project Supplement to Ross et al. (Nat Genet., 2000).
Serum-response Supplement to Lyer et al.(1999) Science 28383-87
Breast cancer Supplement to Perou et al. Nature 406747-752(2000)
Cancer Molecular Pharmacology Integration of large databases on gene expression and molecular pharmacology.
39
Web resources general information
Leungs Links page software info
Davisons DNA Microarray Methodology - Flash Animation
gene-chips Overview of the technique, papers
Chips microassays General information
SMD guide Stanford's links page, very complete
Introduction Online introduction to microarrays (EBI)
Brown Lab Guide Microarrays protocols and arrayer construction.
Write a Comment
User Comments (0)
About PowerShow.com