Aucun titre de diapositive - PowerPoint PPT Presentation

About This Presentation

Title:

Aucun titre de diapositive

Description:

Title: Aucun titre de diapositive Author: nat Last modified by: elf Created Date: 8/13/2001 10:17:35 PM Document presentation format: Pr sentation l' cran – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 40

Provided by: Nat109

Category:

more less

Transcript and Presenter's Notes

Title: Aucun titre de diapositive

1
Introduction to DNA Microarrays
DNA Microarrays and DNA chips resources on the
web
2
INTRODUCTION
Microarray analysis is a new technology that
allows scientists to simultaneously detect
thousands of genes in a small sample and to
analyze the expression of those genes.
Microarrays are simply ordered sets of DNA
molecules of known sequence. Usually rectangular
shaped, they can consist of a few hundred to
hundreds of thousands of sets. Each individual
sequence goes on the array at precisely defined
location.
3
Potential application domains

Identification of complex genetic diseases
Drug discovery and toxicology studies
Mutation/polymorphism detection (SNPs)
Pathogen analysis
Differing expression of genes over time, between
tissues, and disease states
Preventive medicine
Specific genotype (population) targeted drugs
More targeted drug treatments AIDS
Genetic testing and privacy

4
The technique
Based on already known methods, such as
fluorescence and hybridization. High throughput
miniaturized method. It's main purpose is to
compare gene transcription levels in two or more
different kinds of cells.

- Microarrays
DNA chips
SAGE
Beads (liquid chip)

5
The challenge
The big revolution here is in the "micro" term.
New slides will contain a survey of the human
genome on a 2 cm2 chip! The use of this
large-scale method tends to create phenomenal
amounts of data, that have then to be analyzed,
processed and stored. As the technique is quite
new, analyzing the data is still a problem, and
nothing is standardized yet. A few databases and
on-line repositories are coming out, and the
future standard will probably be chosen among
them.
This is a job for Bioinformatics !
6
General overview

Making the chip
Experiment design, sequence selection, collection
maintenance, PCR, spotting, printing, synthesis
Probe hybridization
Probe purification, labelling, hybridization,
washing
Scanning and image treatment
Fluorescence correction, find spots, background
Analysing the data
Filtering, normalisation
Clustering (hierarchical, centroid, SPC)
Representation, storage
Graphics, databases, web public resources

wet lab
7
THE EXPERIMENT making the chip
1- Designing the chip choosing genes of
interest for the experiment and/or select the
samples - Selection of sequences that represent
the investigated genes. - Finding sequences,
usually in the EST database. - Problems
sequencing errors, alternative splicing, chimeric
sequences, contamination
8
THE EXPERIMENT making the chip

2- Spotting the sequences on the substrate
Substrate usually glass, but also nylon
membranes, plastic, ceramic
Sequences cDNA (500-5000 nucleotides, dna
chips), oligonucleotides (2080-mer oligos, oligo
chips), genomic DNA ( 50000 bases)
Printing methods microspotting, ink-jetting
(for dna chips) or in-situ printing,
photolithography (for oligos, Affymetrix method)

9
THE EXPERIMENT making the chip
Microspotting and ink-jetting
10
THE EXPERIMENT making the chip
The microspotting is done by a robot called
arrayer
11
THE EXPERIMENT making the chip
Oligo-spotting (Affymetrix method)
12
THE EXPERIMENT hybridization

Sample preparation
Extracting DNA (for genomic studies) or mRNA (for
gene expressions studies) from the two or more
samples to compare.
Making cDNAs with extracts, and labeling them
with different fluorochromes to allow direct
comparison. (Cy-3, Cy-5, DIG)
Some techniques use radiolabeling

13
THE EXPERIMENT hybridization
Probes are overlaid on the chip, put in a
hybridization chamber, and then washed.
14
THE EXPERIMENT generating data

Chip scanning
Fluorescence measurements are made with scanning
laser fluorescence microscope that scans the
slide, illuminating each DNA spot and measuring
fluorescence for each dye separately. It creates
one red and one green image.
The two images are then superimposed to give a
virtual result of RNA ratio in both samples

15
THE EXPERIMENT generating data
16
1- Samples 2- Extracting mRNA 3- Labeling 4-
Hybridizing 5- Scanning 6- Visualizing
17
Examples of images
Affymetrix chip
Stanford array
18
THE EXPERIMENT generating data

Image analysis
These fluorescence measures are then used to
determine the ratio, and in turn the relative
abundance, of the sequence of each specific gene
in the two mRNA or DNA samples.
This analysis is performed by a software such as
Scanalyze, available at http//rana.lbl.gov/Ei
senSoftware.htm
or Spotfinder from TIGR
The files created can then be submitted to
further analysis

19
THE EXPERIMENT making sense of the data
Although the visual image of a microarray panel
is alluring, its information content, per se, is
still not human readable.
How to visualize, organize and explore the
meaning of information consisting of several
million measurements of expression of thousands
of genes under thousands of conditions?
20
THE EXPERIMENT making sense of the data
Data mining depends on the questions which are
asked. The most frequent question is to find sets
of genes that have correlated expression profiles
(belonging to the same biological process and/or
co-regulated), or to divide conditions to groups
with similar gene expression profiles (for
example divide drugs according to their effect on
gene expression). The method used to answer
these questions is called CLUSTERING.
21
Clustering data

Input N data points, Xi, i1,2,,N (the color
ratios measured with Scanalyze, for example) in a
D dimensional space. N and D will be either genes
and conditions for gene clustering, or conditions
and genes for condition clustering.
Goal Find natural groups or clusters.
Note according to the method, the number of
clusters will be fixed from the beginning
(centroid clustering) or determined after the
analysis (hierarchical clustering)

22
Clustering data
Before clustering, a few steps to clean the
data are necessary (normalization,
filtering) Clustering methods (examples) 1-
Agglomerative Hierarchical 2- Centroids K-means
or SOM 3- Super-Paramagnetic Clustering For a
good introduction on different clustering
techniques, read the article from Gavin Sherlock
Analysis of large-scale gene expression data in
Current Opinion in Immunology 2000, 12201-205
(pdf) http//www.isrec.isb-sib.ch/vpraz/chips/She
rlock.pdf
23
Agglomerative Hierarchical Clustering
Distance between joined clusters
The dendrogram induces a linear ordering of the
data points
Dendrogram
24
Agglomerative Hierarchical Clustering
Before doing a such clustering, one has to define
two things

1- The similarity measure between two genes (or
experiments)

Centered correlation Uncentered
correlation Absolute correlation Euclidean
2- The distance measure between the new cluster
and the others
Single Linkage distance between closest
pair. Complete Linkage distance between farthest
pair. Average Linkage distance between cluster
centers
25
Centroid methods - K-means

Start with random position of K centroids.
Assign points to centroids
Move centroids to centerof assigned points
Iterate until centroids are stable

Iteration 0
26
Centroid methods - K-means

Start with random position of K centroids.
Assign points to centroids
Move centroids to centerof assigned points
Iterate until centroids are stable

Iteration 1
27
Centroid methods - K-means

Start with random position of K centroids.
Assign points to centroids
Move centroids to centerof assigned points
Iterate until centroids are stable

Iteration 3
28
Self-organizing Maps

Choose a number of partitions
Assign a random reference vector to each
partition.
Pick a gene randomly and assign it to its most
similar reference vector.
Adjust that reference vector is so that it is
more similar to the chosen gene.
Adjust the other reference vectors.
Repeat thousands of times until partitions are
stable.

A self-organizing map.
29
Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation

The idea behind SPC is based on the physical
properties of dilute magnets.
Calculating correlation between magnet
orientations at different temperatures (T).

TLow
30
Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation

The idea behind SPC is based on the physical
properties of dilute magnets.
Calculating correlation between magnet
orientations at different temperatures (T).

THigh
31
Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation

The algorithm simulates the magnets behavior at a
range of temperatures and calculates their
correlation
The temperature (T) controls the resolution

TIntermediate
32
Clustering data
Available clustering tools

M. Eisens programs for clustering and display of
results (Cluster, TreeView)
Predefined set of normalizations and filtering
Agglomerative, K-means, 1D SOM
Matlab
Agglomerative, public m-files.
Dedicated software packages (SPC)
Web sites e.g. http//ep.ebi.ac.uk/EP/EPCLUST/
Statistical programs (SPSS, SAS, S-plus)
And much more

33
Clustering data
The final data representation is then a big
matrix with rows being the genes and columns
representing the different experiments. To keep
the image coherent with the scan output, the
ratio numbers calculated by Scanalyze are
transformed back in color spots on a green-red
based scale.
34
Clustering data
Another way to represent these data is a graph
showing the genes expression variation during
the different experiments
Expression variation of nine genes along the 19
experiments from Lyer et al. (Fibroblast response
to serum stimulation)
35
Web resources data analysis tools
Expression Profiler Online clustering and analysis tools (EBI)
GenEx Database, repository and analysis tools (NCGR)
MAExplorer MicroArray Explorer for data mining Gene Expression, free download
ArrayDB Downloadable tools, short online demo
MAXD Downloadable data warehouse and visualisation for expression data
Jexpress Java tools for gene expression data analysis, free download
Eisen Lab Michael Eisen's suite for image quantitation and data analysis (Scanalyze, Cluster, TreeView). Downloadable.
36
Web resources public databases
SMD The Stanford Microarray Database
Chip DB Searchable database on gene expression (MIT)
ExpressDB Public queries of E. coli and yeast data
GEO Gene expression data repository and online resource (NCBI)
RAD RNA Abundance Database
Expression Connection Saccharomyces Genome Database expression data retrieval
EpoDB Expression information retrieval for one gene at a time
yMGV Public queries of yeast data
37
Web resources public databases
AMAD Downloadable web driven database system
ArrayExpress Public data deposition and public queries (EBI)
maxdSQL Downloadable data warehouse and visualization environment
GXD Mouse expression data storage and integration
GeNet Distribution and visualization of gene expression data from any organism
38
Web resources public databases
Drosophila microarray project Drosophila Metamorphosis Time Course Database
Samson Lab Yeast Transcriptional Profiling Experiments
SageMap NCBI SAGE data and analysis tools
NCI60 cancer project Supplement to Ross et al. (Nat Genet., 2000).
Serum-response Supplement to Lyer et al.(1999) Science 28383-87
Breast cancer Supplement to Perou et al. Nature 406747-752(2000)
Cancer Molecular Pharmacology Integration of large databases on gene expression and molecular pharmacology.
39
Web resources general information
Leungs Links page software info
Davisons DNA Microarray Methodology - Flash Animation
gene-chips Overview of the technique, papers
Chips microassays General information
SMD guide Stanford's links page, very complete
Introduction Online introduction to microarrays (EBI)
Brown Lab Guide Microarrays protocols and arrayer construction.

Write a Comment

User Comments (0)