Title: Microarray Databases and MIAME (Minimum Information About a Microarray Experiment)
1Microarray Databases and MIAME (Minimum
Information About a Microarray Experiment)
- Yong Liu
- Bioinformatics Unit
2Outline
- Review of microarray technology from
data/database perspective - Motivation behind the MIAME standard
- MIAME whats in it?
- Current existing microarray databases
- Future development
3DNA Microarray Technology
Cy3 550 nm
Cy5 650 nm
4(No Transcript)
5Context is Everything !
- An observed phenotype is specific for the
conditions under study (Pat Brown, Stanford
University) - Information recorded in microarray database
should be used on standalone basis - Any background information
- Automated data analysis and mining, i.e. not only
on record-by-record basis - Data from different laboratories and different
technology platforms
6Capturing Data and Meta-data in Microarray
Experiments
7How Much Data?
- Experiments
- 100 000 genes in human
- 320 cell types
- 2000 compounds
- 3 time points
- 2 concentrations
- 2 replicates
- Data volume
- 8 x 1011 data-points
- 1 x 1015 1 petaB of data
8Gene Expression Matrix
The final gene expression matrix (on the right)
is needed for higher level analysis and mining.
?
9MGED and MIAME
- A need to establish a public repository or
repositories for microarray gene expression data
became apparent in 1998, which requires data
standards - MGED-1 (Microarray Gene Expression Database)
Group November 14-15, 1999, Cambridge, UK - Established five working groups, including the
microarray data annotation group (MIAME) - MGED-2 May 25 - 27, 2000, Heidelberg, Germany
- Endorsed a MIAME draft
- MGED-3 March 29-31, 2001, Stanford University
- Adopted MIAME 1.0
- MGED-4 Feb. 13-16, 2002, Boston
- Adopted MIAME 1.1
10MIAME Six Parts
11MIAME Part 1 - Experimental Design the set of
the hybridisation experiments as a whole
- Author, contact information, citations
- Type of experiment (e.g., time course, normal vs
diseased comparison) - Experimental factors i.e. tested parameters in
the experiment (e.g. time, dose, genetic
variation, response to a compound) - List of organisms used in the experiment
- List of platforms used
12MIAME Part 2 - Array Design each array used and
each element (spot) on the array
- Array design related information (e.g. platform
type insitu synthesized or spotted, array
provider, surface type glass, membrane, other,
etc) - Properties of each type of elements on the array,
that are generated by similar protocols (e.g.
synthesized oligos, PCR products, plasmids,
colonies, others) may be simple or composite
(Affymetrix) - Each element (spot) on the array
13MIAME Part 3 - Samples samples used, the extract
preparation and labeling
- Sample source and treatment
- Hybridisation extract preparation
- Laboratory protocol, including extraction method,
whether RNA, mRNA, or genomic DNA is extracted,
amplification method - Labelling
- Laboratory protocol, including amount of nucleic
acids labelled, label used (e.g. Cy3, Cy5, 33P,
etc)
14MIAME Part 4 - Hybridizations procedures and
parameters
- The solution (e.g. concentration of solutes)
- Blocking agent
- Wash procedure
- Quantity of labelled target used
- Time, concentration, volume, temperature
- Description of the hybridisation instruments
15MIAME Part 5 - Measurements images,
quantitation, specifications
- Scanning information
- Scan parameters, including laser power, spatial
resolution, pixel space, PMT voltage - Laboratory protocol for scanning, including
scanning hardware and software used - Image analysis information
- Image analysis software specification
- All parameters
- Summarised information from possible replicates
16MIAME Part 6 Normalization types, values,
specifications
- Normalisation strategy (spiking, housekeeping
genes, total array, other) - Normalisation algorithm
- Control array elements
17Current Existing Microarray Databases
- Local Installation
- AMAD, GeneDirector, mAdb, maxdSQL, NOMAD
- Public Queries only
- ChipDB, RAD
- Public Queries and Local Installation
- SMD
- Public Data Deposition and Queries
- ArrayExpress, GEO, GXD
- GeneX and GeNet
FOR MORE INFO...
Margaret Gardiner-Garden and Timothy G.
Littlejohn, A comparison of micoarray databases,
Briefings in Bioinformatics, May 2001
18MIAME-compliant Systems
- Different labs have different needs lab-centric
system is more desirable - MIAME-compliant microarray database systems are
still under development - Commerical
- GeneTraffic (www.iobion.com)
- PARTISAN arrayLIMS (www.clondiag.com)
- Rosetta Resolver (www.rosettabio.com)
- .
- OpenSource
- GeneX and NOMAD, among others, are still under
development to be MIAME-compliant,
19Future Development
- Establishing MIAME-compliant databases
- Different labs continue to develop their own
systems - Data exchange format (MAGE-ML) allowing to
communicate MIAME information - Microarray data has no central DB yet
distributed data queries and data mining? - HTTP/XML
- SOAP (Simple Object Access Protocol)
- WDSL(Web Services Description Language)
- UDDI (Universal Description, Discovery, and
Integration)