Microarray experiments. Database and Analysis Tools. - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Microarray experiments. Database and Analysis Tools.

Description:

Title: PowerPoint Presentation Author: Alexander Milov Last modified by: Aldo Created Date: 9/24/2002 11:47:14 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 29
Provided by: Alexande200
Category:

less

Transcript and Presenter's Notes

Title: Microarray experiments. Database and Analysis Tools.


1
Microarray experiments. Database and Analysis
Tools.
Kate Milova cDNA Microarray Facility March
24, 2005
2
Outline.
  • Microarray platforms and services available at
    AECOM
  • cDNA
  • Long Oligo
  • Afymetrix
  • Database (cDNA Long Oligo) structure and
    content
  • Printing information
  • Chip layout
  • Annotation
  • Annotation algorithms and data mining
  • On-line Analysis Tools
  • Normalization
  • Signal filtering
  • Comparison
  • Statistical packages and Analysis software
  • Summary

3
Microarray Platforms at AECOM.
4
How to choose a microarray platform.
5
Before starting your microarray experiment.
6
cDNA Microarray Facility. Home page.
Standart Custom Arrays. Description Prices
Hybridization, labeling, bioinformatics, workshops
Database for cDNA Long Oligo Arrays. Analysis
Pipeline
AECOM cDNA microarray facility. Supported
publications
Useful links of analysis tools
7
Database for Analysis of Microarrays at AECOM.
Contents.
Chip layout
Gene Annotation
Printing Information
  • Accession
  • Clone ID
  • Clone end
  • Vector name
  • Clone name
  • UniGene cluster ID
  • Best blast hit
  • Main blast parameters (score, E-value,
    identity, blast date, etc.)
  • Gene ID
  • Gene symbol
  • Gene synonyms
  • Chromosome
  • Map location
  • GO IDs
  • GO Annotation
  • Chip name
  • Spot information (Accession or clone id or
    bacterial control)
  • Spot location
  • Library name
  • Clone location on 384 plate
  • Clone location on 96 plate
  • Chip name
  • Specie
  • Number of spots
  • Number of controls
  • Number of pen domains
  • Number of slides
  • Printing pattern
  • Distance between spots
  • Number of rows
  • Number of columns
  • Printing date
  • Master chip

8
Annotation sources NCBI.
UniGene ID? Accession
UniGene
UniGene ID ? Blast against UniGene clusters
Entrez Gene
UniGene ID ? Gene ID ? GO ID
NCBI
Blast Software
Blast Search
Refseq NT databases ? Annotation
9
Annotation sources NCBI.
UniGene ID? Accession
UniGene
UniGene ID ? Blast against UniGene clusters
  • NCBI ? UniGene ? UniGene ID
  • UniGene Id for cDNA arrays is obtained from the
    UniGene source file for each particular accession
    number of the clone.
  • NCBI ? UniGene ? Blast
  • UniGene Id for Long Oligo arrays is obtained
    from blast results
  • Blast search was done with the set of oligo
    sequences against UniGene clusters with cutoff
    99 for sequence identity and 90 for
    overlapping.
  • UniGene Id for the oligo hitting multiple
    UniGene clusters is marked as an Ambiguous
    cluster ID.

NCBI
10
Annotation sources NCBI.
  • UniGene ID ? Gene ID
  • All information retrieved from Enrez Gene
    project is based on the UniGene cluster ID and
    corresponding Gene ID.
  • Gene ID is ambiguous in Gene ID to UniGene
    cluster ID connection.
  • Parsing filter was used to eliminate ambiguous
    Gene IDs.
  • Gene ID ? GO ID
  • For each Gene ID corresponding Gene Ontology IDs
    were retrieved from Entrez Gene source file
  • There might be a few or more then 10 different
    GO IDs for a Gene ID. All of them are collected.

11
Annotation sources NCBI.
  • Blast Software package is installed on the
    microarray server.
  • This software allows to format databases and run
    batch homology search for any combination of
    custom databases and query sequences.
  • Refseq NT databases. Annotation
  • Loaded formatted and periodically updated on the
    microarray server.
  • When databases are updated we run blast search
    of cDNA and Long Oligo sequences.
  • Blast results are parsed using our algorithm for
    annotation extraction.

NCBI
Blast Software
Blast Search
Refseq NT databases? Annotation
12
Annotation Extraction Algorithm.
Raw Data
Sequences
Homology search against RefSeq NT
Alignment quality check
90
80
13
Annotation sources Gene Ontology.
Biological process
Molecular function
Gene Ontology
  • Gene Ontology.
  • Multiple GO IDs for each Gene ID are retrieved
    in the previous step from Enrez Gene ( if
    available).

Cellular compartment
  • Gene Ontology annotation for all GO IDs is kept
    in three different information fields biological
    processes, molecular function and cellular
    compartment. For each of the fields all available
    annotation was prefiltered with redundancy check
    and concatenated.

14
cDNA Microarray Facility. Database.
15
Database Search.
16
Microarray Data Analysis Pipeline.
17
Pipeline. LOWESS Normalization.
18
cDNA Microarray Facility. Pipeline. Filtering.
19
Pipeline. Data set Comparison.
20
Summary

21
cDNA Microarray Facility. Services.
22
Annotation Extraction Algorithm.
Database of cDNA Long Oligo sequences
All hits now go through linguistic filter
Blast search against Refseq NT databases
Hits which passes two tests are defined as Good
Hits
All hits are examined with alignment quality check
  • Best blast Hit is
  • First good Refseq hit from group 1 OR
  • First good NT hit from group 1 OR
  • First good Refseq hit from group 2 OR
  • First good NT hit from group 2

Only hits with gt90 identity are left
All hits are divided in two groups 1. gt 80 of
overlapping and 2. lt 80 (Partially similar)
23
cDNA Microarray Facility. Arrays.
24
cDNA Microarray Facility. Publications.
25
Annotation Extraction Algorithm.
Raw Data
Sequences
Homology search against RefSeq NT
26
Before starting your microarray experiment.
27
Microarray Experts.
28
Microarray Platforms at AECOM.
Write a Comment
User Comments (0)
About PowerShow.com