An evaluation of representing proteomics experiments within FuGEOM - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

An evaluation of representing proteomics experiments within FuGEOM

Description:

Most investigation types have a hypothesis, biological sample, basic lab procedures etc... The time period elapsed since an identifiable point in the life ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 39
Provided by: jon82
Category:

less

Transcript and Presenter's Notes

Title: An evaluation of representing proteomics experiments within FuGEOM


1
An evaluation of representing proteomics
experiments within FuGE-OM
  • Andrew Jones
  • Department of Computer Science,
  • University of Manchester
  • Angel Pizarro
  • Institute for Translational Medicine and
    Therapeutics,
  • University of Pennsylvania

2
Requirements of functional genomics
  • FG produces large complex data sets ( lots of
    metadata!)
  • Various standard formats being developed (PSI,
    MGED, SMRS)
  • Most investigation types have a hypothesis,
    biological sample, basic lab procedures etc
  • Similar analyses performed over different types
    of FG data set
  • Queries, statistics and so on

3
Benefits of shared model components
  • Queries over common annotation
  • Samples, hypotheses, protocols
  • Shared software for experimental annotation and
    analysis
  • Microarrays, proteomics and metabolomics
    performed in same lab
  • Developing standards for each technique is a hard
    problem
  • Shared resources could alleviate problems

4
The Functional Genomics Experiment Object Model
(FuGE-OM)
  • Aim
  • Model to support generation of functional
    genomics data formats
  • Learn from experience of MAGE-OM ( PEDRo)
  • Divide into generic and technology-specific
  • Result
  • FuGE-OM is far simpler than MAGE-OM
  • Covers a wide range of use cases due to its
    generic structure
  • 2 namespaces in core Bio and Common
  • MAGE extension in development

5
Purpose of this Evaluation
  • Test if proteome experiments could be modelled
    within FuGE-OM
  • To verify if Common and Bio are generic to
    functional genomics
  • Demonstrate correspondence with PSI-OM (and
    PEDRo) for experiment, lab procedures etc.
  • Does FuGE-OM give additional features to the
    models?
  • Gain feedback from microarray / proteome
    community on modelling issues

6
FuGE.Common
FuGE.Bio
  • Audit
  • Description
  • Ontology
  • Protocol
  • Reference
  • Experiment
  • Material
  • BioSequence
  • Data

Common does not rely upon Bio, it can be used
independently
7
Common.Protocol
  • Action
  • Ordered list of atomic actions (simple protocol
    steps)
  • Plain text or standard term from CV
  • Allows nesting of protocols
  • (Complex actions become a nested protocol)
  • Association to Equipment and Software
  • Parameters and default values can be captured

The class Protocol describes a standard protocol
8
Common.Protocol
  • Date performed
  • If parameter values differ from default
  • Particular instrument used e.g. from a pool
  • Protocol operator (Person)

ProtocolApplication is an instance of a Protocol
9
Common.Protocol
1. Material package
3. Data package
2. Data package
  • Convert one type of substance to another
  • - protein solubilisation
  • Acquire data by analysing a particular material
  • - gel scanning
  • Create a new data set by transforming an existing
    data set
  • - gel image analysis

10
Bio.Experiment
  • Package can describe a wide range of
    investigation types
  • Class names are open for discussion!

11
PSI-OM Study
  • Experimental Factors can be biological or
    technical
  • e.g. Time course
  • Factor Value captures difference between factors
  • e.g. Time points / strains / treatments
  • Link to Data dimension

Experiment similar to Project, ExperimentDesign
similar to Study
12
Experiment Find gene protein expression in
three different mouse strains
Description Replicate design, Quality
control Etc
Experiment
OT OntologyTerm
Link from each FactorValue to corresponding data
values
ExperimentDesign
DesignType
BiologicalProperty (OT)
ExperimentalFactor
FactorType
BioSourceType (OT)
FactorValue
bioSource
Material
Strain A (OT)
FactorValue
bioSource
Material
Strain B (OT)
FactorValue
bioSource
Material
Strain C (OT)
Data of Interest
13
Experiment Find gene protein expression
sampling from a population at 4 time points
Description Replicate design, Quality
control Etc
Experiment
OT OntologyTerm
Information about how sampling was done in
Material package
ExperimentDesign
DesignType
MethodologicalDesign (OT)
FactorType (OT)
ExperimentalFactor
TimePoint
FactorValue
1 hour (OT)
bioSource
FactorValue
2 hour (OT)
Material
FactorValue
4 hours (OT)
FactorValue
24 hours (OT)
Data of Interest
14
Experiment Find difference in protein
expression detected using two types of mass
spectrometer
Description Replicate design, Quality
control Etc
Experiment
OT OntologyTerm
Difference between workflows only in the Data
package different DataAcquisitions
ExperimentDesign
DesignType (OT)
MethodologicalDesign
FactorType (OT)
ExperimentalFactor
hardwareVariation
FactorValue
Micromass (OT)
bioSource
FactorValue
Material
Q-star (OT)
Data of Interest
15
Bio.Material
  • Materials have a type and various
    characteristics, described using an ontology

Can have sub-components e.g. Plate and Plate Well
Used for describing a cycle of treatments (in
conjuction with Protocol)
16
Proteome workflow
Material Measurement
Material Measurement
Material
Material Treatment
Gel2D Dimensions Composition Etc.
Gel Separation Voltage Time
Material
Material Treatment
Separated Gel
Spot Picking X/Y Coords
Material
  • Create subclasses of Material and
    MaterialTreatment to specify attributes about
    gels and separation
  • Allows use of other Bio and Common features
    ontology references (for Material), link to
    Experiment, auditing etc

17
PSI-OM Assay
Material Measurement
Material Measurement
Material
Material Treatment
Material
Material Treatment
Gel2D
Gel Separation
Separated Gel
Spot Picking
18
Proteome workflow
Material Measurement
Material Measurement
Material
Material Treatment
Material
Material Treatment
Material
Column Type Size Beads Etc
Column Separation Time Flow rate
Collect Fraction startPoint endPoint
  • Extensions allow specific attributes to be
    captured about column, separation and collection
    of fractions
  • End Material can go to any other kind of
    treatment e.g. mass spec / gel etc

19
2D LC-MS use case
Yeast culture
1. Extract Proteins
2. Digest into peptides
Ion exchange
Reverse phase
Protein list ABC1 DFR2 DFF .
MassSpec Informatics
Mass Spectrometry
Peak lists
20
2D LC-MS use case
Protocol Application
mzData
CollectFraction
Various ways mzData / mzIdent could reference a
FuGE entry

Bio or Common
mzIdent
1
n

PSI extension
IdentifiedProtein
21
DIGE Use Case
2. Pool
1. Extract protein
1. Extract protein
Cy2
Cy3
Cy5
3. Labelling
4. Pool
5. Gel Electrophoresis
6. Imaging
22
DIGE Use Case
Material (Sample A)
Material (Sample B)
Described using ontologies
MaterialTreament (ProteinExtraction)
MaterialTreament (ProteinExtraction)
Material (Protein mix.)
Material (Protein mix.)
Protocol Application
MaterialTreament (ProteinExtraction)
Mat. Treat.
Protocol
Material (pooled)
Material (Cy3)
Protocol Application
MaterialTreament (ProteinExtraction)
Material (Cy2)
Material Treament (Labelling)
Material Treament (Labelling)
Material Treament (Labelling)
Material (Cy5)
Material (Cy2 Labelled)
Material (Cy3 Labelled)
Material (Cy5 Labelled)
23
DIGE Use Case
Material (Cy2 Labelled)
Material (Cy3 Labelled)
Material (Cy5 Labelled)
Mat. Treat (pooling)
Material (Pooled mix.)
Gel2D ( gel params)
GelSeparation (separation params)
SeparatedGel

Bio or Common
DataAcquisition (Scanning)

PSI extension
ImageData
24
mzData
  • Could be linked by mapping mzData back to UML
  • OR
  • Use an XLink associate a FuGE-ML entry with an
    mzData doc
  • (for sample processing / experiment description
    etc.)

25
Issues for community involvement
  • How can FuGE help PSI model development process?
  • Does Experiment package capture all proteome /
    RSBI use cases?
  • How could mzData be integrated with FuGE?
  • E.g. XLink, UML level (shared model components)

26
Conclusions
  • Common and Bio allow descriptions of experiment,
    lab procedures, data dimensions
  • Could give additional annotation capabilities to
    current proteome models
  • PSI could extend components to make explicit
    proteome concepts to be captured
  • Mass spec formats could reference a FuGE entry
    for samples, hypothesis, experiment etc

27
Plan for future work
  • XML Schema generated in near future for creating
    test data sets
  • Use cases defined and made available to
    demonstrate correct use of model
  • Weekly conference call to solve modelling issues
  • We would like to encourage PSI involvement in
    further development of Bio and Common

http//fuge.sourceforge.net
28
Acknowledgements
  • Modelling
  • Paul Spellman, Chris Taylor, Norman Paton and
    many others at MGED meetings

http//fuge.sourceforge.net
29
ICAT Use Case
30
DIGE Use Case
Tracking materials through separations /
labelling / pooling
31
FactorValue
FactorValue
FactorValue
Data
DimensionElement
DimensionElement
DimensionElement
Dimension
Strain A
Strain B
Strain C
Dimension
DimensionElement
DimensionElement
DimensionElement
DimensionElement
DimensionElement
Array Spot 1
Gel Spot 1
Measure 1
Measure 2
Measure 3
DimensionElement
DimensionElement
Array Spot 2
Gel Spot 2
DimensionElement
DimensionElement
Array Spot 3
Gel Spot 3
. . .
. . .
Matrix
32
Common.Data
  • Ordered set of Dimensions
  • Data stored in Matrix
  • Matrix must be extended with subclasses

33
Common.Ontology
  • Source of term captured in DatabaseEntry
  • Term stored in name
  • Value some ontology concepts need a value
    supplied by user
  • Self-association for nested terms

34
class Age namespace http//mged.sourceforge.net/
ontologies/MGEDOntology.daml documentation The
time period elapsed since an identifiable point
in the life cycle of an organism. If a
developmental stage is specified, the
identifiable point would be the beginning of that
stage. Otherwise the identifiable point must be
specified such as planting (e.g. 3 days post
planting). constraints restriction
has_measurement has-class Measurementrestriction
has_initial_time_point has-class InitialTimePoint
class Age namespace http//mged.sourceforge.net/
ontologies/MGEDOntology.daml documentation The
time period elapsed since an identifiable point
in the life cycle of an organism. If a
developmental stage is specified, the
identifiable point would be the beginning of that
stage. Otherwise the identifiable point must be
specified such as planting (e.g. 3 days post
planting).
Example
3 days post planting
hasMeasurement
hasUnit
OntologyTerm
OntologyTerm
OntologyTerm
Name Measurement Value 3
Name Unit Value null
Name days Value null
OntologyTerm
Name Age Value null
OntologyTerm
OntologyTerm
Name InitialTimePoint Value null
Name planting Value null
hasInitialTimePoint
35
Common.Reference
Subclasses of Identifiable can be linked to
bibliographic or database entries
36
Common.Description
  • Many classes inherit from Describable
  • Link to Audit / Security details
  • URI and text description

37
Common.Audit
  • Manages changes to the document
  • Linked to Contacts

38
Common.Data
  • Ordered set of Dimensions
  • Data stored in Matrix
  • Matrix must be extended with subclasses
Write a Comment
User Comments (0)
About PowerShow.com