Mass Spectrometry Database Infrastructure and Data Import Module - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Mass Spectrometry Database Infrastructure and Data Import Module

Description:

Mass Spectrometry Database Infrastructure. and Data Import Module ... [1] DePaul University, jvanpuy_at_gmail.com [2] DePaul University, epuryear_at_gmail. ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 2
Provided by: ericpu
Category:

less

Transcript and Presenter's Notes

Title: Mass Spectrometry Database Infrastructure and Data Import Module


1
Mass Spectrometry Database Infrastructure and
Data Import Module Jennifer Van Puymbrouck1,
Eric Puryear 2, David Angulo3, Alex
Schilling4, Kevin Drew5, David Jabon6,
Gregor von Laszewski 7 1 DePaul University,
jvanpuy_at_gmail.com 2 DePaul University,
epuryear_at_gmail.com 3 DePaul University,
dangulo_at_cs.depaul.edu 4 University Of Illinois
at Chicago, aschilli_at_uic.edu 5 The University
of Chicago, kdrew_at_uchicago.edu 6 DePaul
University, djabon_at_depaul.edu 7 Argonne
National Laboratory gregor_at_mcs.anl.gov
Project Background
Illinois Bio-Grid Background
About The Illinois Bio-Grid The Illinois
Bio-Grid (IBG) is a consortium of universities,
national laboratories, and private corporations
dedicated to researching and solving
computationally complex biological problems. IBG
members share computational resources, allowing
all parties to collaborate and solve large
problems that are too complex for any one
organization to solve independently.
About the IOIM Input and Output of mzXML,
including the importation to databases, is
critical for several projects within the Illinois
Bio-Grid (IBG). The Input Output and Import for
mzXML (IOIM) project addresses this need by
providing a means of importing mass spectrometry
data into the IBG databases, as well as a general
purpose mzXML I/O library.
Data is read from the mzXML file and imported to
the database
The IBG Database stores millions of empirically
derived, annotated spectra
Mass spectrometry data can in several different
file formats including .dta, .mgf, and .mzXML
Data that is not already in the mzXML format is
converted to mzXML
Mass spectrometry data is now in the universal
mzXML format and can be used by other components
of the IOIM
DatabaseImport Module
IBG Database
Mass Spectrometry Data .dta.mgf .mzXML Other
Formats
mzXMLFile
Conversion
Data Manipulation
mzXML File
Data is processed and modified before being
written to an mzXML file for storage on the file
system
This material is based upon work supported by the
National Science Foundation under Grant No.
0353989.
Implementation
mzXML
The IOIM is implemented in Java and C. The
portion implemented in C is responsible for
converting mass spectrometry data from the
various formats (such as .dta and .mgf) to mzXML,
as well as the general purpose I/O of mzXML
files. These general purpose I/O routines are
also available as shared libraries, allowing
other programs to read and write mzXML files. The
Java portion of the IOIM retrieves meta data from
the mzXML files and stores this information in
the IBG mass spectrometry database. The spectrum
are then stored on the file system for more
efficient access.
The mzXML standard is an XML based format for the
storage of mass spectrometry data. This open
format is designed to be easy to implement,
feature rich, and because it is based on XML,
mzXML can be extended to suit future needs. The
Mass Spectrometry I/O Project and its related
tools use mzXML for the storage of mass
spectrometry data. Below is a small sample of
mass spectrometry data in the mzXML format.
  ltmsManufacturer category"msManufacturer"
value"ThermoFinnigan" /gt   ltmsModel
category"msModel" value"LCQ Deca" /gt  
ltmsIonisation category"msIonisation" value"ESI"
/gt   ltmsMassAnalyzer category"msMassAnalyzer"
value"Ion Trap" /gt   ltmsDetector
category"msDetector" value"EMT" /gt   ltsoftware
type"acquisition" name"Xcalibur" version"1.3
alpha 6" /gt   lt/msInstrumentgt ltdataProcessing
centroided"1"gt   ltsoftware type"conversion"
name"Thermo2mzXML" version"1" /gt  
lt/dataProcessinggt ltscan num"1" msLevel"1"
peaksCount"780" polarity"" retentionTime"PT180
.11S" lowMz"400" highMz"1800"
basePeakMz"1727.36" basePeakIntensity"6.75227e0
07" totIonCurrent"5.90564e008"gt
Below is a sample of the psuedocode used to read
and write mass spectrometry data in various file
formats AccessionRecordPointer read_mzXML (mzXML
file) //reads an mzXML file and creates an
Accession Record that represents the
file. MassSpectrumPointer read_mgf (mgf
file) //reads an mgf file and creates a Mass
Spectrum that represents the file. MassSpectrumPo
inter read_dta (dta file) //reads a dta file and
creates a Mass Spectrum that represents the
file. void write_mzXML (AccessionRecordPointer)
//reads an Accession Record and writes the data
to an mzXML file.
Write a Comment
User Comments (0)
About PowerShow.com