Title: The CardioVascular Research Grid (CVRG): A National Infrastructure for Representing, Sharing, Analyzing, and Modeling Cardiovascular Data
1The CardioVascular Research Grid (CVRG)A
National Infrastructure for Representing,
Sharing, Analyzing, and Modeling Cardiovascular
Data
- Stephen J. Granite, MS
- Director of Database/Software Development
- The Johns Hopkins University Center for
Cardiovascular Bioinformatics and Modeling
2Why Is There A Need For The CVRG?
- The challenge of how best to represent CV data
- Emerging data representation standards are seldom
used - No standards for representing and no culture of
sharing electrophysiological data - The challenge of sharing data
- National initiatives in CV genetics, genomics and
proteomics are underway but there is no direct,
easy way to discover data - Facilitate data discovery
- The challenge of how best to develop and deploy
hardened data analysis workflows - The challenge of discovering new knowledge from
the CV data itself
3Grids And E-Science
- Grids1
- Interconnected networks of computers and storage
systems - Running common software
- Enabling resource sharing and problem solving in
multi-institutional environments - E-Science
- Computationally intensive science carried out on
grids - Science with immense data sets that require grid
computing - Two major bio-grids are active today
- The Biomedical Informatics Research Network
(BIRN http//www.nbirn.net/) - The Cancer Biomedical Informatics Grid (caBIG
http//cabig.nci.nih.gov/)
1I. Foster and C. Kesselman (2004). The Grid
Blueprint for a New Computing Infrastructure.
Elsevier.
4The Biomedical Informatics Research Network
(BIRN)Principal Investigator Mark Ellisman UCSD
- Grid infrastructure for sharing, analyzing and
visualizing brain imaging data sets - 32 participating research sites, gt 400
investigators - 4 driving biological projects
- BIRN is a bottom up effort (scientific
applications drive technology)
BWH Image Segmentation
End User Shape Visualization
UCLA Image Acquisition
JHU Shape Analysis
5The Cancer Biomedical Informatics Grid
(caBIG)caGrid Lead Development Team Joel Saltz
OSU
- NCI intramural research effort, with selected
external collaborators - Develop open-source software
- Enables cancer researchers to become a caBIG node
- Share data with the cancer research community
- Develop controlled vocabularies for describing
cancer phenotypes and multi-scale data - Develop grid analytic services for analyzing
cancer data sets
6The CVRG Driving Biological Project (DBP)The D.
W. Reynolds Cardiovascular Clinical Research
Center (PI - E. Marban)
- Center studies the cause and treatment of Sudden
Cardiac Death (SCD) in the setting of heart
failure (HF) - HF is the primary U.S. hospital discharge
diagnosis - Incidence of 400,000 per year, prevalence of
4.5 million - Prevalence increasing as population ages
- Leading cause of SCD (30-50 of deaths are
sudden) - Medical expenditures 20 billion per year
- Manifestation of HF occurs at multiple biological
levels - Genetic Predisposition via Single/Multi-Gene
Mutations - Modified Gene/Protein Expression
- Electrophysiological Remodeling and Altered
Cellular Function - Heart Shape and Motion Changes
- Reduced cardiac output, mechanical pump failure
7The CVRG Driving Biological Project (DBP)The D.
W. Reynolds Cardiovascular Clinical Research
Center (PI - E. Marban)
Multi-Scale Data
- Large patient cohort ( 1,200) at high risk for
SCD - All have received ICD placement to prevent SCD
- Collecting multi-scale data for all these
patients - Patients with ICD firings are defined as high
risk for SCD patients without as low risk - Within the 1st year, only 5 of the ICDs
implanted have actually fired - Challenge discover multi-scale biomarkers that
are predictive of which patients should receive
ICDs
Gene Expression Profiling
Genetic Variability (SNPs)
Electrophysiological Data
Protein Expression Profiling
Multi-Modal Imaging
8The CVRG Project
- R24 NHLBI Resource, start date 3/1/07
- 3 development teams
- Winslow, Geman, Miller, Naiman, Ratnanather,
Younes (JHU) - Saltz, Kurc (OSU)
- Ellisman, Grethe (UCSD)
- Aims
- Develop tools for representing, managing and
sharing multi-scale data - SNP, genomic and proteomic data (Project 1)
- Electrophysiological data (Projects 1 2)
- Heart Shape and Motion (Cardiac Computational
Anatomy) data (Projects 1, 3 4) - Use multi-scale data to discover biomarkers that
predict need for ICD placement (Project 5)
9Project 1 The CVRG Core Infrastructure
- Develop and deploy CVRG-Core middleware
- Reuse components and assure interoperability with
BIRN and caBIG - Open-source software stack that instantiates a
CVRG node
The CVRG-Core (Projects 1-5)
BioINTEGRATE (Project 1)
BioMANAGE (Project 1)
BioPORTAL (Project 1)
Data Services
Multiple Analytical Methods (Projects 2,3,45)
SNP (Project 1)
EP Data (Project 2)
Gene Expression (Project 1)
Imaging (Project 1)
Patient/ Study (Project 1)
Protein Expression (Project 1)
CVRG Data Services
CVRG Analytic Services
10Project 2 Electrophysiological (EP) Data
Management And Dissemination
- Goal
- Adopt/develop data models to represent
cardiovascular EP data - Create databases for managing and sharing these
data
EP data
ONTOLOGIES
ECG EP Data Analysis Portal
XML DATABASES
11Project 3 Mathematical Characterization Of
Cardiac Ventricular Anatomic Shape And Motion
- Goal
- Develop methods for statistically characterizing
variability of heart shape and motion in health
and disease - Use these methods to discover shape and motion
biomarkers for CV disease - Methods
- Measure heart shape and motion over time in the
Reynolds population using multi-modal imaging
(MR, multi-detector CT and Gd contrast-enhanced
MR) - Model variation of heart shape/motion in both the
low/high risk Reynolds patients - Discover shape and motion parameters that
predict who should receive ICD placement
12Cardiac Computational Anatomy And Shape Analysis
Large Deformation Diffeomorphic Metric Mapping2
Targets (Diseased Training Set)
Targets (Normal Training Set)
?
2Beg et al (2004). Mag. Res. Med. 52 1167
Template (smoothed)
13Cardiac Computational Anatomy And Shape Analysis
Large Deformation Diffeomorphic Metric Mapping2
Diseased Heart
Targets (Diseased Training Set)
Targets (Normal Training Set)
2Beg et al (2004). Mag. Res. Med. 52 1167
Template (smoothed)
14Project 4 Grid-Tools for Cardiac Computational
Anatomy
15Project 5 Statistical Learning With Multi-Scale
Cardiovascular Data
- Goal predict risk of SCD and identify patients
to receive ICDs - Develop learning methods that work in the small
sample regime - Patient A Patient A
- SCD HIGH RISK
- Patient B Patient B
- SCD LOW RISK
- Deploy these multi-scale biomarker discovery
tools on the CVRG Portal
Algorithms3-6
3Geman et al (2004). Stat. Appl. Genet. Mol.
Biol. 3(1) Article 19. 4Xu et al (2005).
Bioinformatics, 21(20) 3905-3911 5Anderson et al
(2007). Proteomics 7(8) 1197 6Price et al
(2007). PNAS 104(9) 3414
16Project 6 Resource Management
- Establish CVRG Working Groups to create a
mechanism for community input on design and
function of CVRG-Core and the CVRG - CV ontologies/data models Testbed
Projects - (HLB-STAT)
- New Technologies Data
Sharing/IRB - Undertake outreach efforts to inform, train, and
support researchers in use of CVRG tools and
resources
17Acknowledgements
- The CVRG Development Team
Johns Hopkins University
Ohio State University
UCSD
Shannon Hastings Tahsin Kurc Stephen
Langella Scott Oster Tony Pan Justin Permar Joel
Saltz
Mark Ellisman Jeff Grethe Ramil Manasala
Siamak Ardekani Donald Geman Stephen Granite Joe
Henessy David Hopkins Anthony Kolasny Aaron
Lucas Michael Miller Daniel Naiman Tilak
Ratnanather Kyle Reynolds Aik Tan Rai Winslow Gem
Yang
NHLBI (R24 HL085343)
18Microsoft Research Faculty Summit 2007