Title: Enabling the Molecular Medicine Revolution in Cancer through Biomedical Informatics
1Enabling the Molecular Medicine Revolution in
Cancer through Biomedical Informatics
- Ken Buetow, Ph.D.NCI Associate
DirectorBioinformatics andInformation Technology
U.S. DEPARTMENT OF HEALTH AND HUMAN
SERVICES National Institutes of Health
2(No Transcript)
3Cancer is a Complex Adaptive System
base state
selection
selection
selection
mutation
malignantstate
mutation
mutation
4Advanced Technologies are needed to characterize
disease
base state(s)
malignantstate(s)
Mutationstatus
Alleleloss
Constitutionalvariation
RNAexpression
Epigeneticvariation
5caBIG Pilot Imperatives
Integrate the biological and clinical silos
Integrate IT infrastructure, software and data
Integrate institutions and people
Address the complexity of cancer
6caBIG NCIs Answer to the Infrastructure
Challenge
7Scientific Discovery Opportunities
- Identify the biomarkers that predict efficacy
of a new cancer treatment. - Correlate molecular profiles with clinical
outcomes. - Identify a new cancer subtype.
- Discover indicators that predict disease
progression. -
8Molecular Medicine as a Complex Continuum
Molecular Medicine
Imaging
Clinical Research
Pathology
Molecular Biology
9The People
10The People
11The People
12The People
13The Activities
Clinical TrialManagement
14The Activities
Image Sharing Analysis
Cross-reference image archive to improve
detection and diagnosis.
15The Activities
Tissue Banking
Collect, process, annotate, archive and
disseminate tissue samples from patients.
16The Activities
Molecular Profiling
17Cancer Center Landscape
- Integrated Systems
- Homegrown/Commercial
- Smooth navigation between applications
- Difficult to expand/extend
- Large IT staff
- 10Ms invested
- Heterogeneous Systems
- Complex mix of commercial and homegrown
components (may be composed of dozens of
components) - No common interfaces
- Medium size IT staff
- 1Ms invested
- Informal/ no systems
- Use of productivity applications (e.g. Excel,
Access) - Complex manual processes
- Small or no IT staff
- 100Ks invested
18caBIG Approach
- Modules that address specific needs
- Electronic/remote data capture
- Adverse Event Reporting
- Regulatory Reporting
- Hospital Information Systems Interfaces
- Trial lifecycle management
- Connect through defined Electronic interfaces
- Use of international data standards
19Systems Interoperability Harmonization
One harmonized standard/model Biomedical
Research Integrated Domain Group (BRIDG) v1.0
(6/26/07)
20Boundaries and Interfaces
- not on the internal details of how focus on
boundaries, interfaces, how things fit together, - once theyre built assume that will be diverse
changing
21Standards-based interoperability caCORE
biomedical objects
- Community driven
- Dynamic implementation
- Built to be upgraded as standards harden, and
domains expand
common data elements
controlled vocabulary
22Standards infrastructure and services
- Enterprise Vocabulary Services (EVS)
- Browsers
- APIs
- cancer Data Standards Repository (caDSR)
- CDEs
- Case Report Forms
- Object models
- ISO 11179 model
- caGrid
- Globus
- Mobius
- Introduce
- Grouper
- Dorian
- ActiveBPEL
- Developer Toolkits
- caCORE SDK
23caGrid 1.0 Conceptual View
24Grid of Grids
Bilateral Negotiations
NCRI ONIX
NHLBI CVRN
NCI caGrid
25caBIG Product Suites
- Electronic Clinical Trials Management
Applications - Connecting through caBIG and its biomedical
research applications - Security and Data Sharing
26CTMS Bundle
- The CTMS Bundle brings together a range of
interoperable tools supporting the clinical
trials enterprise. - Functions include
- Patient Study Calendar (PSC)
- Participant Registry (C3PR)
- Adverse Event Reporting (caAERS)
- Clinical Source Data Integration (caXchange)
- Integration with Cancer Central Clinical Database
(C3D), or with commercial clinical trials data
collection tools at sites
CTMS Bundle
27CTMS Bundle Participant Registry (C3PR)
- Tracks subject registrations to clinical trials
- Verifies registration criteria (study open,
participant eligible, consent received) - Stratifies subject into a stratum group,
randomizes to the trial - Tracks participants across sites (handles
multi-site trials) - Manages study personnel
- Reporting (federal/local requirements, supplies
NCI Summary 3/4 data)
28CTMS Bundle Clinical Source Data Integration
(caXchange)
- Enables automatic transfer of clinical data from
point-of-care systems in medical centers, e.g.,
clinical chemistry lab systems - Accumulates results in a standards-based data
warehouse with defined electronic interfaces - Translation of multiple source data formats into
standards-compliant data for use in clinical
trials - Incorporates Viewer enabling viewing and
selection of data
29Commercial clinical trials tools are an
importantpart of the caBIG CTMS Bundle concept
- Velos Comprehensive clinical trials system in
use in the extramural Cancer Centers throughout
the country. - PercipEnz A comprehensive solution for managing
all aspects of clinical research study setup
and activation, scientific reviews, subject
registration, compliance tracking, visit
tracking, data collection, data and safety
monitoring, financials management, data
extraction, regulatory reporting, and outreach. - Akaza Rsch Web-based, open source software
platform for managing multi-site clinical
research studies. It facilitates protocol
configuration, design of case report forms,
electronic data capture, retrieval, and
management.
30Biomedical Informatics Bundle
- The Biomedical Informatics Bundle brings together
a range of caGrid-interfaced tools supporting
biomedical informatics - Functions include
- Tissue Banking (caTISSUE Suite)
- Gene Expression Database (caArray)
- Translational Medicine tools (caIntegrator)
- Biomedical Image Management (NCIA)
- Array analysis (geWorkbench)
- and the supporting caGrid infrastructure
Biomedical Informatics Bundle
31Biomedical Informatics Bundle caTISSUE
Product Description caTissue Core is caBIG's
tissue bank repository tool for biospecimen
inventory, tracking, and basic annotation.
Version 1.1 of caTissue permits users to track
the collection, storage, quality assurance, and
distribution of specimens as well as the
derivation and aliquotting of new specimens from
an existing ones (e.g. for DNA analysis). It also
allows users to find and request specimens that
may then be used in molecular, correlative
studies.
Current Version Number Version 1.1 Release Date
of Current Version February 2007 caBIG
Compatibility Level SilverMaturity Assessment
Stable Release
32Biomedical Informatics Bundle caARRAY
Product Description caArray is an open source
microarray data management system that allows
users to submit, annotate and download microarray
data. caArray was developed using the caBIG
compatibility guidelines, as well as the
Microarray Gene Expression Data (MGED) society
standards for microarray data. Compatibility with
these standards and guidelines will facilitate
data sharing and integration of diverse data
types including clinical, imaging, tissue and
functional genomics data. A number of analytical
tools that connect to caArray are already
available, including geWorkbench and GenePattern
that both provide a variety of data analysis,
visualization and annotation functions for
microarray and other data types.
Current Version Number Version 1.4 Release Date
of Current Version October 2006 caBIG
Compatibility Level SilverMaturity Assessment
Stable Release
33Biomedical Informatics Bundle NCIA
Product Description The National Cancer Imaging
Archive (NCIA) is a searchable, national
repository integrating in vivo cancer images with
clinical and genomic data. NCIA provides the
cancer research community, industry, and academia
with public access to DICOM images, Image
markup, Annotations, and rich meta data.
Current Version Number Version 2.2 Release Date
of Current Version January 2007 caBIG
Compatibility Level SilverMaturity Assessment
Mature Product
34Biomedical Informatics Bundle geWorkbench
Product Description geWorkbench provides an
innovative, open-source software platform for
genomic data integration, bringing together
analysis and visualization tools for gene
expression, sequences, pathways, and other
biomedical data. It gives scientists transparent
access to a number of external data sources and
algorithmic services, combining these with many
built-in tools for analysis and visualization (at
present more than 40 distinct analysis and
visualization modules are part of the platform).
Current Version Number Version 1.0.4 Release
Date of Current Version August 2006 caBIG
Compatibility Level In process Maturity
Assessment Stable Product
35Biomedical Informatics Bundle caIntegrator
Product Description caIntegrator is a novel
translational informatics platform that allows
researchers and bioinformaticians to access and
analyze clinical and experimental data across
multiple clinical trials and studies. The
caIntegrator framework provides a mechanism for
integrating and aggregating biomedical research
data and provides access to a variety of data
types (e.g. Immunohistochemistry (IHC),
microarray-based gene expression, SNPs, clinical
trials data etc.) in a cohesive fashion.
Current Version Number Version Release Date of
Current Version caBIG Compatibility Level
SilverMaturity Assessment Stable Release
36Data Sharing and Intellectual Capital
- The DSIC Bundle provides a critical range of
processes, procedures, policies and template
agreements that provide a framework for
collaboration - Bundle includes
- Master Guidance Document
- Flow document and questionnaire
- Decision tree
- Template agreements for MTA, IRB, etc.
- Security policies, procedures, and a framework
for caGrid-wide authorization - and a framework for participating in the DSIC
process, refining the structure for data sharing
throughout the program
DSIC Bundle
37Scientific Discovery Realized
- Identify the biomarkers that predict efficacy
of a new cancer treatment. - Correlate molecular profiles with clinical
outcomes. - Identify a new cancer subtype.
- Discover indicators that predict disease
progression. -
caBIG Enables Molecular Medicine
38NCI-based caBIG Support
- Application Support
- E-mail
- Phone support
- List Servers
- caBIG boot camps
- Developers
- Application Users
- Online Interactive Training
- Down-loadable User Materials
- Training Sessions at Scientific Meetings
39(No Transcript)
40(No Transcript)
41(No Transcript)
42Built For the Community, By the Community
- Expanding the Network
- More caBIG compatible systems and tools
- Resources for adoption and support
- New partners from IT and biomedicine
- Extension to other health categories
2007
2006
- Delivering Software Tools and caGrid
- 40 software products delivered
- caGrid 1.0 launched December 18th
- 900 active participants at 80 institutions
- New partnerships private sector, regions,
Federal agencies
- Establishing Connectivity
- Connectivity achieved between pilot nodes of
caGrid - Pre-existing software retrofitted for caBIG
compatibility - caBIG compatibility embedded into NCI Advanced
Technology programs, Cancer Centers, and external
product development activities
2005
2004
- Building Community
- caBIG pilot launched - February 2004
- Project plans developed and Working Groups
established - Standards conventions determined
- First generation software tools developed
2003
- NCI studies the IT challenges and develops
strategic plan for a large-scale bioinformatics
network
43Current caBIG Governance
44Ongoing Funding Developers, Adopters and
Participants
Developers
Adopters
Participants
- Developers
- Funding continues to support developing or
modifying interoperable tools (e.g., software,
infrastructure) that meet new community or
scientific research needs
45Ongoing Funding Developers, Adopters and
Participants
Developers
Adopters
Participants
- Adopters
- Funding continues to support the development and
adoption of tools and applications for use in
settings different from those in which they were
developed
46Ongoing Funding Developers, Adopters and
Participants
Developers
Adopters
Participants
- Participants
- Funding continues to support specific, targeted
activities such as mentoring others in data
model and tool development, software development,
documentation or training activities - Additional activities might include contributing
to white papers in strategic, policy or
technology areas, such as patient privacy or
security architecture
47(No Transcript)
48Facilitating Next Generation Adoption
What
Who
Program Offices
Knowledge Centers
Service Providers
- Services open to all caBIG institutions
- Broad technical service support
- caBIG certified 3rd party support
- Partner with other groups for best customer
service
Ongoing Tool Development, Adoption and
Participation
49http//pid.nci.nih.gov
50http//caintegrator.nci.nih.gov/cgems
51(No Transcript)
52(No Transcript)
53http//caintegrator.nci.nih.gov/rembrandt
54Glioma Molecular Diagnostic Initiative GMDI
The goal of the GMDI is to create a publicly
accessible web-based glioma data base, and
informatics platform consisting of in depth
pathologic, molecular and genetic data with
detailed clinical corollary data for hundreds of
individual brain tumors.
Data base should be invaluable for basic
scientists for aiding in tumor lineage
determination gene discovery new target
identification and validation Data base may be
invaluable for clinical investigators for a
prognostically more meaningful classification
system. toward individual patient-based
therapy selection .
55GMDI Questions to Be Addressed
- Does there exist a biological basis for the
current glioma classification schemas based on
gene expression profiling? Toward a molecular
taxonomy of gliomas. - Do genetic/molecular determinants help to
identify subgroups of gliomas within major
standard histological groups that might have
biological and/or clinical significance? - Toward patient-specific tailored therapy
- Do genetic/molecular profiles predict patient
survival? - Do genetic/molecular profiles allow one to
stratify patients into more homogeneous groups
for more accurate assessment of treatment
efficacy in clinical trials. - Targeted therapeutics (i.e. EGFR, PDGF
inhibitors) - Can we identify genes and pathways involved in
gliomagenesis that might serve as novel molecular
targets?
56Data Integration via Rembrandt
57Workflows
- Quick search
- Show K-M plot based on EMP3 expression
- Show gene expression profile plots
- Advanced search
- Select good survival and poor survival patient
groups - Perform class comparison analysis using T-Test to
identify genes that are differentially expressed
in these groups - Explore annotations to identify relevance to
molecular changes in gliomas - Survival analysis in a targeted group of patients
- Select patients with EGFR over-expression and
PTEN deletion - Display K-M survival chart for this group vs.
rest of the patient population - Plot copy number data from patient DNA samples
against physical genomic location - Display PCA chart for GBM and normal samples
58Survival data on patients with EGFR amplification
and PTEN deletion
EGFR up PTEN del
Rest of the patients
59Perform higher-order statistical analysis on
genomic and clinical datasets
60Plot copy number data from patient DNA samples
against physical genomic location
61Rembrandt facts
- Number of unique patient samples in the database
- 500 - Gene expression data points - 20 million
- Copy number data points - 35 million
- Registered users - 250
- Average unique users per month - 150
- Average time spent per session - 45 minutes
- Longest visit as of June, 2006 283 minutes
62Join the effort!!!
- More information
- caBIG.cancer.gov
- Join caBIG effort
- caBIG.nci.nih.gov