Title: MMIM : Molecular Medicine Informatics Model and Australian Cancer Grid
1MMIM Molecular Medicine Informatics Model
and Australian Cancer Grid
- by Julie Johns and Marienne Hibbert
- MMIM team and researchers
2- Funding
- Drivers
- Architecture of the 3 phases
- Record linkage
- Data collection example
- Research findings
3Bio21MMIM
Molecular Medicine Informatics Model
- Phase 1 (pilot) funded by Victorian Government
(STI) via Bio21 (A1.6M) 2005 - 2005 - 5 hospitals
- 2 Research Institutes
- 3 disease types Oncology, Diabetes, Epilepsy
-
Many researchers and clinicians collaboration
4Bio21MMIM
Molecular Medicine Informatics Model
- Phase 1 (pilot) funded by Victorian Government
(STI) via Bio21 (A1.6M) 2005 - 2005 - 5 hospitals
- 2 Research Institutes
- 3 disease types Oncology, Diabetes, Epilepsy
- Phase 2 funded by Australian Government (DEST)
via the University of Melbourne (A4.5M) 2005 -
2006 - 7 hospitals
- 2 Research institutes
- 4 diseases Oncology, Neuroscience, Diabetes,
Respiratory Images -
Many researchers and clinicians collaboration
5Bio21MMIM
Molecular Medicine Informatics Model
- Phase 1 (pilot) funded by Victorian Government
(STI) via Bio21 (A1.6M) 2005 - 2005 - 5 hospitals
- 2 Research Institutes
- 3 disease types Oncology, Diabetes, Epilepsy
- Phase 2 funded by Australian Government (DEST)
via the University of Melbourne (A4.5M) 2005 -
2006 - 7 hospitals
- 2 Research institutes
- 4 diseases Oncology, Neuroscience, Diabetes,
Respiratory Images - Phase 3 ACG funded by Victorian Government
(STI) and DIIRD (Dept Industry Innovation
Regional Development) via the University of
Melbourne (A11M) 2006 - 2009 - 7 hospitals
- 2 Research institutes
- 4 diseases Oncology, Neuroscience, Diabetes,
Respiratory Images
Many researchers and clinicians collaboration
6- Why create a research data grid ?
- Maximise research link genomic data to clinical
/ outcome data - A platform to link state-wide clinical outcome
data - A platform to maximise collaborative research
across Australia and internationally
7- Medical Research - the Challenges
- Large amount of data
- Lack of data standards
- Lack of interoperability between databases
- Need a cohesive approach between disciplines
8 FJ Martin-Sanchez.http//bioinfomed.isciii.es/
9Ref Bioinfomed.iscii.es Synergy between MI and
BI facilitating Genomic Medicine for Future
Healthcare. 2003.
10Bio21MMIM
- A federated research platform of integrated
data from multiple institutions linked to public
data sources - Without forcing institutions to change the
existing data models, and - Allowing a wide range of data analysis and
analytic techniques, programs, software, etc to
be used
11Why link databases ?
- Research power
- Increase the sample size
- Draw together specialist databases addressing
common diseases - Genetic predisposition
- Environmental exposures
- Screening activity
- Genomics, proteomics epigenetics
- Co morbidities
- Treatment strategies
- Outcomes
12Why do this?
- Maximise research link genomic data to clinical
/ outcome data - test multiple hypotheses without collecting /
recollecting their own data, - identify patient numbers for clinical trials
based on clinical information or genetic profile, - research suitable pre-symptomatic testing and
early intervention based on genotype data, - research genetic, genomic and proteomic profiles,
factors that may influence treatment outcome,
with respect to toxicity and potential benefit, - analyse summary/statistical information across
institutions and from diverse databases - A platform to link state-wide clinical outcome
data - A platform to maximise collaborative research
across Australia and internationally
13The Vision
Population-based Health Record
Clinical / Research multiple hospitals
Disease Sub-specialty
Research Project 2
Research Project 1
Gene Expression
Protein Expression
Genotypes
Public Domain Data
A generic informatics model providing
opportunities for beneficial collaboration across
organisations and expansion to other research
areas.
14Colorectal Cancer
- Risk Management Outcomes locally nationally
- Linkage to genotype
- Screening Average high-risk
- Surveillance
- Improving Treatment Outcomes
- How are we doing? How can we do better?
- Tailored treatments Genomics, proteomics
- Managing patients in the real world
- Do results in clinical trials translate to
improved outcomes in the clinic? - How do we manage the old and frail patient or one
with multiple co-morbidities?
15Key drivers
- Privacy and Ethics
- MMIM infrastructure processes Ethically
approved (all sites) - One patient Unique Subject Index allocated
- Health data separate from identifying data
- Log-ons and passwords
- Virtual Private Networks for transmission of data
- Secure Internet access
- Data available in de-identified (codified) form
only - Access provided to specific tables on
application research/purpose must be described - IP
- Background IP and Project IP recognized.
- Principal Investigator must be appointed and
identify which party will be the
Commercialization Lead
16How it works MMIM phase 1 - Dynamic linkage
Researchers contribute data and pose further
questions
Oncology
Oncology
Nightly updates
Nightly updates
Authorised researchers and
Authorised researchers and
n1,265
n3,118
applications query the federated
applications query the federated
Tissue Bank
Tissue Bank
research repository download data
research repository download data
n1,764
n2,515
ETL
Wrapper
ETL
for analysis.
PMCC LRR
for analysis.
uArray
uArray
n80 test
n80 test
VPN
PeterMac
PeterMac
FDI de
-
identified data
FDI de
-
identified data
ETL
Wrapper
Austin LRR
Internet
Federated
Federated
VPN
Data
Data
Integrator
Integrator
Oncology
n3,600
Tissue Bank
Institute
-
specific data
Institute
-
specific data
Research and
n454
loaded into institute
-
specific
loaded into institute
-
specific
Training
Familial Ca
Local Research Repository
Local Research Repository
n3,758
Wrappers
Surveillance
n3,541
Researchers
MH Alfred
GenBank
GenBank
VPN
Institutions
Epilepsy
n1,823
ETL
Wrapper
MHWH LRR
UniProt
UniProt
LocusLink
LocusLink
Diabetes
n21,048
PubMed
PubMed
Biomarkers
n50
Wrappers link to
Public Data Sources
Public Data Sources
heterogeneous systems
Molecular Medicine
Molecular Medicine
Melbourne
and data sources
Western
Informatics Model
Informatics Model
At Oct 2006
17How it works Unique Subject Index
18The Unique Subject Index (USI)
- Probabilistic matching
- Based on matches of 6 key demographic data items
- Surname
- Given name
- Middle name or Initial
- Date of Birth
- Gender/sex
- Digits 5 to 9 of the Medicare Number
- Check new records for a match against existing
subjects - Score assigned on the basis of match / non-match
for each data item - fuzzy logic for transpositions, soundex
matches, common hospital names (e.g. BABE OF,
TWIN 1) - Manual checking of subjects in the grey area
between thresholds
19Institute
-
specific data
Institute
-
specific data
loaded into institute
-
specific
loaded into institute
-
specific
- South Aust
- Oncology
- RAH
- Q Elizabeth
- Flinders MC
Local Research Repository
Local Research Repository Nightly
Authorised researchers and applications query
the Federated Data Repository for analysis.
- St Vincents
- Neuroscience
- Diabetes
- Oncology
Metadata Repository
- PeterMac
- Oncology
- Tissues Bank
- PET images
- Miccroarray
- Monash MC
- Oncology
- Tissue Bank
- Respiratory
Internet
- Alfred
- Respiratory
- Neuroscience
- Oncology
Federated Data Integrator
SAS Queries Statistical analysis Reports
- RCH/ MCRI
- Respiratory
- Diabetes
- Crohns
VPN
- Austin
- Oncology
- Diabetes
- Ludwig -
- Tissue Bank
- RHH/ Menzies
- Respiratory
- Diabetes
- Oncology
- Melbourne
- Western
- Oncology
- Diabetes
- Neuroscience
- MRI Images
- Ludwig
- Biomarkers
- Proteomics
- Tissue Bank
Unique Subject Index
- Box Hill
- Neuroscience
- Oncology
ETL
LRR at other Hospital
ETL
Bio21MMIM Phase 2
ETL
20ACG Scope of sites
MMIM / ACG
Sites - Vic Melbourne Austin Western Peter
Mac Alfred St Vincents Monash RCH Box
Hill Peninsula Northern Barwon Ballarat Latrobe Be
ndigo Hume
Medicare
ePIN
Mesh block
ABS
Sites Other states Tas RHH/Menzies Act
Canberra SA Flinders R Adelaide H
Queen Elizabeth
De-identified data
Sites International Malaysia UK St Marks USA
Moffitt Vanderbilt Venter NZ
Christchurch
Researcher
21ACG Scope of data
MMIM / ACG
Medicare
- Tumours
- Colorectal
- Brain
- Breast
- Lung
- Sarcoma
-
- Prostate
- Head neck
- Upper GI
- Melanoma
- Renal
ePIN
Mesh block
ABS
- Data
- Clinical outcome
- Treatment
- Genetic (Microarray, Biomarker, Proteomic)
- Image
- Tissue Banks
De-identified data
Researcher
22ACG Scope - analysis
MMIM / ACG
Medicare
ePIN
Mesh block
ABS
De-identified data
- Tools
- Bioinformatics
- Statistical
- Drug Discovery
- Image Analysis
High Performance Computing
Researcher
23End user view SAS
24Tissue Banks
Authorised researchers and applications query
the Federated Data Repository for analysis.
Metadata Repository
- PeterMac
- Oncology
- Tissues Bank
Queries Statistical analysis Reports
MMIM Researcher
Federated Data Integrator
- Monash MC
- Oncology
- Tissue Bank
- Respiratory
VPN
- Austin
- Oncology
- Diabetes
- Ludwig -
- Tissue Bank
Queries Statistical analysis Reports
- Melbourne
- Western
- Oncology
- Diabetes
- Neuroscience
- MRI Images
- Tissue Bank
- Ludwig
- Biomarkers
- Proteomics
Unique Subject Index
Tissue Banks
Bio21MMIM Phase 2
25Tissue Banks
- De-identified data
- (i) Tissue Bank
- a. determine what data to go to MMIM min data
set - b. define access roles who has access to what
- c. help create reports what regular reports
are wanted - d. Use for research
- e. Access to associated clinical and outcome data
- (ii) Other researchers
- a. Access to TB data approved by TB
- b. Request for samples must be to TB, normal
ethics processes apply
26Linking Sample Data
Federated Data Integrator
- Biomarkers
- USI
- TB number
- Study ID
- Biomarker results
- Tissue Bank
- USI
- TB number
- Sample and storage data
- Accord
- USI
- Clinical, pathological and outcome data
27Project challenges
- Linking the data
- across disparate, clinical and biomedical data
sources - within and across institutions and to other
public and private sources - ensuring compliance with security and data
ownership constraints, and - allowing each institution to keep their own data
models and data control - Linking patient/subject records
- assigning Unique Subject Identifiers (USIs) to
data - allowing patients and families to be linked
across multiple institutions, and - observing legal, ethical, privacy and data
ownership constraints - Providing a uniform interface
- Allowing researchers to use a consistent model of
data, independent of the owning institution
28Project challenges
- Governance
- Infrastructure (Strategic planning, Maintenance,
growth, costs) - Use and access
- Collaborative agreements
- Commercial use
- Data issues - preservation of privacy and
re-identify - Consent and Ethical issues
- Intellectual property and data ownership
- Standards and quality
- Data Types Clinical, Demographic, Geographic,
Images, Sound, Genomic, Socio-economic, other - Version control/data stamping/archiving of data
extracts
29Project challenges
- Technology
- Interoperability - Ensure all the systems can
work/link together - Meta data - descriptions
- Tools
- for searching what data is there - really user
friendly - for querying the data itself - really user
friendly - for analysis
- to assist in data cleaning and profiling
- for tracking and audit
- for record matching
30Colon Cancer Screening Database
Tissue Bank Colon Cancer Database
Familial Cancer Clinic Database
MH Colon Cancer Clinical Outcomes Database
Colon Cancer
Immunohistochemistry
Familial Cancer Related Gene Testing Database
Radiology
Proteomics Analysis
WH Colon Cancer. Clinical Outcomes Database
Microarray Analysis
31Cancer - findings
Diabetes patients with colorectal cancer have a
higher risk of cancer recurring hence the
survival is lower.
P.Gibbs, S. McLaughlin, J. Johns 2005
32Stage III Colon Cancer at Western Hospital (1998
2005)
Outcomes achieved in Australian patients, with
and without chemotherapy, are as good as anywhere
in the world
Received chemotherapy
(P.Gibbs, S. McLaughlin, I. Skinner et al )
33Bio21MMIM Publications and abstracts
- so far
- Publication of letter to the Editor of Cancer
Epidemiology, Biomarkers Prevention (CEBP)
Subsite-specific Colorectal Cancer in Diabetic
and non-Diabetic patients. - Paper published by Medical Journal of Australia
entitled The Influence of Language Spoken on
Colorectal Cancer Diagnosis and Management by
Rodrigues J, Lim E, McLaughlin S, Faragher I,
Skinner I, Chao M, Croxford M, Chapman M, Johns
J, Gibbs P. - Presentation to American Gastroenterology
Association as a result of the Collaborative work
established with CSIRO eHealth on Surveillance of
Colorectal Cancer, evaluating the sensitivity and
specificity of FOBT compared to Colonoscopy. - Presentation to American Epilepsy Society and
American Clinical Neurophysiology Society Joint
Annual Meeting First Seizure Clinic experience
Heterogeneity of patient population and
prognosis. - Presentation with CSIRO to Federal Parliamentary
Breakfast Session Sensitivity Specificity of
FOBT compared with colonoscopy. - Presentation to World Congress of Neurology
Syndromic diagnosis of Epilepsy in the First
Seizure Clinic population. - Presentation to World Congress of Neurology
Pharmacogenetics of neurocognative side-effects
from newly commenced anti-epileptic treatment. - Abstract submitted to Australian
Gastroenterology Week P2X7 A New Biomarker
for Colorectal Neoplasia Kaur G, Chang WY, Zhiye
S, Barden, J, Cumming G, Landgren A, Macrae F.
Colorectal Medicine Genetics, and Pathology,
Royal Melb Hosp Biosceptre International Ltd,
Sydney Anatomy Histology, The University of
Sydney.
34Bio21MMIM Publications and abstracts
- so far ( 10 in preparation)
- Response letter to a paper in the Journal of the
National Cancer Institute. re Completion of
therapy by Medicare patients with stage III colon
cancer. Gibbs P 1,2,3, McLaughlin S 2, Skinner I
2, Jones I 1, Hayes I 1, Chapman M 3, Johns J 3,
Lim L 1,2,3, Faragher I - In preparation A Single Institution Experience
Of Adjuvant 5-Fluorouracil Based Chemotherapy For
Stage III Colon cancer. Faragher I, Handolias
D, McLaughlin S, Skinner I, Chao M, Chapman M,
Johns J, Gibbs P.
35Websitehttp//mmim.ssg.org.au/Project
Director marienne.hibbert_at_mh.org.auOncology
Coordinator julie.johns_at_mh.org.au