Title: Fast, Cheap, and Out of Control: A ZeroCuration Model for Ontology Development
1Fast, Cheap, and Out of Control A Zero-Curation
Model for Ontology Development
- Benjamin Good
- Wilkinson Laboratory
- University of British Columbia
2My Research Question
- Can a mass-collaborative protocol produce useful
bio-ontologies without centralized curatorial
control ?
3Mass collaboration
- Creation of a knowledge resource entirely by the
community that will ultimately use it
4Successful mass collaborations
- Wikipedia
- Open Directories Project
- BioMOBY
- Open source software
- The World Wide Web
5Requirements for building an ontology
- Identify people that have the knowledge
- Motivate them to share it
- Provide an interface that
- (a) allows them to share efficiently, and
- (b) captures the knowledge in a well-ordered
manner.
6Expert identification
- Scientific conferences
- very limited amount of time
- severe restrictions on interface design
- The Young Investigators Forum for Research in
Circulatory and Respiratory Health - Winnipeg,
Manitoba, CAN 2005 - (where it was still snowing in May)
7Motivation
- Competition
- Altruism
- Narcissism
-
- Self-interest
8iCAPTURer Protocol
- Preprocessing
- Identify appropriate upper ontology
- Extract terminology from text sources
- Volunteer ontology engineering
- Filter and extend terminology
- Identify relationships between terms
- Evaluation
http//bioinfo.icapture.ubc.ca8090/icapturer/logi
n.jsp
9iCAPTURer2.0
iCAPTURer - terminology builder
Automatic term extraction - Text2Onto
smooth muscle cell
immune response
Volunteers filter terms and extend terminology
apoptosis
10iCAPTURer2.0
iCAPTURer - taxonomy builder
11iCAPTURer2.0
iCAPTURer - taxonomy builder
Terms now annotated into UMLS
12iCAPTURer2.0
iCAPTURer - synonym collector
smooth muscle cell
?
is the same as
13Evaluation of volunteer assertions
a kind of
Is
T-cell activation
?
Immune response
I dont know
Yes
Sometimes
No
14ResultsVolunteers
15Comments from volunteers
- It has enzyme! Whooo I like it!
- "It's amusing me
- "Woo! Atherosclerosis is in there now!
- "Science isn't so anal you know...
- This a helluva lot more interesting than those
talks were
16Results Knowledge
1) Collection 2 days , 65 participants
17Initial acquisition versus evaluation
11,000
Number of assertions gathered
1,000
Knowledge capture at YI forum
Evaluation conducted via email request
18Next Steps
- The votes about the ontology represent the same
knowledge that is needed to build the ontology - Can we build an ontology using the interface
implemented for ontology evaluation?
19Results Summary
- Produced an ontology describing aspects of
circulatory and respiratory health - in two days
- without a knowledge engineer
- with numerous but detectable flaws
20Conclusions
- Motivation is easy
- Interface design is hard
- A multiple-choice interface seems to produce the
best results - Mass collaboration shows promise in the domain of
bio-ontology engineering
21Future Work
- Reduce the human interface to EXCLUSIVELY
multiple choice by - More extensive pre-processing
- More intelligent questioning
22Acknowledgements
- Ivan Berkowitz, Bruce McManus (YI Forum)
- Mark Wilkinson
- Tim Chklovsky, Yolanda Gil
- All of the volunteers
- Rodney Brooks, author of Fast, Cheap, and Out of
Control A Robot Invasion of the Solar System
1989 Journal of the British Interplanetary Society
http//bioinfo.icapture.ubc.ca/bgood
23Hiring!
- 1 or more Post doctoral fellows - in the domain
of cardiology and/or ontology development - 1 lead software developer for the BioMOBY
project. http//biomoby.org
http//bioinfo.icapture.ubc.ca
24(No Transcript)
25Time line
Number of assertions gathered
time
Evaluation conducted via email request
Forced to submit abstract
Knowledge capture at YI forum
time
May, 2005
June, 2005
February, 2005
26Outline
- Mass collaboration
- Experiment
- mass collaborative ontology development and
evaluation - Results
- ontology produced, cost and quality