Title: Watskeburt: heuristic support for hypothesis construction from literature
1Adaptive Information Disclosure Application
Watskeburt heuristic support for hypothesis
construction from literature Marco Roosa,c,
Sophia Katrenkoa, Edgar Meija, Willem van Hageb,
Frans Verstera, Scott Marshalla, Pieter
Adriaansa Adaptive Information Disclosure a)
Institute of Informatics, Faculty of Science,
University of Amsterdam, The Netherlands b) TNO
Industrie Techniek / Vrije Universiteit
Amsterdam, The Netherlands c) Project or Area
Liaison (PAL) for OMII-UK (e.g.
Taverna) Correspondence roos_at_science.uva.nl web
site http//adaptivedisclosure.org
Introduction Our objective is to provide
automated support for hypothesis construction
from literature We start with a seed of
knowledge, a proto-ontology, that we want to
extend, for instance information from a review
about histones and disease.
Snapshot of Jambalaya plugin for Protégé/OWL
showing relationships between Enhancer of Zeste
and various diseases.
Results The workflows produce tagged genomics
entities with which we enriched the ontology in
the ontology tool Protégé/OWL.
Watskeburt workflows We implemented
watskeburt by connecting our own AIDA services
and others in the workflow enactor tool Taverna
(http//taverna.sourceforge.net).
(From workflow output) ltprotein_moleculegtBMIlt/pr
otein_moleculegt ltprotein_moleculegtVavlt/protein_m
oleculegt ltprotein_moleculegtMiblt/protein_molecule
gt
The workflow here imports the proto-ontologys
terms from an excel file, retrieves relevant
papers, and then identifies genomics entities
based on a pre-learned genomics model.
Jambalaya snapshot
The AIDA toolbox An alternative to building
special purpose bioinformatics tools is to use
web services within a general computing
environment such as the workflow tool Taverna.
Watskeburt applies this e-science approach
for text mining. Its components come from our
AIDA toolbox, which contains services that can
be flexibly combined for various applications,
including various forms of text mining. Adaptive
Information Disclosure Application
Discussion Watskeburt workflows support
automated hypothesis construction by enriching an
ontology with hypothetically related elements
automatically discovered from literature. We can
explore various text-mining strategies by
adapting workflows and by adding new services
from the AIDA toolbox, such as machine learning
services to make discoveries in terms of ones
own ontology, and services to automate handling
input and output in terms of ontologies. The
number of true positives returned (recall) can be
improved, for instance, by incorporating synonym
services.
Acknowledgements and availability We thank
Taverna users and developers for their help with
creating workflows in Taverna. Services,
workflows, ontologies, and ontology-based data
are available at ws.adaptivedisclosure.org and
rdf.adaptivedisclosure.org.