Kwaliteit van databases - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Kwaliteit van databases

Description:

De kwaliteit van het abstract / de volledige tekst / de vermelde ... Ontwikkeling software/technieken om meer uit de grote databases te halen ( mining' ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 50
Provided by: ipo7
Category:

less

Transcript and Presenter's Notes

Title: Kwaliteit van databases


1
Kwaliteit van databases
  • Ingrid Riphagen
  • Medische Bibliotheek VU /
  • Werkgroep Elektronische Bronnen Zoeken (WEBZ
    vh VOGIN BMW)
  • BMI voorjaarsbijeenkomst 8 maart 2007

2
Jong-Hofman MW de, Spikman G. Beoordeling van
bestanden In Sieverts EG, de Jong-Hofman MW,
editors. Online opsporen van informatie Theorie
en praktijk van het gebruik van interactieve
informatiesystemen. 6 ed. Den Haag NBLC
Uitgeverij 1996. p. 143-9. Uitgave
samengesteld onder auspiciën van VOGIN
Nederlandse Vereniging van Gebruikers van Online
Informatiesystemen
3
De Jong-Hofman Spikman 1996 Een objectieve
beoordeling van de kwaliteit van bestanden is
moeilijk, omdat daarbij de mate waarin een
bestand aansluit op de specifieke behoefte van
een gebruiker een grote rol speelt. Objectieve
kwaliteitscriteria?
4
Criteria indeling de Jong-Hofman
Spikman(Onderverdeling niet getoond)
  • Het dekkingsgebied, de bronnenkeuze en de
    selectie binnen de bronnen
  • De kwaliteit (standaardisatie/nauwkeurigheid) en
    de uitgebreidheid van de bibliografische gegevens
  • De kwaliteit van het abstract / de volledige
    tekst / de vermelde gegevens (feiten/data)
  • De kwaliteit en de aard van ontsluiting
  • De actualiteit van een bestand
  • De gebruikersondersteuning / gebruiksvriendelijkhe
    id

5
Literatuurselectie (I.Riphagen 02/2007) Uiteinde
lijke selectie na ontdubbeling (behoud van beste
record) MEDLINE 49 EMBASE 39 INSPEC 32 Web
of Science 14 LISTA 13 Hfst boek refs 7
(de Jong-Hofman Spikman) Uit eigen
database 41
6
Bakker S. Subject strengths and weaknesses of
four current-awareness services on diskette.
Health Libr Rev 1992 Dec9(4)131-7. Brand-de
Heer DL. A comparison of the coverage of
clinical medicine provided by PASCAL BIOMED and
MEDLINE. Health Info Libr J 2001
Jun18(2)110-6. Brandsma R, Deurenberg-Vos HWJ,
Bakker S, Brand-de Heer DL, Otten RHJ, Pinatsis
A. A comparison of the coverage of clinical
medicine provided by BIOSIS Previews and Medline.
Online Review 199014(6)367-77. De Jong-Hofman
MW. Comparison of selecting, abstracting and
indexing by COMPENDEX, INSPEC and PASCAL and the
impact of this on manual and automated retrieval
of information. Online Review 19815(1)25-36. D
e Jong-Hofman MW, Siebers HH. Experiences with
online literature searching in a water-related
subject field Aqualine, Biosis, CA search and
Pascal, compared using the ESA/information
Retrieval System. Online Review
19848(1)59-73. Konings CAG. Comparison and
Evaluation of 9 Bibliographies - Bibliographic
Databases in the Field of Computer-Science.
Online Review 19859(2)121-33. Nederhof AJ.
Delimitation of a medical research topic a
comparison of online databases. International
Information, Communication and Education
199413(1)13-22.
7
Read EJ, Smith RC. Searching for library and
information science literature a comparison of
coverage in three databases. Library Computing
200019(1-2)118-26. Abstract With Information
Science Abstracts (ISA) under new ownership and
engaged in quality improvement initiatives, a
fresh evaluation of its subject coverage was in
order. A study of the Dialog database versions of
ISA and its close competitors, LISA and Library
Literature Information Science (Library
Literature), characterized the subject coverage,
overlap among the databases, and types of
documents indexed in 1999 and early 2000. In 20
LIS subject searches, Library Literature
retrieved the most records in 16 of the searches,
LISA had the most in three, and they tied in one.
ISA had the least records for all topics, a
natural result of adding significantly fewer
records overall. For any subject, the maximum
overlap of titles among the three databases was
only 21 percent, indicating the need to search
more than one database for comprehensive results.
Document types were primarily journal/feature
articles
8
Minozzi S, Pistotti V, Forni M. Searching for
rehabilitation articles on MEDLINE and EMBASE. An
example with cross-over design. Arch Phys Med
Rehabil 2000 Jun81(6)720-2.Abstract
OBJECTIVE To analyze the usefulness of MEDLINE
and EMBASE . METHODS We looked for articles
published since 1990 relating to neurologic,
orthopedic, respiratory, urologic, and
rheumatologic rehabilitation. We looked for all
descriptors and text words pertinent to
rehabilitation and linked them with "cross-over."
RESULTS We found 165 articles in MEDLINE and 159
in EMBASE with an overlap of only 17 of
articles. Only 32 of the articles in MEDLINE and
35 in EMBASE were relevant. Of the 214
nonoverlapping articles, 84 were published in
journals present in both databases, but were
indexed differently. CONCLUSION At least two
databases must be used to ensure a comprehensive
literature search. Searching in EMBASE after
MEDLINE we gained 25 articles (32).
Bibliographic search in rehabilitation is
particularly complex because of the heterogeneity
of the subject matter. Cooperation between an
information professional and a clinician is
essential to ensure a comprehensive search
9
  • de ideale literatuur database (?)
  • complete dekking relevante vakliteratuur alle
    publicatietypen, talen, jaren etc
  • inhoud en (volgorde) presentatie niet door
    zakelijke belangen beïnvloed
  • full-text, met afbeeldingen, figuren,
    onderschriften geen stopwoorden
  • volledig w.b. doorzoekbare velden (indexen)
    meta-data op paragraafniveau
  • afbeeldingen en figuren geindexeerd zoekbaar
  • foutloos (spelling, namen) 100 consekwent
    gebruik van namen, adressen, etc
  • wetenschappelijk, zakelijk, populair/publiek,
    politiek/beleid/juridisch, educatief
  • uitstekende onderwerpsindexering in meerdere
    talen voldoende breed en diep
  • vocabulaire met gelijkwaardige facetten en
    gelijkwaardige toepassing voor alle facetten. Bv
    symptomen even degelijk geindexeerd als
    ziekten
  • perfecte term mapping vanuit natuurlijke taal
    naar indexeertalen
  • perfecte resultaatpresentatie/rapportageopties
    vele sorteermogelijkheden relevance ranking
    evidence ranking
  • citation links (cited citing)
  • numeriek zoeken mogelijk (bv op aantal patienten
    in studie)
  • related records op diverse wijzen (ISI-methode,
    PubMed methode etc)
  • koppeling aan nieuws, patienteninformatie,
    patenten, stofeigenschappen, chemische
    structuren, chromosoompatronen, gensequenties,
    etc
  • makkelijk te gebruiken voor eigen databases en
    publicaties (cwyw)
  • integreerbaar in EPDs

10
Scope
  • Welk(e) vakgebied(en) ?
  • Gericht op welk(e) doel(en) ?
  • Geschikt voor de beantwoording van welke vragen?

11
MEDLINE MEDLINE is the NLM's premier
bibliographic database covering the fields of
medicine, nursing, dentistry, veterinary
medicine, the health care system, and the
preclinical sciences.  MEDLINE contains
bibliographic citations and author abstracts from
more than 5,000 biomedical journals published in
the United States and 80 other countries. The
database contains over 15 million citations
dating back to the mid-1950's. Coverage is
worldwide, but most records are from
English-language sources or have English
abstracts. See also the MEDLINE/PubMed Resources
Guide.
Medicine nursing dentistry veterinary
medicine health care system preclinical
sciences Biomedical journals
12
(No Transcript)
13
dekking
  • Welke literatuur genres?
  • populair/publiek, vakpublicaties,
    wetenschappelijk, juridisch
  • Welke publicatiesoorten?
  • tijdschriften, boek(hoofdstukk)en,
    congresverslagen, patenten, rapporten, nieuws,
    etc
  • Cover-to-cover of selectie? Welke selectie?
  • Selectie op landen, talen, uitgevers?
  • Periode? Hoe ver terug hoe actueel?

14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
Andere objectieve kwaliteitscriteria volledighei
d zonder doublures bv dekking per tijdschrift
(ontbrekende issues/artikelen?) correcte
woordspelling schrijfwijze van namen
e.d. consekwente auteursnamen, volledigheid
co-auteursnamen gestandaardiseerde adressen van
auteurs indexeerfouten ontbrekende
trefwoorden verkeerde trefwoorden te brede of
te specifieke trefwoorden snelheid van opname en
indexering
18
EMBASE Topfer LA, Parada A, Menon D, Noorani H,
Perras C. Comparison of literature searches on
quality and costs for health technology
assessment using the MEDLINE and EMBASE
databases. Int J Technol Assess Health Care
199915(2)297-303. (abstract) Indexering Cost-b
enefit analysis Data analysis Informatie
retrieval Internet Medical Information
MEDLINE Topfer LA, Parada A, Menon D,
Noorani H, Perras C, Serra-Prat M. Comparison of
literature searches on quality and costs for
health technology assessment using the MEDLINE
and EMBASE databases. Int J Technol Assess
Health Care 199915(2)297-303.(abstract) Indexer
ing Abstracting and Indexing/standards Costs
and Cost Analysis Databases, Bibliographic/econom
ics Databases, Bibliographic/standards
Evidence-Based Medicine Humans Information
Storage and Retrieval/economics Information
Storage and Retrieval/standards Internet
MEDLINE/economics MEDLINE/standards Research
Design/standards Review Literature Technology
Assessment, Biomedical Time Factors
19
ISI Web of Science EMBASE Bac R.
Comparative-Study by the Pdr of Toxicology
Information-Retrieval from Online Literature
Databases. Online 19804(2)29-33. Geen abstract
LISTA Bacir K. A comparative study by the pdr
of toxicology information retrieval from online
literature databases. Online 42, 29-33. 1980.
Ref Type GenericAbstract Online searching of
various literature databases by 7-11 members of
the toxicity data committee of the pdr (pharma
documentation ring) for four different
toxicological subjects during the period april
1977-february 1979 has been studied. An analysis
of references retrieved has been presented using
a matrix approach to give an impression per
search of the importance of each database with
respect to yield and overlap with others. The
quality of the references retrieved and
efficiency of search strategies are
discussed MEDLINE Publicatie ontbreekt
20
De Jong-Hofman Spikman 1996 We kunnen een
bestand (nu) in de meeste gevallen alleen (nog)
maar beoordelen zoals het online beschikbaar
gesteld wordt door de diverse hosts. Voor de
gebruiker is het soms moeilijk te bepalen voor
welke factoren de bestandsproducent
verantwoordelijk is, en voor welke de host.
21
Database of aanbieder? Woord zichtbaar maar niet
zoekbaar stop-woorden bv as (dan dus niet te
zoeken als afkorting van arsenicum) bv in vitro
gt vitro in utero gt utero under age gt
age Veld aanwezig maar niet standaard
doorzoekbaar bv PubMed titel in oorspronkelijke
taal in TT-veld MEDLINE, maar wordt niet
standaard doorzocht gt veldspecificatie TT is
nodig. bv Web of Science citatieveld is aanwezig
(cited papers), maar is niet met veldspecificatie
(bv RE) doorzoekbaar.
22
  • Kwaliteit database wat erin zit wat eruit te
    halen is
  • Inhoud niet los te zien van retrieval- en
    presentatiemogelijkheden
  • Denk bv aan
  • term mapping , resultaat analyse opties,
  • formulieren voor vaak gebruikte vraagtypen
  • preprocessed searches
  • bv Clinical Queries, subsets
  • citatie analyses, citatie links

23
(No Transcript)
24
(No Transcript)
25
Bijv. de Clinical Queries bij PubMed
Preprocessed search adviessearch
Kwaliteitskenmerk van de database? Neem de
Clinical Query om systematische reviews te
selecteren n AND systematicsb (Zojuist een
nieuwe versie 2007 verschenen, die nu
bediscussieerd wordt in de clinical librarians
mailgroep) Het is niet zo moeilijk systematische
reviews te vinden die niet gevonden worden met
dit filter, o.a. met de search (reviewti OR
reviewpt AND randomized controlled
trialpt NOT systematicsb
26
(No Transcript)
27
(No Transcript)
28
Geen probleem van dekking, maar van
retrieval in een kolossale, ongestructureerde
database als MEDLINE
29
Ontwikkeling software/technieken om meer uit
de grote databases te halen (mining) versus On
twikkeling van kleine gespecialiseerde databases
30
Hulpstukken om meer uit databases (m.n.
PubMed/Medline) te halen Becker KG, Hosack DA,
Dennis G, Jr., Lempicki RA, Bright TJ, Cheadle C,
et al. PubMatrix a tool for multiplex literature
mining. BMC Bioinformatics 2003 Dec
10461. Corrao S, Colomba D, Arnone S, Argano
C, Di CT, Scaglione R, et al. Improving efficacy
of PubMed Clinical Queries for retrieving
scientifically strong studies on treatment. J Am
Med Inform Assoc 2006 Sep13(5)485-7. Ding J,
Berleant D. MedKit a helper toolkit for
automatic mining of MEDLINE/ PubMed citations.
Bioinformatics 2005 Mar 121(5)694-5. Ding J,
Hughes LM, Berleant D, Fulmer AW, Wurtele ES.
PubMed Assistant a biologist-friendly interface
for enhanced PubMed search. Bioinformatics 2006
Feb 122(3)378-80. Divoli A, Attwood TK. BioIE
extracting informative sentences from the
biomedical literature. Bioinformatics 2005 May
121(9)2138-9. Doms A, Schroeder M. GoPubMed
exploring PubMed with the Gene Ontology. Nucleic
Acids Res 2005 Jul 133(Web Server
issue)W783-W786. Eaton AD. HubMed a web-based
biomedical literature search interface. Nucleic
Acids Res 2006 Jul 134(Web Server
issue)W745-W747.
31
Fontelo P, Liu F, Ackerman M. MeSH Speller
askMEDLINE auto-completes MeSH terms then
searches MEDLINE/PubMed via free-text, natural
language queries. AMIA Annu Symp Proc
2005957. Fornaro MT, Gugliada A, Bonacina S,
Pinciroli F. PubMiur a tool for on-line assisted
bibliographic researches. AMIA Annu Symp Proc
2005958. Grimes GR, Wen TQ, Mewissen M, Baxter
RM, Moodie S, Beattie JS, et al. PDQ Wizard
Automated prioritization and characterization of
gene and protein lists using biomedical
literature. Bioinformatics 200622(16)2055-7. Mu
dunuri U, Stephens R, Bruining D, Liu D, Lebeda
FJ. botXminer mining biomedical literature with
a new web-based application. Nucleic Acids Res
2006 Jul 134(Web Server issue)W748-W752. Muin
M, Fontelo P, Liu F, Ackerman M. SLIM an
alternative Web interface for MEDLINE/ PubMed
searches - a preliminary study. BMC Med Inform
Decis Mak 2005537. Rebholz-Schuhmann D, Marcel
S, Albert S, Tolle R, Casari G, Kirsch H.
Automatic extraction of mutations from Medline
and cross-validation with OMIM. Nucleic Acids
Research 200432(1)135-42. Saric J, Jensen LJ,
Ouzounova R, Rojas I, Bork P. Extraction of
regulatory gene/protein networks from Medline.
Bioinformatics 2006 Mar 1522(6)645-50.
32
Databases voor select publiek / specifieke
doelen Albrecht H, van WR, Dittloff S. A new
database on basic research in homeopathy.
Homeopathy 2002 Jul91(3)162-5. Bailey MR,
Ansoborlo E, Chazel V, Fritsch P, Hodgson A,
Kreyling WG, et al. Radionuclide biokinetics
database (RBDATA-EULEP) An update. Radiation
Protection Dosimetry 2004112(4)535-6. Beiki O,
Beiki D. ParsMedline establishment of a
Web-based bibliographic database related to
Iranian health and medical research. J Med Libr
Assoc 2005 Jul93(3)400-3. Brassey J. TRIP
Database Identifying high quality medical
literature from a range of sources. New Review of
Information Networking 2005 Nov11(2)229-34. Bul
t CJ, Krupke DM, Naf D, Sundberg JP, Eppig JT.
Web-based access to mouse models of human
cancers The Mouse Tumor Biology (MTB) database.
Nucleic Acids Research 200129(1)95-7. Fang X,
Shao L, Zhang H, Wang S. CHMIS-C A comprehensive
herbal medicine information system for cancer.
Journal of Medicinal Chemistry 200548(5)1481-8.
Greenough A. Help from ISABEL for paediatric
diagnoses. Lancet 2002 Oct 19 360(9341)1259.
33
Hollegaard MV, Bidwell JL. Cytokine gene
polymorphism in human disease On-line databases,
Supplement 3. Genes and Immunity
20067(4)269-76. Nixon J, Stoykova B, Glanville
J, Christie J, Drummond M, Kleijnen J. The U.K.
NHS Economic Evaluation Database Economic issues
in evaluations of health technology. Int J
Technol Assess Health Care 200016(3)731-42. Ost
ermann T, Zillmann H, Matthiessen PF. Literature
for complementary medicine in cancer - Search on
Internet with the CAMbase database. Deutsche
Zeitschrift fur Onkologie 200436(4)165-9. Pagon
RA, Tarczy-Hornoch P, Baskin PK, Edwards JE,
Covington ML, Espeseth M, et al.
GeneTests-GeneClinics genetic testing
information for a growing audience. Hum Mutat
2002 May19(5)501-9. Sprague J, Doerry E,
Douglas S, Westerfield M. The zebrafish
information network (ZFIN) A resource for
genetic, genomic and developmental research.
Nucleic Acids Research 200129(1)87-90. Tomasulo
PA. AgeLine free and valuable database from
AARP. Med Ref Serv Q 200524(3)55-65. Woods SE.
OLIO an osteopathic medicine database. Med Ref
Serv Q 199110(4)49-58.
34
Groter is niet altijd beter Kleine,
gespecialiseerde databases hogere precisie door
voorselectie Gerolit, Ageline (geriatrie,
gerontologie) MANTIS (Manual Alternative and
Natural Therapy Index System) Medion Database
(diagnostic studies/systematic reviews) OSTMED
(The Osteopathic Literature Database) OT Seeker
(occupational therapy) PEDro (Physiotherapy
Evidence Database) PILOTS Database (posttraumatic
stress literature) POPLINE (POPulation
information onLINE) SportDiscus (sport, fitness,
sports medicine)
35
POPLINE
Ingrid Riphagen, 02/2007
36
POPLINE
37
(No Transcript)
38
(No Transcript)
39
  • the TRIP process
  • Content identification
  • The content of the TRIP Database is identified
    using a variety of methods. .
  • Netting the Evidence
  • clinical question answering services
  • Directory of Clinical Information Websites
  • Once a site has been identified as potentially
    useful it is then assessed by an in-house team of
    information experts and clinicians and external
    experts to assess quality and clinical
    usefulness. If the site passes this additional
    test we start the process of 'grabbing' content

40
Content grab The first stage is to identify the
clinically relevant material within the website.
Once this has been identified certain core
information (title, URL and date of publication)
is obtained, either manually or via an automated
system. The next stage is for our in-house
spidering software to visit every URL and grab
the content of that page. This is then
processed by our systems to remove superfluous
material such as HTML coding etc. The material is
then ready for searching and made live. 
41
  • Results display
  • The search results are returned based on the
    score for each matching record. 
  • This score is based on three main variables
  • Year of publication. The newer the article the
    higher the score
  • Term position/density.  This is the most
    complicated and is based on two main factors. 
  • Publication.  Each publication is given a score
    based on methodological quality and clinical
    usefulness. So a publication such as Cochrane
    (seen as the pinnacle of evidence based medicine)
    will have a higher score than something like GP
    Notebook. NB indexering !

42
Categorisation method The content of the TRIP
Database is separated into a number of
categories.  This categorisation is based on
Hayness work on the 4S approach to current best
evidence. Evidence-based synopses.  These are
synopses of individual studies that have been
critically appraised.  This category includes
evidence-based journal reviews, CATs, Clinical
Evidence etc.  Clinical answers.  A number of
services exist to answer clinical questions. 
These services aim to match the best available
evidence to the question. Systematic reviews.
These are explicit, rigorous syntheses of primary
research studies. Guidelines. Clinical
guidelines are central to modern healthcare and
we have gathered collections from around the
globe.  Given guidelines geographic sensitivity
these have been separated according to the
country/area of origin.
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
Verwacht geen 100 resultaat van database
searches, hoe doorwrocht je ook gezocht hebt in
vele, fraaie databases ? Verwachting in de hand
gewerkt door de protocol-werkwijze voor
systematische reviews Er zullen aanvullingen
komen uit cited papers, citing papers,
related articles etc. en uit eigen contacten
van de onderzoekers
Greenhalgh T, Peacock R. Effectiveness and
efficiency of search methods in systematic
reviews of complex evidence audit of primary
sources. BMJ 2005 Nov 5331(7524)1064-5.
48
(No Transcript)
49
Introduction In Cochrane reviews of therapeutic
interventions, most high quality primary studies
could be identified by searching four standard
databasesthe Cochrane Controlled Trials Register
(which contains 79 of studies listed in Cochrane
systematic reviews), Medline (69), Embase (65),
and Science and Social Sciences Citation Indexes
(61).1 Searching 26 further databases identified
only an extra 2.4 of trials. No comparable
figures have been published for systematic
reviews of complex evidence, which address broad
policy questions and synthesise qualitative and
quantitative evidence, usually from multiple and
disparate sources.2 The aim of our study was to
audit the origin of primary sources in a wide
ranging systematic review of complex evidence.
Method We reviewed the diffusion of
service-level innovations in healthcare
organisations.
Write a Comment
User Comments (0)
About PowerShow.com