Simplifying Family History Research for the Nave User: Building an Ontology and Expert Logic for Sea - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Simplifying Family History Research for the Nave User: Building an Ontology and Expert Logic for Sea

Description:

Building an Ontology and Expert Logic for Searching Danish Genealogical Primary Records ... dates and birth dates from age at death. Match names and families ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 30
Provided by: charlaw
Category:

less

Transcript and Presenter's Notes

Title: Simplifying Family History Research for the Nave User: Building an Ontology and Expert Logic for Sea


1
Simplifying Family History Research for the Naïve
User Building an Ontology and Expert Logic for
Searching Danish Genealogical Primary Records
  • By
  • Charla Woodbury
  • June 13, 2005

2
Real User Problem
  • A person decides to do family history research
    for the first time on their Danish family lines.
  • Where do they go?
  • What records do they look for?
  • How do they handle records in Danish?
  • How can they tell when the records they have
    match their search family?

3
Problem
  • Semantic web tools - Expanded to specialized
    domain expertise
  • SMART websites
  • Automatically link to best information
  • Make the user an expert
  • HELP
  • ANTICIPATE
  • GUIDE
  • TRAIN

4
Solution
  • Use an ontology with lexicons and description
    logic to
  • Extract the correct matching primary records
  • Compute feast dates and birth dates from age at
    death
  • Match names and families

5
Methods
  • Preparing for the records extraction
  • Producing results listing
  • Evaluating the methodology

6
Preparing for Records Extraction
  • Ontology Building at the Entity Level
  • Annotating Primary Record Websites
  • Building Research Tools Inside the Ontology
  • Logic and Reasoning inside the Ontology

7
1 Ontology Entity Level
8
ONTOLOGY ENTITIES
  • FIND and MARK UP relevant web pages by
  • NAME
  • DATE
  • PLACE
  • RELATIONSHIP
  • OCCUPATION
  • RECORD_TYPE
  • SOURCE

9
Danish GIVEN NAME LEXICONAdd synonyms and
thesaurus
  • MALE
  • Anders And.
  • Andreas
  • Christen Kristen
  • Christian Kristian
  • Erik Eric
  • Gregers
  • Hans
  • Ib Jep Jeppe
  • Jacob
  • Jens
  • Johan Johannes Joh.
  • Jorgen Jørgen
  • Knud
  • Lars Laurs Laurids Lauritz
  • Mads Mats - Mats
  • FEMALE
  • Ane Anna Anne
  • Birthe Birte
  • Bodil
  • Caroline
  • Dorthe Dorte
  • Ellen -Helene -Elene
  • Elisabeth Elsbeth Lisbeth
  • Else Ilse
  • Ingeborg
  • Inger
  • Karen
  • Kirsten Christen Kirstine Christine Kirstine
    Chirstine
  • Malene
  • Maren

10
DATE Lexicon Adds Thesaurus of Synonyms
  • MONTHS
  • January Jan Januar -11br
  • Februrary Feb Februar -12br
  • March Mar Marts
  • April Apr Apl
  • May Mai
  • June Jun Juni
  • July Jul Juli -5br
  • August Aug Augst -6br
  • September Sep Sept -7br Septembre
  • October Oct -8br Octobre
  • November Nov -9br Novembre
  • December Dec -10br
  • TIME
  • Year yr aar år
  • Month mo maaned m.
  • Week uge ug.
  • FEAST DATES
  • Easter Paaske Påske Paasche Påsche P.
  • Pentecost Pent Pinse -Pin
  • Trinity Tr Trin Trinitatis
  • DAYS OF WEEK
  • Sunday Sun Dominico Dom.
  • Monday Mon Mondag Mond.
  • Tuesday Tue Tirsdag Tirsd.
  • Wednesday Wed -Onsdag Onsd.
  • Thursday Thur Tørsdag Tørsd.
  • Friday Fri Fredag Fred.
  • Saturday Sat Lørsdag Lørs

11
2 Annotating Primary Record Websites
  • Colors are used to represent the mark-ups

12
Web Page
  • SOURCE URL -Tvilum Sogne Kirkebog
  • PAGE HEADER Fødde 1751 3
  • BODY Truust Dom. 23 p Trinit laest over
    Niels Baches SØREN fadd. Johannes Michelsens og
    Niels Mollers hustruer af Søebyevad, Peder
    Rasmussen af Søebyevad, Jens Bachis søn Peder og
    Niels Thylkes s. Peder af Truust

13
ONTOLOGY ENTITIES
  • FIND and MARK UP relevant web pages by
  • NAME
  • DATE
  • PLACE
  • RELATIONSHIP
  • OCCUPATION
  • RECORD_TYPE
  • SOURCE

14
Annotated Web Page
  • SOURCE -Tvilum Parish Register
  • PAGE HEADER Fødde 1751 3
  • BODY Truust Dom. 23 p Trinit laest over
    Niels Baches SØREN fadd. Johannes Michelsens og
    Niels Mollers hustruer af Søebyevad, Peder
    Rasmussen af Søebyevad, Jens Bachis søn Peder og
    Niels Thylkes s. Peder af Truust

15
3 Building Research Tools Inside the
Ontology
  • Conversion functions
  • Matching different name forms
  • Matching place names to appropriate records

16
CONVERSION FUNCTIONSinside the ontology
  • Compute birthdate from age at death
  • Death 22 Mar 1743
  • Age - 23 yr 2 m
  • - BIRTH Jan 1720
  • Compute dates from feast dates
  • Sunday 23rd after Trinity 1751
  • - 14 Nov 1751

17
Match different name forms as ONE PERSON
  • Uses lexicon to determine different forms of the
    same name
  • JENS PEDERSEN
  • JENS PEDERSEN BACH
  • JENS BACH
  • JENS BACHIS

18
PLACES - County Map of DENMARK
19
Parish and District Map of SKANDERBORG
20
Matching Places to Records
21
Logic and Reasoning inside the Ontology
  • Correct family placement of primary records -
    This is a logic and reasoning knowledge base
    which applies rules to determine that
  • Names of the children follows common naming
    practices
  • High percentage of the witnesses match
    individuals in the family knowledge base

22
Naming Practices
  • Male children are named in this order
  • occasional Mothers previous husband
  • Fathers father
  • Mothers father
  • Father

23
Knowledge Base Points out deviations of naming
practices
  • Father
  • FathersFather
  • Mother
  • MothersFather
  • MothersPrevHusband
  • Son1
  • Son2
  • Son3
  • Son4
  • LARS Andersen
  • ANDERS Pedersen
  • Maren Jensen
  • JENS Olesen
  • HENRICH Sorensen
  • HENRICH Larsen
  • ANDERS Larsen
  • JENS Larsen
  • LARS Larsen

24
Witness Match Knowledge Base
  • PURPOSE -Correct Family Placement
  • Description logic knowledge base
  • CHILD
  • PARENT
  • SPOUSE
  • SIBLING
  • Match christening record to family where highest
    of witnesses can matched to the knowledge base
    load

25
Sample LoadNiels Baches SØREN fadd. Johannes
Michelsens og Niels Mollers hustruer af
Søebyevad, Peder Rasmussen af Søebyevad, Jens
Bachis søn Peder og Niels Thylkes s. Peder af
Truust
Jens Pedersen Bach Inger Nielsen Michel
Jensen Anna Ibsen Peder Jensen Bach Anna
Michelsen Niels Thylke Niels Jensen
BachAbigael Michelsen Peder Nielsen
Thylke Johannes Michelsen Soren
Nielsen Bach SPOUSE arrow PARENT
CHILD SIBLING
26
Producing Results Listing
  • Processing the Input
  • Enough information?
  • Do the names, dates, places, and relationships
    correspond to lexicon values?
  • Using ONTOS to extract records

27
RESULTS LISTING
  • TARGET Jens Pedersen Bach
  • Truust, Tvilum Parish, Gjern District,
    Skanderborg
  • born 1693, died 1778

SOURCE -Tvilum Parish Register PAGE HEADER
Fødde 1751 3 BODY Truust Dom. 23 p Trinit
laest over Niels Baches SØREN fadd. Johannes
Michelsens og Niels Mollers hustruer af
Søebyevad, Peder Rasmussen af Søebyevad, Jens
Bachis søn Peder og Niels Thylkes s. Peder af
Truust
28
Evaluating the Methodology
  • Search Speed
  • User Relevance Feedback
  • Accuracy of the results list
  • Ease or difficulty of use
  • Precision and Recall

29
MAJOR CONTRIBUTIONS
  • A portal for family history research that could
    be easily expanded with
  • Maps and gazeteers
  • Look-ups
  • Helps
  • Training
  • Other countries and states
  • The first genealogical primary record extractor
    using semantic web tools which promises
  • Accuracy
  • Fast response
  • Ease of use
  • The first use of logic and reasoning inside an
    ontology to add expert rules for family history
  • A practical demonstration of the superiority of
    semantic web tools for future research
Write a Comment
User Comments (0)
About PowerShow.com