Extracting Paper Titles, Authors and Conferences from Lists on the Web - PowerPoint PPT Presentation

About This Presentation
Title:

Extracting Paper Titles, Authors and Conferences from Lists on the Web

Description:

... Paper Titles, Authors and Conferences from ... Conference accepted papers list (on the conference Web page) AAAI-05 Paper List ... Redundancy = # of authors 1 ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 14
Provided by: benjamin101
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Extracting Paper Titles, Authors and Conferences from Lists on the Web


1
Extracting Paper Titles, Authors and Conferences
from Lists on the Web
  • Nguyen Bach
  • Sue Ann Hong
  • Ben Lambert

2
We will attempt to extract these predicates and
relations
  • isAuthor(X)
  • isPaperTitle(X)
  • isConferenceName(X)
  • publishedAt( ltpapergt, ltconferencegt )
  • authorOf (ltauthorgt , ltpapergt )

3
By looking at these Web pages
  • An authors publication lists (on their home
    page)
  • Conference accepted papers list (on the
    conference Web page)

4
AAAI-05 Paper List
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Then we look for patterns
  • ltauthorgt and ltauthorgt, lttitlegt.
  • ltauthorgt , ltauthorgt, ltauthorgt and ltauthorgt,
    lttitlegt.
  • ltauthorgt , ltauthorgt, ltauthorgt and ltauthorgt,
    lttitlegt.

11
But
  • Maybe the patterns are wrong, so look for some
    more evidence
  • Once we have enough evidence we can add it to our
    knowledge base.

12
And around 100 other citations in roughly the
same format
13
Conclusion
  • Redundancy of authors 1
  • Seems to be very precise since titles usually do
    not have many spelling variations.
  • Help to find alternate name spellings
  • Could there be a Q. Yang and a Qiang Yang
    with nearly identical publications?
  • Works for other fields also (e.g. History)
Write a Comment
User Comments (0)
About PowerShow.com