Kshitij A Search and Page Recommendation System for Wikipedia - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Kshitij A Search and Page Recommendation System for Wikipedia

Description:

Felidae. Black Panther. Mammal. William Lyons. Automobile. Center for Data ... Felidae, Animal, Big cat, Black panther. Jaguar. Jaguar XJS, Car classification ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 23
Provided by: cdeIi
Category:

less

Transcript and Presenter's Notes

Title: Kshitij A Search and Page Recommendation System for Wikipedia


1
Kshitij A Search and Page Recommendation System
for Wikipedia
  • Phanikumar Bhamidipati
  • Kamalakar Karlapalem

2
Agenda
  • Motivation
  • Problem statement
  • Kshitij
  • Graph Model
  • Overview
  • Architecture
  • Core Algorithms
  • CBR, LBR, YBR, AR
  • Results
  • Conclusion Future Work
  • Demo
  • Q A

3
Motivation
  • New paradigms in Search
  • Increased interest after PageRank and HITS
    algorithms
  • NAR, CWS, Koru
  • Wikipedia
  • Powerful online collaborative encyclopedia, free
    of cost
  • Vast knowledge, available in structured format
  • Recommendation Systems
  • Information filtering tools
  • Successful story in e-commerce

4
Motivation
  • Need for systems that leverage Wikipedia
    knowledge in recommendations
  • Knowledge As Service ? Recommendations as Service
  • Empowerment of datasets that are not Semantic Web
    ready (current state of WWW?)
  • Challenges in the protocols and crisp definition
    of knowledge

5
Kshitij
  • Evaluating how effectively semantic information
    from annotated data sets such as Wikipedia can be
    used in recommendation systems
  • A generic Recommendation System based on
    Wikipedia semantics
  • Provides two services
  • Search Recommendations
  • Page Recommendations

6
The Graph Structure
Search
Atari 7800
Atari Jaguar
Atari Jaguar II
Jaguar
Felidae
Jaguar Cars
Black Panther
Mammal
William Lyons
Automobile
7
Kshitij - Overview
  • Leverages the structured model powered by Wikis
  • Categories
  • Links
  • YAGO An ontology compiled from Wikipedia as
    the static source of knowledge

8
Kshitij Architecture
9
Kshitij - Algorithms
  • Three individual recommendations that explore
    different semantics
  • CBR
  • LBR
  • YBR
  • A link based aggregator (AR) that combines the
    three into single set of recommendations

10
Category Based Recommendations (CBR)
  • If two pages belong to multiple categories
    together, are they related?
  • London and Berlin in Capitals In Europe and Host
    cities of the Summer Olympic Games
  • Starts with a set of pages (search output)
  • Explores category structure to obtain candidate
    pages
  • Prunes the list based on similarity values
    calculated from shared categories (T1 and T2)

11
Link Based Recommendations (LBR)
  • Each page as a transaction, links as items Can
    they be related if supported by many
    transactions?
  • Competing sports persons, products
  • Start with search results and output of CBR
  • Identify frequent item sets
  • Support by search results is high over CBR output

12
Yago Based Recommendations (YBR)
  • Set of facts in triplet form ltE1, R, E2gt
  • ltNew Delhi, Is Capital Of, Indiagt
  • Prune the relation types
  • Start with search output
  • Retrieve entities related to these pages
  • Merge the lists and identify the related pages

13
Diversity of the algorithms
  • Each explores different knowledge space the
    graph explored along edges of same color
  • Recommendations of individual algorithms differ
  • Need for aggregation

14
AggregatedRecommendations (AR)
  • Link based topic filtering
  • Aggregated list of CBR, LBR and YBR ? CL
  • Explore the neighborhood for each search result
    to find how many in CL are reachable
  • Each result page as a point in k-dimensional
    space (each dimension by one page in CL)
  • Run AGNES to obtain clusters of result pages
  • High level topics are distilled

15
Results Evaluation
  • Mean Absolute Error (MAE) for evaluating the
    recommendations
  • Lower MAE implies high quality recommendations

16
Results Search Recommendations
Keyword jaguar
17
Results Search Recommendations
Keyword amazon
18
Results Page Recommendations
19
Conclusion
  • Good quality recommendations can be obtained from
    annotated knowledge bases using only semantic
    information
  • (Recent version) Ability to expose the
    recommendations as a service
  • Usable by external clients through HTTP
  • Built on top of MediaWiki 1.11

20
Future Work
  • Query eligibility
  • How to determine if a page/search query should be
    recommended?
  • Recommendation as Service
  • More Wikipedia structures
  • Templates, References, Info-Boxes, History

21
Demo
22
Q A
Write a Comment
User Comments (0)
About PowerShow.com