Exploring folksonomy for personalized seaRch - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Exploring folksonomy for personalized seaRch

Description:

Associations between the users and the web pages. using Vector Space Model(VSM) ... Interest and Topic Adjusting via Bipartite Collaborative Link Structure ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 16
Provided by: sle107
Category:

less

Transcript and Presenter's Notes

Title: Exploring folksonomy for personalized seaRch


1
University of Seoul Sungjick Lee
2
Personalized Search with Folksonomy
  • Folksonomy
  • as Category Names
  • as Keywords
  • as Link Structure
  • Personalized Search
  • Associations between the users and the web pages
  • using Vector Space Model(VSM)
  • interest vector of user
  • topic vector of page

3
A Personalized Search Framework
All pages
Ranking pages by the topic matching model ( user
?? page )
Ranking pages by the text matching model ( query
??page)
  • The topic vector of the web page pi
  • The interest vector of the user uj
  • The topic similarity between pi and uj

Ranking Aggregation
4
A Personalized Search Framework
In practice
All pages
Ranking pages by the topic matching model ( user
?? page )
Ranking pages by the text matching model ( query
??page)
  • The topic vector of the web page pi
  • The interest vector of the user uj
  • The topic similarity between pi and uj

Top 100 pages
Ranking Aggregation
5
A Personalized Search Framework
In practice
Ranking pages by the text matching model ( query
??page)
Ranking pages by the text matching model ( query
??page)
  • Two state-of-the-art text retrieval model
  • BM25
  • Language Model for IR(LMIR)

6
A Personalized Search Framework
In practice
Ranking pages by the topic matching model ( user
?? page )
  • The topic vector of the web page pi
  • The interest vector of the user uj
  • The topic similarity between pi and uj

7
Ranking pages by topic matching model(1/6)
  • Estimating the initial topic vectors and initial
    interest vectors
  • From Folksonomy ( Using TFIDF / BM25)
  • From Taxonomy ( ODP Categories as Topics)
  • Interest and Topic Adjusting via Bipartite
    Collaborative Link Structure

8
Ranking pages by topic matching model(2/6)
  • Estimating the initial topic vectors and initial
    interest vectors
  • From Folksonomy ( Using TFIDF / BM25)

The Social annotations
Topic Space
Documents
The users
The web pages
The Social annotations of each page
The Social annotations of each user
Terms
TFIDF / BM25
TFIDF / BM25
The interest vectors of each user
The topic vectors of each page
9
Ranking pages by topic matching model(3/6)
  • Estimating the initial topic vectors and initial
    interest vectors
  • From Folksonomy ( Using TFIDF / BM25)

The users
Documents
The Social annotations of each user
Terms
Folksonomy
User A
User A User B User C
Car Girl Car
Book Girl
Girl Girl
Book Car Girl
TF 1 1 1
IDF Log(3) Log(3/2) Log(3/2)
TFIDF Log(3) Log(3/2) Log(3/2)
The interest vector of User A rALog(3),
Log(3/2), Log(3/2)
Topic Space Book, Car, Girl
10
Ranking pages by topic matching model(4/6)
  • Estimating the initial topic vectors and initial
    interest vectors(Cont.)
  • From Taxonomy (ODP Categories as Topics)

ODP Categories
Topic Space
All the description of the web pages under a
category
The term vector of the category
Calculating Cosine Similarity
The Social annotations owned by each page
The Social annotations owned by each user
The topic vectors of each page
The interest vectors of each user
11
Ranking pages by topic matching model(5/6)
  • Estimating the initial topic vectors and initial
    interest vectors(Cont.)
  • From Taxonomy (ODP Categories as Topics)

Category 1 Category 2 Category 3


The term vector Of category 1
The term vector Of category 2
The term vector Of category 3
The Social annotations owned by user A
The Social annotations owned by user A
The Social annotations owned by user A
Cosine Similarity
Cosine Similarity
Cosine Similarity
The interest vector of user A
r1,A , r2,A ,
r3,B
12
Ranking pages by topic matching model(6/6)
  • Interest and Topic Adjusting via Bipartite
    Collaborative Link Structure

W The adjacency matrix, in which the rows represent the users and the columns represent the web pages. Wi,j is set to the number of annotations that ui gives to pj
ri,j The jth normalized interest of the ith user
ti,j The jth normalized topic of the ith web page
a The weight of the initial estimated user interest
ß The weight of the initial estimated web page topic
The initial interest vectors of each user
The initial topic vectors of each page
Adjusting
User interest adjusting by related web pages
Web page topic adjusting by related users
The adjusted topic vectors of each page
The adjusted interest vectors of each user
13
Data Set
gt500 all users own more than 500 bookmarks
80-100 100 random users who own 80-100 bookmarks
5-10 100 randomly selected users who own 5-10 bookmarks
  • Folksonomy
  • Two heterogeneous Data Sets
  • From each data set, Three test beds according to
    the number of bookmarks owned by the users

web pages annotations users
Del.icio.us 90,300 65,080 9,813
Dogear 179,835 47993 5192
Data Set Num. users Max. Tags Min. Tags Avg. Tags Max. Pages Min. Pages Avg. Pages
DEL.gt500 31 1133 74 464.42 1790 506 727.55
DEL.80-100 100 456 2 107.51 100 80 88.43
DEL.5-10 100 64 1 18.53 10 5 7.44
DOG.gt500 92 2147 42 543.87 4578 500 999.04
DOG.80-100 85 295 9 126.96 100 80 89.32
DOG.5-10 100 41 2 16.11 10 5 6.99
14
Evaluation metric
  • Mean Average Precision (MAP)
  • The average precision for each query for a user
  • Mean Mean Average Precision (MMAP)
  • The mean of all the MAP values

15
Performance
Write a Comment
User Comments (0)
About PowerShow.com