POLYPHONET: An Advanced Social Network Extraction System from the Web - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

POLYPHONET: An Advanced Social Network Extraction System from the Web

Description:

Friendster : http://www.friendster.com/ Orkut : http://www.orkut.com/ Imeem : http://www.imeem.com/ 3600 : http://360.yahoo.com/ Web of trust. Ontology construction ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 29
Provided by: feya
Category:

less

Transcript and Presenter's Notes

Title: POLYPHONET: An Advanced Social Network Extraction System from the Web


1
POLYPHONET An Advanced Social NetworkExtraction
System from the Web
  • Yutaka Matsuo Junichiro Mori
    Masahiro Hamasaki
  • National Institute of Advanced University
    of Tokyo National Institute of Advanced
  • Industrial Science and Hongo 7-3-1, Tokyo
    113-8656 Industrial Science and
  • Technology Japan
    Technology
  • y.matsuo_at_aist.go.jp
    jmori_at_mi.ci.i.u-tokyo.ac.jp
    hamasaki_at_ni.aist.go.jp
  • (WWW2006)
  • Finding Social Network for Trust Calculation
    (ECAI 2004)
  • Yutaka Matsuo, Hironori Tomobe, Koiti Hasida and
    Mitsuru Ishizuka

2
ABSTRACT
  • Social networks in Semantic Web
  • Knowledge management,
  • Information retrieval,
  • Ubiquitous computing..
  • POLYPHONET
  • Extract relations of persons
  • Detect groups of persons
  • Obtain keywords for a person.

3
Introduction and Related work 1/3
  • Social Network
  • Please indicate which persons you would regard
    as your friend.
  • Social networking services (SNSs)
  • Friendster http//www.friendster.com/
  • Orkut http//www.orkut.com/
  • Imeem http//www.imeem.com/
  • 3600 http//360.yahoo.com/
  • Web of trust
  • Ontology construction

4
Introduction and Related work 2/3
  • Referral Web (1995)
  • social network extraction system from the Web
  • Two person X and Y by putting a query X and Y
    to a search engine.
  • Flink
  • online social networks for a Semantic Web
    community
  • Given a set of names as input, the component uses
    a search engine to obtain hit counts

5
Introduction and Related work 3/3
  • Name disambiguation probability model
  • Co-occurrence information
  • provided by a search engine
  • to detect the proof of relations
  • Google-Hacks book
  • PageRank, HITS
  • Web graphs
  • Link structure of Web pages is seen as a social
    network.

6
Social Network Extraction 1/4
  • Nodes and Edges
  • Nodes a list of persons is given beforehand
  • JSAI2003,JSAI2004,JSAI2005 and UbiComp2005
  • Edges between of nodes are added using a search
    engine.
  • Co-occurrence
  • matching coefficient, nXY
  • mutual information, log(nXY /nXnY )
  • Dice coefficient, (2nXY )/(nX nY )
  • Jaccard coefficient,(nXY /nXvY )
  • overlap coefficient, (nXY / min(nX, nY )) ECAI
    2004
  • cosine, (nXY / )

7
Social Network Extraction 2/4
K30
8
Social Network Extraction 3/4
9
Social Network Extraction 4/4
_
10
Advanced Extraction
  • Relationship
  • Relationships between people
  • 30 kinds of relationships
  • http//vocab.org/relationship
  • POLYPHONET
  • Co-author co-authors of a technical paper
  • Lab members of the same laboratory or research
    institute
  • Proj members of the same project or committee
  • Conf participants in the same conference or
    workshop

11
Advanced Extraction - Class of Relation 1/2
  • GoogleTop(X Y,5)
  • C4.5
  • Five-fold cross validation (JSAI Case)

High tf-idf terms manually categorize data set.
12
Advanced Extraction - Class of Relation 2/2
13
Advanced Extraction Scalability 1/3
  • For example -
  • The network density of the JSAI2003 social
    network is 0.0196 with o.2 threshold.

14
Advanced Extraction Scalability 2/3
_
15
Advanced Extraction Scalability 3/3
16
Advanced Extraction Intellectual link 1/6
  • Intellectual link
  • A relation between a pair of persons with similar
    interests or citations
  • Evaluation
  • They plot the probability that the two persons
    will attend the same session at a JSAI
    conference.
  • Idea
  • If two persons are researchers of very similar
    topics, the distribution of word co-occurrences
    will be similar.

17
Advanced Extraction Intellectual link 2/6
  • Keyword extraction

Termex 37
18
Advanced Extraction Intellectual link 3/6
  • Keyword extraction
  • 567 researchers with 3981 pages
  • They gave questionnaires to 10 researchers and
    defined the correct set of keywords.

19
Advanced Extraction Intellectual link 4/6
  • X2
  • idf
  • hit

20
Advanced Extraction Intellectual link 5/6
21
Advanced Extraction Intellectual link 6/6
22
POLYPHONET 1/4
23
POLYPHONET 2/4
24
POLYPHONET 3/4
25
POLYPHONET 4/4
26
Conclusion
  • This paper describes a social network mining
    approach using the Web and organize those methods
    into small pseudocodes.
  • New aspects of social networks are investigated
    classes of relations, scalability, and a
    person-word matrix.
  • This paper implemented every algorithm on
    POLYPHONET.

27
(No Transcript)
28
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com