Title: POLYPHONET: An Advanced Social Network Extraction System from the Web
1POLYPHONET An Advanced Social NetworkExtraction
System from the Web
- Yutaka Matsuo Junichiro Mori
Masahiro Hamasaki - National Institute of Advanced University
of Tokyo National Institute of Advanced - Industrial Science and Hongo 7-3-1, Tokyo
113-8656 Industrial Science and - Technology Japan
Technology - y.matsuo_at_aist.go.jp
jmori_at_mi.ci.i.u-tokyo.ac.jp
hamasaki_at_ni.aist.go.jp - (WWW2006)
- Finding Social Network for Trust Calculation
(ECAI 2004) - Yutaka Matsuo, Hironori Tomobe, Koiti Hasida and
Mitsuru Ishizuka
2ABSTRACT
- Social networks in Semantic Web
- Knowledge management,
- Information retrieval,
- Ubiquitous computing..
- POLYPHONET
- Extract relations of persons
- Detect groups of persons
- Obtain keywords for a person.
3Introduction and Related work 1/3
- Social Network
- Please indicate which persons you would regard
as your friend. - Social networking services (SNSs)
- Friendster http//www.friendster.com/
- Orkut http//www.orkut.com/
- Imeem http//www.imeem.com/
- 3600 http//360.yahoo.com/
- Web of trust
- Ontology construction
4Introduction and Related work 2/3
- Referral Web (1995)
- social network extraction system from the Web
- Two person X and Y by putting a query X and Y
to a search engine. - Flink
- online social networks for a Semantic Web
community - Given a set of names as input, the component uses
a search engine to obtain hit counts
5Introduction and Related work 3/3
- Name disambiguation probability model
- Co-occurrence information
- provided by a search engine
- to detect the proof of relations
- Google-Hacks book
- PageRank, HITS
- Web graphs
- Link structure of Web pages is seen as a social
network.
6Social Network Extraction 1/4
- Nodes and Edges
- Nodes a list of persons is given beforehand
- JSAI2003,JSAI2004,JSAI2005 and UbiComp2005
- Edges between of nodes are added using a search
engine. - Co-occurrence
- matching coefficient, nXY
- mutual information, log(nXY /nXnY )
- Dice coefficient, (2nXY )/(nX nY )
- Jaccard coefficient,(nXY /nXvY )
- overlap coefficient, (nXY / min(nX, nY )) ECAI
2004 - cosine, (nXY / )
7Social Network Extraction 2/4
K30
8Social Network Extraction 3/4
9Social Network Extraction 4/4
_
10Advanced Extraction
- Relationship
- Relationships between people
- 30 kinds of relationships
- http//vocab.org/relationship
- POLYPHONET
- Co-author co-authors of a technical paper
- Lab members of the same laboratory or research
institute - Proj members of the same project or committee
- Conf participants in the same conference or
workshop
11Advanced Extraction - Class of Relation 1/2
- GoogleTop(X Y,5)
- C4.5
- Five-fold cross validation (JSAI Case)
-
-
High tf-idf terms manually categorize data set.
12Advanced Extraction - Class of Relation 2/2
13Advanced Extraction Scalability 1/3
- For example -
- The network density of the JSAI2003 social
network is 0.0196 with o.2 threshold.
14Advanced Extraction Scalability 2/3
_
15Advanced Extraction Scalability 3/3
16Advanced Extraction Intellectual link 1/6
- Intellectual link
- A relation between a pair of persons with similar
interests or citations - Evaluation
- They plot the probability that the two persons
will attend the same session at a JSAI
conference. - Idea
- If two persons are researchers of very similar
topics, the distribution of word co-occurrences
will be similar.
17Advanced Extraction Intellectual link 2/6
Termex 37
18Advanced Extraction Intellectual link 3/6
- Keyword extraction
- 567 researchers with 3981 pages
- They gave questionnaires to 10 researchers and
defined the correct set of keywords.
19Advanced Extraction Intellectual link 4/6
20Advanced Extraction Intellectual link 5/6
21Advanced Extraction Intellectual link 6/6
22POLYPHONET 1/4
23POLYPHONET 2/4
24POLYPHONET 3/4
25POLYPHONET 4/4
26Conclusion
- This paper describes a social network mining
approach using the Web and organize those methods
into small pseudocodes. - New aspects of social networks are investigated
classes of relations, scalability, and a
person-word matrix. - This paper implemented every algorithm on
POLYPHONET.
27(No Transcript)
28(No Transcript)