Title: Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection
1Semantic Analytics on Social Networks
Experiences in Addressing the Problem of Conflict
of Interest Detection
- Boanerges Aleman-Meza, Meenakshi Nagarajan,
Cartic Ramakrishnan, - Amit P. Sheth, I. Budak Arpinar,
- LSDIS Lab, Dept. of Computer Science. University
of Georgia Athens, - (boanerg, bala, cartic, amit, budak)_at_cs.uga.edu
- Li Ding, Pranam Kolari, Anupam Joshi, Tim Finin
- Department of Computer Science and Electrical
Engineering - University of Maryland, Baltimore County
Baltimore, MD 21250 - (dingli1, kolari1, joshi, finin)_at_cs.umbc.edu
WWW 2006
2Conflict of Interest (COI) Detection Problem
- The NIH (National Institutes of Health) defines
COI in the context of the grant review process
as A Conflict Of Interest (COI) in scientific
peer review exists when a reviewer has an
interest in a grant or cooperative agreement
application or an RD contract proposal that is
likely to bias his or her evaluation of it. A
reviewer who has a real conflict of interest with
an application or proposal may not participate in
its review.
3Abstract
- A Semantic Web application
- It detects Conflict of Interest (COI)
relationships among potential reviewers and
authors of scientific papers. - It discovers various semantic associations
between the reviewers and authors. - Integrated entities and relationships from two
social networks - knows - FOAF (Friend-of-a-Friend) social
network - co-author - DBLP bibliography
4Introduction
- Social Network on the Web
- Friendship or personal ties
- LinkedIn.com
- MySpace.com
- Friendster
- Hi5
- College student
- Facebook.com
- Club Nexus (Stanford students)
- Social Network application
- Yahoo! 3600
- Dodgeball.com (by Google)
5Introduction
- COI detection systems
- EDAS
- edas.info/doc
- Microsoft Research CMT tools
- msrcmt.research.microsoft.com/cmt/
- Confious
- www.confious.com
6Introduction
- Open resources
- Real-world examples
- Addressing the problem of integrating different
social networks - Two open resources for evaluations
- co-author relationship - DBLP bibliography
- dblp.unitrier.de
- knows relationship - FOAF (Friend-of-a-Friend)
social network - Swoogle
7Motivation and Background
Reviewer vs. Author
Semantic Association
Obtaining high quality data
8Integration of Two Social Networks
- FOAT
- The dataset includes 207,000 person entities from
49,750 FOAF documents collected during the first
three months of 2005. - DBLP
- It is one of the best formatted and organized
bibliography datasets. - DBLP covers approximately 400,000 researchers who
have publications in major Computer Science
publication venues.
91. Metadata Extraction
102.Cleaning FOAF and DBLP Datasets 1/2
- DBLO-SW (Semantic Web) 38,027 person entities
112.Cleaning FOAF and DBLP Datasets 2/2
- FOAF-EDU 21,308 person entities
123.Entity Disambiguation - Algorithm
- Name-Reconciliation algorithm
- Dong, X., Halevy, A. and Madhavan, J., Reference
Reconciliation in Complex Information Spaces. In
ACM SIGMOD Conference, (Baltimore, Maryland,
2005). - atomic attributes similarity of their names and
affiliations - associations attributes common co-author
relationship.. - Weights are manually assigned
13(No Transcript)
143.Entity Disambiguation - Results
- Entity Disambiguation Results
- 6 random samples, each having 50 entity pairs
- 1 false positive , 16 false negatives
153.Entity Disambiguation - Analysis
16Semantic Analysis for COI Detection
- Levels of Conflict of Interest
- An algorithm for COI detection
- quantity and strength of relationships
- distance between a reviewer and an author.
17Weighting Relationships for COI Detection
- foafknows from A to B
- Potential positive bias from A to B
- Not necessarily imply a reciprocal relationship
from B to A. - We assigned a weight of 0.5 to all 34,824
foafknows relationships in the FOAF-EDU dataset. - co-author relationship
- It is a good indicator for collaboration and/or
social interactions among authors.
18Weighting Relationships for COI Detection
- For any two co-authors, a and b,
- let represent
the set of relationships where a co-authors a
publication with b - We define the weight of the co-authorship
relationship from a to b as follows - Pa represent the set of papers published by a
19Detection of Conflict of Interest 1/5
- Anyanwu, K. and Sheth, A.P., ?-Queries Enabling
Querying for Semantic Associations on the
Semantic Web. In Twelfth International World Wide
Web Conference, (Budapest,Hungary, 2003),
690-699.
20Detection of Conflict of Interest 2/5
- Algorithm for COI detection works as follows
- First, it finds all semantic associations between
two entities. - Second, each of the semantic associations found
is analyzed by looking at the weights of its
individual relationships. - Thresholds were required to decide what weight
values are indicative of strong and weak
collaborations. - The following cases are considered
- Reviewer and author are directly related
- Reviewer and author are not directly related but
they are directly related to (at least) one
common person. - Reviewer and author are indirectly related
21Detection of Conflict of Interest 3/5
- (i) Reviewer and author are directly related
- Through foafknows and/or co-author
- The assessments are high
- At least one relationship have weight on the
range medium-to-high (i.e., weight 0.3) - The assessments are medium
- At least one relationship have weight on the
range low-to-medium (i.e., 0.1 weight lt 0.3) - The assessments are low
- At least one relationship have low weight (i.e.,
weight lt 0.1)
22Detection of Conflict of Interest 4/5
- (ii) Reviewer and author are not directly related
but they are directly related to (at least) one
common person. - The common person is an intermediary.
- The assessments are medium
- Case1 10 intermediaries in common.
- Case2 The relationships connecting to the
intermediary (i.e., one from the reviewer and
another from the author) have weight on the range
medium-to-high (i.e., weight 0.3). - If neither of these two cases holds, then the
assessment is low.
23Detection of Conflict of Interest 5/5
- (iii) Reviewer and author are indirectly related
- Through a semantic association containing three
relationships. - In this case, the assessment is low level of
potential COI. - The assessments are medium have weight on the
range low-to-medium (i.e., 0.1 weight lt 0.3)
24Experimental Results
25Conclusion
- Conflict of Interest Detection fits in a
multi-step process of a class of Semantic Web
applications. - Identified some major stumbling blocks
- Metadata extraction
- Data integration algorithms and techniques
- Entity disambiguation
- Metadata and Ontology representation
- COI detection is based on semantic technologies
techniques - Integrated social network from the FOAF social
network and the DBLP co-authorship network.
26Conclusion
- A demo of the application is available
(lsdis.cs.uga.edu/projects/semdis/coi/).