Loading...

PPT – Algorithms and Incentives for Robust Ranking PowerPoint presentation | free to download - id: 9d84-MTIwM

The Adobe Flash plugin is needed to view this content

Algorithms and Incentives for Robust Ranking

- Rajat Bhattacharjee
- Ashish Goel
- Stanford University

Algorithms and incentives for robust ranking.

ACM-SIAM Symposium on Discrete Algorithms (SODA),

2007. Incentive based ranking mechanisms. EC

Workshop, Economics of Networked Systems, 2006.

Outline

- Motivation
- Model
- Incentive Structure
- Ranking Algorithm

Content then and now

- Traditional
- Content generation was centralized (book

publishers, movie production companies,

newspapers) - Content distribution was subject to editorial

control (paid professionals reviewers, editors)

- Internet
- Content generation is mostly decentralized

(individuals create webpages, blogs) - No central editorial control on content

distribution (instead there are ranking and reco.

systems like google, yahoo)

Heuristics Race

- PageRank (uses link structure of the web)
- Spammers try to game the system by creating

fraudulent link structures - Heuristics race search engines and spammers have

implemented increasingly sophisticated heuristics

to counteract each other - New strategies to counter the heuristics

Gyongyi, Garcia-Molina - Detecting PageRank amplifying structures ?

sparsest cut problem (NP-hard) Zhang et al.

Amplification Ratio Zhang, Goel,

- Consider a set S, which is a subset of V
- In(S) total weight of edges from V-S to S
- Local(S) total weight of edges from S to S

10

w(S) Local(S) In(S) Amp(S) w(S)/In(S)

High Amp(S) ? S is dishonest Low Amp(S) ? S is

honest Collusion free graph where all sets are

honest

S

Heuristics Race

- Then why do search engines work so well?
- Our belief because heuristics are not in public

domain - Is this the solution?
- Feedback/click analysis Anupam et al. Metwally

et al. - Suffers from click spam
- Problem of entities with little feedback
- Too many web pages, cant put them on top slots

to gather feedback

Ranking reversal

- Ranking reversal
- Entity A is better than entity B, but B is

ranked higher than A

Keyword Search Engine

Our result

- Theorem we would have liked to prove
- Here is a reputation system and it is robust,

i.e., has no ranking reversals even in the

presence of malicious behavior - Theorem we prove
- Here is a ranking algorithm and incentive

structure, which when applied together imply an

arbitrage opportunity for the users of the system

whenever there is a ranking reversal (even in the

presence of malicious behavior)

Where is the money?

- Examples
- Amazon.com better recommendations ? more

purchases ? more revenue - Netflix better recommendations ? increased

customer satisfaction ? increased registration ?

more revenue - Google/Yahoo better ranking ? more eyeballs ?

more revenue through ads - Revenue per entity
- Simple for Amazon.com and Netflix
- For Google/Yahoo, we can distribute the revenue

from a user on the web pages he looks at (other

approaches possible)

Why share?

Because they will take it anyway!!!

Less compelling reasons

- Difficulty of eliciting honest feedback is well

known - Resnick et al. Dellarocas
- Search engine rankings are self-reinforcing Cho,

Roy - Strong incentive for players to game the system
- Ballot stuffing and bad mouthing in reputation

systems Bhattacharjee, Goel Dellarocas - Click spam in web rankings based on clicks

Anupam et al. - Web structures have been devised to game PageRank

- Gyongyi, Garcia-Molina
- Problem of new entities
- How should the system discover high quality, new

entities in the system? - How should the system discover a web page whose

relevance has suddenly changed (may be due to

some current event)?

Outline

- Motivation
- Model
- Incentive Structure
- Ranking Algorithm

I-U Model

- Inspect (I)
- User reads a snippet attached to a search result

(Google/Yahoo) - Looks at a recommendation for a book (Amazon.com)
- Utilize (U)
- User goes to the actual web page (Google/Yahoo)
- Buys the book (Amazon.com)

I-U Model

- Entities
- Web pages (Google/Yahoo), Books (Amazon.com)
- Each entity i has an inherent quality qi (think

of it as the probability that a user would

utilize entity i, conditioned on the fact that

the entity was inspected by the user) - The qualities qi are unknown, but we wish to rank

entities according to their qualities - Feedback
- Tokens (positive and negative) placed on an

entity by users - Ranking is a function of the relative number of

tokens received by entities - Slots
- Placeholders for the results of a query

Sheep and Connoisseurs

- Sheep can appreciate a high quality entity when

shown - But wouldnt go looking for a high quality

entity - Most users are sheep

- Connoisseurs will dig for a high quality entity

which is not ranked high enough - The goal of this scheme is to aggregate the

information that the connoisseurs have

User response

I-U Model

- User response to a typical query
- Chooses to inspect the top j positions
- User chooses j at random from an unknown but

fixed distribution - Utility generation event for ei occurs if the

user utilizes an entity ei (assuming ei is placed

among the top j slots) - Formally
- Utility generation event is captured by random

variable - Gi Ir(i) Ui
- r(i) rank of entity ei
- Ir(i),Ui independent Bernoulli random variables
- EUi qi (unknown)
- EI1 EI2 EIk (known)

Outline

- Motivation
- Model
- Incentive Structure
- Ranking Algorithm

Information Markets

- View the problem as an info aggregation problem
- Float shares of entities and let the market

decide their value (ranking) Hanson Pennock - Rank according to the price set by the market
- Work best for predicting outcomes which are

objective - Elections (Iowa electronic market)
- Distinguishing features of the ranking problem
- Fundamental problem outcome is not objective
- Revenue because of more eyeballs or better

quality? - Eyeballs in turn depend on the price set by the

market - However, an additional lever the ranking

algorithm

Game theoretic approaches

- Example Miller et al.
- Framework to incentivize honest feedback
- Counter lack of objective outcomes by comparing a

users reviews to that of his peers - Selfish interests of a user should be in line

with the desirable properties of the system - Doesnt address malicious users
- Benefits from the system, may come from outside

the system as well - Revenue from outcome of these systems might

overwhelm the revenue from the system itself

Ranking mechanism overview

- Overview
- Users place token (positive and negative) on the

entities - Ranking is computed based on the number of tokens

on the entities - Whenever a revenue generation event takes place,

the revenue is shared among the users - Ranking algorithm
- Input feedback scores of entities
- Output probabilistic distribution over rankings

of the entities - Ensures that the number of inspections an entity

gets is proportional to the fraction of tokens

on it

Incentive structure

- A token is a three tuple (p, u, e)
- p 1 or -1 depending on whether a token is a

positive token or a negative token - u user who placed the token
- e entity on which the token was placed
- Net weight of the tokens a user can place is

bounded, that is ??pi is bounded - User cannot keep placing positive tokens without

placing a negative token and vice versa

User account

- Each user has an account
- Revenue shares are added or deducted from a

users account - Withdrawal is permitted but deposits are not
- Users can make profits from the system but not

gain control by paying - If a users share goes negative remove it from

the system for some pre-defined time - Let ?1 be pre-defined system parameters
- The fraction of revenue that the system

distributes as incentives to the users ? - Parameter s will be set later

Revenue share

- Suppose a revenue generation event takes place

for an entity e at time t - R revenue generated
- For each token i placed on entity e
- ai is the net weight (positive - negative) of

tokens placed on entity e before token i was

placed on e - The revenue shared by the system with the user

who placed token i is proportional to - pi?R/ais
- Adds up to at most ?R
- Negative token the revenue share is negative,

deduct from the users account

Revenue share

- Some features
- Parameter s controls relative importance of

tokens placed earlier - Tokens placed after token i have no bearing on

the revenue share of the user who placed token i - Hence s is strictly greater than 1
- Incentive for discovery of high quality entities
- Hence the choice of diminishing rewards
- Emphasis is on making the process as implicit as

possible - Resistance to racing
- The system shouldnt allow a repeated cycle of

actions which pushes A above B and then B above A

and so on - We can add more explicit feature by multiplying

any negative revenue by (1?) where ? is an

arbitrarily small positive number

Ranking by quality

- Either the entities are ranked by quality, or,

there exists a profitable arbitrage opportunity

for the users in correcting the ranking - Ranking reversal A pair of entities (i,k) such

that qi?k - qi, qk quality of entity i and k resp.
- ?i, ?k number of tokens on entity i and k resp.
- Revenue/utility generated by the entity f(r,q)
- r relative number of tokens placed on an entity
- q quality of the entity
- For the I-U Model, our ranking algorithm ensures

f(r,q) is proportional to qr - Objective A ranking reversal should present a

profitable arbitrage opportunity

Arbitrage

- There exists a pair of entities A and B
- Placing a positive token on A and placing a

negative token on B - The expected profit from A is more than the

expected loss from B

7

4

6

3

5

2

4

1

3

2

1

Proof (for separable rev fns)

- Suppose f(ri, qi) ?i-s
- ri ?i (?l ?l)-1, rk ?k(?l ?l)-1
- It is profitable to put a negative token on

entity i and a positive token on entity k - Assumption f is separable, that is f(r,q) qr?
- Choose parameter s greater than ?
- f(ri, qi) ?i-s
- f is increasing in q
- f(ri, qk) ?i-s qkri? ?i-s qk ?i?-s (?l ?l)-?
- Definition of separable function
- Similarly f(rk, qk) ?k-s qk rk? ?k-s qk ?k?-s

(?l ?l)-? - However qk?i?-s(?l ?l)-???
- ?i ?k and s ?
- Hence, f(ri, qi) ?i-s

Proof (I-U Model)

- The rate at which revenue is generated by entity

i (k) is proportional to (ensured by our ranking

algorithm) qi?i (qk?k) - Rate at which incentives are generated by placing

a positive token on entity k is qk?k/ ?ks - Loss due to placing a negative token on entity i

is qi?i/ ?is - If s1, qk?k1-s qi?i1-s
- qi
- ?i ?k (ranking reversal)
- Thus a profitable arbitrage opportunity exists in

correcting the system

Outline

- Motivation
- Model
- Incentive Structure
- Ranking Algorithm

Naive approach

- Order the entities by the net number of tokens

they have - Problem?
- Incentive for manipulation
- Example
- Slot 1 1,000,000 inspections
- Slot 2 500,000 inspections
- Entity 1 1000 tokens
- Entity 2 999 tokens

Ranking Algorithm

- Proper ranking
- If entity e1 has more positive feedback than

entity e2, then if the user chooses to inspect

the top t (for any t) slots, then the probability

that e1 shows up should be higher than the

probability that e2 shows up among the top t

slots - Random variable Xe gives the position of entity e
- Entity e1 dominates e2 if for all t, PrXe1 t

PrXe2 t - Proper ranking if the feedback score of e1 is

more than the feedback score of e2, then e1

dominates e2 - Distribution returned by the algorithm is a

proper ranking

Majorized case

p vector giving the normalized expected

inspections of slots S EI1 EI2

EIk p EI1/S, EI2/S, , EIk/S ?

vector giving the normalized number of tokens on

entities Special case p majorizes ?

For all i, the sum of the i largest components of

p is more than the sum of the i largest

components of ?

Majorized case

- Typically, the importance of top slots in a

ranking system is far higher than the lower slots

- Rapidly decaying tail
- The number of entities is order of magnitude

more than the number of significant slots - Heavy tail
- Hence for web ranking p majorizes ?
- We believe for most applications p majorizes ?
- Restrict to the majorized case here
- The details of the general case are in the paper

Hardy, Littlewood, Pólya

- Theorem Hardy, Littlewood, Pólya
- The following two statements are equivalent (1)

The vector x is majorized by the vector y, (2)

There exists a doubly stochastic matrix, D, such

that x Dy - Interpret Dij as the probability that entity i

shows up at position j - This ensures that the number of inspections that

an entity gets is directly proportional to its

feedback score

- Doubly stochastic matrix
- (Dij 0, ?j Dij 1, ?j Dij 1)

Birkhoff von Neumann Theorem

- Hardy, Littlewood, Pólya theorem on majorization

doesnt guarantee that the ranking we obtain is

proper - We present a version of the theorem which takes

care of this - Theorem Birkhoff, von Neumann
- An nxn matrix is doubly stochastic if and only

if it is a convex combination of permutation

matrices - Convex combination of permutation matrices ?

Distribution over rankings - Algorithms for computing Birkhoff von Neumann

distribution - O(m2) Gonzalez, Sahni
- O(mn log K) Gabow, Kariv

Conclusion

- Theorem
- Here is a ranking algorithm and incentive

structure, which when applied together imply an

arbitrage opportunity for the users of the system

whenever there is a ranking reversal - Resistance to gaming
- We dont make any assumptions about the source of

the error in ranking - benign or malicious - So by the same argument the system is resistant

to gaming as well - Resistance to racing

Thank You