Advanced Topics in Search Theory

- 1 - Introduction

In Todays Class

- Course procedures
- What is economic search?
- Characteristics of economic search
- Classical models in Search Theory
- One Sided
- Two-Sided
- Mediated Search
- Reservation-Value based search

Goal

- Get familiar with the concept of economic

search - Learn and master the main principles of economic

search - One-sided
- Two-sided

Course Procedures

- Course web-site can be found here
- http//www.cs.biu.ac.il/sarned/Courses/search/
- Teacher David Sarne (sarned_at_cs.biu.ac.il)
- Office hours Thu 1500-1600 (building 216, room

2) - Course exercises 20
- Course final exam 80

Course Plan

Week Topic Readings

1 Introduction to Search Theory

2 Pandoras Problem

3 One-Sided Search principles and optimal strategy

4 One sided search with unknown distribution

5 Concurrent search

6 Cooperative Search

7 The secretary Problem

8 Market throughput in one-sided search

9 Two-Sided Search with no search costs

10 Two-Sided Search with search costs multi-type

11 Two-Sided Search with search costs with one and two types

12 Throughput in two-sided search

13 Two-sided search with mediators

Disclaimer

- Search in AI deals with finding nodes having

certain properties in a graph (find an optimal

path from the initial node to a goal node if one

exists) - Branch and bound
- A
- Hill climbing
- This is not what we are interested in (at least

in this course) - We deal with economic search

Have you searched for something lately?

- Can you give examples for what youve searcher

for?

Searching What?

- Everything!
- Searching for a partner
- Searching for a job
- Searching for a product
- Searching for a parking space
- Searching for a java class (reuse)
- Search for a thesis advisor

The goal here is to optimize the process rather

than ending up with the optimal search object

How about the secretary problem?(also known

as the marriage problem, the sultan's dowry

problem, the fussy suitor problem)

- There is a single secretarial position to fill.
- There are n applicants for the position, and the

value of n is known. - The applicants can be ranked from best to worst

with no ties. - The applicants are interviewed sequentially in a

random order, with each order being equally

likely. - After each interview, the applicant is accepted

or rejected. - The decision to accept or reject an applicant can

be based only on the relative ranks of the

applicants interviewed so far. - Rejected applicants cannot be recalled.
- The object is to select the best applicant. The

payoff is 1 for the best applicant and zero

otherwise.

Example - Marriage Marketlegacy domain (search

pioneers)

f(x)

- Lifetime Utility

Statistics Reminder

- given a continuous random variable X, we denote
- The probability density function, pdf as f(x).

(also known as the probability distribution

function and the probability mass function) - The cumulative distribution function, cdf, as

F(x). - The pdf and cdf give a complete description of

the probability distribution of a random variable

- The pdf of X, is a function f(x) such that for

two numbers, a and b with ab - That is, the probability that X takes on a value

in the interval a, b is the area under the

density function from a to b.

CDF

- Thecdf is a function F(x), defined for a number x

by - That is, for a given value x, F(x) is the

probability that the observed value of X will be

at most x.

????? ??????? ?????

f(x)0.01

200

300

??????? ?????

- ????? f(x) ??? ?????? ?? P(x)
- ???? ????? ?????, P(2)1/6

Sampling from the distribution

f(t)

f4

P2

f3

P4

f2

P1

f1

P3

t

x1

x2

x3

x4

x5

x

- Draw a random value from a uniform distribution
- Take the value for which the CDF equals the value

drawn

Fitting a Distribution

- Visualize the Observed Data (decide on how to

divide date to bins) - Come up with possible theoretical distributions
- Test goodness-of-fit and p-values based on the

empirical distribution function (EDF) - Kolmogorov-Smirnov
- Chi-Square
- Anderson-Darling

measures of discrepancy between the empirical

distribution function and the cumulative

distribution function based on a specified

distribution

Comparison Shopping Agents (CSAs)

- Shopbots and Comparison Shopping
- automatically query multiple vendors for price

information - Growing market, growing interest

comparison-shopping agents

Comparison Shopping Agents (CSAs)

Offline - central DB of prices (daily updated)

Real-time querying upon receiving a request

Real-Time Querying (CSAs)

- Ever-increasing frequency of price updates
- Dynamic pricing theories (based on competitors

prices) Greenwald and Kephart, 1999 - Hit and run sales strategies (short term price

promotions at unpredictable intervals) Baye et

al, 2004

Assumption Future CSAs will use real-time

(costly) querying

Exercise

- Select 5 different products (preferable

electronics, computers etc.) - Collect Prices for these products over the

internet build their empirical distribution (at

least 50 prices for each) - Fit to a know distribution or describe the

empirical distribution obtained - Calculate the optimal search rule
- Send all the data with your file

Example - Marriage Marketlegacy domain (search

pioneers)

f(x)

- Lifetime Utility

Should I try to do better?

Can we do better?

- Yes we can!
- However, it has a cost
- Thus a search strategy is needed

Strategy (opportunities, time,

cost)-gt(terminate, resume)

Search Characteristics

- A distribution of plausible opportunities
- The searcher is interested in exploiting one

opportunity - Unknown value of specific opportunities
- Search costs

Searching What?

Application Cost Opportunity

Marriage Market Time / money / loneliness Better partner

Job Market Time / money / confidence Better job

Product Time / money Better price / performance

Parking time Closer parking space

Looking for a thesis advisor Working with him a little More interesting thesis

Anyone searched for an apartment in her life?

What made you take the one you are living in?

Anyone sold an apartment in her life? What made

you accept the winning bid?

The key concept dont attempt to find the best

opportunity, instead find the best policy

The search strategy

- After each draw, the searcher has a choice
- Keep what he has
- Draw another opportunity from the distribution

F(), at a cost c - Notice the net profit is a random variable whose

value depends both on the actual draws and on his

decisions to accept or reject particular

opportunities

The Goal

- Maximize the expected value of the net profit

Application Cost Opportunity

Marriage Market Time / money / loneliness Better partner

Job Market Time / money / confidence Better job

Product Time / money Better price / performance

Parking time Closer parking space

The optimal strategy

- Let V be the expected profit if following the

optimal strategy - Clearly the searcher should never accept an

opportunity with a value less than V - If he rejects the opportunity, he is in the same

situation as a searcher who is starting anew

expect profit V - Therefore

Example - Marriage Market

f(x)

Reservation Value - x

- Lifetime Utility

Should I try to do better?

In a simple infinite horizon model - doesnt

depend on history

What is a reservation value?

- Its a threshold for decision making!
- Example Krovim Krovim
- The reservation property of the optimal search

rule is a consequence of the stationarity of the

search problem (a searcher discarding an

opportunity is in exactly the same position as

before starting the search)

Example - Marriage Market

f(x)

Terminate Search

Resume Search - sample one more

Reservation Value - x

- Lifetime Utility

Should I try to do better?

In a simple infinite horizon model - doesnt

depend on history

The optimal Reservation Value

f(x)

Terminate Search

Resume Search - sample one more

- Lifetime Utility

x

Distribution of utilities in the environment

(p.d.f / c.d.f)

Expected utility when using reservation value x

Search cost

The Reservation Value Concept

Distribution of utilities in the environment

(p.d.f / c.d.f)

Expected utility when using reservation value x

Search cost

What is x that maximizes V(x)?

The Reservation Value Concept

Example - Marriage Market

f(x)

Terminate Search

Resume Search - sample one more

Reservation Value - x

- Lifetime Utility

Should I try to do better?

The expected utility from accepting only better

partner than the optimal reservation value woman

will yield an expected overall utility equal to

the lowest partner Im willing to accept

Some more interesting interpretations

Some more interesting interpretations (2)

Stop searching and keeping x

Searching exactly one more time

Myopic rule

- Important property of the optimal search rule

myopic - The searcher will never decide to accept an

opportunity he has rejected beforehand - Searcher cares only about whether or not he wants

the opportunity now - Therefore, we dont care for the recall option

Also notice that

- and

Bernoulli trial is an experiment whose outcome is

random and can be either of two possible

outcomes, "success" and "failure".

Calculating the optimal RV

Notice that

Calculating the optimal RV

Therefore

CS economic search domains

- CSAs
- Job scheduling
- Searching for free space in disks
- Searching for media in P2P
- Classical tradeoff time it takes to process vs.

time it takes to find a strong processor

The Scheduling Problem

Processor 1

Price quote (q)

c1

Processor 2

c2

Price quote (q)

Scheduling Process

Proxy

cN

Processor N

Price quote (q)

WorkFlow

- Receive a job
- Contact proxy to learn about available processors
- Query processors by using the proxy
- Each query delays you in c_i seconds
- Each query will return the temporary load on the

server (this value will not change as long as

current job is not scheduled) - Keep on querying until you are ready to schedule

your job

The Goal is

- To schedule the job in a way that minimizes the

EXPECTED overall delay - Overall delay all delays due to queries the

time job waits in queue of the selected processor

Problem 1

- You are about to purchase an iPod touch over the

internet - You estimate the price distribution of the

product over the different sellers to be uniform

between 200-300 dollars - You can search by yourself, by visiting different

web-sites the cost of time for obtaining a

price quote is 1 - How will you search? What will be your expected

cost? Whats the mean of the number of merchants

youll visit?

Solution

f(x)

0.01

200

300

- Sequential search

Find the minimum cost

Verification

- V(x)x?
- Mean number of merchants visited
- Mean payment to merchant 214.14-7.14207
- (notice its less than minimum of sampling 7

merchants)

V

Alternative Solution

f(x)

0.01

200

300

- Sequential search

marginal benefit

cost of search

x