Multilayer Filtering or The Dangerous Economics of Spam Control 2008 MIT Spam Conference - PowerPoint PPT Presentation

Loading...

PPT – Multilayer Filtering or The Dangerous Economics of Spam Control 2008 MIT Spam Conference PowerPoint presentation | free to download - id: 1796ec-ZGNlM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Multilayer Filtering or The Dangerous Economics of Spam Control 2008 MIT Spam Conference

Description:

Upstream ISPs more sensitive to speed, downstream to accuracy ... Downstream better off with less incoming spam, but cannot force upstream to do ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Multilayer Filtering or The Dangerous Economics of Spam Control 2008 MIT Spam Conference


1
Multilayer FilteringorThe Dangerous Economics
of Spam Control2008 MIT Spam Conference
  • By
  • Alena Kimakova and Reza Rajabiun
  • York University and COMDOM Software
  • Toronto, Canada

2
I.1 Spam as an empirical problem
  • Two historical observations (2002-2008)
  • A) Spam ratio in 2002 20 -30 of all email
    messages
  • Spam ratio in 2008 70 - 90 of all email
    messages
  • Increased sophistication of spam (pdf, image,
    search engine, etc)
  • B) Increased sophistication and accuracy of
    statistical content filters 98 accuracy, 0.1
    false positives
  • (Cormack and Lynam, 2007)
  • Empirical puzzle
  • Why more spam after the adoption of technical
    and regulatory countermeasures?

3
I.2 Methodology Positive analysis
  • How can we explain the growth and sophistication
    of spam?
  • Hypothesis A technological trade-off between
    speed and accuracy facing network owners and
    operators.
  • Approach combines
  • A) Game theoretical models Large volumes of
    spam because of asymmetries in the distribution
    of filter quality across the Internet.
  • B) Evolution of the technological possibilities
    frontier facing ISPs and other operators from
    the early 2000s.
  • Problem with existing studies in economics and
    computer science
  • Do not account for incentives of spammers and
    ISPs.
  • General point Importance of interdisciplinary
    cooperation between economists and computer
    scientists in designing spam filtering bundles
    and regulatory countermeasures.

4
II.1 Technological Choice
  • Advances in content filtering accuracy ?
    Constrained sensory threat
  • However
  • High noise/signal ratio ? Network costs of spam
    rise
  • Of particular concern in developing countries
    with relatively lower
  • a) Bandwidth
  • b) Processing capacity
  • c) Administrative capacity
  • Spam and the Digital Divide (Rajabiun, 2007)
  • The literature in computer science and economics
    almost exclusively focuses on false
    negative/positive problem

5
II.2 End user and network costs
  • More realistic assumption End user (E) and
    network costs of spam (N) are likely to be
    closely linked.
  • General problem facing an ISP (Server level
    problem)
  • Costs of Spam C ( E (E1, E2 ), N ( E1, E2, S )
    )
  • E1 Expected false negative rate
  • E2 Expected false positive rate
  • S Number of servers
  • Theory Little known about relationships, but not
    static.
  • Practice Can be estimated for individual ISPs
    based on
  • a) Accounting information
  • b) Features of antispam systems available at a
    point in time

6
II.3 Antispam technology
  • Basic filtering methods available since the late
    1990s
  • Server level
  • Adoption of (fuzzy) fingerprinting (2001-2005)
    and reputation based systems (2004-2006) upstream
    (fast, but not accurate)
  • End user level
  • Statistical (Bayesian) content filters
    (accurate, but not fast)
  • Other technical public policy measures Aiming
    to increase the costs of sending spam (Hashcash,
    civil/criminal law, do not call registries)
  • Optimal choice of filter depends on identity of
    end user/ISP
  • Upstream ISPs more sensitive to speed,
    downstream to accuracy
  • Divergence between (socially) optimal and actual
    technological choice

7
III.1 The long tail
  • Distribution of taste for spam for each
    sub-network not normal
  • Khong (2004) Mechanisms that connect spammers
    and those with a taste for spam ? first best
    solution (open channel argument)
  • Blocking and filters second best
  • Empirically More spam after wide-spread adoption
    of open channels rather than less.
  • Loder et al. (2004) Attention Bond Mechanism
    (ABM) first best because it allows for price
    negotiations between senders and receivers.
  • Basic economic assumption The subjective theory
    of value
  • Ex Search for affordable drugs for the
    uninsured in the U.S.
  • The long tail in natural sciences Phase
    transition/multiple equilibria
  • Game theory Strategic complementarities

8
III.2 Sender side countermeasures
  • In Microeconomic theory
  • Long tailed distributions associated with
    markets where markups are invariant to the number
    of sellers (e.g. mutual funds)
  • Margin for spammers, or expected response rates,
    are invariant to the number of spammers at play
  • Implications Legal sanctions and IP reputation
    systems increase costs of spamming, drive some
    spammers out of the market, but do not thin out
    the market.
  • Intuition As in wars against prostitution and
    drugs, hang them all strategies ineffective
    increase social costs (Becker-Friedman).

9
IV.1 Strategic conflicts
  • Trivial model of spam Tragedy of the commons
  • ? Generic solution is to increase costs on
    spammers, but results in escalating spam wars.
  • Empirically Increasing sender costs since early
    2000s, but more spam.
  • Escalation ? Development and adoption of new
    spamming techniques
  • Androutsopoulos et al. (2005) ? 2 player game
    between senders and receivers has a single Nash
    equilibrium
  • ? Settles in infinitely repeated games, unless
    changes to underlying technologies or taste for
    spam

10
IV.2 Spam growth and filter quality
  • Reshef and Solan (2006) Blame filters for growth
    of spam due to differences in filter quality.
  • When costs of sending messages not too high
  • ? Effect of improved filter quality on total
    volume of spam ambiguous
  • Eaton et al. (2008) Complementarities between
    filters and sender side countermeasures.
    Improving filtering alone results in more spam.
    Given ineffective sender side countermeasures,
    they suggest receiver side payments (as in SMS
    systems).
  • Kearns (2005) Spam as a source of both costs and
    revenues for ISPs ? economic incentive to adopt
    inefficient filters

11
V.1 Speed versus accuracy
  • Existing literature
  • Even if they could read end user preferences
    accurately, upstream backbone providers do not
    have sufficient financial incentives to adopt
    the right technological countermeasures.
  • Argument here Not necessarily because of
    financial factors alone.
  • Hypothesis ISPs faced technological trade-offs
    in terms of speed v. accuracy
  • Coordination failure not between senders and
    receivers as in the tragedy of commons or Khong
    (2004), but between upstream and downstream
    entities/servers.
  • Downstream better off with less incoming spam,
    but cannot force upstream to do the optimal
    filtering for them.

12
V.2 Bundles and layers
  • Bundles of countermeasures facing spammers
  • a) Ad hoc feature selection rules (late 1990s)
    centralized
  • b) Fingerprinting/checksum filters (2000-2005)
    centralized
  • c) IP reputation/authentication mechanism
    (2004-2006) centralized
  • d) Statistical content filters (since late
    1990s) distributed
  • Asymmetric filter quality (2000-2006)
  • (b and c) fast relative to 1st generation
    statistical content filters (5x), but less
    accurate (-5 and -30 respectively).
  • Response by income smoothing spammers higher
    noise/signal ratio, more variants, one shot BGP
    spectrum agility

13
VI.1 The response
  • A) Coordination by operators to strengthen
    authentication protocols (SPF, DKIM)
  • Problem A wide range of techniques available
    to bypass, and even use the protocols as an
    instrument of sending more spam!
  • B) Closing the gap between fast and accurate
    filters Further optimization of the methods for
    distributed content scanning, learning, and
    classification
  • 1st versus 2nd generation Bayesian filters
    (CRM114, Bogofilter, COMDOM)

14
VI.2 Evolution of Bayesian content filters
15
VI.3 Findings
  • Technological trade-off between speed and
    accuracy now closed with distributed 2nd
    generation Bayesian filters (at least 30x
    differential in throughput relative to 1st
    generation)
  • Note Fingerprinting was 5x faster than 1st
    generation Bayesian filters in terms of
    throughput
  • Fixed versus variable costs of message processing
  • ? Substantial reductions in variable costs of
    scanning, minor improvements in fixed costs of
    classification

16
VII. Summary
  • More spam is an instrument for
  • a) Evading filters
  • b) Searching for people with a taste for spam
  • Normative question for policy makers Should
    spamming be illegal?
  • Legal sanctions may induce moral hazard problem
    and potentially exacerbate the problem at the
    aggregate level by adopting more costly
    strategies/technologies (especially important for
    developing countries).
  • For designers of antispam systems/bundles Should
    we retain layers that aim to increase the costs
    of spamming through ad hoc centralized control
    (e.g. IP reputation, fingerprinting)?
About PowerShow.com