563.10.3 CAPTCHA - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

563.10.3 CAPTCHA

Description:

Title: Advanced Computer Security CAPTCHA Author: Sari Louis Last modified by: Munawar Hafiz Created Date: 5/2/2006 2:14:19 AM Document presentation format – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 20
Provided by: Sari181
Category:

less

Transcript and Presenter's Notes

Title: 563.10.3 CAPTCHA


1
563.10.3 CAPTCHA
  • Presented by Sari Louis
  • SPAM Group Marc Gagnon, Sari Louis, Steve White
  • University of Illinois
  • Spring 2006

2
Agenda
  • Definition
  • Background
  • Applications
  • Types of CAPTCHAs
  • Breaking CAPTCHAs
  • Proposed Approach
  • Conclusion

3
Definition
  • CAPTCHA stands for Completely Automated Public
    Turing test to tell Computers and Humans Apart
  • A.K.A. Reverse Turing Test, Human Interaction
    Proof
  • The challenge develop a software program that
    can create and grade challenges most humans can
    pass but computers cannot

4
Background
  • First used by Altavista in1997
  • Reduced SPAM add-url by over 95
  • CMU/Yahoo!
  • Automated the creating and grading of challenges
  • PARC
  • Relies on document image degradation to prevent
    successful OCR
  • Conducted user-focused studies to assess the
    effectiveness of CAPTCHAs

5
Background
  • CAPTCHAs are based on open AI problems
  • Breaking CAPTCHAs help advance AI by solving
    these open problems
  • Improving CAPTCHAs help telling computers and
    human apart
  • Win-win situation

6
Background - Papers
  • Pessimal Print A Reverse Turing TestAllison L.
    Coates, Henry S. Baird, Richard J. Fateman
  • Telling Humans and Computer Apart
    AutomaticallyLuis von Ahn, Manuel Blum, and John
    Langford
  • CAPTCHA Using Hard AI Problems for SecurityLuis
    von Ahn, Manuel Blum, Nicholas J. Hopper, and
    John Langford
  • Using Machine Learning to Break Visual Human
    Interaction Proofs (HIPs)Kumar Chellapilla,
    Patrice Y. Simard

7
Applications
  • Free email services
  • Online polls
  • Dictionary attacks
  • Newsgroups, Blogs, etc
  • SPAM

8
Types of CAPTCHAs
  • Text based
  • Gimpy, ez-gimpy
  • Gimpy-r, Google CAPTCHA
  • Simards HIP (MSN)
  • Graphic based
  • Bongo
  • Pix
  • Audio based

9
Text Based CAPTCHAs
  • Gimpy, ez-gimpy
  • Pick a word or words from a small dictionary
  • Distort them and add noise and background
  • Gimpy-r, Googles CAPTCHA
  • Pick random letters
  • Distort them, add noise and background
  • Simards HIP
  • Pick random letters and numbers
  • Distort them and add arcs

10
Text Based CAPTCHAs
11
Graphic Based CAPTCHAs
  • Bongo
  • Display two series of blocks
  • User must find the characteristic that sets the
    two series apart
  • User is asked to determine which series each of
    four single blocks belongs to
  • Difference? thick vs. thin lines

12
Graphic Based CAPTCHAs
  • PIX
  • Create a large database of labeled images
  • Pick a concrete object
  • Pick four images of the object from the images
    database
  • Distort the images
  • Ask the user to pick the object for a list of
    words

13
Graphic Based CAPTCHAs
Pool
Dog
14
Audio Based CAPTCHAs
  • Pick a word or a sequence of numbers at random
  • Render them into an audio clip using a TTS
    software
  • Distort the audio clip
  • Ask the user to identify and type the word or
    numbers

15
Breaking CAPTCHAs
  • Most text based CAPTCHAs have been broken by
    software
  • OCR
  • Segmentation
  • Other CAPTCHAs were broken by streaming the tests
    for unsuspecting users to solve.

16
Proposed Approach
  • Very similar to PIX
  • Pick a concrete object
  • Get 6 images at random from images.google.com
    that match the object
  • Distort the images
  • Build a list of 100 words 90 from a full
    dictionary, 10 from the objects dictionary
  • Prompt the user to pick the object from the list
    of words

17
Proposed Approach - Technical
  • Make an HTTP call to images.google.com and search
    for the object
  • Screen scrape the result of 2-3 pages to get the
    list of images
  • Pick 6 images at random
  • Randomly distort both the images and their URLs
    before displaying them
  • Expire the CAPTCHA in 30-45 seconds

18
Proposed Approach - Benefits
  • The database already exists and is public
  • The database is constantly being updated and
    maintained
  • Adding concrete objects to the dictionary is
    virtually instantaneous
  • Distortion prevents caching hacks
  • Quick expiration limits streaming hacks

19
Proposed Approach - Drawbacks
  • Not accessible to people with disabilities (which
    is the case of most CAPTCHAs)
  • Relies on Googles infrastructure
  • Unlike CAPTCHAs using random letters and numbers,
    the number of challenge words is limited
Write a Comment
User Comments (0)
About PowerShow.com