Identification Numbers and Error Detection - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Identification Numbers and Error Detection

Description:

Identification Numbers and Error Detection Meredith Wachs Soundex continued 4. Delete the first character of the original name if still present. – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 40
Provided by: Mered95
Learn more at: http://www.math.wm.edu
Category:

less

Transcript and Presenter's Notes

Title: Identification Numbers and Error Detection


1
Identification Numbers and Error Detection
  • Meredith Wachs

2
Where do you find identification numbers?
  • Checks
  • Credit cards
  • Drivers licenses
  • VIN Numbers
  • Zip Codes
  • SSN
  • ISBN Numbers
  • Bar Codes/UPCs (Universal Product Codes)

3
Check Digits
  • Used to reduce error
  • Grocery items, credit cards, overnight mail,
    magazines, personal checks, travelers checks,
    soft drink cans, and automobiles have check
    digits (COMAP)

4
Congruence mod n
  • a is congruent to b, mod n, if n divides
  • a-b (Stillwell, Elements of Number Theory)
  • For example, 27 is congruent to 6 mod 3 because
    27-621, which is divisible by three.
  • Similarly, a number is congruent (mod n) to its
    remainder after being divided by n for example,
    20/36 R2, so 202 (mod 3)

5
Money Order
  • COMAP example- The identification number on a
    money order is 63024383845.
  • The last digit is the check digit, so the first
    10 digits should be 5 mod 9.
  • How do we divide such a large number by 9?
  • How do we know this?

6
Calculating Divisibility Rules
  • Take a number zabcdefg
  • This could also be written as z g100 f101
    e102 d103 c104 b105 a106
  • Lets replace each 10n by its congruence mod 9

7
Divisibility by 9 continued
  • 10011 (mod 9)
  • 101101 (mod 9)
  • 102111 (mod 9)
  • By analogy, we see that every place will have a
    weight of 1, so a(1)b(1)c(1)d(1)e(1)f(1)g(1)
    leaves the same remainder (mod 9) as abcdefg, so
    this is our divisibility rule.

8
Divisibility by 11
  • Well proceed in the same way
  • 1001 mod 11
  • 101-1 mod 11
  • 102 -1-11 mod 11
  • 103 1021011-1-1 mod 11
  • In this way, we see that abcdefg(geca)-(fdb)
    (mod 11)
  • Ex. 3458679(9653)-(784)4 (mod 11), so is
    not divisible by 11.

9
Money Order Revisited
  • If the check digit only checks the divisibility
    of the sum, what can be overlooked?

10
Potential Problems with the Check Digit
  • Any number can be replaced with its congruence
    mod n (so 0 can be replaced with 9 in this
    example)
  • Also, any digits can be easily transposed.

11
Other Uses of the Check Digit
  • Travelers checks and Euro banknotes use the
    check digit, but the entire number, including the
    check digit, should be divisible by 9.
  • If you know the divisor rule, you can easily
    figure out what the check digit should be. For
    example, if a travelers check ID number is
    3487956321 and it has a divisor rule of 9, what
    should the check digit be?

12
Answer
  • The sum of the digits is 48, and 48654, which
    is divisible by 9, so the check digit should be 6.

13
More Sophisticated- UPCs
  • A 12-digit number that is found along the bottom
    of a barcode
  • A BBBBB CCCCC D
  • A is the type of good, B is the
    manufacturers code, C is the product code, D
    is the check digit (COMAP)

14
UPC Error Prevention
  • Take the code A BCDEF GHIJK L. Going from left
    to right, calculate 3AB3CD3EF3GH3IJ3KL.
    If it isnt divisible by 10, it is an incorrect
    UPC.
  • This detects all single-position errors and
    about 89 of other errors (COMAP 326).
  • Try this on your own!

15
The U.S. Banking System
  • To identify a certain bank, it has an
    identification number (the first string on the
    bottom of the check). The string is ABCDEFGH,
    and the check digit is I. I must be the last
    digit of the resulting sum of 7A3B9C7D3E9F7G
    3H

16
Here, we can see that 7(1)3(2)9(1)0007(2)3(
4) 48. Since 8 is the check digit, this is a
valid number.
http//blog.wellsfargo.com/GuidedByHistory/images/
Earth_Check_large.jpg
17
Credit Card Numbers
  • A more effective algorithm, which is used for
    credit cards, is called Codabar, and is also used
    by libraries, blood banks, photofinishing
    companies, German banks, and the South Dakota
    drivers license department (COMAP 327).
  • Codabar allows computers to detect 100 of
    single-position errors and about 98 of other
    common errors (COMAP 328).
  • The Codabar algorithm was developed by Hans Peter
    Luhn (1896-1964) and was patented in 1960
    (http//www.merriampark.com/anatomycc.htm).

18
The Algorithm
  • Assume a 16-digit card number, with the final
    number being the check digit. Add every digit in
    the odd-numbered spaces (going from left to
    right) and multiply by two.
  • Then add the number of digits in odd-numbered
    spaces that gt4.
  • Finally, add all the digits in the even-numbered
    spaces (except for the check digit).
  • The check digit will need to be whatever will
    make the total divisible by 10.
  • Ex. What is the check digit for card number
    124785943967210?

19
Solution
  • Ex. What is the check digit for card number
    124785943967210?
  • 1489362033266
  • 66369
  • 692754971 104
  • 1046110, so the check digit is 6.

20
ISBN Numbers
  • The 10-digit International Standard Book Number
    detects 100 of single errors and 100 of
    transposition errors (COMAP 328)
  • An ISBN of A-BCDE-FGHI-J, with J as the check
    digit, is valid if 10A9B8C7D6E5F4G3H2IJ
    is divisible by 11.

21
ISBNs continued
  • Example 0-387-95587-9
  • 10(0)9(3)8(8)7(7)6(9)5(5)4(5)3(8)2(7)928
    6
  • 286 is divisible by 11 because (62)-80, which
    is divisible by 11.

22
Why does this work every time?
  • COMAP proof
  • Let there be an error in the B slot called B.
  • Then both calculations (with and without errors)
    must be divisible by 11 in order for B to not be
    detected. Then the difference between the
    calculations must be divisible by 11.

23
Proof continued
  • So (10A9B8C7D6E5F4G3H2IJ)
    (10A9B8C7D6E5F4G3H2IJ) 9(B-B).
  • Since B,B lt9, B-B cannot be 11, and so 9(B-B)
    cannot be divisible by 11 unless BB.

24
Another Example
  • What is the check digit for the ISBN 0-7167-1910?

25
Solution
  • 10(0)9(7)8(1)7(6)6(7)5(1)4(9)3(1)2(0)
    199
  • The next number divisible by 11 is 19910209.
  • To represent 10, ISBN numbers have an X.

26
Code 39
  • This uses the digits 0-9 and letters A-Z (which
    correspond to 10-35)
  • Code 39 is used by the DoD, automotive companies,
    and the health industry (COMAP 329).
  • A 15-character string is validated by whether
    15a14b13c.1o is divisible by 36 (with o as
    the check digit).
  • The VIN system is a more complicated alphanumeric
    system.

27
Bar Codes
  • To decode the information in a bar code, a beam
    of light is passed over the bars and spaces via a
    scanning device, such as a handheld wand of a
    fixed-beam device. The dark bars reflect very
    little light back to the scanner, whereas the
    light spaces reflect much light. The differences
    in reflection intensities are detected by the
    scanner and converted to strings of 0s and 1s
    that represent specific numbers and letters.
    (COMAP 334)

28
Postnet Codes
  • A bar code for a ZIP4 code (1 check digit)
  • There are 52 long or short bars, with one guard
    bar on either side and the remaining 50 bars
    grouped into 10 groups of 5, with 2 long bars and
    3 short bars each.
  • The check digit makes the sum of all ten numbers
    divisible by 10.

29
http//www-math.cudenver.edu/wcherowi/jcorner/bar
codes.html
30
UPC Bar Codes
  • Each digit is made up of seven modules
  • There are guard bars, a center division, and a
    difference between manufacturer and product
    numbers to make reading the codes as accurate as
    possible.
  • Bar codes for UPCs have been in use since a pack
    of Wrigley Juicy Fruit gum was scanned on June
    26, 1974 in Marshs Supermarket in Troy, Ohio.
    The first barcodes were made by National Cash
    Register, but smearing ink problems soon made IBM
    the top contender in the market.
    (http//en.wikipedia.org/wiki/Barcode)

31
Digit Manufacturer Product
0 0001101 1110010
1 0011001 1100110
2 0010011 1101100
3 0111101 1000010
4 0100011 1011100
5 0110001 1001110
6 0101111 1010000
7 0111011 1000100
8 0110111 1001000
9 0001011 1110100
http//en.wikipedia.org/wiki/Universal_Product_Cod
e Is the UPC above valid? (3013363512
60 0 mod 10, so yes, it is valid.)
32
Illinois Drivers License Numbers
  • In contrast to Social Security numbers, an
    Illinois drivers license number can help
    reconstruct a persons surname (by sound, not by
    spelling), first and middle initials, date of
    birth, and gender.
  • These forms of ID numbers are also used in the
    National Archives, the Library of Congress, and
    in genealogy research. (COMAP 341-2).

33
Soundex Coding System for Surnames
  1. Delete all occurrences of h and w. (so Wachs
    becomes acs)
  2. Assign number as follows a,e,i,o,u,y 0
    b,f,p,v 1 c,g,j,k,q,s,x,z 2 d,t 3 l 4
    m,n 5 r 6 (so acs 022)
  3. If two or more letters with the same numeric
    value are adjacent, omit all but the first (so
    were left with ac)

34
Soundex continued
  • 4. Delete the first character of the original
    name if still present.
  • 5. Delete all occurrences of a,e,i,o,u,y. (so
    were left with c)
  • 6. Retain only the first three digits
    corresponding to the remaining letters append
    trailing 0s if fewer than three letters remain
    precede the three digits with the first letter of
    the surname (so we have W200)
  • Because of the way this is coded, many errors in
    spelling are taken into account.
  • taken directly from COMAP 342

35
The Middle Digits-First Initial
  • InitialCodeInitialCodeInitialCodeInitialCode
  • A 0 H 320 O 640 V 860
  • B 60 I 400 P 660 W 880
  • C 100 J 420 Q 700 X 940
  • D 160 K 500 R 720 Y 960
  • E 200 L 520 S 780 Z 980
  • F 240 M 540 T 800
  • G 280 N 620 U 840

36
The Middle Digits-Middle Initial
  • InitialCodeInitialCodeInitialCodeInitialCode
  • A 1 H 8 O 14 V 18
  • B 2 I 9 P 15 W
    19
  • C 3 J 10 Q 15 X 19
  • D 4 K 11 R 16 Y 19
  • E 5 L 12 S 17 Z
    19
  • F 6 M 13 T 18
  • G 7 N 14 U 18

37
Calculating the Middle Digits
  • Add the code for the first initial to the code
    for the middle initial. For my initials, MJ, you
    have 54010550.
  • So far, my number is W200-550.
  • All middle digits information taken from
    http//www.highprogrammer.com/alan/numbers/dl_us_s
    hared.html

38
The Last Five Digits
  • In Illinois, the last five digits retain the
    birth date and sex of the person.
  • Each month is assumed to have 31 days (starting
    with January 1 as 001). My birthday, March 2, is
    therefore 231622064. If male, you are done.
    If female, add 600 to this number (so I am 664).
  • Put the last two digits of your birth year before
    this number. Thus, the last five digits of my
    Illinois drivers license is 8-8664. My complete
    number is W200-5508-8664. As the website author
    points out, IL not having overflow numbers makes
    it likely to have multiple people with the same
    number printed on their license.
  • COMAP 343

39
Conclusions
  • We have seen several ways of identifying objects
    or people with numbers and the likelihood of
    error and the ways error is caught with each one.
    As COMAP points out on p. 329, Like many
    practices in the real world, historical
    accident and lack of knowledge about existing
    methods seem to be the explanation for having so
    many means of identification. We see this
    especially in the difference between drivers
    license numbers and SSN, which were assigned
    prior to computers and the development of many of
    the systems used (like Soundex).
Write a Comment
User Comments (0)
About PowerShow.com