Entropy and programs using random numbers - PowerPoint PPT Presentation

About This Presentation
Title:

Entropy and programs using random numbers

Description:

1. A machine code executable was reduced from 9220 to 4001 bytes, to 43.39 ... Chi square distribution for 931467 samples is 13250024.68, and randomly would ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 29
Provided by: richar219
Category:

less

Transcript and Presenter's Notes

Title: Entropy and programs using random numbers


1
Entropy and programs using random numbers
  • Introduction to entropy
  • Entropy and data compression
  • Predictability of random number generation
  • Entropy and system security
  • An easy but predictable method
  • Shuffling an array
  • Linux random bit generation devices
  • A programmed set of randomness tests
  • A program to generate random passwords

2
Introduction to entropy
  • Randomness is the property attributed to
    behaviour, activity or a sequence of numbers
    which lack any appearance of order. A system is
    deterministic if subsequent behaviour can be
    derived from a knowledge of its starting state,
    and the physical laws and programmed rules
    governing changes of state.
  • Programmed computers are inherently
    deterministic, because a program is a sequence of
    instructions with an intended (i.e.
    prespecified) result from given input. Test
    planning generally assumes a program which is
    deterministic.

3
Some quotations
  • "God doesn't play dice with the universe"
  • Albert Einstein.
  • "Random numbers should not be generated with a
    method chosen atrandom."
  • Donald Knuth.
  • "The total entropy of any isolated thermodynamic
    system tends to increase over time, approaching a
    maximum value."
  • The second law of thermodynamics.

4
Thermodynamic and Shannon entropy
  • Thermodynamic entropy is a measure of the amount
    of energy in a physical system that cannot be
    used to do work.
  • Entropy in information systems is related to
    entropy in thermodynamic systems, but these are
    different concepts. The entropy rate of an
    information source (Shannon or information
    entropy) is the number of bits needed to encode a
    character from this source.
  • If the information is very predictable, it is
    also very compressible so fewer bits are needed.
    One way to measure information entropy is to find
    out how much it is possible to compress a
    sequence of symbols into a smaller file.

5
Entropy and data compression
  • rich_at_saturn compress ls -l
  • total 36
  • -rwxr-xr-x 1 rich rich 9220 Apr 5 1717
    avltree
  • -rw-r--r-- 1 rich rich 6905 Apr 5 1715
    avltree.c
  • rich_at_saturn compress gzip
  • rich_at_saturn compress ls -l
  • total 16
  • -rw-r--r-- 1 rich rich 2123 Apr 5 1715
    avltree.c.gz
  • -rwxr-xr-x 1 rich rich 4001 Apr 5 1717
    avltree.gz
  • In the above example
  • 1. A machine code executable was reduced from
    9220 to 4001 bytes, to 43.39 .
  • 2. A 'C' source file used to compile the
    executable was reduced from 6905 to 2123 bytes,
    to 30.74

6
Entropy and system security
  • One of the most interesting uses of entropy
    within computing systems is for security
    purposes, because it is important that passwords
    and encryption keys should be unpredictable.
  • Some systems used to generate random numbers are
    themselves
  • inherently deterministic. Before using a random
    number generator to generate keys or passwords a
    security evaluation should be carried out.
  • a. Can an attacker predict anything about future
    states of the generator based upon knowledge of
    previous system states ?
  • b. Does an attacker have any ability to influence
    these states ?

7
Does true randomness exist ?
  • We don't know. Some systems used to generate
    random numbers are inherently deterministic,
    though are clearly good enough for security
    purposes because minor changes in the input which
    can't be controlled beyond a known precision
    result in big enough changes in the output.
    Supposing enough were known about
  • The exact starting position and velocity of a six
    sided die,
  • its aerodynamics and the resistance of the air
  • the weight distrubution of the die and other
    properties
  • inclination etc. of the surface on which the die
    lands etc.
  • Then it would be theoretically possible to
    compute which number would be on top when the die
    comes to rest on the surface on which it lands.

8
How much entropy is needed ?
  • The minimum amount will depend upon the kind of
    attacks on the system to be secured. Having much
    more than what is needed can improve system
    lifetime but in some cases reduces usability.
  • At one time IBM consultants recommended that
    master system keys and passwords be generated by
    the system manager using dice. The reason for
    this is that using a simple system meant that the
    manager could be as certain as possible about the
    means by which these keys were created and the
    conditions surrounding this event.
  • However, the increased performance of computers
    now requires more entropy within cryptographic
    keys than can easily be generated using dice.
    Systems that lock out attackers after a few wrong
    tries or which use multiple security factors are
    thought to need less entropy.

9
An easy but predictable method 1
  • Using the Posix ANSI 'C' library ltstdlib.hgt 2
    functions and a constant are defined.
  • Function void srand(unsigned int seed) is used
    to seed the pseudorandom number generator.
  • Function int rand(void) generates a
    pseudorandom sequence of numbers in the range
  • 0 RAND_MAX . RAND_MAX is a system constant,
    typically 232 - 1.

10
An easy but predictable method 2
11
An easy but predictable method 3
  • The POSIX 1003.12003 standard gives these
    example implementations of rand() and srand().
    From this simple implementation it is clear that
    the sequence generated will repeat whenever a
    value for next is repeated. As a 32 bit integer
    is used for next, the maximum possible sequence
    length will be 232.

12
Shuffling an array 1
  • Binary tree and quick sort algorithms are known
    to degrade to O(N2) operations if the input is
    sorted. But data can be shuffled with O(N)
    operations. So the risk of a severely reduced
    performance can be traded for a small performance
    loss by shuffling the data before sorting it. The
    following program uses rand() and srand() to swap
    each element with a pseudo-randomly selected
    element. The time in seconds since the epoch
    (1/1/1970 on Unix or 1/1/1980 on Windows) is used
    to seed the pseudo-random generator.

13
Shuffling an array 2
14
Shuffling an array 3
15
Linux random devices 1
  • To improve upon the obvious limitations of
    predictable pseudorandom number generators, the
    Linux operating system kernel provides 2 device
    files for the purpose of generating entropy. One
    of these, /dev/urandom, is fast and less
    cautious, the other, /dev/random is slow and
    cautious.
  • The faster device uses the slower device to
    reseed a pseudorandom sequence. The following
    description is from Linux documentation.

16
Linux random devices 2
  • RANDOM(4) Linux Programmer's
    Manual
  • NAME
  • random, urandom - kernel random number source
    devices
  • DESCRIPTION
  • The character special files /dev/random and
    /dev/urandom (present since
  • Linux 1.3.30) provide an interface to the
    kernel's random number gener-
  • ator. File /dev/random has major device
    number 1 and minor device number 8. File
    /dev/urandom has major device number 1 and minor
    device number 9.
  • The random number generator gathers
    environmental noise from device
  • drivers and other sources into an entropy
    pool. The generator also
  • keeps an estimate of the number of bits of
    noise in the entropy pool.
  • From this entropy pool random numbers are
    created.

17
Linux random devices 3
  • When read, the /dev/random device will only
    return random bytes within
  • the estimated number of bits of noise in the
    entropy pool. /dev/random
  • should be suitable for uses that need very
    high quality randomness such
  • as one-time pad or key generation. When the
    entropy pool is empty,
  • reads from /dev/random will block until
    additional environmental noise
  • is gathered.
  • When read, /dev/urandom device will return as
    many bytes as are
  • requested. As a result, if there is not
    sufficient entropy in the
  • entropy pool, the returned values are
    theoretically vulnerable to a
  • cryptographic attack on the algorithms used
    by the driver. Knowledge
  • of how to do this is not available in the
    current non-classified liter-
  • ature, but it is theoretically possible that
    such an attack may exist.
  • If this is a concern in your application, use
    /dev/random instead.

18
Linux random devices 4
  • rich_at_saturn random cat /dev/random gt random
  • (CTRL-C was pressed after counting to 30
    seconds)?
  • rich_at_saturn random cat /dev/urandom gt urandom
  • (CTRL-C was pressed after counting to 30
    seconds)?
  • rich_at_saturn pwgen ls -l
  • total 593736
  • -rw-r--r-- 1 rich rich 520 Apr 5 1819
    random
  • -rw-r--r-- 1 rich rich 94748672 Apr 5 1819
    urandom
  • 520 bytes were generated by the cautious entropy
    device in about the same time 94 MBytes were
    generated by the fast device.

19
A randomness test program 1
  • Programs can be downloaded from the Internet
    which will test a source of random numbers using
    various methods to determine their statistical
    properties. The following site provides
    information about some of these methods and a
    program called ent, which provides a set of these
    tests
  • http//www.fourmilab.ch/random/

20
A randomness test program 2
  • This ent program is also available as an Unbuntu
    package, so it was installed using the command
  • sudo aptitude install ent
  • 3.7Mb of random data was then generated using
  • cat /dev/urandom gt randata
  • and pressing ltctrlgt and ltcgt after about 10
    seconds.

21
A randomness test program 3
  • rich_at_saturn/devel/ent ent randata
  • Entropy 7.999995 bits per byte.
  • Optimum compression would reduce the size
  • of this 37867520 byte file by 0 percent.
  • Chi square distribution for 37867520 samples is
    259.05, and
  • randomly would exceed this value 50.00 percent of
    the times.
  • Arithmetic mean value of data bytes is 127.5217
    (127.5 random).
  • Monte Carlo value for Pi is 3.141374304 (error
    0.01 percent).
  • Serial correlation coefficient is -0.000110
    (totally uncorrelated 0.0).

22
A randomness test program 4
  • rich_at_saturn/devel/ent ent /usr/share/dict/words
  • Entropy 4.422962 bits per byte.
  • Optimum compression would reduce the size
  • of this 931467 byte file by 44 percent.
  • Chi square distribution for 931467 samples is
    13250024.68, and randomly would exceed this value
    0.01 percent of the times.
  • Arithmetic mean value of data bytes is 95.1313
    (127.5 random).
  • Monte Carlo value for Pi is 3.999098194 (error
    27.30 percent).
  • Serial correlation coefficient is -0.136842
    (totally uncorrelated 0.0).
  • Clearly, the spelling dictionary wasn't as random
    as the output of /dev/urandom. Source code for a
    program which computes the monte carlo value for
    PI is available in the older HTML notes.

23
Program generated passwords 1
  • Passwords chosen by humans often have too little
    entropy. When an individual is required to choose
    a password, one is often selected which an
    attacker would find very easy to guess. The
    advantage of getting the user to choose a
    password is that there is a better chance that
    the individual won't have to write it down. If
    the risk mitigated by use of a password is from a
    remote system attacker rather than a local one,
    it is better for a strong randomly-generated
    password to be written down than for a weak
    password to be used.

24
Program generated passwords 2
25
Program generated passwords 3
26
Program generated passwords 4
27
Program generated passwords 5
28
Program output
  • How many passwords ?
  • 8
  • Gdh6sWH3
  • cpUXETpc
  • zVpj6an8
  • fjQhfM9S
  • VagGz3rF
  • tJhAgJGm
  • 6XVqReQ2
  • WxQA5mxx
Write a Comment
User Comments (0)
About PowerShow.com