Week - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Week

Description:

Believe it or not, you use protocols every day. When you see a red, octagonal sign, you ... You understand how to mail a letter ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 44
Provided by: tel126
Category:
Tags: week

less

Transcript and Presenter's Notes

Title: Week


1
IT-101
Introduction to Information Technology
  • Week 4

2
Overview
  • Chapter 3.5.3 Introduction to error detection
    and correction
  • Parity checking
  • Repetition coding
  • Redundancy checking
  • Chapter 4 Protocols
  • Chapter 7 Compressing Information
  • Why can information be compressed?
  • Variable length coding
  • Universal coding

3
Error Detection and Correction
  • Many factors can lead to errors during
    transmission or storage of bits.
  • When binary information is transmitted across a
    physical channel such as wire, coax cable,
    optical fiber or air, there is always the
    possibility that some bits will be received in
    error due to interference from other sources or
    signal attenuation due to long distances or
    events like rain or snow etc
  • When binary information is stored on some form of
    media such as magnetic disks or CDs, there is the
    possibility that some bits will be read in error
    due to smudged/scratched disks.

4
  • Clever code construction or additional
    information added to the data can increase the
    odds of the information being retrieved
    correctly. This is called error control coding.
    We will discuss 3 methods
  • Parity checking (error detection only)
  • Repetition coding (error detection and
    correction)
  • Redundancy code word checking (error detection
    and correction)
  • Adding redundancy to a code increases the number
    of bits that need to be transmitted/stored, but
    leads to the detection and significant decrease
    in errors during retrieval of information.

5
  • Just about any system that uses digital
    information employs some form of error detection
    and/or correction. Recall that a major advantage
    of digital was the ability to detect/correct
    errors.
  • For example, information on a CD is encoded to
    allow the CD player to detect and correct errors
    that might arise due to smudged or scratched
    disks.
  • Almost all digital communication systems employ
    some form of error control coding to correctly
    decode information that has been corrupted by the
    channel during transmission.

6
Parity Checking
  • A simple method for error detection can be
    accomplished by appending an extra bit called a
    parity bit at the end of the code.
  • Parity checking allows for the detection of
    errors, but not correction.
  • Even parity is when the parity bit is set so that
    the total number of 1s in the word is even
  • 11 ? 11 0
  • 10 ? 10 1
  • Odd parity is when the parity bit is set so that
    the total number of 1s in the word is odd
  • 11 ? 11 1
  • 10 ? 10 0

7
Parity Checking
  • Even parity is set, and 1 0 0 is received
  • Error present, but dont know which one is the
    bit in error
  • Parity checking can detect errors, but cant
    correct them
  • Odd parity is set 1111011 0 is received.
    Error?
  • Even parity is set 1111011 0 is received.
    Error?
  • Parity checking has a major disadvantage due to
    its inability to detect an even number of errors,
    e.g. 0011 1001 would go undetected even if the
    original transmission was 0011 1111, since there
    are 2 errors.

8
Repetition coding
  • Repetition coding is the provision of extra bits
    which are repetitions of the original data to
    ensure the information is received correctly.
  • Each bit in the original data is repeated a
    certain number of times.
  • Repetition coding allows for the detection and
    correction of errors
  • For example, in 3-bit repetition coding
  • Original data 1 0 0 1
  • Transmitted 111 000 000 111 (each bit is
    repeated 3 times)
  • Received 011 000 001 111
  • Errors in the first and third bits detected
  • Errors in the first and third bits can be
    corrected

9
Redundancy Checking
  • Redundancy checking is a more complex form of
    error control coding scheme compared to the
    previous two methods.
  • It uses parity checking in an interleaved way.
  • Redundancy checking allows for the detection and
    correction of errors
  • Redundancy check coding may be accomplished a
    number of ways, and only one approach will be
    discussed.

10
Redundancy Check
  • Symbols are given a parity bit
  • Total word is given a redundancy check
  • The first bit of each symbol in the word is given
    parity, then the second bit of each symbol in the
    word is given parity
  • The additional parity symbol is given its own
    parity and then appended to the transmitted
    information

11
Redundancy check coding
Symbols
  • Information to be sent 00 01 10 11
  • First, each 2-bit symbol is given an even parity
  • 00 0 01 1 10 1 11 0
  • Next, the first bits of each symbol are given odd
    parity
  • 0 0 1 1 Odd parity bit 1

12
  • Next, the second bits of each symbol are given
    odd parity
  • 0 1 0 1 Odd parity bit 1
  • Next, the parity bits are given even parity
  • Parity bits 1 1 Even parity bit 0
  • The redundancy check codeword is appended at the
    end of the bit stream.
  • Transmitted 000 011 101 110 110

Redundancy check codeword
13
Redundancy Check Error
  • Original bit stream 00 01 10 11
  • Coded and transmitted bit stream 000 011 101
    110 110
  • Received bit stream 000 111 101 110 110
  • Even parity tells us that the second symbol has
    an error
  • Comparing Odd parity with the first bit in each
    symbol shows us that the first bit in the second
    symbol should be a 0
  • Comparing Odd parity with the second bit in each
    symbol shows us that everything is OK
  • So, the error is detected and can be corrected.
    Since we know that the first bit in the second
    symbol should be 0 instead of 1, this is
    corrected, resulting in the decoded bit stream at
    the receiver
  • 00 01 10 11

14
In-class examples
  • The following binary stream uses 2-bit repetition
    coding to help detect errors. Find the erroneous
    2-bit symbol.
  • 001100100011
  • If an even-parity bit has been added to the end
    of each 3-bit symbol for error detection, which
    of the following symbols has been received in
    error?
  • 0110
  • 0010
  • 0011
  • Generate the redundancy check codeword for the
    following stream of 2-bit symbols
  • 10 00 11 10
  • Find and correct the error in the following bit
    stream that is terminated in a redundancy check
    codeword.
  • 000 101 010 101

15
Next topic Ch.4 Protocols
  • Objectives
  • Protocols are common to human as well as digital
    worlds
  • Sample protocols for transmission, storage, data
    processing

16
Protocols
Agreed upon sets of rules that provide order to
different systems and situations.
  • Believe it or not, you use protocols every day
  • When you see a red, octagonal sign, you_____
  • When you pick up the telephone when it rings, you
    say _______
  • You know to wait in line at the DMV
  • You understand how to mail a letter
  • Protocols give structure and provide rules, but
    they arent based on anything other than human
    convention, agreement and understanding

17
Protocols are a vital component of IT
Interoperability requires sets of rules for
communicating between various devices.
  • What type of connector or voltage level should be
    used by a device?
  • How can information be formatted in a standard
    manner?
  • Where in a bit stream do you begin?
  • Which bits comprise the destination address?
  • How can a document include bold, italics,
    different font sizes, etc.

Without agreed upon formats, wed be drowning in
a sea of 1s and 0s.
010100100101001011110101010110010101010101010
18
IT Protocols
To achieve interoperability, digital systems must
organize, manage, and interpret information
according to a set of widely accepted standards.
  • Weve already discussed many IT standards.
  • Internet addresses have 32 bits.
  • The byte is commonly accepted as the smallest
    division of information for storage and
    manipulation.
  • ASCII is the standard protocol for alphanumeric
    data
  • IT is built on protocols
  • Hypertext Markup Language (HTML)
  • Hypertext Transfer Protocol (HTTP)
  • Simple Mail Transfer Protocol (SMTP)
  • Internet Protocol (IP)
  • And just about everything else we will talk about

19
Who Sets Standards and Protocols?
  • Technology Consortiums
  • Internet Engineering Task Force (IETF)
  • World Wide Web Consortium (W3C)
  • International Organizations
  • International Telecommunications Union (ITU)
  • International Standards Organization (ISO)
  • National Organizations
  • American National Standards Institute (ANSI)
  • Professional Organizations
  • Institute of Electrical and Electronic Engineers
    (IEEE)
  • Companies Microsoft, Cisco, 3Com, others..

20
Protocol example
  • How do we know where the bit stream begins?
  • Start bits stop bits (These are protocols)
  • Flags -- used in Ethernet, High-Level Data Link
    Control (HDLC)

21
HDLC Protocol Challenges
  • A protocol procedure like the HDLC flag byte
    would fail if that byte also occurred somewhere
    in the content bit stream.
  • Recall that flag byte 01111110 ASCII
  • Furthermore, because the bit stream is read on a
    bit-by-bit basis, this pattern could appear under
    other circumstances.
  • To fix this problem, a rule had to be
    incorporated into the protocol
  • Transmitter Whenever you have five 1s in a row,
    insert an unneeded 0
  • Receiver Whenever you receive five 1s in a row,
    followed by a 0, discard the 0
  • Note that this only happens in the data (or
    content) field AFTER the flag byte is
    transmitted.
  • This procedure is known as Bit Stuffing, or Zero
    Bit Insertion

22
Bit Stuffing Example
Original Data0110 1111 1111 1111 1111 0010
Transmitted Data 0110 1111 1011 1110 1111 1010 010
23
Next Topic
  • Chapter 7 Compressing Information
  • Why can information be compressed
  • Variable length coding
  • Universal coding
  • Huffman coding

24
Compressing Information
World Wide Web not World Wide Wait
  • Compression techniques can significantly reduce
    the bandwidth and memory required for sending,
    receiving, and storing data.
  • Most computers are equipped with modems that
    compress or decompress all information leaving or
    entering via the phone line.
  • With a mutually recognized system (e.g. WinZip)
    the amount of data can be significantly
    diminished.
  • Examples of compression techniques
  • Compressing BINARY DATA STREAMS
  • Variable length coding (e.g. Huffman coding)
  • Universal Coding (e.g. WinZip)
  • IMAGE-SPECIFIC COMPRESSION
  • GIF and JPEG
  • VIDEO COMPRESSION
  • MPEG

25
Why can we compress information?
REDUNDANCY
  • Compression is possible because information
    usually contains redundancies, or information
    that is often repeated.
  • For example, two still images from a video
    sequence of images are often similar. This fact
    can be exploited by transmitting only the changes
    from one image to the next.
  • For example, a line of data often contains
    redundancies
  • File compression programs remove this redundancy.

Ask not what your country can do for you - ask
what you can do for your country.
26
FREQUENCY
  • Some characters/events occur more frequently than
    others.
  • Its possible to represent frequently occurring
    characters with a smaller number of bits during
    transmission.
  • This may be accomplished by a variable length
    code, as opposed to a fixed length code like
    ASCII.
  • An example of a simple variable length code is
    Morse Code.
  • E occurs more frequently than Z so we
    represent E with a shorter length code

. E - T - - . . Z -
- . - Q
27
Information Theory
  • Variable length coding exploits the fact that
    some information occurs more frequently than
    others.
  • The mathematical theory behind this concept is
    known as INFORMATION THEORY
  • Claude E. Shannon developed modern Information
    Theory at Bell Labs in 1948.
  • He saw the relationship between the probability
    of appearance of a transmitted signal and its
    information content.
  • This realization enabled the development of
    techniques that could achieve compression.

28
A Little Probability
  • Shannon found that information can be related to
    probability.
  • An event has a probability of 1 (or 100), if we
    believe it will occur.
  • An event has a probability of 0 (or 0), if we
    believe it will not occur.
  • The probability that an event will occur takes on
    values anywhere from 0 to 1.
  • Consider a coin toss heads or tails each has a
    probability of .50
  • In two tosses, the probability of tossing two
    heads is
  • 1/2 x 1/2 1/4 or .25
  • In three tosses, the probability of tossing all
    tails is
  • 1/2 x 1/2 x 1/2 1/8 or .125
  • We compute probability this way because the
    result of each toss is independent of the
    results of other tosses.

29
Example from text..
A MENS SPECIALTY STORE
  • The probability of male patrons is .8
  • The probability of female patrons is .2
  • Assume for this example, groups of two enter the
    store. Calculate the probabilities of different
    pairings
  • Event A, Male-Male. P(MM) .8 x .8 .64
  • Event B, Male-Female. P(MF) .8 x .2 .16
  • Event C, Female-Male. P(FM) .2 x .8 .16
  • Event D, Female-Female. P(FF) .2 x .2 .04
  • We could assign the longest codes to the most
    infrequent events while maintaining unique
    decodability.

30
Example (cont..)
  • Lets assign a unique string of bits to each
    event based on the probability of that event
    occurring.
  • Event Name Code
    A Male-Male 0 B Male-Female 10
    C Female-Male 110 D Female-Female 111
  • Given a received code of 01010110100, determine
    the events
  • The above example has used a variable length code.

31
Variable Length Coding
Takes advantage of the probabilistic nature of
information.
  • Unlike fixed length codes, variable length codes
  • Assign the longest codes to the most infrequent
    events.
  • Assign the shortest codes to the most frequent
    events.
  • Each code must be uniquely identifiable
    regardless of length.
  • Examples of Variable Length Coding
  • Morse Code
  • Huffman Coding

If we have total uncertainty about the
information we are conveying, fixed length codes
are preferred.
32
Morse Code
  • Characters represented by patterns of dots and
    dashes.
  • More frequently used letters use short code
    symbols.
  • Short pauses are used to separate the letters.
  • Represent Hello using Morse Code
  • H . . . .
  • E .
  • L . - . .
  • L . - . .
  • O - - -
  • Hello . . . . . . - . . . - . . - - -

33
Huffman Coding
A procedure for finding the optimum, uniquely
decodable, variable length code associated with a
set of events, given their probabilities of
occurrence.
  • Huffman coding is a variable length coding scheme
    because it assigns the shortest codes to the most
    frequently occurring events, and longest codes to
    the least frequent events
  • You must know the probability of occurrence of
    events beforehand
  • To determine the assignment of codes to events, a
    Huffman code tree needs to be constructed
  • To decode a Huffman encoded bit stream, you would
    need to have the code table that was generated by
    the Huffman code tree

34
Constructing a Huffman Code Tree
  • First list all events in descending order of
    probability.
  • Pair the two events with lowest probabilities and
    add their probabilities.

.3 Event A
.3 Event B
.13 Event C
.12 Event D
.1 Event E
.05 Event F
0.15
.3 Event A
.3 Event B
.13 Event C
.12 Event D
.1 Event E
.05 Event F
35
Constructing a Huffman Code Tree
  • Repeat for the pair with the next lowest
    probabilities.

0.15
0.25
.3 Event A
.3 Event B
.13 Event C
.12 Event D
.1 Event E
.05 Event F
36
Constructing a Huffman Code Tree
  • Repeat for the pair with the next lowest
    probabilities.

0.4
0.15
0.25
.3 Event A
.3 Event B
.13 Event C
.12 Event D
.1 Event E
.05 Event F
37
Constructing a Huffman Code Tree
  • Repeat for the pair with the next lowest
    probabilities.

0.4
0.6
0.15
0.25
.3 Event A
.3 Event B
.13 Event C
.12 Event D
.1 Event E
.05 Event F
38
Constructing a Huffman Code Tree
  • Repeat for the last pair and add 0s to the left
    branches and 1s to the right branches.

1
0
0.4
0.6
0
1
0.15
0.25
0
1
1
0
0
1
.3 Event A
.3 Event B
.13 Event C
.12 Event D
.1 Event E
.05 Event F
00
01
100
101
110
111
39
  • Given the tree we just constructed, we can assign
    a unique code to each event (This is the Huffman
    code table)
  • Event A 00 Event B 01
  • Event C 100 Event D 101
  • Event E 110 Event F 111
  • How can you decode the string 0000111010110001000
    000111?
  • Starting from the leftmost bit, find the shortest
    bit pattern that matches one of the codes in the
    table. The first bit is 0, but we dont have an
    event represented by 0. We do have one
    represented by 00, which is event A. Continue
    applying this procedure

40
Exercise
  • Construct a Huffman code tree and code table
    given events with the following probabilities
  • 0.3 0.2 0.15 0.10 0.08 0.07 0.06 0.04
  • Decode the following bit stream using this
    Huffman code tree
  • 110100000000100101000010

A B C D E
F G H
41
Universal Coding
  • Huffman coding has its limits
  • You must know a priori (beforehand) the
    probability of the characters or symbols you are
    encoding.
  • What if a document is one of a kind?, i.e. you
    dont know the probabilities of events?
  • Universal Coding schemes do not require a
    knowledge of the statistics (probabilities) of
    the events to be coded.
  • Universal Coding is based on the realization that
    any stream of data consists of some repetition.
  • Lempel-Ziv coding is one form of Universal Coding
    presented in the text.
  • Compression results from reusing frequently
    occurring strings.
  • Works better for long data streams. Inefficient
    for short strings.
  • Used by WinZip to compress information.

42
An important distinction!
  • If, for example you have 4 events/characters/symbo
    ls/things, such as A, B, C, and D, and you want
    to convert each of them to binary. Traditionally,
    what you would do is to determine the number of
    bits required to encode 4 different things
  • Since you have 4 different things, you would need
    to use 2 bits (224)
  • The assignment of codes could be A 00, B 01,
    C10, D11
  • This is a fixed length coding approach (just like
    ASCII), since 2 bits are used per code
  • Then, for example, if you wanted to send
    information such as AABCDBC, you would need to
    send 00 00 01 10 11 01 10
  • Before sending this, however, you could apply
    compression techniques such as universal coding
    (WinZip) to reduce the amount of data that needs
    to be sent
  • These types of compression techniques are done
    after the information has been converted to
    binary
  • Note that Huffman coding is a technique that can
    achieve compression through the binary conversion
    process

43
Comments for next class
  • Please start reading chapters 10, 11 and 12 for
    next class

44
1.0
1
0.60
0
0.40
1
0
1
0
0.30
0
1
0.20
0.10
0
1
0.15
0
0
1
1
0.3
0.2
0.15
0.10
0.8
0.7
0.6
0.4
00
10
010
110
0110
0111
1110
1111
A
B
C
D
E
F
G
H
110100000000100101000010
DBAAACCBAC
Write a Comment
User Comments (0)
About PowerShow.com