NATO Consultation, Command and Control Agency

About This Presentation

Title:

NATO Consultation, Command and Control Agency

Description:

... Compression - Examples 256 ... speech information at ~50 bps Examples of speech coding we use: 64000 bps , 32000 bps PC 16000 bps CVSD, 2400 bps LPC, MELP ... – PowerPoint PPT presentation

Number of Views:83

Avg rating:3.0/5.0

Slides: 66

Provided by: DY

Category:

more less

Transcript and Presenter's Notes

Title: NATO Consultation, Command and Control Agency

1
NATOConsultation, Command and Control Agency

COMMUNICATIONS INFORMATION
SYSTEMS
Decreasing Bit Pollution through Sequence
Reduction

Dr. Davras Yavuz yavuz_at_nc3a.nato.int
2

You will find this presentation and the
accompanying paper at
www.nc3a.info/MCC2006
from where both can be viewed and/or
downloaded
(the four other NC3A presentations can also
be found at the above URL)

3
Terminology

Sequence Reduction
Originates with Peribit 2000, Founders
Ph. D. on Genome Mapping - uses the term
Molecular Sequence Reduction (MCR) -
Biomedical Informatics, Stanford University
Bit Pollution
Link/network pollution repetition of
redundant digital sequences over transmission
media (especially significant for
mobile/deployed networks/links)
Other related terms WAN optimizer,
Application Accelerator/ Optimizer or
Application Controller-Optimizer, Performance
Enhancement Proxies (PEP), WAN Expanders,
Latency (delay) removers/compensators/mitigator
s .. etc.
New dynamic field, many terms will
continue to appear, coalesce, some will
catch on others will disappear

4
Terminology

Next Generation Compression, Bit Pollution
Reduction, Sequence Reduction (latter
Peribit/Dr. Amit Singh)
WAN Expander (WX), WAN Optimizer, WAN
Optimization Controller (WOC)
(Juniper/Peribit)
Application Accelerator/Optimizer/Controller-Opti
mizer
Latency Remover/Optimizer (replace Latency by
Delay )
Especially for networks with SATCOM links
In general use of a-priori knowledge of data
comms protocols required by application to
optimize the data input/output
Combinations of above
Unfortunately all present implementations
proprietary
Unrealistic to expect standards soon,
technology too new and lucrative

5
Why Bit Pollution ?

Most of us deal daily with various electronic
files/ information
Taking MS Office as an example Word, PPT,
Excel, Project, HTML, Access, . Files
and/or many other electronic files,
data-bases, forms, etc.,..
On many occasions we make small changes and
send them back and/or forward to others

Repetitive traffic over communication links can,
in general, be classified broadly into 3
categories 1) Application protocol
overheads 2) Commonly used words, phrases,
strings, objects (logos, images, audio clips,
etc.) 3) Process flows (data-base
updates/views, forms, templates, etc. going
back forth)
6
SEQUENCE REDUCTIONNext Generation Compression
- Examples

256 Kbps satellite link
20 Mbytes PPT file (48 slides) sent 1st
time 12 minutes (700 secs)
6 of the slides modified, file size change
lt0.5 Mbytes
Modified file sent 6 hours later time
taken 8 secs
Same modified file sent 24 hours later 18
secs
Sent 7 days later 24 secs
Original file sent 7 days later 14 secs
Similar results for Word, Excel files and
web pages
Less but still significant improvement for PDF
files
Smallest improvement for zipped files
(reduction by 2.5 to 3)
Amount of new files in between repetitions
SR RAM/HD capacities have strong effect
on the duration of repeat transmissions
(dynamic library updates)
Above results based on Peribit SR s
German MOD, Syracuse University Real World
Labs (Network Computing Nov 2004) and NC3A
GE MOD results based on operational
traffic, others test traffic
Ref 6 of paper Record for throughput was
60Mbps through a T1. It came about when copying
1.5GB file twice!

7
Mobile/Tactical Comms Divergence

Fixed communications WANs with all users/nodes
fixed
Fiber-optic/photonic revolution Essentially
unlimited capacity is now possible/available
if/when a cable can be installed
Mobile comms Networks with mobile/deployable
users
No technological revolution similar to
photonic foreseen
Radio propagation will be the limiting factor
Mainstay will be radio Tactical LOS
tens/hundreds of Kbps, BLOS (rough terrain,
long distances) few Kbps
Star-wars scenarios Moving laser beams ???
LEO satellites will provide some 100s of
Kbps at a cost
Divergence will continue
Another factor Input into the five senses
100 Shannon/ Entropy bps
For transmission redundancy x 10 1
Kbps

Therefore we must treat mobile/tactical comms
differently
NATO UNCLASSIFIED
8
Deployable, Mobile, On-the-MoveCommunications

At least one end of a link moving/deployed
Networks which have nodes/users
moving/deployed
Such links/networks essential for
survivability and rapid reaction
Will be taking on increasingly more critical
tasks
Present approach Use applications developed
for fixed links/networks for deployed/mobile
units
Must consider the very different characteristics
of such networks when choosing applications
Can we measure information so we can
determine performance of links/ networks in
terms of information transported, not just
bits/bytes

9
Can we measure information ?Yes we can !

Shannon defined the concept of Entropy,
a logarithmic measure in 1940s (while
working on cryptography), it has stood the test
of time
First suggestion of log measure was Hartley
(base 10) but Shannon used the idea to
develop a complete theory of information
communication
Shannon preferred Log2 and called the
unit bits
Base e is also sometimes used (Nats)
Smaller the probability of occurrence of
an event higher the information delivered
when it occurs

10
Si
Rj

Discrete, countable
discrete
C. E. Shannon (BSTJ 1948)
11
Entropy
Entropy (H) in the case of two
possibilities/events/symbols Prob of one
p the other q 1-p H -(p log p q log
q) H versus p plotted ?
12

Let us take a Natural Language English
as an example
English has 26 letters (characters)
Space as a delimiter
TOTAL 27 characters (symbols)
One could include punctuation, special
characters, etc., for example we could use
the full 256 ASCII symbol set -
methodology is the same
Extension to other natural languages readily
made
Extension to images also possible (same
methodology)

Structure of a Natural Language - English
Defined by many characteristics Grammar,
semantics, etymology, usage, ., historical
developments, .
Until early 70s there was substantial
belief that Natural Languages and
computer programming languages (finite
automata instructions) had similarities
Noam Chomskys work (Professor at MIT)
completely destroyed those expectations
Natural Languages can be studied through
probabilistic (Markov) models
Shannons approach (1940s, no computers, Bell
Labs staff flipped through many pages of
books to get the probabilities)
He was actually working on cryptography and
made important contributions in that area
also

Various Markov model examples here,
skipped here for continuity, may be found at
the end

Zipfs Law Principle of Least Effort
George Kingsley Zipf, Professor of Linguistics,
Harvard (1902 1950)
If the words in a language are ordered
(ranked) from the most frequently used
down the probability Pn of the nth word
in this list is Pn ? 0.1 / n
Implies a maximum vocabulary size 12366
words since ( ? 1 / n is not finite
when summed 1 to ? )

For details of above see DY IEEE
Transactions on Information Theory, September
1974 Many other applications of Zipfs
Law, if interested just make a
Google/Internet search
16
Zipfs Law (Principle of Least Effort)
million words, various texts
From Symbols, Signals Noise J. R. Pierce
17
Entropy bits/character - English
Amazingly it turns out to be about the same
for most Natural Languages for which the
analysis has been done (Arabic, French,
German, Hebrew, Latin, Spanish, Turkish, .).
These languages also follow Zipfs Law.
18
Entropy of Natural Languages

Between 1 2 bits per letter/character
1.5 bits per letter is commonly used
English has 4.5 letters per word on the
average
4.5 x 1.5 6.75 or 7 bits per word
average

Normal speech 1 - 2 words per second Hence
information per second 5 bits
19
Extension to Images

Same concept and definitions
Letters replaced by pixels/groups of pixels,
etc.
Words could be analogous to sets of pixels,
objects
The numbers are much larger
E.g. 400 x 600 240000 pixel image with
each pixel capable of taking on one of 16
brightness levels
16240000 possible images
Assume all these images are equally likely ()
Probability of one these images is 1/
16240000 and the information provided by
that image is 240000 log2 16 0.96 106 bits
A real image contains much smaller
information adjacent/nearby pixels are not
independent of each other
Movies frame to frame only small/incremental
changes

() equally likely assumption clearly not
realistic
20
Speech Coding
5 b/s is irreducible information content, x
by 10 to introduce redundancy - therefore
we should be able communicate speech
information at 50 bps Examples of speech
coding we use 64000 bps , 32000 bps
PC 16000 bps CVSD, 2400 bps LPC, MELP
1200, 600 bps MELP All above waveform
codecs, they will also convey
non-measurable (intangible) information
Speech codecs (recognition at transmitter
and synthesis at receiver ) technology could
conceivably go lower than 600 bps but
would not contain the intangible component
!
21

A QUICK REFRESHER ON CONVENTIONAL
COMPRESSION
May be found at the end

22
SEQUENCE REDUCTIONNext Generation Compression

Dictionary based implements learning
algorithm
Dynamically learns the language of the
communications traffic and translates into
short-hand
Continuously updates/improves knowledge of
link language
Frequent patterns move up in dictionary,
infrequent patterns move down and eventually
can age out
No fixed packet or window boundaries
Unlike e.g. LZ which generally uses 2048
byte window
Once a pattern is learned and put in
dictionary it will be compressed wherever it
appears
Data compression is based on previously seen
data
Performance improves with time as learning
increases
Very quickly at first (10 20 minutes) and
then slowly
When a new application comes in, SR adapts
to its language

23
MOLECULAR SEQUENCE REDUCTION
Relative positioning of statistical and
substitutional compression algorithms (from
Peribit, A. P. Singh)
24
Molecular Sequence reduction
www.Peribit.com
25
MSR Technology

Real time, high speed, low latency
Continuously learns and updates dictionary
Transparently operates on all traffic (optimized
for IP)
Eliminates patterns of any size, anywhere in
stream
Patent-pending technology

26
MSR Molecular Sequence Reduction

Next-gen dictionary-based compression

www.peribit.com
27
Government/Military use examples

Many thousands of units in use in USA
(mostly corporate but also government
agencies)
GE MOD using Peribit SRs (since 2 years)
INMARSAT German Navy WAN (encrypted)
Links to GE Navy ships in/around South
Africa
Satellite links to GE units in Afghanistan
Plans for some 64 Kbps landlines
GE MOD total 300 units
also other nations
Some with initial trials

28
93
29
From German MOD
30
From German MOD
Startup behavior example
31
From German MOD
32
From German MOD
33
From Peribit.com (not GE MOD data)
34
Peribit (screen capture) NC3A WAN (NL BE)

EFFECTIVE WAN CAPACITY INCREASED BY 2.80 DATA
REDUCTION BY 64.34
NO DATA COMPRESSION NO REDUCTION
WITH DATA COMPRESSION REDUCTION !!!
35
(No Transcript)
36
Peribit Sequence Reducers
www.peribit.com
37
NC3A TEST RESULT SUMMARYExpand Model 4800
WAN Link Accelerators
512 kbps satellite link Multiplexed TCP/IP
Link with SCPS-TP acceleration
Link with application accelerator IP data
compressor
Un-accelerated link
38
NC3A TEST RESULT SUMMARY
512 kbps satellite link Multiplexed TCP/IP
Link with SCPS-TP acceleration
Link with application accelerator IP data
compressor
Un-accelerated link
39
512 Kbps satellite link10 multiplexed
TCP/IP sessions
Link with SCPS-TP acceleration
Link with application accelerator IP data
compressor
Un-accelerated link
40
Packeteer
41
Industry

New area but many increasing number of
companies
Peribit.com (now Juniper Networks)
Expand.com (Expand Networks)
Packeteer.com
Riverbed.com
Silver-peak.com
..
National authorities (e.g. USA GE) also
working with industry to incorporate SR/WX
technology into national crypto devices

42
SEQUENCE REDUCTIONNext Generation
CompressionSummary (1)

WANs will form backbone of Network Enabled
Operation
This technology provides significant
improvements in capacity
Dictionary based implements learning
algorithm
Dynamically learns the language of the
communications traffic and translates into
short-hand
Continuously updates/improves knowledge of
link language
Frequent patterns move up in dictionary,
infrequent patterns move down and eventually
can age out
No fixed packet or window boundaries
Unlike conventional compression which operates
over 1-2 Kbytes
Once a pattern is learned and put in
dictionary it will be compressed wherever it
appears
Data compression is based on previously seen
data
Performance improves with time as learning
increases
Very quickly at first (10 20 minutes) and
then slowly
When a new application comes in, SR adapts
to its language

43
SEQUENCE REDUCTIONNext Generation
CompressionSummary (1)

Significant advantages for WANs where
capacity is an issue (i.e.
deployed/mobile/tactical)
Removes redundant/repetitive transmissions
Packet-flow acceleration (latency removal)
can be easily added
Quality of Service Policy Based Multipath
can also be implemented
Does not impact security implementations
(cryptos between SRs)
However
Presently available from a few sources, each
with its proprietary technology

44
Conclusions

Shannon Information Theory provides tools
for measuring information as Entropy
Has formed the basis for most of the
coding, data transmission/detection results
since 1950s
DNA / Genome mapping process has also
apparently benefited from it
In 90s estimate for human genome was 20-30
years took 2-3 years with the computational
developments in late 90s
A new form of compression, Sequence Reduction
provides significant reductions by reducing
redun-dancies in transmitted data
Will provide important advantages for
mobile/deployable/moving WAN link applications

Questions
Comments

This presentation associated paper can be
found at www.nc3a.info/MCC2006
46
NC3A
47

Markov model examples

48
Zeroth approximation to English (zero
memory) Zero order Markov equally likely
letters, 27 numbers

AZEWRTZYNSADXESYJRQY_WGECIJJ_OB
_KRBQPOZB_YMBUAWVLBTQCNIKFMP_KMVUUGBSAXHLHSIE_MAUL
EXJ_NATSKI

All logs base 2 Entropy ? pi log
(1/pi) for i 1 to 27 log 27
4.75 bits / letter (or symbol)
49
First approximation to English (zero
memory) Zero order Markov letter
probabilities, 27 numbers

AI_NGAE__ITF__NR_ASAEV_OIE_BAINTHHHYROO_POER_SETRY
GAIETRWCO__ EHDUARU_ EU_C_FT_NSREM_DIY_EESE_
F_O_SRIS_R __UNNASHOR_CIE_AT_XEOIT_UTKLOOUL_E

Entropy ? pi log (1/pi) for i 1
to 27 4 bits / letter
50
Second approximation to English (memory)
First order Markov e.g. prob(aa),
prob(ba), prob(ca), , 27 x 27 729
numbers, some zero
URTESHETHING_AD_E AT_FOULE_ ITHALIORT_WACT_D_STE_M
INTSAN_OLINS__TWID_OULY_TE_THIGHE_CO_YS_TH_HR_
UPAVIDE_PAD_CTAVED_QUES_E
Entropy ? pi,k log (1/pi/k) for i
1 to 729 ( 27 x 27) 3.3 bits /
letter
51
Third approximation to English (memory)
Second order Markov e.g. prob(aaa),
prob(aab), prob(aac), ,
.., prob(zzy), prob(zzz - 27 x 27 x 27
19683, 75 zero (Shannon calls these
di-gram probabilities)
IANKS _CAN_OU_ANG_RLER_THATTED _OF_TO_SHOR_OF_TO_H
AVEMEM_A_I_MAND_AND_BUT_WHISSITABLY_THERVEREER_EIG
HTS_TAKILLIS_TA_KIND_AL
Entropy 3 bits / letter
52
Third approximation to French
JOU_MOUPLAS_DE_MONNERNAISSAINS_DEME_US_VREH_BRETU_
DE_TOUCHEUR_DIMMERE_LLES_MAR_ELAME_RE_A_VER_IL_DOU
VENTS_SO_FUITE
N. Abramson Information Theory Coding
53
Third approximation to ????
ET_LIGERCUM_SITECI_LIBEMUS_ACERELEN_TE_VICAESCERUM
_PE_NON_SUM_MINUS_UTERNE_UT_IN_ARION_POPOMIN_SE_IN
QUENEQUE_IRA
N. Abramson Information Theory Coding
54
WE COULD CONTINUE THIS WITH CONDITIONAL
PROBABILITIES GIVEN TRIPLETS (tri-grams),
QUADRUPLETS (tetra-grams), n-grams,... etc.
(i.e. mth ORDER MARKOV SOURCES m ? 3)
HOWEVER, THIS BECOMES IMPRACTICAL AS THE
NUMBER OF JOINT PROBABILITIES BECOMES TOO LARGE
- SO SHANNON JUMPED TO MARKOV SOURCES WITH
WORDS AS SYMBOLS - symbol set no longer 27
characters, but thousands of words. However m1,2
Markov model gives much better results than
n-gram analysis as n is increased
55
Fourth approximation to English Zero order
Markov with words e.g. Probability of
words, zero memory
REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME
CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE
TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE
MESSAGE HAD BE THESE
Entropy 2.2 bits / letter (using Zipfs
Law)
(Shannon 1948)
56
Fifth approximation to English (memory)
First order Markov with words e.g.
Probability (wordi wordj)
THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH
WRITER THAT THE CHARACTER OF THIS POINT IS
THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE
TIME OF WHO EVER TOLD THE PROBLEM FOR AN

(Shannon 1948)
57
Fifth approximation to Turkish (memory)
First order Markov with words e.g.
Probability (wordi wordj)
BIR ANLATTIKLARINA GULMECE YAZDI YAPITLARININ
SARAP BIÇIMLERI BELA GÖRUNUMU GIBI AMA BIR ETMEK
YOK TUTULDU GELEN GIDEN YER KALMADI ...

58

A QUICK REFRESHER ON CONVENTIONAL
COMPRESSION

59
Conventional Compression

Lossy Compression
Not necessarily a copy of the input most
audio, image, video compression algorithms are
Lossy our ears and eyes have resolution
thresholds
Loss-less Compression
Data integrity essential in digital data
communications Network compression must be
Loss-less
Two basic approaches
Statistical compression algorithms
Substitutional compression algorithms

Statistical compression Probabilities of
characters in the input data calculated (or
given) - frequently occurring characters are
encoded into fewer bits e.g. Huffman code,
Morse code
Static coding Once the coding is
determined in accordance with the probabilities
of occurrence it does not change
Dynamic coding Coding changes with
context - for example, the occurrence of
q in English increases the probability of
occur-rence of u to 1, similarly the
occurrence of th significantly increases
the probability of occurrence of e , etc.
As the amount of historical context
information increases dynamic coding
techniques can approach Shannon limit,
however computational requirements increase
exponentially making them impractical for
real-time/on-line applications

Substitutional compression Identifies
repeated strings of characters (longer the
better) and replaces them with reference
identifiers or tokens (shorter the better) -
At the receiver the tokens are de-referenced
and the reverse substitution performed
Essentially a form of pattern recognition and
classification
Pattern detection/recognition generally much
faster than computations needed for dynamic
coding algorithms
Most network compression techniques in use today
use substitutional compression
Compression techniques can also be combined
for example substitution based compression
followed by static coding, etc.

Substitution based compression is the basis
of almost all network compression implementations
Principle of all replace repeated patterns
with shorter tokens
Different techniques for detecting/encoding
repeated patterns
Two basic approaches
Lempel-Ziv (LZ) stateless window
compression
e.g. v.42bis, fax compression, LZS(STAC)
Predictor compression
Tries to predict the next input byte the
matching algorithm looks for the most recent
match of any pattern rather than best and
longest match - higher speed but misses many
significant pattern repetitions therefore
lower data reduction (not much used)

63
Lempel-Ziv (LZ) stateless window compression

Published in 1977 (hence LZ77)
Basis of all loss-less data compression
implementations today
Repeated strings replaced by pointers to
the previous location where the string had
occurred
Buffer or window required for the
historical information to be available for
reference typically 1000 2000 bytes
(mostly 2048 bytes)
All previous data outside the buffer/window is
lost or forgotten hence the name stateless
or memory-less
Can find and compress only patterns that are
repeated within the window repetitions
separated by more than window size are
ignored
Poor scalability For compression efficiency
large window size is required but this increases
pattern search computation significantly
Good for file compression type applications

64
(No Transcript)
65
Nov 1978, University of Pennsylvania, Museum
Hall, Banquet in honor of Claude E. Shannon
receiving H. Pender award (Prof. F. Haber
DY)

Write a Comment

User Comments (0)