Title: NATO Consultation, Command and Control Agency
1NATOConsultation, Command and Control Agency
- COMMUNICATIONS INFORMATION
- SYSTEMS
- Decreasing Bit Pollution through Sequence
Reduction
Dr. Davras Yavuz yavuz_at_nc3a.nato.int
2- You will find this presentation and the
accompanying paper at - www.nc3a.info/MCC2006
- from where both can be viewed and/or
downloaded - (the four other NC3A presentations can also
be found at the above URL)
3Terminology
- Sequence Reduction
- Originates with Peribit 2000, Founders
Ph. D. on Genome Mapping - uses the term
Molecular Sequence Reduction (MCR) -
Biomedical Informatics, Stanford University - Bit Pollution
- Link/network pollution repetition of
redundant digital sequences over transmission
media (especially significant for
mobile/deployed networks/links) - Other related terms WAN optimizer,
Application Accelerator/ Optimizer or
Application Controller-Optimizer, Performance
Enhancement Proxies (PEP), WAN Expanders,
Latency (delay) removers/compensators/mitigator
s .. etc. - New dynamic field, many terms will
continue to appear, coalesce, some will
catch on others will disappear
4Terminology
- Next Generation Compression, Bit Pollution
Reduction, Sequence Reduction (latter
Peribit/Dr. Amit Singh) - WAN Expander (WX), WAN Optimizer, WAN
Optimization Controller (WOC)
(Juniper/Peribit) - Application Accelerator/Optimizer/Controller-Opti
mizer - Latency Remover/Optimizer (replace Latency by
Delay ) - Especially for networks with SATCOM links
- In general use of a-priori knowledge of data
comms protocols required by application to
optimize the data input/output - Combinations of above
- Unfortunately all present implementations
proprietary - Unrealistic to expect standards soon,
technology too new and lucrative
5Why Bit Pollution ?
- Most of us deal daily with various electronic
files/ information - Taking MS Office as an example Word, PPT,
Excel, Project, HTML, Access, . Files - and/or many other electronic files,
data-bases, forms, etc.,.. - On many occasions we make small changes and
send them back and/or forward to others
Repetitive traffic over communication links can,
in general, be classified broadly into 3
categories 1) Application protocol
overheads 2) Commonly used words, phrases,
strings, objects (logos, images, audio clips,
etc.) 3) Process flows (data-base
updates/views, forms, templates, etc. going
back forth)
6SEQUENCE REDUCTIONNext Generation Compression
- Examples
- 256 Kbps satellite link
- 20 Mbytes PPT file (48 slides) sent 1st
time 12 minutes (700 secs) - 6 of the slides modified, file size change
lt0.5 Mbytes - Modified file sent 6 hours later time
taken 8 secs - Same modified file sent 24 hours later 18
secs - Sent 7 days later 24 secs
- Original file sent 7 days later 14 secs
- Similar results for Word, Excel files and
web pages - Less but still significant improvement for PDF
files - Smallest improvement for zipped files
(reduction by 2.5 to 3) - Amount of new files in between repetitions
SR RAM/HD capacities have strong effect
on the duration of repeat transmissions
(dynamic library updates) - Above results based on Peribit SR s
German MOD, Syracuse University Real World
Labs (Network Computing Nov 2004) and NC3A - GE MOD results based on operational
traffic, others test traffic - Ref 6 of paper Record for throughput was
60Mbps through a T1. It came about when copying
1.5GB file twice!
7Mobile/Tactical Comms Divergence
- Fixed communications WANs with all users/nodes
fixed - Fiber-optic/photonic revolution Essentially
unlimited capacity is now possible/available
if/when a cable can be installed - Mobile comms Networks with mobile/deployable
users - No technological revolution similar to
photonic foreseen - Radio propagation will be the limiting factor
- Mainstay will be radio Tactical LOS
tens/hundreds of Kbps, BLOS (rough terrain,
long distances) few Kbps - Star-wars scenarios Moving laser beams ???
- LEO satellites will provide some 100s of
Kbps at a cost - Divergence will continue
- Another factor Input into the five senses
100 Shannon/ Entropy bps - For transmission redundancy x 10 1
Kbps
Therefore we must treat mobile/tactical comms
differently
NATO UNCLASSIFIED
8Deployable, Mobile, On-the-MoveCommunications
- At least one end of a link moving/deployed
- Networks which have nodes/users
moving/deployed - Such links/networks essential for
survivability and rapid reaction - Will be taking on increasingly more critical
tasks - Present approach Use applications developed
for fixed links/networks for deployed/mobile
units - Must consider the very different characteristics
of such networks when choosing applications - Can we measure information so we can
determine performance of links/ networks in
terms of information transported, not just
bits/bytes
9Can we measure information ?Yes we can !
- Shannon defined the concept of Entropy,
a logarithmic measure in 1940s (while
working on cryptography), it has stood the test
of time - First suggestion of log measure was Hartley
(base 10) but Shannon used the idea to
develop a complete theory of information
communication - Shannon preferred Log2 and called the
unit bits - Base e is also sometimes used (Nats)
- Smaller the probability of occurrence of
an event higher the information delivered
when it occurs
10Si
Rj
Discrete, countable
discrete
C. E. Shannon (BSTJ 1948)
11Entropy
Entropy (H) in the case of two
possibilities/events/symbols Prob of one
p the other q 1-p H -(p log p q log
q) H versus p plotted ?
12- Let us take a Natural Language English
as an example - English has 26 letters (characters)
- Space as a delimiter
- TOTAL 27 characters (symbols)
- One could include punctuation, special
characters, etc., for example we could use
the full 256 ASCII symbol set -
methodology is the same - Extension to other natural languages readily
made - Extension to images also possible (same
methodology)
13- Structure of a Natural Language - English
- Defined by many characteristics Grammar,
semantics, etymology, usage, ., historical
developments, . - Until early 70s there was substantial
belief that Natural Languages and
computer programming languages (finite
automata instructions) had similarities - Noam Chomskys work (Professor at MIT)
completely destroyed those expectations - Natural Languages can be studied through
probabilistic (Markov) models - Shannons approach (1940s, no computers, Bell
Labs staff flipped through many pages of
books to get the probabilities) - He was actually working on cryptography and
made important contributions in that area
also
14- Various Markov model examples here,
skipped here for continuity, may be found at
the end
15- Zipfs Law Principle of Least Effort
- George Kingsley Zipf, Professor of Linguistics,
Harvard (1902 1950) - If the words in a language are ordered
(ranked) from the most frequently used
down the probability Pn of the nth word
in this list is Pn ? 0.1 / n - Implies a maximum vocabulary size 12366
words since ( ? 1 / n is not finite
when summed 1 to ? )
For details of above see DY IEEE
Transactions on Information Theory, September
1974 Many other applications of Zipfs
Law, if interested just make a
Google/Internet search
16Zipfs Law (Principle of Least Effort)
million words, various texts
From Symbols, Signals Noise J. R. Pierce
17Entropy bits/character - English
Amazingly it turns out to be about the same
for most Natural Languages for which the
analysis has been done (Arabic, French,
German, Hebrew, Latin, Spanish, Turkish, .).
These languages also follow Zipfs Law.
18Entropy of Natural Languages
- Between 1 2 bits per letter/character
- 1.5 bits per letter is commonly used
- English has 4.5 letters per word on the
average - 4.5 x 1.5 6.75 or 7 bits per word
average
Normal speech 1 - 2 words per second Hence
information per second 5 bits
19Extension to Images
- Same concept and definitions
- Letters replaced by pixels/groups of pixels,
etc. - Words could be analogous to sets of pixels,
objects - The numbers are much larger
- E.g. 400 x 600 240000 pixel image with
each pixel capable of taking on one of 16
brightness levels - 16240000 possible images
- Assume all these images are equally likely ()
Probability of one these images is 1/
16240000 and the information provided by
that image is 240000 log2 16 0.96 106 bits - A real image contains much smaller
information adjacent/nearby pixels are not
independent of each other - Movies frame to frame only small/incremental
changes
() equally likely assumption clearly not
realistic
20Speech Coding
5 b/s is irreducible information content, x
by 10 to introduce redundancy - therefore
we should be able communicate speech
information at 50 bps Examples of speech
coding we use 64000 bps , 32000 bps
PC 16000 bps CVSD, 2400 bps LPC, MELP
1200, 600 bps MELP All above waveform
codecs, they will also convey
non-measurable (intangible) information
Speech codecs (recognition at transmitter
and synthesis at receiver ) technology could
conceivably go lower than 600 bps but
would not contain the intangible component
!
21- A QUICK REFRESHER ON CONVENTIONAL
COMPRESSION - May be found at the end
22SEQUENCE REDUCTIONNext Generation Compression
- Dictionary based implements learning
algorithm - Dynamically learns the language of the
communications traffic and translates into
short-hand - Continuously updates/improves knowledge of
link language - Frequent patterns move up in dictionary,
infrequent patterns move down and eventually
can age out - No fixed packet or window boundaries
- Unlike e.g. LZ which generally uses 2048
byte window - Once a pattern is learned and put in
dictionary it will be compressed wherever it
appears - Data compression is based on previously seen
data - Performance improves with time as learning
increases - Very quickly at first (10 20 minutes) and
then slowly - When a new application comes in, SR adapts
to its language
23MOLECULAR SEQUENCE REDUCTION
Relative positioning of statistical and
substitutional compression algorithms (from
Peribit, A. P. Singh)
24Molecular Sequence reduction
www.Peribit.com
25MSR Technology
- Real time, high speed, low latency
- Continuously learns and updates dictionary
- Transparently operates on all traffic (optimized
for IP) - Eliminates patterns of any size, anywhere in
stream - Patent-pending technology
26MSR Molecular Sequence Reduction
- Next-gen dictionary-based compression
www.peribit.com
27Government/Military use examples
- Many thousands of units in use in USA
(mostly corporate but also government
agencies) - GE MOD using Peribit SRs (since 2 years)
- INMARSAT German Navy WAN (encrypted)
- Links to GE Navy ships in/around South
Africa - Satellite links to GE units in Afghanistan
- Plans for some 64 Kbps landlines
- GE MOD total 300 units
- also other nations
- Some with initial trials
-
2893
29From German MOD
30From German MOD
Startup behavior example
31From German MOD
32From German MOD
33From Peribit.com (not GE MOD data)
34Peribit (screen capture) NC3A WAN (NL BE)
EFFECTIVE WAN CAPACITY INCREASED BY 2.80 DATA
REDUCTION BY 64.34
NO DATA COMPRESSION NO REDUCTION
WITH DATA COMPRESSION REDUCTION !!!
35(No Transcript)
36Peribit Sequence Reducers
www.peribit.com
37NC3A TEST RESULT SUMMARYExpand Model 4800
WAN Link Accelerators
512 kbps satellite link Multiplexed TCP/IP
Link with SCPS-TP acceleration
Link with application accelerator IP data
compressor
Un-accelerated link
38NC3A TEST RESULT SUMMARY
512 kbps satellite link Multiplexed TCP/IP
Link with SCPS-TP acceleration
Link with application accelerator IP data
compressor
Un-accelerated link
39 512 Kbps satellite link10 multiplexed
TCP/IP sessions
Link with SCPS-TP acceleration
Link with application accelerator IP data
compressor
Un-accelerated link
40Packeteer
41Industry
- New area but many increasing number of
companies - Peribit.com (now Juniper Networks)
- Expand.com (Expand Networks)
- Packeteer.com
- Riverbed.com
- Silver-peak.com
- ..
- National authorities (e.g. USA GE) also
working with industry to incorporate SR/WX
technology into national crypto devices -
42SEQUENCE REDUCTIONNext Generation
CompressionSummary (1)
- WANs will form backbone of Network Enabled
Operation - This technology provides significant
improvements in capacity - Dictionary based implements learning
algorithm - Dynamically learns the language of the
communications traffic and translates into
short-hand - Continuously updates/improves knowledge of
link language - Frequent patterns move up in dictionary,
infrequent patterns move down and eventually
can age out - No fixed packet or window boundaries
- Unlike conventional compression which operates
over 1-2 Kbytes - Once a pattern is learned and put in
dictionary it will be compressed wherever it
appears - Data compression is based on previously seen
data - Performance improves with time as learning
increases - Very quickly at first (10 20 minutes) and
then slowly - When a new application comes in, SR adapts
to its language
43SEQUENCE REDUCTIONNext Generation
CompressionSummary (1)
- Significant advantages for WANs where
capacity is an issue (i.e.
deployed/mobile/tactical) - Removes redundant/repetitive transmissions
- Packet-flow acceleration (latency removal)
can be easily added - Quality of Service Policy Based Multipath
can also be implemented - Does not impact security implementations
(cryptos between SRs) - However
- Presently available from a few sources, each
with its proprietary technology
44Conclusions
- Shannon Information Theory provides tools
for measuring information as Entropy - Has formed the basis for most of the
coding, data transmission/detection results
since 1950s - DNA / Genome mapping process has also
apparently benefited from it - In 90s estimate for human genome was 20-30
years took 2-3 years with the computational
developments in late 90s - A new form of compression, Sequence Reduction
provides significant reductions by reducing
redun-dancies in transmitted data - Will provide important advantages for
mobile/deployable/moving WAN link applications
45This presentation associated paper can be
found at www.nc3a.info/MCC2006
46NC3A
47 48Zeroth approximation to English (zero
memory) Zero order Markov equally likely
letters, 27 numbers
- AZEWRTZYNSADXESYJRQY_WGECIJJ_OB
_KRBQPOZB_YMBUAWVLBTQCNIKFMP_KMVUUGBSAXHLHSIE_MAUL
EXJ_NATSKI
All logs base 2 Entropy ? pi log
(1/pi) for i 1 to 27 log 27
4.75 bits / letter (or symbol)
49First approximation to English (zero
memory) Zero order Markov letter
probabilities, 27 numbers
- AI_NGAE__ITF__NR_ASAEV_OIE_BAINTHHHYROO_POER_SETRY
GAIETRWCO__ EHDUARU_ EU_C_FT_NSREM_DIY_EESE_
F_O_SRIS_R __UNNASHOR_CIE_AT_XEOIT_UTKLOOUL_E
Entropy ? pi log (1/pi) for i 1
to 27 4 bits / letter
50Second approximation to English (memory)
First order Markov e.g. prob(aa),
prob(ba), prob(ca), , 27 x 27 729
numbers, some zero
URTESHETHING_AD_E AT_FOULE_ ITHALIORT_WACT_D_STE_M
INTSAN_OLINS__TWID_OULY_TE_THIGHE_CO_YS_TH_HR_
UPAVIDE_PAD_CTAVED_QUES_E
Entropy ? pi,k log (1/pi/k) for i
1 to 729 ( 27 x 27) 3.3 bits /
letter
51Third approximation to English (memory)
Second order Markov e.g. prob(aaa),
prob(aab), prob(aac), ,
.., prob(zzy), prob(zzz - 27 x 27 x 27
19683, 75 zero (Shannon calls these
di-gram probabilities)
IANKS _CAN_OU_ANG_RLER_THATTED _OF_TO_SHOR_OF_TO_H
AVEMEM_A_I_MAND_AND_BUT_WHISSITABLY_THERVEREER_EIG
HTS_TAKILLIS_TA_KIND_AL
Entropy 3 bits / letter
52Third approximation to French
JOU_MOUPLAS_DE_MONNERNAISSAINS_DEME_US_VREH_BRETU_
DE_TOUCHEUR_DIMMERE_LLES_MAR_ELAME_RE_A_VER_IL_DOU
VENTS_SO_FUITE
N. Abramson Information Theory Coding
53Third approximation to ????
ET_LIGERCUM_SITECI_LIBEMUS_ACERELEN_TE_VICAESCERUM
_PE_NON_SUM_MINUS_UTERNE_UT_IN_ARION_POPOMIN_SE_IN
QUENEQUE_IRA
N. Abramson Information Theory Coding
54WE COULD CONTINUE THIS WITH CONDITIONAL
PROBABILITIES GIVEN TRIPLETS (tri-grams),
QUADRUPLETS (tetra-grams), n-grams,... etc.
(i.e. mth ORDER MARKOV SOURCES m ? 3)
HOWEVER, THIS BECOMES IMPRACTICAL AS THE
NUMBER OF JOINT PROBABILITIES BECOMES TOO LARGE
- SO SHANNON JUMPED TO MARKOV SOURCES WITH
WORDS AS SYMBOLS - symbol set no longer 27
characters, but thousands of words. However m1,2
Markov model gives much better results than
n-gram analysis as n is increased
55Fourth approximation to English Zero order
Markov with words e.g. Probability of
words, zero memory
REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME
CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE
TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE
MESSAGE HAD BE THESE
Entropy 2.2 bits / letter (using Zipfs
Law)
(Shannon 1948)
56Fifth approximation to English (memory)
First order Markov with words e.g.
Probability (wordi wordj)
THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH
WRITER THAT THE CHARACTER OF THIS POINT IS
THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE
TIME OF WHO EVER TOLD THE PROBLEM FOR AN
(Shannon 1948)
57Fifth approximation to Turkish (memory)
First order Markov with words e.g.
Probability (wordi wordj)
BIR ANLATTIKLARINA GULMECE YAZDI YAPITLARININ
SARAP BIÇIMLERI BELA GÖRUNUMU GIBI AMA BIR ETMEK
YOK TUTULDU GELEN GIDEN YER KALMADI ...
58- A QUICK REFRESHER ON CONVENTIONAL
COMPRESSION
59Conventional Compression
- Lossy Compression
- Not necessarily a copy of the input most
audio, image, video compression algorithms are
Lossy our ears and eyes have resolution
thresholds -
- Loss-less Compression
- Data integrity essential in digital data
communications Network compression must be
Loss-less - Two basic approaches
- Statistical compression algorithms
- Substitutional compression algorithms
60- Statistical compression Probabilities of
characters in the input data calculated (or
given) - frequently occurring characters are
encoded into fewer bits e.g. Huffman code,
Morse code - Static coding Once the coding is
determined in accordance with the probabilities
of occurrence it does not change - Dynamic coding Coding changes with
context - for example, the occurrence of
q in English increases the probability of
occur-rence of u to 1, similarly the
occurrence of th significantly increases
the probability of occurrence of e , etc. - As the amount of historical context
information increases dynamic coding
techniques can approach Shannon limit,
however computational requirements increase
exponentially making them impractical for
real-time/on-line applications
61- Substitutional compression Identifies
repeated strings of characters (longer the
better) and replaces them with reference
identifiers or tokens (shorter the better) -
At the receiver the tokens are de-referenced
and the reverse substitution performed - Essentially a form of pattern recognition and
classification - Pattern detection/recognition generally much
faster than computations needed for dynamic
coding algorithms - Most network compression techniques in use today
use substitutional compression - Compression techniques can also be combined
for example substitution based compression
followed by static coding, etc.
62- Substitution based compression is the basis
of almost all network compression implementations - Principle of all replace repeated patterns
with shorter tokens - Different techniques for detecting/encoding
repeated patterns - Two basic approaches
- Lempel-Ziv (LZ) stateless window
compression - e.g. v.42bis, fax compression, LZS(STAC)
- Predictor compression
- Tries to predict the next input byte the
matching algorithm looks for the most recent
match of any pattern rather than best and
longest match - higher speed but misses many
significant pattern repetitions therefore
lower data reduction (not much used)
63Lempel-Ziv (LZ) stateless window compression
- Published in 1977 (hence LZ77)
- Basis of all loss-less data compression
implementations today - Repeated strings replaced by pointers to
the previous location where the string had
occurred - Buffer or window required for the
historical information to be available for
reference typically 1000 2000 bytes
(mostly 2048 bytes) - All previous data outside the buffer/window is
lost or forgotten hence the name stateless
or memory-less - Can find and compress only patterns that are
repeated within the window repetitions
separated by more than window size are
ignored - Poor scalability For compression efficiency
large window size is required but this increases
pattern search computation significantly - Good for file compression type applications
64(No Transcript)
65Nov 1978, University of Pennsylvania, Museum
Hall, Banquet in honor of Claude E. Shannon
receiving H. Pender award (Prof. F. Haber
DY)