Title: Data Hiding (3 of 3)
1Data Hiding (3 of 3)
Curtsey of Professor Min Wu Electrical
Computer EngineeringUniv. of Maryland, College
Park
2Watermark-Based Authentication
3Document Authentication
- Embed pre-determined pattern or content features
beforehand - Verify hidden datas integrity to decide on
authenticity
4Image/Video Authentication via Watermarking
- Motivation
- Picture never lies? Easy to edit digital media
Photoshop - Important to detect tampering evidence in
litigation, insurance government archive - Original true image cannot be used to convince
judge - Basic idea for detecting tampering
- Recall authentication problem in crypto
- Embed some data in the image and certain
relationship/property gets changed upon tampering - Rely on
- fragility of embedding scheme, and/or
- embedding content features of original true image
- Two issues to address
- how to embed data?
- what data to embed?
5Useful Crypto Tools/Building-Blocks
- Cryptoly strong hash or digest function H( )
- One-way compression function
- M-bit input to N-bit output often with fixed N
and M gtgt N - Often used to produce a short ID for identifying
the input - Properties to be satisfied
- 1) Given a message m, H(m) can be calculated
very quickly - 2) Given a digest y, it is computationally
infeasible to find a message m s.t. H(m) y
(i.e., H is one-way) - 3) It is computationally infeasible to find
messages m1 m2 s.t. H(m1) H(m2) (i.e.
H is strongly collision-free) - Keyed Hash
- H( k, m ) Hash( concatenated string derived
from k m ) - Commonly used crypto hash
- 160-bit SHA (Secure Hash Algorithm) by NIST
- 128-bit MD4 and MD5 by Rivest
6Data Integrity Verification (data authentication)
- Authentication is always relative
- with respect to a reference
- How to establish and use a reference
- Method-1 Give a genuine copy to a trusted
3rd party - Method-2 Append check bits
- Want hard to find a different meaningful msg.
with same check bitsgt use cryptoly strong
hash - Want tamper-proof if hash func. is public
- Encrypt concatenated version of message and hash
- Keyed Hash (Message Authentication Code) no
extra encryption needed - Digital signature algorithm (using public-key
crypto) - Signed MsgHash i.e., encrypt by private key
s.t. others cant forge
7Extension to Grayscale/Color Images
- Semi-fragile watermarking
- Want to distinguish content-preserving changes
(e.g. moderate compression) vs. content tampering - Achieve controlled robustness often via
quantization - How to embed
- One approach enforce pre-quantized DCT
coefficients using a look-up table - What to embed
- A visually meaningful pattern and/or a
pre-selected one - facilitate quick visual check and locate
alteration - Content features to avoid malicious
counterfeiting attack - limited precision (e.g., most significant bits)
8Watermark-based Authentication
- Embed patterns and content features using a
lookup-table - High embedding capacity/security via shuffling
- locate alteration
- differentiate content vs. non-content change
(compression)
9Issues Beyond Embedding Mechanism
10Issues and Challenges
- Tradeoff among conflicting requirements
- Imperceptibility
- Robustness security
- Capacity
- Key elements of data hiding
- Perceptual model
- Embedding one bit
- Multiple bits
- Uneven embedding capacity
- Robustness and security
- What data to embed
11Techniques For Multi-bit Embedding
- Amplitude modulation
- Use M different amplitude of watermark to
represent log2M bits - ?i ? - J, -(M-3)J/(M-1), , (M-3)J/(M-1), J
where J is JND - accurate detection require clear distinction in
received amplitudes - use modulo-M operation for enforcement embedding
- Orthogonal and Biorthogonal
- Embed one of M orth. patterns representing log2M
or log2(2M) bits - TDMA-type (temporal or spatial or both)
- Embed each bit in different non-overlapped region
or frame - Unevenness in embedding capacity due to
non-stationarity - CDMA-type(Coded Modulation)
- Use plus vs. minus a pattern to embed one bit
- detector need to know the mutually orthogonal
patterns
12Comparison (brief)
- TDMA vs. CDMA
- Equivalent in terms of watermark energy
allocation - Need to handle uneven embedding capacity for TDMA
- Need to set up and store orthogonal vectors for
CDMA - Orthogonal vs. TDMA/CDMA
- Orthogonal modulation has higher energy
efficiency - To explore further, See Section V and the
reference therein of - M. Wu, B. Liu "Data Hiding in Image and Video
Part-I -- Fundamental Issues and Solutions'',
submitted to IEEE Trans. on Image Proc., Jan. 2002
13Comparison (1)
- Applicable Media Types
- not always easy to find many CDMA orthogonal
directions (e.g., binary image) - Amplitude is applicable to most features
- TDMA can be applied temporally and spatially
- TDMA vs. CDMA
- Equivalent in terms of watermark energy
allocation - Need to handle uneven embedding capacity for TDMA
- Variable Embedding Rate (need to embed some side
info.) - Constant Embedding Rate (shuffling helps increase
embed.rate) - Need to set up and store orthogonal vectors for
CDMA
14Comparison (2)
- TDMA / CDMA vs. Orthogonal Modulation
- Constant minimum separation for orthogonal
modulation as of embedded bits B increases but
total wmk energy ? unchanged - Orthogonal modulation require book-keeping more
orthogonal vectors and more computation in
classic detection - Combining the two to improve embedding rate with
small increase in computation and storage
15Comparison (3)
- Amplitude Modulation vs. Other Techniques
- Amplitude modulation can embed multiple bits on a
single feature/direction - Without the need of many orthogonal vectors
- Minimum separation for same avg. wmk energy ?
and embedding bits B - O( 2 -B ?1/2 ) for amplitude modulation
- O( B -1/2 ?1/2 ) for TDMA/CDMA
- O( ?1/2 ) for orthogonal modulation
- Modulation techniques for communications
Proakis - Bandwidth-efficient techniques vs.
Energy-efficient techniques - Non-trivial amplitude modulation is not good when
signal energy is limited (very low SNR), esp. for
blind detection
16What Data to Embed? Recall Important to
determine what data to embed in authentication
applications
17Tracing Traitors
- Robustly embed digital fingerprint
- Insert ID or fingerprint to identify each
customer - Prevent improper redistribution of multimedia
content - Collusion A cost-effective attack
- Users with same content but different
fingerprints come together to produce a new copy
with diminished or attenuated fingerprints - Anti-collusion fingerprinting
- Trace traitors and colluders to actively deter
collusion/redistribution - Rely on joint fingerprint encoding embedding
18Embedded Fingerprinting for Multimedia
19Collusion Scenarios
- Result of collusion Fingerprint energy decreases
- Jointly design encoding and embedding of
fingerprints
2016-bit ACC Example for Detecting 3 Colluders
21Anti-Collusion Fingerprint Codes
- Simplified assumption
- Assume fingerprint codes follow logic-AND op in
colluded images - K-resilient AND ACC code
- A binary code Cc1, c2, , cn such that the
logical AND of any subset of K or fewer
codevectors is non-zero and distinct from the
logical AND of any other subset of K or fewer
codevectors - Example (1110), (1101), (1011), (0111)
- ACC code via combinatorial design
- Balanced Incomplete Block Design (BIBD)
Simple Example ACC code via (7,3,1) BIBD for
handling up to 2 colluders among 7 users
To explore further, see Trappe-Wu-Liu paper
(2001).
22Anti-Collusion Fingerprint Codes (contd)
- (v,k,l)-BIBD code is an (k-1)-resilient ACC
- Defined as a pair (X,A)
- X is a set of v points
- A is a collection of blocks of X, each with k
points - Every pair of distinct points is in exactly l
blocks - blocks
- Example (7,3,1) BIBD code
- X1,2,3,4,5,6,7
- A123, 145, 167, 246, 257, 347, 356
- Code length for n1000 users
- This code O( n0.5 ) dozens bits
- Prior art by Boneh-Shaw O( (log n)4) thousands
bits
23Efficient Collusion Detection for Orth. Mod.
Amount of correlations needed ? Considerable
reductions in computation!
- EffDet(y,S)
- Break S into S0 and S1
- if then
- if Sj 1 then Output Sj,
- else EffDet(y,Sj)
-