Title: Wireless Sensor Network Security: The State of the Art
1Wireless Sensor Network SecurityThe State of
the Art
- ASCI Springschool on Wireless Sensor Networks
Yee Wei Law The University of Melbourne
2Prelude
- In the beginning, security objective for civilian
applications is unclear - But communication with the industry confirms our
suspicion about the security requirements - Endless challenges, every component of WSNs has
its corresponding security issues
3Roadmap
- Primer to cryptography andWSNs
- Secure data aggregation
- Key management
- Other areas
- secure remote reprogramming
- secure localization
- energy-efficient jamming attacks
Information Assurance
Protection
Detection
Reaction
4Part Zero
- Primer to cryptography and WSNs
5Introduction to security
- Security threats either somebody wants to steal
something from you or sabotage you - Information assurance (IA) is a set of measures
that protect and defend information and
information systems by ensuring their
availability, integrity, authentication,
confidentiality, and non-repudiation. These
measures include providing for restoration of
information systems by incorporating protection,
detection, and re-action capabilities.
Information assurance
Information security
Operation security
6Primitives
- Security objectives
- Confidentiality
- Integrity
- Authentication
- Non-repudiation
- Encryption / decryption
- Symmetric-key E(K, M) / D(K, M)
- Asymmetric-key E(PK, M) / D(SK, M)
- Signature / verification
- Symmetric-key message authentication code (MAC),
denotedMAC(K, M) - Asymmetric-key digital signature,
denotedSign(SK, M), Ver(PK, M)
Notation Public key PK Private key SK
7Common usage
Diff keys for encryption and authentication
- E(K1, M) MAC(K2, E(K1, M))
- E(K1, M) Sign(SK, h(E(K1, M)))
Integrity, authentication
Confidentiality
Signing on hash is more efficient
Confidentiality
Integrity, authentication, non-repudiation
8Birthday threshold
23 people (q)
birthdays (n)
- Collision probability C(N,q)
- Birthday attack on CBC-MAC Bellare et al. 00
number of queries
running time
9Security notions (PKC)
- Semantic security indistinguishabilityCiphertex
t doesnt reveal anything about the plaintext
except the length - Non-malleabilityNew ciphertexts cannot be
created based on known ciphertexts - Satisfies a security notion, if an attacker loses
to a game, e.g., the chosen plaintext attack
(CPA) game
10Challenges in WSNs
Constraints
Implications
Sensor node hardware, resource constraints
Algos must be energy- and storage-efficient
Nodes operate unattended
Adversary can compromise any node
Nodes not tamper-resistant
Adversary can compromise any nodes keys
No fixed infrastructure
Cannot assume any special-function node in
vicinity
No pre-configed topology
Nodes dont know neighbours in advance
Communicate in an open medium
Communications are world-readable and
world-writeable by default
11Security design principles
- Favour computation over communication
- Communication 1000 times more energy-consuming
than computation - Minimal public-key crypto
- Tate pairing costs 5s (54mJ) on a Tmote Sky
(fastest recorded by Szczechowiak et al. 08) - Favour resilience (tolerance) over absolute
security - Strength in number
12Part One
13Data aggregation
aggregate
aggregate
aggregate
- Purposes
- Save bandwidth (limited data rate)
- Save energy (limited energy)
Reason why we put a processor on every node in
the first place
14Phase 1 Query dissemination
Sample query SELECT AVERAGE(temperature) FROM
sensors WHERE floor 6 EPOCH DURATION 30s
15Phase 2 Data aggregation
aggregate
aggregate
aggregate
Types of aggregation (1) basic aggregation, (2)
data compression, (3) parameter estimation
16Phase 3 Result verification (optional)
Did you really report this?
Did you really report this?
Did you really report this?
Did you really report this?
Did you really report this?
Did you really report this?
17Security goals of data aggregation
So the average is 251.5 Oh wait a minute
- Robustness Byzantine corruption of data would
not make aggregation result totally meaningless - Confidentiality To ensure that other than the
sink and the sources, no intermediate node should
have knowledge of the raw data or the aggregation
result
perform averaging
1
1000
3
2
What the hell am I forwarding?
sink
What the hell am I aggregating?
sources
18Securing data aggregation multipronged defence
4
1
2
3
19Resilient aggregation
- Objective To bound the effect of data corruption
- Corruption can be arbitrary Byzantine
- By convention, we denote the number of
corruptions as k - Methods
- Robust statistics (1-hop networks)
- RANBAR (1-hop networks)
- Quantiles aggregation (multi-hop networks)
20Robust statistics
Say an aggregation function is actually an
estimator Say we are estimating a parameter T
and there are k rouge nodes An aggregation
function is (k,?)-resilient if
That is, the RMS error as a result of
k-corruption, must be bounded by a constant
factor of the original RMS error We win if we
can limit ? The attacker wins if he manages to
unbound ?
21Examples of (k,?)-resilient aggregation functions
y
y?y?
Non-resilient, example Average
rms(y?)gt? rms(y)
AVG
AVG
x1
x1
x2
x3
x4
x2
x3
x44?
Resilient, examples
22RANBAR
- Based on RANdom SAmple Consensus, which
originates in computer vision (hence the name
RANBAR RANsac-Based AggRegation Buttyán et al.
06) - Step1 Use as few samples as possible to
determine a preliminary model - Step 2 Use the preliminary model to identify
samples that are consistent with the model - Step 3 Refine the model with all the samples
that are found to be consistent
23Quantiles aggregation (extending resilient
aggregation to multihop)
4
6
Median
Median
10
2
2
Median
Median
Median
1
2
3
4
16
1
2
3
4
16
Actual median 3
This approach suggests that instead of taking a
median every hop on the way, we should compress
the data judiciously at each hop
24Quantiles aggregation
count
tree nodes are numbered
Rules for deriving a q-digest Rule (A)
count(node) count(parent) count(siblings)
?n/k? 1 Rule (B) count(node) ?
?n/k? q-digest in this example
lt8,2gt,lt9,2gt,lt1,1gt
25Quantiles aggregation
count
tree nodes are numbered
Derived median data value represented by node 9
3.5 Actual median 3
26Resilient aggregation guidelines
- Two approaches actually
- estimate by minimizing effects of outliers
- detect outliers and estimate without outliers
1-hop multihop
Data distribution known Robust statistics, RANBAR Quantiles aggregation
Data distribution unknown Robust statistics Quantiles aggregation
27Progress so far
4
1
2
3
28Voting
malicious
is mean 61.4 reasonable?
malicious
3
300
2
1
malicious
1
Alright, 61.4 is not reasonable!
No
Yes
No
No
Resource-intensive, only good for
mission-critical, small-scale networks
No
29Progress so far
4
1
2
3
30Result verification
- The single-aggregator case
- The multi-aggregator case
- Chan et al.s hierarchical in-network aggregation
- Yang et al.s SDAP
31Interactive proof algo
- By Przydatek et al. 2003, algo for proving
probabilistically a given figure is indeed the
median of the samples - Example for the sake of intuition
Prover must have the samples sorted first
1
1
2
3
4
5
6
Prover tells the verifier median is 3.5 and the
no. of samples is 6
2
Verifier asks for the 3rd sample, prover tells
the 3rd sample is 3 lt 3.5, verifier is happy but
still suspicious
3
Verifier asks for the 4th sample, prover tells
the 4th sample is 4 gt 3.5, verifier is happy but
still suspicious
4
Verifier asks for the 1st and 6th sample, prover
tells 1st is 1 lt 3.5 and 6th is 6 gt 3.5, verifier
says Alright, Ive sampled enough, median
should be 3.5 at high probability.
5
Relies on the trustworthiness of the samples, but
how do we make sure?
32Result verification single aggregator
- (a) The information S requires from A in the data
aggregation phase - aggregation result f(x1xn)
- the number of data samples n
- a commitment of the data samples hA.
(b) Commitment tree based on Merkle hash tree
saves bandwidth
Previous slide shows these are necessary
Forces prover to commit to the sample values
33Result verification single aggregator
A returns the following when interrogated by S M
MAC(KAS, M) where M q ID(1) x1
MAC1S ID(2) x2 MAC2S h1,1
Prevents source nodes from lying
34Result verification multi-aggregator
- Chan et al.s hierarchical in-network aggregation
- Every sensor sends a message of the following
format to its parentquery ID value
complement count commitment MAC - Uses two primitives COMB and AGG
- AGG(msg1, msg2)Let msg1 q v1 c1 and
msg2 q v2 c2, then AGG(msg1, msg2) q
f(v1, v2) c1c2. - COMB(msg1, msg2)Let msg1 q v1 c1 and
msg2 q v2 c2, then COMB(msg1, msg2) q
v1 c1 v2 c2.
35Aggregation phase Chan et al.
- Aggregate only trees of the same size to create
balanced binary trees - The advantage of creating only balanced binary
trees is that edge congestion (congestion on a
link) is only O(log2n), where n is the number of
samples
36Verification phase Chan et al.
- S broadcasts COMB(AGG(B2, H2), G1) to the
network, for example, using µTESLA. Next, the
following transmissions take place - A ? B H2 A ? E COMB(B2, G1) B ? C COMB(H2,
D1) B ? D COMB(H2, C1) E ? G COMB(B2, G1) G
? H B2 H ? I COMB(B2, J1) H ? J COMB(B2, I1) - A source node that successfully reconstructs the
commitment will send a confirmation message to
the sink qnodeIDOK MAC(K, qnodeIDOK) - Problem is instead of at the sink, the commitment
is reconstructed at the source nodes themselves
an attacker can forge negative confirmations
37Result verification SDAP
- Better than previous approach, because commitment
is re-constructed at the sink, not the source
nodes - We divide the sub-network into groups, we only
need to check the groups which look suspicious - A sensor decides whether it would become a group
leader by checking whether h(qnodeID) lt Fg(c),
where Fg(c) is a function that increases with the
data count c - The role of a group leader is to set a boolean
flag in a message to NAGG to indicate the message
needs only be forwarded, not aggregated
38SDAPs aggregation phase
- S tests if h(qleaders nodeID) lt Fg(c). If
false, S discards the group aggregate. Otherwise,
S proceeds with the next test. - S tests if the group aggregate represents an
outlier
39SDAPs verification phase
- S ? A G q qa
- G ?? S qa G xG 3 MACGSH ?? S qa
H xH 2 MACHSJ ?? S qa J xJ 1
MACJSI ?? S qa I xI 1 MACIS - S performs the following checks
- xG is correctly derived from f(xG, f(xJ, xI))
- MACGS is correctly reconstructed in the following
steps - MACIS MAC(KIS, q I xI 1)
- MACJS MAC(KJS, q J xJ 1)
- MACHS MAC(KHS, q H f(xJ, xI) 2
MACIS ? MACJS) - MACGS MAC(KGS, q G f(xG, f(xJ, xI)) 3
MACHS)
40Progress so far
4
1
2
3
41Privacy homomorphism (PH)
- First proposed by Rivest et al. in 1978 to
process encrypted data without decrypting the
data first - A function is (?,?)-homomorphic if f(x) ? f
(y) f (x ? y)where ? is an operator in the
range and ? is an operator in the domain. - If f is an encryption function and the inverse
function f-1 is the corresponding decryption
function, then f is a PH.
42Types of PHs
- There are three main approaches to PHs in WSNs so
far - PHs that are based on polynomial rings, e.g.,
Domingo-Ferrers scheme - PHs that are based on one-time pads
- homomorphic public-key cryptosystems
Insecure under known-plaintext attacks Attacks
involve only computation of gcd and linear
algebra Wagner 03
43PHs based on one-time pads
One-time pad
- Encryption
- Decryption by sink
- Drawbacks
- Use of the addition operator in place of the XOR
operator in the plaintext space is unproven in
terms of security - Synchronization of keys causes scalability problem
m1m2m3k1 k2k3
m1 m2 k1 k2
m1m2m3m4k1 k2k3k4
m1 k1
m3 k3
sink
m4 k4
m2 k2
44Security of homomorphic public-key cryptosystems
- PHs are different from conventional ciphers in
the sense that the highest attain-able security
for PHs is semantic security under non-adaptive
chosen-ciphertext attacks (IND-CCA1) - PHs are also by definition malleable, so they
fail all the non-malleability notions - In practice, we only look for PHs that are
semantically secure against chosen-plaintext
attacks (IND-CPA)
45Candidate cryptosystems
- ElGamal on elliptic curves (EG-EC)
- Semantic security depends on the discrete
logarithm problem on elliptic curves - (,)-homomorphic
- Okamoto-Uchiyama
- Semantic security depends on the intractability
of factoring p2q - (?,)-homomorphic
46Guideline Mykletun et al. 06
EG-EC becomes increasing costly with larger
ciphertexts
EG-EC requires too much storage here
(real-time)
(intermediate nodes might want to decrypt the
intermediate values)
47Part One Conclusion
- Among the techniques introduced so far, voting,
result verification and PH all require a lot of
resources. - Only resilient aggregation is the most practical.
- If all data are only aggregated once, then
RANBAR, or a simple resilient aggregation
function can be used. - For multi-aggregation scenarios, quantiles
aggregation can be used at each aggregation
point to compress the data. - Instead of PH, encrypted data are decrypted and
then aggregated and re-encrypted no true
end-to-end confidentiality.
48Part Two
aggregate
aggregate
Key management
aggregate
In Secure Data Aggregation, we secure one-way
traffic.
generalized
In Key Management, we secure generic traffic.
49Components
Protocol verification
1
Key management
Key establishment
2
Key refreshment
3
Key revocation
4
50Protocol verification
- Verification gives us indication and confidence
of security - If we simulate unbounded sessions, verification
of secrecy and authentication is undecidable - If we limit number of parallel sessions, we can
use constraint solving for verification - Model strand space model
- Tool CoProVe implements the strand space model
using constraint solving (Prolog)
51Strand space model
52Node-to-node key establishment
- A wants to establish a secure channel with B via
a common trusted node S - A ? B NA AB ? S NA NB A B
MAC(KBS, NA NB A B)S ? A E(KAS, KAB)
MAC(KAS, NA B E(KAS, KAB))S ? B E(KBS,
KAB) MAC(KBS, NB A E(KBS, KAB))A ? B
Ack MAC(KAB, Ack)
53Node-to-node key establishment
E(KBS, KAB) MAC(KBS, NB A )
NA NB A B MAC(KBS, )
E(KAS, KAB) MAC(KAS, NA B )
NA A
Ack MAC(KAB, Ack)
54Verification using CoProVe
Strand space model
Role 1 send ? recv ?
Scenario Instantiate Role 1 Instantiate
Role n Instantiate Outcome
Bundle
Strands
Role n send ? recv ?
Outcome e.g., attacker learns key
has_to_finish(Outcome)
Security is disproved if there exists a bundle
that satisfies these constraints
55Verification using CoProVe the code itself
- initiator(A, S, B, Na, Ns, Ka, Kb, Kab, Ack,
- recv(A, S, B),
- send(Na, BKa),
- recv(NsKb, Kab, Na, BKa),
- send(A, NsKb),
- recv(AckKab)
- ).
- server(A, B, Na, Ns, Nb, Ka, Kb, Kab,
- recv(Na, BKa),
- send(NsKb, Kab, Na, BKa),
- recv(B, Nb, A, NsKb),
- send(Kab, Nb, AKb)
- ).
- responder(A, B, Nb, Ns, Kb, Kab, Ack,
- recv(A, NsKb),
- send(B, Nb, A, NsKb),
- recv(Kab, Nb, AKb),
- send(AckKab)
- ).
- secrecy(N, recv(N)).
- scenario(a, Init1, b, Resp1, s, Serv1,
sec, Secr1) - - initiator(a, s, B, na, Ns, ka, Kb, Kab, ack,
Init1), - server(a, b, Na, ns, Nb, ka, kb, kab, Serv1),
- responder(A, b, nb, Ns1, kb, Kab1, ack,
Resp1), - secrecy(kab, Secr1).
- has_to_finish(sec).
56Components
Protocol verification
1
Key management
Key establishment
2
Key refreshment
3
Key revocation
4
57Key establishment
- Definition a process or protocol whereby a
shared secret key becomes available to two or
more parties, for subsequent cryptographic use - Types
A key agreement protocol whereby the
resulting established keys are completely
determined a priori by initial keying material
58Protocol design by communication modes
- Global broadcasts
- Authenticated broadcast using µTESLA
- Local broadcasts
- Passive participation
- Unicast
- Only consider neighbour-to-neighbour
- Multihop can be secured hop by hop
- Random key pre-distribution schemes
- LEAP
- EBS
59Global broadcast µTESLA
- Micro version of the Timed, Efficient,
Streaming, Loss-tolerant Authentication Protocol
Authenticated broadcast
keys are generated in reverse order
Ki-1 h(Ki)
K1
K2
K3
K4
Kn
keys are released in forward order
60µTESLA example (1)
(3) Generate one-way reverse key chain on the
base station
(1) Generate one-way reverse key chain on the
base station
h()
K1
K2
K3
K4
(2) Give K1 to everybody
M
K2
MAC(K3, )
K1
K1
K1
K1
61µTESLA example (2)
(5) Base station later sends K3 that can be used
to authenticate message M
(4) K2 is genuine because h(K2) K1 but packet
tagged with MAC(K3, MK2) still needs to be
authenticated
M2
K3
MAC(K4, )
M
MAC(K3, )
K2
Authentication steps (a) K3 is genuine because
K2 h(K3) (b) M is genuine because K3 is genuine
and K3 authenticates M
M
MAC(K3, )
K2
62Local broadcast Passive participation
A is just transmitting a similar data to I have,
so I shall not transmit.
D
C
E
B
Passive participation nodes B, C, D, E suppress
their transmissions when they find A transmitting
about the same data To secure passive
participation, A uses a cluster key and a one-way
key chain to achieve encrypted and authenticated
local broadcast
A
63Local broadcast Passive participation
D
- If only the key chain is used, the keys in the
key chain would have to be broadcast in the
clear, and in the absence of time interval
differentiation, a cluster-outsider would be able
to forge messages using these keys - If only the cluster key is used, authentication
of the sender cannot be achieved - But if used together, the cluster key can be used
to encrypt messages as well as to hide the key
chain keys from cluster-outsiders and at the
same time, the key chain keys can be used for
authentication
C
B
A
64Securing unicast
- Random key pre-distribution schemes
- LEAP
- EBS
65Random key pre-distribution (RKP)
at random
Keying material
at random
Pool
Able to establish session key?
P pool size (4 in this example) K key ring
size (1 in this example)
66Random key pre-distribution (RKP)
Type 1
Type 2
Type 3
Symmetric key Eschenauer Gligor 02
Symmetric bivariate polynomial Liu et al. 05
Part of a matrix Du et al. 05
67Symmetric-key-based RKP
Ive got keys 1, 2, 3, 4
1
1
Ive got keys 1, 5, 6, 7
2
5
3
6
4
7
OK, so our session key can be derived from key 1
OK, so our session key can be derived from key 1
Although not all neighbouring pairs of nodes can
establish a session key (aka pairwise key), the
network will remain connected, with a suitable
choice of K and P. K key ring size (4 in this
example) P key pool size (7 in this example)
68Symmetric-key-based RKP
Prconnectivity k vs k
K 4, P 15, RMSE 0.0427
K 4, P 30, RMSE 0.0436
Prconnectivity k
Expected connectivity
Derived from results of random geometric graphs
Law et al. 07
69Polynomial-basedRKP
f1(x, y) 12y3y22xxy4xy2 3x24x2yx2y2
f2(x, y) 23y5y23x2xy7xy2 5x27x2y2x2y2
Ive got f1(), f2()
f3(x, y) 34y5y24x3xy6xy2 5x26x2y3x2y2
Ive got f2(), f3()
Node 1
Pool
f1(1, y) 67y8y2
Node 2
OK, so our session key can be derived from f2()
f2(2, y) 2835y27y2
f2(1, y) 1012y14y2
f3(2, y) 31 34y 29y2
OK, so our session key can be derived from f2()
In this example, t 2, K 2, P 3 The pairwise
key is f2(1,2) f2(2,1) 10 24 56 28 35
27 90 In reality, the value must of course
be as large as normal crypto keys Storage
requirement K(t 1) coefficients, where t is
the threshold
70Matrix-basedRKP
N number of nodes number of columns
this seed can be used as an ID
Vandemonde-like generator matrix
Random symmetric matrices
D2
D3
D4
D1
M1(D1G)T
M2
M3
M4
71Matrix-basedRKP
M1
M3
M2
M4
Ive got M1, M2
Pool
Ive got M2, M3
Node 1
Node 2
G(2)
M1(1)
G(1)
M3(2)
OK, so our session key can be derived from M2
OK, so our session key can be derived from M2
M2(1)
M2(2)
Heres G(1)
Heres G(2)
- Pairwise key M2(1)G(2) M2(2)G(1)
- Storage requirement K(t1)1 coefficients, where
t is the threshold
72Node-to-node key establishment
- RKP schemes only good for keying two neighbouring
nodes with common key(s) what about neighbours
without any common key? Use common trusted node - A wants to establish a secure channel with B via
a common trusted node S - A ? B NA AB ? S NA NB A B
MAC(KBS, NA NB A B)S ? A E(KAS, KAB)
MAC(KAS, NA B E(KAS, KAB))S ? B E(KBS,
KAB) MAC(KBS, NB A E(KBS, KAB))A ? B
Ack MAC(KAB, Ack)
73LEAP
- LEAP is a key pre-distribution scheme but not
random - Every node is pre-distributed with Kin
Node B node key KB PRF(Kin, B) Kin already
deleted
A sets timer
0
Hello, Im A
1
Im B
2
Node A initial key Kin
A and B compute pairwise key PRF(PRF(Kin, B), A)
3
Timer fires, A deletes Kin
4
74EBS (Exclusion Basis System)
Nodes
Keys
Pro Two nodes always share at least 2K-P
keys. Con When a node is compromised, more than
half of the keys in the key pool are compromised.
75Components
Protocol verification
1
Key management
Key establishment
2
Key refreshment
3
Key revocation
4
76Key refreshment
- Why? The more a key is used, the more it is open
to cryptanalytic attacks, birthday attacks etc.
Parallel re-keying
- Lose the key K, then all past and future keys are
exposed - Not suitable for WSNs
77Key refreshment
Serial re-keying preferable because of forward
security
- Only need to store this
- Lose this, then all future keys are compromised
- But past keys are intact
78Abdalla et al. 2000
- Without this scheme, birthday threshold O(2k/2)
- With this scheme, a session key can be refreshed
O(2k/3) times - Each time, a session key has a birthday threshold
of O(2k/3) - The final birthday threshold is O(2k/3) ? O(2k/3)
O(22k/3)
79Components
Protocol verification
1
Key management
Key establishment
2
Key refreshment
3
Key revocation
4
80Which keys to revoke?
Big picture
- When A is compromised
- Global broadcast keys B, C, D, E need to have
their copies of KSglobal replaced - Local broadcast keys B, C, D, E need to purge
KAcluster and KAchain B needs to re-gen and
re-distribute KBcluster and KBchain similarly
for C, D, E
81Strategy
Gateway
82Re-keying unicast keys
Big picture
- If using polynomial-based or matrix-based RKP or
LEAP, do nothing - If using symmetric key-based RKP, re-keying is
desirable but can be done without - If using EBS, re-keying is a must
83Re-keying local broadcast keys
84Re-keying global broadcast keys
- New global key is propagated from the base
station in two stages - The hash of the key is propagated
- Then the key itself
- Over each hop, the key is protected by a cluster
key and a cluster key chain
85Part Two Conclusion
- Securing local broadcasts is generally too
expensive for current generation of nodes - The priority is to secure query broadcasts, data
convergecasts and neighbour-to-neighbour unicasts
This means a node should minimally store - a unique key shared with the base station
- a µTESLA commitment distributed by the base
station - a global key
- a set of pairwise keys, each of which is shared
with a different neighbour - Periodic key refreshment should be made a
standard practice - global key is used most often
- Always verify protocols
86- Thank yall
- Dank u
- Danke
- Grazie
- Mult'umesc
- Dziekuje
- Köszönöm
- Tesekkurler
- Shukran
- ???????
- ??