Title: Technological Networks: From Empirical Laws to Theory Stephanie Forrest Dept. of Computer Science Un
1Technological Networks From Empirical Laws to
Theory Stephanie Forrest Dept. of Computer
ScienceUniversity of New Mexicohttp//cs.unm.ed
u/forrestforrest_at_cs.unm.edu
2UNM CS Dept. Profile
- 18 Faculty
- 9 tenured (5 full, 4 assoc), 5 assistant,1
lecturer - 3 openings (2 junior, 1 senior)
- Prince of Asturias Endowed Chair in Information
Technology - External faculty appointments from other
departments and national labs. - Close collaborations with SFI and national labs
(SNL and LANL) - Students
- Degrees BS 40, MS 35, PhD 5
- Undergraduate 200 majors BS degrees
- 20 Female 35 Minorities
- Graduate 120 MS 80 Ph. D.
- 20 Female 40-60 Foreign
- Funding (2004-2005)
- Total 3.5M
- NSF, DARPA, DOE, NIH, Sandia and Los Alamos
3UNM CS Dept. Research
- Strongly Interdisciplinary
- Adaptive computation
- New paradigms of computing (molecular and quantum
computation) - Computational biology and bioinformatics
(phylogenetic tree reconstruction, radiology,
RNAi) - Graphics and visualization
- High-performance computing
- Light-weight distributed operating systems
- Automated reasoning and machine learning
- Otter
- POMDP
- Complex networks
- Provably robust scalable algorithms for P2P
networks - Phase transitions in NP-complete problems
4Themes of Talk
- The real world isn't exactly scale free.
- Understanding and predicting network structure is
important for engineering - Network properties can be exploited to enhance
computer security (computer epidemiology / border
gateway protocol) - We lack theory to explain/predict the structure
of technological networks - Preferential attachment isnt good enough.
- Initial steps toward theory.
5Technological Networks
- Distribute resources
- Energy
- Materials
- Information
- Energy distribution
- Power grids
- Gas pipelines
- Transportation
- Highways
- Airline routes
- The Internet
- Physical connectivity
- Autonomous systems (AS)
- World-wide web
- Social contacts, e.g., email
- Microarchitecture
6Network Structure
- Network topology affects network properties
- Shortest distance between two nodes
- Bisection width
- Rate and extent of contagion
- Analysis
- Epidemiological models.
- The epidemic threshold.
- Degree distribution of network
- Scale-free (power law) networks
- Pk k-c
- Controlling infections on scale-free networks
- Random vaccination is ineffective (e.g.,
anti-virus software). - Targeted vaccination of high-connectivity nodes.
7Example 1 Computer EpidemiologyJustin Balthrop,
Mark Newman, Matt Williamson
- Viruses and worms spread over networks of
contacts between computers - Email address books.
- URL links.
- Different types of networks are exploited by
different types of infections.
UNM CS Dept. Network of Address Books
8Degree Distributions of Four NetworksRelevant to
Computer Security
Science 304527-529 (2004)
- Not scale-free.
- Targeted vaccination unlikely to be effective
- 10 of nodes required for address book data
- 87 of nodes required for email traffic data
- Computer infections can choose their own
topology, so network topology is not static. - Viruses spread faster than repairs.
9Throttling Generic Control of EpidemicsMatt
Williamson and Justin Balthrop
- Control network topology in time rather than
space. - Limit the rate at which a computer can make new
connections - Limits spreading rates rather than stopping.
- Assumes that virus traffic is significantly
different from normal traffic - Nimbda infects up to 400 new machines per second
compared to normal rate of connections to new Web
servers of about one connection per second or
slower (Williamson, 2002). - Throttling Nimbda would have increased the
epidemic time from one day to over one year. - Advantages
- Effective when the form of infection not known in
advance. - Reduces amount of traffic generated during an
epidemic. - Limitations
- Effectiveness decreased if not all nodes are
throttled (altruism), stealth attacks. - Implementations
- RIOT
- HPs Virus Throttle
10Responsive I/O Throttle (RIOT)Justin Balthrop
and Matt Williamson
- Vision An adaptive desktop firewall
- Hamper worms, viruses, DOS, misconfigurations,
etc. - Graduated automated response (throttling) on all
connections. - Adaptive (learn normal).
- Robust to false positives
- Personal desktop firewalls
- Coarse-grained
- Static and preprogrammed
- Generic rate limit (throttle)
- Delay network connections that occur at an
anomalous rate. - Detector activation triggers delay and packets
are dropped.
- How to detect anomalous connections?
- Set of detectors (lymphocytes) that observe TCP
connections - Data Path (below)
- Meta-information (direction, TCP flags)
- Each detector matches some portion of IP space.
- Each detector has its own normal activity level.
11Learning and Throttling Connections
- A new connection is initiated (SYN Packet)
- The connection information is translated to a
bitstring - The bitstring is shown to all detectors
- An immature detector exceeds its activation
threshold - The detectors activation threshold is raised
- Detectors with stable activation thresholds
become mature
12Autonomic Responses A Repertoire
13Example 2 Inter-Domain Routing Josh Karlin,
Stephanie Forrest, and Jennifer Rexford
- 20,000 Autonomous Systems (ASs) connected via
the Border Gateway Protocol (BGP) - ASs route blocks of IPs, know as prefixes
- 64.106.0.0/17 (Owned by UNM)
- 170,000 prefixes owned by ASs today.
- Border gateway protocol (BGP)
- Tell neighbors about new routes (Announcements).
- Tell neighbors about old routes gone bad
(Withdrawals). - Thats it.
14BGP Networks are Interesting and Important
- Distributed
- Nodes are AUTONOMOUS systems.
- No centralized routing information (routes stored
and maintained locally). - No authentication of new nodes or routes.
- Dynamic
- Network connectivity changes routinely and
continually. - Network updates are spread through local contact
(BGP). - Confluence of technological and economic
constraints - A policy network as well as a routing network.
- All inter-domain Internet traffic relies upon
BGP. - Vulnerable
- Trivial to inject false routing information into
network. - Man-in-the-Middle attacks.
- Pretty Good BGP (PGBGP) and Internet Alert
Registry (IAR). - Throttle the adoption of new routes.
15PGBGP Algorithm
- Main Idea Delay Suspicious Routes
- Lower the preference of suspicious routes (24hr)
- Detection
- Monitor BGP update messages
- Treat origin ASs for a prefix seen within the
past few days as normal - Treat new origin ASs as suspicious for 24 hours,
then accept as normal (possible prefix hijack) - Treat new sub-prefixes as suspicious for 24
hours, then accept as normal (possible sub-prefix
hijack) - Response
- Suspicious origin AS routes are temporarily given
low local preference - Suspicious sub-prefixes are temporarily ignored
(not forwarded to)
16PGBGP Advantages
- Incremental deployability
- No change to BGP protocol, just to path selection
- Immediate benefits to adopting AS and customers
- Automated and immediate response
- Avoid using and propagating the bogus route
- Network has chance to stop the attack before it
spreads - Robust to false positives
- Lowering preference for suspicious routes
- No loss of reachability
- Accidental short-term delays do no harm
- Offline investigation of suspicious route
- Internet Alert Registry, active probing,
- Adaptive, simple
17Incremental Deployment
Subprefix Hijack
Prefix Hijack
ICNP (2006)
- Limitations
- Doesnt address path spoofing, redistribution
attacks. - Negligence
18BGP Network StructurePetter Holme and Josh Karlin
- Barabasi-Albert (BA) model (Barabasi and Albert,
1999) - Vertices and edges added iteratively
- Probability of attaching to vertex i is
proportional to k(i) - Inet model (Winick and Jamin, 2000)
- No simple growth principle
- Generate random graph with known degree
distribution - Augment to mimic additional known correlations
(e.g., connecting all high-degree nodes)
Proc. Royal Acad. A (in press)
19Tentative Conclusions
- Real AS graph is more heterogeneous than can be
expected from degree distribution alone - Core providers in the low-d tail
- Peak at d3 (vertices directly connected to the
core) - Second peak at d4 (vertices directly connected
to d3 nodes) - More structure in periphery than predicted by
earlier models. - Preferential attachment is a poor model for
network growth. - What constraints determine network architecture
and growth?
20Allometric Scaling in Biology
- Dominant design constraint
- Distribute resources to every cell in the
organism - Internal, space-filling, hierarchical networks
(vascular system) - Invariant terminal units (capillaries)
- Optimality (minimize transport time, maximize
metabolism)
21Examples Scaling in SoftwareH. Inoue
LimeWire Behavior
HelloWorld Unique Function Calls vs. Invocation
Freq
22Example Scaling in Software Dave Ackley and
Terry Van Belle
- How to measure evolvability?
- Likelihood (how likely is a location to change).
- Impact (change in one location affects other
locations). - Work Likelihood x Impact
- Software maintenance costs.
- Expected time to evolve.
- Study long-lived Java code bases
- Code change sizes and frequencies are power law
ish.
Van Belle, 2004
23A Theory of Network Scaling Conclusions
- Why its important
- Networking infrastructure continues to expand by
orders of magnitude - NSF GENI project Redesign the Internet from the
ground up. - Interplanetary networking.
- Architecture
- Move from performance-oriented to power-aware
designs. - The end of silicon.
- Scaling problems in software, security problems
everywhere - Why its hard
- Terminal units arent necessarily invariant sized
- Dimensionality and geometry are not obvious
- May require new mathematics
-