System Administration: Drowning in Management Complexity - PowerPoint PPT Presentation


PPT – System Administration: Drowning in Management Complexity PowerPoint presentation | free to view - id: 41a14-ZmM0Y


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

System Administration: Drowning in Management Complexity


Equivalent to 72 Gmail accounts OR 4 Blu-Ray Discs. People Ever Born (106,456,367,669) = O(1011) ... on my machine and stealing my bank account passwords? ... – PowerPoint PPT presentation

Number of Views:240
Avg rating:3.0/5.0
Slides: 37
Provided by: Micro244


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: System Administration: Drowning in Management Complexity

System AdministrationDrowning in Management
  • Chad Verbowski
  • Microsoft Research, Redmond

  • Problem Space
  • Complexity Grows Faster than We Can Handle
  • System Management Approaches
  • A New Approach Data Driven Management
  • Examples/Results using Data Driven Mgmt

  • Systems Management Complexity Scale
  • The amount of Energy we put into Maintaining our
  • Energy Software, Hardware, People, Resource
  • Complexity is Constantly Growing
  • Advances Reducing Development Complexity
  • Simplified Development Enables Complex Systems
  • Growing Number of Devices, Apps, Users
  • Advances Needed in Managing Complexity
  • To Avoid Drowning!!!

Systems Management Problem
  • Complexity AND Scale
  • Persistent-State Size O(105)
  • Persistent-State Access Trace Size O(108) per
  • Number of Programs Interacting O(102)
  • Globally
  • Number of Machines O(109)
  • each runs a different combination of
  • CyberSecurity (Anti-Malware)
  • Systems Management Adversaries
  • Digital Rights (DRM), Protecting Data
  • Cybersecurity Untrusted Users

Complexity ComparisonWith Humans
  • Human DNA 1 Billion Base Pairs 1GB
  • 0.25 Unique Pairs 1.2 MB
  • 6 Billion people 7.2 TB
  • Encoding of Relatives (3.61) 2 TB
  • Lempel-Ziv Compress (101) 200 GB
  • All Living Peoples DNA Fit on a Laptop!
  • Equivalent to 72 Gmail accounts OR 4 Blu-Ray
  • People Ever Born (106,456,367,669) O(1011)
  • Storage Required for all Human DNA 3.5 TB
  • Cost (Using 14k for 18TB) 2,725
  • Backup (100 for 1TB Tape) 400

Growth in Software Complexity
  • Rate a Developer Can Code
  • OO/CORBA/COM Enables Componentization
  • Libraries Enable Sharing Code
  • New Languages Less Coding
  • Better Tools Easier to Debug, Build, Annotate
  • Density of Developer Collaboration
  • Source Control Systems
  • Improved Communication (Email, IM)
  • Enforceable Software Development Processes
  • Hardware Advancements
  • Software is a gas that expands to fill its

Developers Role in Manageability
  • Rely on them for Manageable Software?
  • Probably Not At least not for a while
  • Time to Adopt new platforms, applications, APIs
  • Third Party, In House, Legacy Software
  • Can They Completely Solve This?
  • Not Feeling the Pain of System Administrators
  • Manageability is the Top Priority, Right After
  • Easy to Manage Hardware?
  • Very Few Advancements in this area
  • State-of-the-Art SNMP v1 Circa. 1988

The Software LifeCycle
The Management LifeCycle
  • Software LifeCycle ! Management LifeCycle
  • ONGOING Cost That Starts With Deployment
  • Configuration, Provisioning
  • Monitoring, Troubleshooting
  • Upgrades, Patching
  • Integration with Other Components
  • Accumulation of Stuff To Manage
  • Applications, Hardware, Devices, Users, and Data
  • Ops Cost gtgt Software Cost
  • BIG Trouble Unless Significant Improvement

Who Is Going to Save Us?
  • Sys-Admins Are Ultimately Responsible
  • They Understand the Symptoms Best
  • Limited Time Toolset for Fixing Manageability
  • Need Better Management Tools (Obviously)
  • Their Net Affect Should Not Be More Complexity!
  • They Need to Take Virtually No Input or
  • They Should Not Rely On Application Participation

Motivation From Albert Einstein
  • Any fool can make things bigger, more complex,
    and more violent. It takes a touch of genius-and
    a lot of courage-to move in the opposite

System Management TechniquesBad System
Management Can Make Things Worse
  • Software Development Design Choices
  • Componentization is Good
  • But Dont Make Every Class a Component!
  • Security Checks, and Locks Are Good
  • But Dont Unnecessarily Check/Lock At Every
  • System Management Technique Choices
  • No Single Technique Solves All Problems
  • Be Aware of the Capability and Limitations
  • Use Them Appropriately!

1. Prescriptive ManagementThe First Line of
  • Limit the Hardware and Software Used
  • You can only buy THESE Server/Laptop/Desktop
  • Only THESE Versions of App X Are Supported.
  • Benefits
  • Less Stuff to Manage!
  • Challenges
  • Ongoing Cost to Maintain the List
  • Measuring Compliance is Hard
  • Difficult to Clean Up Existing Environments
  • User Happiness ?

2. Signature BasedAvoid Solving the Same Problem
Over and Over and
  • Create Rules/Fingerprints for Known Problems
  • (AV/AS) Manual Sample Collection and Signature
  • (Mgmt) Manual Events Rules for Well Known
  • Benefits
  • Minimal Troubleshooting Time
  • Early Problem Detection
  • Challenges
  • Costly Hard to Identify Root Cause
  • The Most Costly Issues Frequently Repeat

3. ManifestDeep System Understanding Enables
Policy Based Management
  • Complete Description of Environment State
  • Each Items Function is Documented with
    Dependencies, Valid Values
  • Benefits
  • Policy Constraints Can be Created and Enforced
  • Wide and Deep Knowledge Minimizes Troubleshooting
  • Challenges
  • Determining What the Policy Should Be
  • Virtually Impossible to Create for ALL items
  • Third Party, In-House, and Legacy Applications
  • Difficulty Resolving Late-Bound dependencies,
    Canonicalization Issues
  • Costly to Create a Manifest for Large
  • Keeping the Manifest Current is Challenging

4. Simplified Management ModelReduce Complexity
by Creating a Simpler Management Abstraction
  • Manage a Simplified Logical View
  • Complexity is Encapsulated in Components Forming
    a Logical View
  • e.g. A Service Description, and a Service
    Level Agreement
  • Benefits
  • The Management Space is Less Complex
  • Challenges
  • Hard to Define the Right Abstraction for
  • Creating the Model Definition
  • Mapping to New Model is Hard
  • Equivalence Across Vendor/Application/Version
  • Keeping the Real-World and Logical View in Sync

Motivation For a New TechniqueHard to Solve
Real-World Change Management Problems
  • My application worked yesterday, but its not
    working today. Whats the problem?
  • My system has been acting weird lately. What has
  • If I apply this patch, which of the 3,000
    applications in my company may break?
  • Was this change consistently applied to all 850
    of my servers?
  • Some spyware program is hijacking my home page.
    How can I get rid of it, all of it?
  • Are there any Trojan programs hiding on my
    machine and stealing my bank account passwords?

Change Management Struggle
App Popularity
App Versions
InsightsA Pragmatic Look at Change Management
  • Cross Machine State CANT Be That Different
  • Most of the O(109) Systems Are Working Correctly
  • Most Environments Have Small Variation in
  • System Workloads are Highly Repetitive
  • We Only Care About The State That Is Used
  • Only 10 of Files and Settings Are Actually Used
  • Process / State Interactions Provides Context
  • For Understanding Process Dependencies and the
  • We Only Care About New System Changes
  • Only 1 of Files and Settings Typically Change

5. Data Driven Management Reduce Complexity
using Automated Monitoring and Analysis
  • Manage Only Globally Distinct Differences
  • Instrument the OS to Auto Track Process/State
  • Identify New Process Patterns and State
  • Benefits
  • Simplifies the Troubleshooting Problem Space
  • Reduces the Problem Space for Other Techniques
  • Leverage Existing Machine Learning Work
  • Challenges
  • Scalable Low Overhead Data Collection and
  • Determining Cross Machine Equivalence
  • False Positives

System Building Challenges
Data Driven Examples
  • Troubleshooting Strider Peer Pressure
  • Spyware Detection GateKeeper
  • Patch Impact Analysis
  • Root Kit Detection Ghost buster
  • Exploit Site Discovery Honey Monkey
  • Closing the Change Mgmt Loop LiveOps

Strider Troubleshooter
  • My application worked yesterday, but its not
    working today. Whats the problem?
  • Cross-time Diff O(105) ? O(103)
  • Windows XP System Restore Registry snapshot
  • Trace the app O(105) ? O(103)
  • Registry read/write operations
  • Diff-Trace Intersection O(103) ? O(101)
  • Inverse Change Frequency Ranking
  • GeneBank PeerPressure Ranking
  • Mostly good Registry snapshots from the Mass for
    detecting anomalies

O(101) ? O(100)
Experimental Results
AskStrider Auto-Scanner
  • My system has been acting weird lately. What has
  • Running-module Snapshot O(105) ? O(103)
  • Earliest-Latest Diff O(105) ? O(103)
  • Diff-Snapshot Intersection O(103) ? O(102)
  • Last-Update Timestamp Ranking O(102) ? O(101)
  • Patch Filtering O(101) ? O(100)
  • During patch troubleshooting focus on files from
  • During malware troubleshooting filter out files
    from patches as noise

(No Transcript)
Patch Impact Analysis
  • If I apply this patch, which of the 3,000
    applications in my company may break?
  • Trace patch installation O(105) ? O(101)
  • Black-box patch manifest
  • For each of the O(103) apps
  • Trace it O(105) ? O(103)
  • Black-box persistent-state app manifest
  • Diff-Trace Intersection O(103) ? O(100) or 0
  • Test prioritization O(103) ? 0 O(101)

Improving OS DesignWhat Extensibility Points
Exist in the OS?
  • Extensibility Point Configuration Setting
    Containing the File Name of Code To Be Loaded At
    Application Runtime
  • - Used by Malware to Automatically Start After
  • Solution For Each Module Load Identify
    Previously Read Settings Containing the Module
  • Results
  • 364 Classes of EPs with 7227 EP Instances
  • 44 of EP Instances were never modified
  • Recommendation Lock Down
  • 70 of EP Instances were used by a single
  • Recommendation Removal

Ghostware The Ultimate Challenge to Trustworthy
  • Ghostware
  • Malware programs that patch the OS to hide their
    files, Registry entries, processes, loaded
    modules, network ports, etc. from other
    applications and OS utility programs
  • Bad things they can do
  • Install keyloggers to steal information
  • Use the disks as free storage
  • Use the machines to send spam emails
  • Release viruses and worms

CWS spyware detected by Ad-aware
(No Transcript)
CWS Spyware Hidden by Hacker Defender
GhostBuster ScanDiff
Strider GhostBuster Ghostware Detector
  • Are there any Trojan programs hiding on my
    machine and stealing my bank account passwords?
  • File System Registry Snapshot O(105)
  • Snapshot from a WinPE CD O(105)
  • Diff of the two snapshots O(105) ? O(101)
  • Content-Diff Noise Filtering O(101) ? O(100)
  • Only care about files and Registry entries that
    exist in the second snapshot, but not the first

LiveOpsClosing The Change Management Loop
Person or Automation
Is The Change Approved?
Change Request
Change Tools
Change Detected
OS Applications Platform
  • Think of New Ways to Avoid Complexity
  • Not just accept, find better ways to manage it
  • Invest in Deep Thinking to Advance Ops
  • Not just fighting the fires!
  • (The Nearest Way to the Exit May be Behind You!)
  • To raise new questions, new possibilities, to
    regard old problems from a new angle, requires
    creative imagination and marks real advance in