Central topic: scalable, selfadaptive analytic systems with real life impact - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Central topic: scalable, selfadaptive analytic systems with real life impact

Description:

... topic: scalable, self-adaptive analytic systems with real life ... Mutually interesting topics (in the areas of data mining and applied/business analytics) ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 10
Provided by: ValuedGate2137
Category:

less

Transcript and Presenter's Notes

Title: Central topic: scalable, selfadaptive analytic systems with real life impact


1
Artur Dubrawski Research
  • Central topic scalable, self-adaptive analytic
    systems with real life impact
  • Approaches statsitical data mining, smart data
    representations, efficient algorithms, machine
    learning, working closely with end-users
  • Director of the Auton Lab (a part of the CMU
    Robotics Institute), a team of 15 people
  • 3 regular3 affiliated faculty, 4 programmers
    and analysts, 6 PhD students, 2 interns led by
    Artur Dubrawski and Jeff Schneider
  • Currently working on 10 sponsored projects
  • Current and past funding from NSF, DARPA,
    DHS/HSARPA, ONR, AFRL, NASA, USDA, FDA, CDC, a
    few Fortune 100 companies, and a number smaller
    corporate academic sponsors and partners
  • Deliverables
  • Algorithms for fast and scalable statistical
    machine learning
  • Software for embedding in production systems
  • Software available for download (current avg. 200
    new downloads per month)
  • www.autonlab.org

2
Bio-Surveillance
  • Rapid and reliable detection of potential threats
    to public health
  • Maintenance and improvement of situational
    awareness
  • Multiple sources of early-warning data (ER, OTC,
    news, )
  • Autonomous learning of probabilistic models of
    detection
  • Multivariate frequentist and Bayesian approaches
  • Scalability to large quantities of
    spatio-temporal data
  • Rapid detection of even small clusters of
    illness, sensitivity to signals supported by
    little evidence, management of false positives
  • Funding from CDC (BioSense), NSF, DHS (NBIS 2)
  • Previously also from DARPA, DHS (incl. BioWatch),
    State of PA,
  • cooperation with University of Pittsburgh.

Monitoring 2002 Olympics
3
Safety of Food Supply and Agriculture
Food Sample Test Results (microbes, toxins)
eLEXNET (FDA)
Import
Fork
Farm
CCMS II (USDA)
CZDSS (USDA)
FABIS/PAM, RBI, PHICP (USDA)
Consumer Complaints
Multiple Streams (condemnations, lab
tests, inspection results, recalls, )
Slaughterhouses (condemnations results of lab
tests for pathogens, residues and microbes
causing zoonotic diseases)
Common denominator quickly detect emerging
patterns and rank them according to expected
risk, so that investigative resources can be
allocated efficiently
4
USAF Fleet Health Surveillance
  • Systematic failures in maintenance and logistic
    support, operational processes or
    hardware/software quality can lead to missed
    availability targets if not identified early
  • Idea Use existing data from the USAF data
    warehouse to provide early identification of
    emerging logistics crises
  • Encouraging initial results prototype system is
    able to detect some known events weeks before
    they have been originally recognized.

5
Nuclear Threat Assessment (work with LLNL)
  • Contextually-Aware Expert-System for Automated
    Threat Assessment
  • We are developing an automated system that
    combines
  • Radiation portal monitor scans
  • Gamma-ray spectra and neutron measurements
  • Contextual information source distance, manifest
    information etc.
  • It uses Machine Learning to
  • Improve threat/non-threat discrimination
  • Classify radiation alarms
  • Lower false alarm rates
  • Improve detection probability by allowing lower
    operation thresholds
  • Improve confidence in alarm resolutions
  • Reduce duplicative data entry, resolution time
    and operating costs.

6
Link-Entity Data Analysis
  • Group Detection Algorithms
  • Provide methods for automatic group detection
    from noisy co-occurrence data
  • Interpretable groups with definite memberships
  • Increasingly efficient/robust solutions.
  • Link Prediction and Activity Rating Algorithms
  • Model the data using an (unseen) underlying
    social network and base future predictions on
    this network (cGraph)
  • Learn a probabilistic model of entity activity
    directly from the data. Use the co-occurrence
    and demographic data to predict additional active
    entities (AFDL).
  • Efficient Learning of Massive-scale Bayesian
    Network Models
  • It is now possible to learn very large Bayesian
    networks from data, allowing the user to look for
    structure/anomalies/etc. in data sets that may
    not previously have been tractable to consider.

7
Research Whats Going On and Whats Ahead
  • We do a lot of fundamental research into
  • Clever data structures
  • Smart, computationally efficient and numerically
    accurate algorithms
  • Practical autonomy and complementing evidence
    available in data with human expertise
  • Our research is motivated by the actual needs of
    the end users.
  • Today, the data available to analysts is too
    voluminous for them to internalize
  • In 10-20 years, the complexity of the concepts
    will be too great to internalize
  • At the same time, the nature and types of
    required analyses will be changing too quickly to
    permit off-line processing
  • Efficient (computationally and statistically)
    machine learning and data mining will make the
    analyses feasible.

8
Artur Dubrawski Teaching at MISM
  • 95-791 Data Mining
  • Mini 1 this Fall then Mini 3 in S08
  • 95-852 Analytics and Business Intelligence
  • Mini 2 this Fall then (I think) Mini 4 in S08
  • 95-911 Independent Studies on Data Mining
  • Mutually interesting topics (in the areas of data
    mining and applied/business analytics)
  • Systems Projects
  • Past Allegheny Cty DHS, Mellon Financial, USDA,
    Microstrategy

9
Contact Information
Write a Comment
User Comments (0)
About PowerShow.com