Open Source Machine Learning - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Open Source Machine Learning

Description:

Open Source Probabilistic Network Library Gary Bradski Program Manager Systems Technology Labs - Intel What are we announcing today? Intel is releasing a library of ... – PowerPoint PPT presentation

Number of Views:311
Avg rating:3.0/5.0
Slides: 31
Provided by: OmidMo7
Category:

less

Transcript and Presenter's Notes

Title: Open Source Machine Learning


1
Open Source Machine Learning
  • Open Source Probabilistic Network Library
  • Gary Bradski
  • Program Manager
  • Systems Technology Labs - Intel

2
What are we announcing today?
  • Intel is releasing a library of Open Source
    Software for Machine Learning
  • First library is Probabilistic Network Library
    (PNL) comprised of code for inference and
    learning using Bayesian Networks
  • Research and Development was conducted in Intel
    research labs in US, Russia and China
  • Software is released as part of Intel Open
    Research Program
  • Tool for research in many application areas
  • Open Source under a BSD license
  • The code is free for academic and commercial use
  • More info http//www.intel.com/research/mrl/pnl

3
Why is Intel involved?
  • Statistical Computing and Machine Learning can
    change computing applications in a considerable
    way
  • Machine Learning requires high-powered processors
  • Ties into Intels research in other areas such as
    wireless networking, sensor networks and
    Proactive Health

4
What is Machine Learning?
  • Machine Learning allows computers to learn from
    their experiences and from gathered data
  • Weve known for gt 200 years that probability
    theory is the right tool to model systems, but it
    has always been too hard to compute. Recent
    advances in computing allow calculation of
    complex models
  • Machines are good at gathering data and
    performing complex analysis
  • Machine Learning is a sea change in development
    of applications since it allows computers to be
    more proactive and predictive

5
Applications of Machine Learning
  • Interface Audio Visual Speech Recognition
    (AVSR) natural language processing, etc.
  • AI robotics, computer games, entertainment,
    etc.
  • Data Analysis information retrieval, data
    mining, etc.
  • Biological gene sequencing, genomics,
    computational pharmacology
  • Computer run time optimization
  • Industrial fault diagnosis
  • Applications of machine learning cover a broad
    range
  • Genomics - matching of protein strands
  • Collaborative Filtering - personal Google
  • Drug Discovery shortening of drug discovery
    cycle
  • Patient and elder care wireless camera and
    sensor network help monitor patients

6
Open ML Components Plan
  • Key
  • Optimized
  • Implemented
  • Not implemented
  • Boosted decision trees

OpenML
  • Influence diagrams

Supervised
  • SVM
  • BayesNets Classification
  • Decision trees
  • K-NN
  • Bayesnet structure learning

Bayesian Networks OpenPNL-2003
  • K-means
  • Dependency Nets

Unsupervised
  • BayesNets Parameter fitting
  • Spectral clustering
  • Agglomerative clustering
  • PCA

Modeless
Model based
7
Model Based Machine Learning
  • Machine Learning can be based on Models
    (model-based) or it could be Model-less
  • In version 1.0 of OpenML Intel is focusing on
    Bayesian Networks and the Probabilistic Networks
    which fall under model-based category
  • The Bayesian approach provides a mathematical
    rule explaining how one should change existing
    beliefs in the light of new evidence
  • Model-less approaches are used for clustering and
    classification
  • Intel will release libraries using model-less
    approaches next year

8
Applications of Model-less ML
Machine 18Fab 11 Tolerance goes out when
temperature gt87o
  • Suitable for applications such as Fault
    Diagnosis
  • The system does not have a model
  • It collects data and clusters and classifies
    them
  • Recognition is derived from these clusters

9
Applications of Model-based ML
  • Our research has focused on Bayesian Networks
  • Hidden Markov Models (HMM) a Bayesian Net - are
    widely used in speech recognition, couple Hidden
    Markov Models are used in Audio Visual Speech
    Recognition (use of visual data in speech
    recognition)
  • Open Source PNL is an optimized infrastructure
    for research and development in Model Based
    Machine Learning

Audio Visual Speech Recognition
Face Recognition Tracking
10
Example Vision Applications
Image super resolution - Use a Bayesian method to
develop a clear image from a small resolution
picture
11
Intel Systems Technology Lab
Santa Clara, CA, USA Graphics Lab Machine
Learning Architecture Lab
Hillsboro, OR, USA Wireless Systems Media 3D
Graphics Tech. Management
Beijing, PR China China Research Center Speech
and Machine Learning
Nizhny Novgorod, Russia Architecture for Machine
Learning, Media, 3D Graphics, Computer Vision
  • One of three major labs of Intel Corporate
    Technology Group
  • 300 researchers worldwide
  • Focus on impact on Intel Architecture
  • Drive university and industry initiatives

12
Why Open Source..?
  • Expands our research base
  • Allows Intel researchers to collaborate easily
    with thousands of colleagues worldwide
  • Remove barriers, speed up collaboration
  • Tap into a very large innovative community
  • Ability to get feedback from a large number of
    developers to design future microprocessors
  • Chance to explore innovative usage models
  • Diffuse new technologies and usage models to a
    wide group of early adopters

13
Open Research Program
  • Currently four open source projects
    http//www.intel.com/software/products/opensource/
    index.htm
  • OpenCV Computer Vision Libraryhttp//www.intel.
    com/research/mrl/research/opencv/
  • OpenRC - Open Research Compilerhttp//ipf-orc.so
    urceforge.net/ORC-overview.htm
  • OpenLF Open Light Fieldshttp//www.intel.com/re
    search/mrl/research/lfm/
  • OpenAVSR Audio Visual Speech Recognitionhttp//
    www.intel.com/research/mrl/research/avcsr.htm

14
Example OpenCV
  • Released in June 2000
  • A library of 500 computer vision algorithms,
    including applications such as Face Recognition,
    Face Tracking, Stereo Vision, Camera Calibration
  • Highly tuned for IA
  • Windows and Linux Versions
  • Over 500,000 Downloads
  • Broad use in academia (450) and Industry (360)

15
More Information
Visit Open Source ML Web page download
at http//www.intel.com/research/mrl/pnl
16
Backup
17
Modeless and Model Based ML
Well use an example application from our current
research to descibe two basic approaches to
machine learning
  • Model Based
  • Bayesian Networks
  • Function fitters
  • Regression
  • Filters
  • Modeless
  • Classifiers
  • Clustering
  • Kernel estimators

A
AACACB
CBABBC
AAA
CCB
ABBC
CB
B
C
B
C
18
Quick view of Bayesian networks
19
What is a Bayesian Network?
  • A Bayesian network, or a belief network, is a
    graph in which the following holds
  • A set of random variables makes up nodes of the
    network.
  • A set of directed links connects pairs of nodes
    to denote causality relations between variables.
  • Each node has a conditional probability
    distribution (CPD) that quantifies the effects
    that the parents have on the node
  • Graphical Models are more general, allowing
    undirected links, mixed directed/undirected
    connections, and loops within the graph

20
Computational Advantages ofBayesian Networks
  • Bayesian Networks graphically express conditional
    independence of probability distributions.
  • Independencies can be exploited for large
    computational savings.
  • EXAMPLE

Joint probability of 3 discrete variable (A,B,C)
system with 5 possible values each
A
P(A,B,C) 5x5x5 table
C
125 parameters
B
But a graphical model factors the probabilities
taking advantage of the independencies
A
A
B
C
A
55 parameters
21
Causality and Bayesian Nets
  • Think of Bayesian Networks as a Circuit Diagram
    of Probability Models
  • The Links indicate causal effect, not direction
    of information flow.
  • Just as we can predict effects of changes on the
    circuit diagram, we can predict consequences of
    operating on our probability model diagram.

Diode
Mains
Capac.
Transf.
Diode
Observed
Ammeter
Un-Observed
Battery
22
Quick view of Decision Trees and Statistical
Boosting
23
Statistical Classification
  • Cluster data to infer or predict properties
  • Example Decision trees

Find splits that most purify the labeled data
Prune the tree to minimize complexity
AACBAABBCBCC
All the way down
AACACB
CBABBC
AAA
CCB
ABBC
CB
The split rules are used to classify Future data
B
CC
B
C
A
BBC
C
BB
24
Statistical ClassificationBoosting
Use a weak classifier such as a 1 level tree
Use the error weighted forest to vote on the
classification of new data
AACBAABBCBCC
AACACB
CBABBC
Re-weight the error cases and classify
again Record weight factor Wi for ith case.
AACBAABBCBCC
AAAACB
CCBBBC
Repeat until you have a forest
25
Application areas and libraries
26
Applications of ML
Actively working on
Ramping
Key
Past work
External activity
Interface
Data Analysis
AI
Industrial
Biologic
Computer
TOOLS
Trees, Boosting, Random forest
Statistical Regression, ANOVA,
Neural Nets SVM
Stochastic Discrimination
Genetic Algorithms
Reinforcement Learning
27
Probabilistic Network Library
Intel
Universities
Application Driven
Data Mining
AI
Industrial
Interface
Trace Compression
Cognitive Modeling
LipsSpeech AVSR
Gene Sequencing
Vision Models
Learned Control
Epidemiology
Workload Analysis
Drive into Future Hardware
Drive into hardware
Modify Existing Architectures
Create New Architectures
Chipset
Platform
CPU Instructions
cache
28
Open Source Computer Vision (OpenCV)
29
Machine Learning Library (OpenMLL)
CLASSIFICATION / REGRESSION CART Statistical
Boosting MART Random Forests Stochastic
Discrimination Logistic SVM K-NN CLUSTERING K-Mea
ns Spectral Clustering Agglomerative
Clustering LDA, SVD, Fisher Discriminate TUNING/V
ALIDATION Cross validation Bootstrapping Sampling
methods
Alpha Q104, Beta Q404
30
Optimization (Lib ?)
Large-scale Optimizations
Combinatorial Optimizations
Continuous
Mixed
Discrete
Constrained
Unconstrained
Linear
Nonlinear
Nonlinear
NLP
LP
QP
Sim. Anealing, Genetic Alg, Stoch. Search,
Network Programming, Dynamic Programming
Interior Point
Active Set
Branch and Bound
Conjugate Gradient, Newton
SQP
Domain Reduction, Constraints Propagation
Simplex
Problems looking at Circuit layout Device
geometry Chemical binding synthesis
Write a Comment
User Comments (0)
About PowerShow.com