Cross Domain Distribution Adaptation via Kernel Mapping - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Cross Domain Distribution Adaptation via Kernel Mapping

Description:

Cross Domain Distribution Adaptation via Kernel Mapping. Erheng Zhong Wei Fan ... Xavier University of Lousiana. Can We? Standard Supervised Learning. New York Times ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 19
Provided by: carbonVide1
Category:

less

Transcript and Presenter's Notes

Title: Cross Domain Distribution Adaptation via Kernel Mapping


1
Cross Domain Distribution Adaptation via Kernel
Mapping
  • Erheng Zhong Wei Fan Jing Peng Kun
    Zhang Jiangtao Ren Deepak Turaga Olivier
    Verscheure
  • Sun Yat-Sen University
  • IBM T. J. Watson Research Center
  • Montclair State University
  • Xavier University of Lousiana

2
Can We?
3
Standard Supervised Learning
training (labeled)?
test (unlabeled)?
Classifier
85.5
New York Times
New York Times
4
In Reality
training (labeled)?
test (unlabeled)?
Classifier
64.1
Labeled data not available!
Reuters
New York Times
New York Times
5
Domain Difference-gtPerformance Drop
train
test
ideal setting
Classifier
NYT
NYT
85.5
New York Times
New York Times
realistic setting
Classifier
NYT
Reuters
64.1
Reuters
New York Times
6
Synthetic Example
7
Synthetic Example
8
Main Challenge ? Motivation
  • Both the marginal and conditional distributions
    between target-domain and source-domain could be
    significantly different in the original space!!

Could we remove those useless source-domain data?
Could we find other feature spaces?
How to get rid of these differences?
9
Main Flow
Kernel Discriminant Analysis
10
Kernel Mapping
11
Instances Selection
12
Ensemble
13
Properties
  • Kernel mapping can reduce the difference of
    marginal distributions between source and target
    domains. Theorem 2
  • Both source and target domain after kernal
    mapping are approximately Gaussian.
  • Cluster-based instances selection can select
    those data from source domain with similar
    conditional probabilities. Cluster Assumption,
    Theorem 1
  • Error rate of the proposed approach can be
    bounded Theorem 3
  • Ensemble can further reduce the transfer risk.
    Theorem 4

14
Experiment Data Set
  • Reuters
  • 21758 Reuters news articles
  • 20 News Groups
  • 20000 newsgroup articles
  • SyskillWebert
  • HTML source of web pages plus the ratings of a
    user on those web pages from 4 different subjects
  • All of them are high dimension (gt1000)!

15
Experiment -- Baseline methods
  • Non-transfer single classifiers
  • Transfer learning algorithm TrAdaBoost.
  • Base classifiers
  • K-NN
  • SVM
  • NaiveBayes

16
Experiment -- Overall Performance
  • kMapEnsemble -gt 24 win, 3 lose!

Dataset 19
17
Conclusion
  • Domain transfer when margin and conditional
    distributions are different between two domains.
  • Flow
  • Step-1 Kernel mapping -- Bring two domains
    marginal distributions closer
  • Step-2 Cluster-based instances selection -- Make
    conditional distribution transferable
  • Step-3 Ensemble Further reduce the transfer
    risk.
  • Code and data available from the authors.

18
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com