Cross Domain Distribution Adaptation via Kernel Mapping

About This Presentation

Title:

Cross Domain Distribution Adaptation via Kernel Mapping

Description:

Cross Domain Distribution Adaptation via Kernel Mapping. Erheng Zhong Wei Fan ... Xavier University of Lousiana. Can We? Standard Supervised Learning. New York Times ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 19

Provided by: carbonVide1

Category:

more less

Transcript and Presenter's Notes

Title: Cross Domain Distribution Adaptation via Kernel Mapping

1
Cross Domain Distribution Adaptation via Kernel
Mapping

Erheng Zhong Wei Fan Jing Peng Kun
Zhang Jiangtao Ren Deepak Turaga Olivier
Verscheure
Sun Yat-Sen University
IBM T. J. Watson Research Center
Montclair State University
Xavier University of Lousiana

2
Can We?
3
Standard Supervised Learning
training (labeled)?
test (unlabeled)?
Classifier
85.5
New York Times
New York Times
4
In Reality
training (labeled)?
test (unlabeled)?
Classifier
64.1
Labeled data not available!
Reuters
New York Times
New York Times
5
Domain Difference-gtPerformance Drop
train
test
ideal setting
Classifier
NYT
NYT
85.5
New York Times
New York Times
realistic setting
Classifier
NYT
Reuters
64.1
Reuters
New York Times
6
Synthetic Example
7
Synthetic Example
8
Main Challenge ? Motivation

Both the marginal and conditional distributions
between target-domain and source-domain could be
significantly different in the original space!!

Could we remove those useless source-domain data?
Could we find other feature spaces?
How to get rid of these differences?
9
Main Flow
Kernel Discriminant Analysis
10
Kernel Mapping
11
Instances Selection
12
Ensemble
13
Properties

Kernel mapping can reduce the difference of
marginal distributions between source and target
domains. Theorem 2
Both source and target domain after kernal
mapping are approximately Gaussian.
Cluster-based instances selection can select
those data from source domain with similar
conditional probabilities. Cluster Assumption,
Theorem 1
Error rate of the proposed approach can be
bounded Theorem 3
Ensemble can further reduce the transfer risk.
Theorem 4

14
Experiment Data Set

Reuters
21758 Reuters news articles
20 News Groups
20000 newsgroup articles
SyskillWebert
HTML source of web pages plus the ratings of a
user on those web pages from 4 different subjects
All of them are high dimension (gt1000)!

15
Experiment -- Baseline methods

Non-transfer single classifiers
Transfer learning algorithm TrAdaBoost.
Base classifiers
K-NN
SVM
NaiveBayes

16
Experiment -- Overall Performance

kMapEnsemble -gt 24 win, 3 lose!

Dataset 19
17
Conclusion

Domain transfer when margin and conditional
distributions are different between two domains.
Flow
Step-1 Kernel mapping -- Bring two domains
marginal distributions closer
Step-2 Cluster-based instances selection -- Make
conditional distribution transferable
Step-3 Ensemble Further reduce the transfer
risk.
Code and data available from the authors.

18
(No Transcript)

Write a Comment

User Comments (0)