Selected Applications of Transfer Learning - PowerPoint PPT Presentation

About This Presentation
Title:

Selected Applications of Transfer Learning

Description:

Selected Applications of Transfer Learning Qiang Yang Department of Computer Science and Engineering The Hong Kong University of Science and Technology – PowerPoint PPT presentation

Number of Views:201
Avg rating:3.0/5.0
Slides: 37
Provided by: vds1
Category:

less

Transcript and Presenter's Notes

Title: Selected Applications of Transfer Learning


1
Selected Applications of Transfer Learning
  • ??,Qiang Yang
  • Department of Computer Science and Engineering
  • The Hong Kong University of Science and
    Technology
  • Hong Kong
  • http//www.cse.ust.hk/qyang

1
2
Case 1 ????? ????
  • Target Class Changes ? Target Transfer Learning
  • Training 2 class problem
  • Testing 10 class problem.
  • Traditional methods fail
  • Solution find out what is not changed bewteen
    training and testing

3
Our Work
  • Cross-Domain Learning
  • TrAdaBoosting (ICML 2007)
  • Co-Clustering based Classification (SIGKDD 2007)
  • TPLSA (SIGIR 2008)
  • NBTC (AAAI 2007)
  • Translated Learning
  • Cross-lingual classification (in WWW 2008)
  • Cross-media classification (In NIPS 2008)
  • Unsupervised Transfer Learning
  • Self-taught clustering (ICML 2008)

4
Our Work (cont)
  • Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang
    Yang, and Yong Yu. Translated Learning. In
    Proceedings of Twenty-Second Annual Conference on
    Neural Information Processing Systems (NIPS
    2008), December 8, 2008, Vancouver, British
    Columbia, Canada. (Link)
  • Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang,
    and Yong Yu. Cross-Domain Spectral Learning. In
    Proceedings of the Fourteenth ACM SIGKDD
    International Conference on Knowledge Discovery
    and Data Mining (ACM KDD 2008), Las Vegas,
    Nevada, USA, August 24-27, 2008. 488-496 (PDF)
  • Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong
    Yu. Self-taught Clustering. In Proceedings of the
    25th International Conference on Machine Learning
    (ICML 2008), Helsinki, Finland, 5-9 July, 2008.
    200-207 (PDF)
  • Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong
    Yu. Boosting for Transfer Learning. In
     Proceedings of The 24th Annual International
    Conference on Machine Learning   (ICML'07)
    Corvallis, Oregon, USA, June 20-24, 2007. 193 -
    200 (PDF)
  • Wenyuan Dai, Gui-Rong Xue, Qiang Yang and Yong
    Yu. Co-clustering based Classification for
    Out-of-domain Documents. In Proceedings of the
    Thirteenth ACM SIGKDD International Conference on
    Knowledge Discovery and Data Mining (ACM KDD'07),
    San Jose, California, USA, Aug 12-15, 2007. Pages
    210-219 (PDF)
  • Dou Shen, Jian-Tao Sun, Qiang Yang and Zheng
    Chen. Building Bridges for Web Query
    Classification. In Proceedings of the 29th ACM
    International Conference on Research and
    Development in Information Retrieval (ACM SIGIR
    06). Seattle, USA, August 6-11, 2006. Pages
    131-138. (PDF) 

5
Query Classification and Online Advertisement
  • ACM KDDCUP 05 Winner
  • SIGIR 06
  • ACM Transactions on Information Systems Journal
    2006
  • Joint work with Dou Shen, Jiantao Sun and Zheng
    Chen

6
QC as Machine Learning
  • Inspired by the KDDCUP05 competition
  • Classify a query into a ranked list of categories
  • Queries are collected from real search engines
  • Target categories are organized in a tree with
    each node being a category

6
7
Related Works
  • Query Classification/Clustering
  • Classify the Web queries by geographical locality
    Gravano 2003
  • Classify queries according to their functional
    types Kang 2003
  • Beitzel et al. studied the topical classification
    as we do. However they have manually classified
    data Beitzel 2005
  • Beeferman and Wen worked on query clustering
    using clickthrough data respectively Beeferman
    2000 Wen 2001
  • Document/Query Expansion
  • Borrow text from extra data source
  • Using hyperlink Glover 2002
  • Using implicit links from query log Shen 2006
  • Using existing taxonomies Gabrilovich 2005
  • Query expansion Manning 2007
  • Global methods independent of the queries
  • Local methods using relevance feedback or
    pseudo-relevance feedback

7
8
Target-transfer Learning in QC
  • Classifier, once trained, stays constant
  • Target Classes Before
  • Sports, Politics (European, US, China)
  • Target Classes Now
  • Sports (Olympics, Football, NBA), Stock Market
    (Asian, Dow, Nasdaq), History (Chinese, World)
    How to allow target to change?
  • Application
  • advertisements come and go,
  • but our query?target mapping needs not be
    retrained!
  • We call this the target-transfer learning problem

9
Solutions Query Enrichment Staged
Classification
9
10
Step 1 Query enrichment
  • Textual information

10
11
Step 2 Bridging Classifier
  • Wish to avoid
  • When target is changed, training needs to repeat!
  • Solution
  • Connect the target taxonomy and queries by taking
    an intermediate taxonomy as a bridge

11
12
Bridging Classifier (Cont.)
  • How to connect?

The relation between and
The relation between and
Prior prob. of
12
13
Category Selection for Intermediate Taxonomy
  • Category Selection for Reducing Complexity
  • Total Probability (TP)
  • Mutual Information

13
14
Experiment- Data Sets Evaluation
  • ACM KDDCUP
  • Starting 1997, ACM KDDCup is the leading Data
    Mining and Knowledge Discovery competition in the
    world, organized by ACM SIG-KDD.
  • ACM KDDCUP 2005
  • Task Categorize 800K search queries into 67
    categories
  • Three Awards
  • (1) Performance Award (2) Precision Award (3)
    Creativity Award
  • Participation
  • 142 registered groups
  • 37 solutions submitted from 32 teams
  • Evaluation data
  • 800 queries randomly selected from the 800K query
    set
  • 3 human labelers labeled the entire evaluation
    query set
  • Evaluation measurements Precision and
    Performance (F1)
  • We won all three. a



14 / 68
15
Result of Bridging Classifiers
  • Performance of the Bridging Classifier with
    Different Granularity of Intermediate Taxonomy
  • Using bridging classifier allows the target
    classes to change freely
  • no the need to retrain the classifier!

16
Summary Target-Transfer Learning
Intermediate Class
classify to
Query
Similarity
Target class
17
Cross-Domain Learning
18
Case 1
  • Source
  • Many labeled instances
  • Target
  • Few labeled instances
  • Target and source domains
  • Same feature representation
  • Same classes Y (binary classes)
  • Different P(X,Y) distribution

19
TrAdaBoost Transfer AdaBoost (cont.)
  • Given
  • Insufficient labeled data from the target domain
    (primary data)
  • Labeled data following a different distribution
    (auxiliary data)
  • The auxiliary data are weaker evidence for
    building the classifier

19
20
TrAdaBoost Transfer AdaBoost (cont.)
  • Misclassified examples
  • increase the weights of the misclassified target
    data
  • decrease the weights of the misclassified source
    data

20
21
TrAdaBoost Transfer AdaBoost (cont.)
  • Performance

22
Transfer Learning in Sensor Network Tracking
  • Received-Signal-Strength (RSS) based localization
    in an Indoor WiFi environment.

Access point 2
Mobile device
Access point 1
Access point 3
-40dBm
-70dBm
-30dBm
(location_x, location_y)
Where is the mobile device?
23
Distribution Changes
  • The mapping function f learned in the offline
    phase can be out of date.
  • Recollecting the WiFi data is very expensive.
  • How to adapt the model ?

Time
Night time period
Day time period
24
Transfer Learning in Wireless Sensor Networks
  • Transfer across time
  • Transfer across space
  • Transfer across device

25
Latent Space based Transfer Learning (Spatial
Transfer) Transfer Localization Models across
Space Pan, Yang et al. AAAI 08
  • Some labeled data collected in Area A and
    unlabeled data in B
  • Only a few labeled data collected in Area B
  • Want to
  • Construct a localization model of the whole area
    (Area A and Area B)

26
Transfer across time
LeMan Static mapping function learnt from
offline data LeMan2 Relearn the mapping
function from a few online data LeMan3 Combine
offline and online data as a whole training data
to learn the mapping function.
  • Area 30 X 40 (81 grids)
  • Six time periods
  • 1230am--0130am
  • 0830am--0930am
  • 1230pm--0130pm
  • 0430pm--0530pm
  • 0830pm--0930pm
  • 1030pm--1130pm

27
Transfer knowledge via latent manifold learning
Labeled WiFi Data
Labeled WiFi Data
Latent Manifold
Knowledge Propagation
28
VIP Recommendation in Tencent Weibo
Properties
Friendship relations in Tencent QQ, which is the
largest instant messenge network
1. Data Sparsity limited neighbors for most
users
Knowledge Transfer
2. Heterogeneous Links symmetric friendship vs.
asymmetric following
3. Large Data 1 billion users and tens of
billion links
28
29
Social Relation based Transfer (SORT)
VIP Recommendation Based on One's 1. X
Friendship on QQ 2. S1 User Following Relations
on Tencent Weibo 3. S2 VIP Following Relations
on Tencent Weibo
30
Social App Recommendation in Tecent Qzone
Other Applications
Qzone (http//qzone.qq.com) is the largest social
network in China.
Video Recommendation in Tencent Video
Four types of auxiliary data 1. binary
ratings 2. social networks 3. context 4. video
content
Rating Prediction
30
31
Activity Recognition
  • With sensor data collected on mobile devices
  • Location
  • GPS, Wifi, RFID
  • Context location, weather, etc.
  • From GPS, RFID, Bluetooth, etc.
  • Various models can be used
  • Non-sequential models
  • Naïve Bayes, SVM
  • Sequential models
  • HMM, CRF

32
Activity Recognition Input Output (Vincent
Zheng, A Sg)
  • Input
  • Context and locations
  • Time, history, current/previous locations,
    duration, speed,
  • Object Usage Information
  • Trained AR Model
  • Training data from calibration
  • Calibration Tool VTrack
  • Output
  • Predicted Activity Labels
  • Running?
  • Walking?
  • Tooth brushing?
  • Having lunch?

http//www.cse.ust.hk/vincentz/Vtrack.html
32
33
Datasets MIT PlaceLab http//architecture.mit.edu
/house_n/placelab.html
  • MIT PlaceLab Dataset (PLIA2) Intille et al.
    Pervasive 2005
  • Activities Common household activities

33
34
Cross Domain Activity Recognition Zheng, Hu,
Yang, Ubicomp 2009
  • Challenges
  • A new domain of activities without labeled data
  • Cross-domain activity recognition
  • Transfer some available labeled data from source
    activities to help training the recognizer for
    the target activities.

CleaningIndoor
Laundry
Dishwashing
34
35
How to use the similarities?
Example sim(Make Coffee, Make Tea) 0.6
ltSensor Reading, Activity Namegt Example ltSS,
Make Coffeegt
Similarity Measure
THE WEB
Target Domain Pseudo Labeled Data
Source Domain Labeled Data
Weighted SVM Classifier
35
36
Calculating Activity Similarities
  • How similar are two activities?
  • Use Web search results
  • TFIDF Traditional IR similarity metrics (cosine
    similarity)
  • Example
  • Mined similarity between the activity sweeping
    and vacuuming, making the bed, gardening

36
Write a Comment
User Comments (0)
About PowerShow.com