Using Text Mining to Infer Semantic Attributes for Retail Data Mining

About This Presentation

Title:

Using Text Mining to Infer Semantic Attributes for Retail Data Mining

Description:

Drawbacks in Current Data Mining Techniques. Semantic ... Calvin. Klein. August. Lounge. Hilfiger. Robe. gown. Jrs. Dkny. Jeans. Tee. Colligate. Logo. Tommy ... – PowerPoint PPT presentation

Number of Views:211

Avg rating:3.0/5.0

Slides: 27

Provided by: csG6

Learn more at: https://cs.gmu.edu

more less

Transcript and Presenter's Notes

Title: Using Text Mining to Infer Semantic Attributes for Retail Data Mining

1
Using Text Mining to Infer Semantic Attributes
for Retail Data Mining

Authors Rayid Ghani Andrew E. Fano
Presenter Vishal Mahajan
INFS795

2
Agenda

Drawbacks in Current Data Mining Techniques.
Purpose.
Assumptions and Constraints.
Methodology or Approach.
Extraction of Feature Set.
Labeling .
Classification Techniques.
Naïve Bayes
EM
Experimental Results.
Recommender System.

3
Drawbacks in Current Data Mining Techniques

Semantic Features not automatically considered.
Transactional Data analyzed without analyzing the
customer.
Trending is partial.
Retail Items treated as objects with no
associated semantics.
Data Mining Techniques (association rules,
decision trees, neural networks) ignore the
meaning of items and semantics associated with
them.

4
Purpose of the Presentation

Describe a system that extracts semantic
features.
Populate the knowledge base with the semantic
features.
Use of text mining in retailing to extract
semantic features from website of retailers.
How profiles of customers or group of customers
can be build using Text Mining.

5
Assumptions Constraints

Focus on Apparel Retail segment only.
Results focus on extracting those semantic
features that are deemed important by CRM or
Retail experts.
Data extracted from retailers website.
Models generated can be extended beyond the
Apparel Retail segment.

6
Approach

Collect Information about products.
Define set of features to be extracted.
Label the data with values of the features.
Train a classifier/extractor to use the labeled
training to extract features from unseen data.
Extract Semantic Features from new products by
using trained classifier.
Populate a knowledge base with the products and
corresponding feature.

7
Data Collection Methodology

Use of web crawler to extract the following from
large retailers website
Names
URLs
Description
Prices
Categories of all Products Available
Use of wrappers.
Extracted Information stored in a database and a
subset chosen.

8
Extraction of Feature Set

Feature selection based on Expert Systems.
Use of extensive domain knowledge.
Feature selection based on Retail Apparel section
in mind.
Feature Selected for the project ?
Age Group
Functionality
Price
Formality
Degree of Conservativeness
Degree of Sportiness
Degree of Trendiness
Degree of Brand Appeal

9
Labeling Training Data

Database created with data from collected from
retailer website.
Subset of 600 products chosen and labeled.
Labeling guidelines provided

10
Details of Features extracted from each Product
Description
11
Verifying Training Data

Disjoint Dataset as labeling done by different
individuals.
Association rules (between features) used to
obtain consistency in labeled data.
Apriori algorithm
Apriori Algorithm implemented with single and two
feature antecedents and consequents.
Desired Consistency in Labeling achieved by
applying associating rules

12
Apriori Algorithm

Find the frequent itemsets the sets of items
that have minimum support
A subset of a frequent itemset must also be a
frequent itemset
i.e., if AB is a frequent itemset, both A and
B should be a frequent itemset
Use the frequent itemsets to generate association
rules.

13
The Apriori Algorithm Example
L1
C1
Scan D
C2
Database D
C2
L2
Scan D
L3
C3
Scan D
14
Training from Labeled Data

Learning problem treated as a text classification
problem.
Only one text classifier for each semantic
feature.
e.g Price of product will be classified as either
discount or average or luxury.
Age group is classified as Juniors or Teens or
GenX or Mature or All Ages.
Classification was performed using Naïve Bayes
classification.

15
Sample Association Rules
16
Naïve Bayes

Simple but effective text classification method.
Class is selected according to class prior
probabilities.
This Model assumes each word in a document is
generated independently of the other in the
class.

where N(wt,di) count of times word wt occurs in
document di and Pr(cj,di) 0,1)
17
Incorporating Unlabeled Data

Initial sample was for 600 products only.
Need to take care of unlabeled products to make
any meaningful predictions.
Use of Supervised learning algorithms.
These algorithms have proved to reduce the
classification error considerably.
Use of Expectation-Maximization (EM) Algorithm as
the supervised technique.

18
Expectation-Maximization (EM) Method

EM is an iterative statistical technique for
maximum likelihood estimation for incomplete
data.
In the retail classification problem, unlabeled
data is considered as incomplete data.
EM ?
Locally maximizes the likelihood of the
parameter.
Gives estimates for missing values.

19
Expectation-Maximization (EM) Method- cont

EM method is a 2-step process.
Initial Parameters are set using naïve Bayes from
just the labeled documents.
Subsequent iteration of E- and M-Steps.
E-Step
Calculates probabilistically weighed class label
Pr(cjdj), for every unlabeled document.
M-Step
Estimates new classifier parameter using all
documents (Equation 1).
E and M steps iterated unless classifier
converges

20
Experimental Results
21
Experimental Results
22
Results on new data set

The subset of data that was used earlier was from
a single retailer.
Another sample of data was collected from variety
of retailers. The results are as follows.
Results are consistently better.

23
Recommender System

Creation of customer profiles (real time) is
feasible by analyzing the text associated with
products and by mapping it to pre-defined
semantic features.
Identity of customer is not known and prior
transaction history is unknown.
Semantic features are inferred by the browsing
pattern of the customer.
Helps in suggesting new products to the customers.

24
Recommender System

Mathematically ?
P(AijProduct)
Where Aij is the jth value of ith attribute
isemantic attributes, jpossible values
User profile is constructed as follows
Pr(Ui,jPast N Items) 1/N

i,j
is calculated
25
Types of Recommender Systems

Two Types of Recommender Systems.
Collaborative Filtering.
Collect user feedback in terms of ratings.
Exploit similarities and differences of customers
to recommend items.
Issues
Sparsity Problem.
New Items.
Content Filtering
Compares the contents
Issues
Narrow in scope
Recommends similar products only

26
Conclusions

The systems learns from the use of supervised and
semi-supervised techniques.
Major assumptions..Products accurately convey the
semantic attributes.??
Small sample of data used to Infer results.
Practical applications not verified.
System bootstrapped from a small number of
labeled training examples.
Interesting application which could be evolved to
generate trends for retail marketers.