Opinion Analysis - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Opinion Analysis

Description:

Use the internet ... Use the seeds to search for synonyms and antonyms in WordNet (eg, Hu and Liu, 2004) ... Dictionary-based approaches ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 38
Provided by: Sudeshn7
Category:

less

Transcript and Presenter's Notes

Title: Opinion Analysis


1
Opinion Analysis
  • Sudeshna Sarkar
  • IIT Kharagpur

2
Introduction facts and opinions
  • Two main types of information on the Web.
  • Facts and Opinions
  • Current search engines search for facts (assume
    they are true)
  • Facts can be expressed with topic keywords.
  • Search engines do not search for opinions
  • Opinions are hard to express with a few keywords
  • How do people think of Motorola Cell phones?
  • Current search ranking strategy is not
    appropriate for opinion retrieval/search.

3
Overview
  • Motivation
  • Definitions
  • Coarse grained vs Fine grained opinion analysis
  • Opinion Lexicons
  • Approaches to document level opinion analysis
  • Lexicon based
  • Supervised learning approaches
  • Mixed approaches
  • Approaches to fine-grained opinion analysis
  • Rule based
  • Learning
  • Opinion mining work at IIT Kharagpur

4
Opinion Mining
  • Search for and aggregate opinions from online
    sources
  • Many reviews have both positive and negative
    sentences
  • Many products are liked by some and disliked by
    others there must be different reasons
  • Identify different features/ aspects of the
    target and the opinion on these separately

5
Why do opinion analysis?
  • Opinion search
  • to extract examples of particular types of
    positive or negative statements on some topic.
  • Opinion question answering
  • What is the reaction to the Left Fronts stand on
    the nuclear deal?
  • Is support diminishing for the UPA government?
  • Product review mining
  • What features of Mr Coffee programmable coffee
    maker do users like and what they dislike
    (Microsoft Live)
  • Review classification
  • Tracking sentiment toward topics over time
  • to track the ups and downs of aggregate attitudes
    to a brand or product

6
Introduction Applications
  • Businesses and organizations product and service
    benchmarking. Market intelligence.
  • Business spends a huge amount of money to find
    consumer sentiments and opinions.
  • Consultants, surveys and focused groups, etc
  • Individuals interested in others opinions when
  • Purchasing a product or using a service,
  • Finding opinions on political topics,
  • Many other decision making tasks.
  • Ads placements Placing ads in user-generated
    content
  • Place an ad when one praises an product.
  • Place an ad from a competitor if one criticizes
    an product.
  • Opinion retrieval/search providing general
    search for opinions.

7
Question Answering
  • Opinion question answering

Q What is the international reaction to the
reelection of Robert Mugabe as President of
Zimbabwe?
A African observers generally approved of his
victory while Western Governments denounced it.
8
Opinion search (Liu, Web Data Mining book, 2007)
  • Can you search for opinions as conveniently as
    general Web search?
  • Whenever you need to make a decision, you may
    want some opinions from others,
  • Wouldnt it be nice? you can find them on a
    search system instantly, by issuing queries such
    as
  • Opinions Motorola cell phones
  • Comparisons Motorola vs. Nokia
  • Cannot be done yet!

9
Typical opinion search queries
  • Find the opinion of a person or organization
    (opinion holder) on a particular object or a
    feature of an object.
  • E.g., what is Bill Clintons opinion on abortion?
  • Find positive and/or negative opinions on a
    particular object (or some features of the
    object), e.g.,
  • customer opinions on a digital camera,
  • public opinions on a political topic.
  • Find how opinions on an object change with time.
  • How object A compares with Object B?
  • Gmail vs. Yahoo mail

10
Find the opinion of a person on X
  • In some cases, the general search engine can
    handle it, i.e., using suitable keywords.
  • Bill Clintons opinion on abortion
  • Reason
  • One person or organization usually has only one
    opinion on a particular topic.
  • The opinion is likely contained in a single
    document.
  • Thus, a good keyword query may be sufficient.

11
Find opinions on an object X
  • We use product reviews as an example
  • Searching for opinions in product reviews is
    different from general Web search.
  • E.g., search for opinions on Motorola RAZR V3
  • General Web search for a fact rank pages
    according to some authority and relevance scores.
  • The user views the first page (if the search is
    perfect).
  • One fact Multiple facts
  • Opinion search rank is desirable, however
  • reading only the review ranked at the top is
    dangerous because it is only the opinion of one
    person.
  • One opinion ? Multiple opinions

12
Search opinions (contd)
  • Ranking
  • produce two rankings
  • Positive opinions and negative opinions
  • Some kind of summary of both, e.g., of each
  • Or, one ranking but
  • The top (say 30) reviews should reflect the
    natural distribution of all reviews (assume that
    there is no spam), i.e., with the right balance
    of positive and negative reviews.
  • Questions
  • Should the user reads all the top reviews? OR
  • Should the system prepare a summary of the
    reviews?

13
User generated content
  • Word of mouth on the web.
  • Review sites
  • Blogs
  • Online forums
  • Shopping comparison sites
  • User reviews
  • Mine opinions expressed in the user-generated
    content
  • Challenging task
  • Useful to individual consumers and companies.

14
Motivation for Consumer
  • I want to buy a camera.
  • Which model should I pick?
  • Ask my friends
  • Use the internet
  • CEA-CNET Study Tech-Savvy Consumers Use Internet
    to Research Products Before Buying Them
  • Wireless News,  November, 2007  
  • Seventy Percent of Consumers Use Internet to
    Research Consumer Packaged Goods, According to
    Prospectiv Survey
  • Market Wire,  January, 2008  

15
Businesses
  • Identify opinions about products help to
    position/ adapt products
  • Much of product feedback is web-based
  • provided by customers/critiques online through
    websites, discussion boards, mailing lists, and
    blogs, CRM Portals.
  • Market research is becoming unwieldy
  • Sources are heterogeneous and multilingual in
    nature

16
Facts vs Opinions
  • An opinion is a person's ideas and thoughts
    towards something. It is an assessment, judgment
    or evaluation of something. An opinion is not a
    fact, because opinions are either not
    falsifiable, or the opinion has not been proven
    or verified. ...en.wikipedia.org/wiki/Opinion
  • Subjectivity The linguistic expression of
    somebodys emotions, sentiments, evaluations,
    opinions, beliefs, speculations, etc.
  • Polarity positive and negative
  • This camera is awesome.
  • The movie is too long and boring.
  • Strength of opinion

17
Levels of opinion analysis
  • Coarse to fine grained opinion analysis
  • Document level At the document (or review) level
  • Subjective vs Objective
  • Sentiment classification positive, negative or
    neutral
  • Sentence level, Expression level
  • Task 1 identifying subjective/opinionated
    sentences (or clauses/ phrases)
  • Classes objective and subjective (opinionated)
  • Task 2 sentiment classification of sentences
  • Classes positive, negative and neutral.
  • But a document/ sentence may contain multiple
    opinions on more than one topic from one or more
    opinion holder

18
Lexicon Development
  • Manual
  • Semi-automatic
  • Fully automatic
  • Find relevant words, phrases, patterns that can
    be used to express subjectivity
  • Determine the polarity of subjective expressions

19
Opinion Words
  • An opinion lexicon containing lists of positive
    and negative phrases is very useful for the
    opinion mining task at different levels
  • Positive beautiful, wonderful, good, amazing,
  • Negative bad, poor, terrible, cost someone an
    arm and a leg
  • How to compile such a list?
  • Dictionary-based approaches
  • Corpus-based approaches
  • Supervised
  • Semi-supervised
  • BUT
  • Some opinion words are context independent (e.g.,
    good).
  • Some are context dependent (e.g., long).

20
Hand created lists
  • Create lists of opinion words appropriate for the
    domain manually
  • Sentiment term
  • Polarity
  • Strength
  • These approaches, while being interesting, are
    labor intensive and can be vulnerable to error
    and high maintenance costs

21
Dictionary-based approaches
  • Start from a set of seed opinion words
  • Use WordNets synsets and hierarchies to acquire
    opinion words
  • Use the seeds to search for synonyms and antonyms
    in WordNet (eg, Hu and Liu, 2004).

21
22
Dictionary-based approaches
  • Use additional information (e.g., glosses) and
    learning from WordNet (Andreevskaia and Bergler,
    2006) (Esuti and Sebastiani, 2005).

22
23
Dictionary-based approaches
  • Advantage Good to find a lot of such words
  • Weakness Do not find context dependent opinion
    words, e.g., small, long, fast.

23
24
Corpus-based approaches
  • Rely on syntactic rules and co-occurrence
    patterns to extract from large corpora
  • Use a list of seed words
  • A large domain corpus
  • Machine learning
  • Advantages This approach can find domain
    (corpus) dependent opinions.

24
25
How to identify subjective terms?
  • Assume that contexts are coherent
  • Statistical Association If words of the same
    orientation like to co-occur together, then the
    presence of one makes the other more probable
  • Use statistical measures of association to
    capture this interdependence
  • Assume that contexts are coherent
  • Assume that alternatives are similarly subjective

26
Corpus-based approaches (contd)
  • Conjunctions Conjoined adjectives usually have
    the same orientation (Hazivassiloglou and McKeown
    1997).
  • E.g., This car is beautiful and
    spacious.(conjunction)
  • Start with seed words
  • Use conjunctions to find adjectives with similar
    orientations
  • Use log-linear regression to aggregate
    information from various conjunctions
  • Use hierarchical clustering on a
    graphrepresentation of adjective similarities to
    find two groups of same orientation

26
27
(No Transcript)
28
Growing contextual opinion words
  • Ding, Liu, Wu
  • Intra-sentence conjunction rule Opinion on both
    sides of and / two consecutive sentences tend
    to be the same
  • E.g., This camera takes great pictures and has a
    long battery life.
  • But with a but-like clause, the opinions
    tend to be of opposite polarity.
  • Context is important
  • Long battery life vs Long time to focus
  • Growing
  • by applying various conjunctive rules
  • Verifying the results as the system sees more
    reviews by those conjunctive rules
  • Only keep those opinions which the system is
    confident about, controlled by a confidence
    limit.

28
29
Semantic Orientation by Association
  • Labeled semantic orientation of words
  • Pwords good, nice, excellent, positive,
    fortunate, correct, superior
  • Nwords bad, nasty, poor, negative,
    unfortunate, wrong, inferior.
  • Various approach to calculate the semantic
    association of two words
  • Pointwise Mutual Information (PMI) Church and
    Hanks 1989
  • Latent Semantic Indexing (LSI) Dumais et al.
    1990
  • Likelihood Ratios Dunning 1993

30
Turney 2002 Turney Littman 2003
  • Determine the semantic orientation of each
    extracted phrase based on their association with
    seven positive and seven negative seed words

31
Weakly spervised learning
  • Gammon Aue 2005
  • Given a list of seed words (seed words 1)
  • Get more seed words (seed words 2) words with
    low PMI at sentence level
  • Get semantic orientation of (seed words 2) by PMI
    at document level
  • Get Semantic orientation of all words by PMI with
    all seed words

32
Document level opinion analysis
  • Polarity classification Classify documents
    (e.g., reviews) based on the overall sentiments
    expressed by authors,
  • Approaches
  • Use opinion lexicon
  • Knowledge Engineering
  • Supervised learning techniques
  • Classifying using the Web as a corpus
  • Semi-supervised

33
Knowledge Engineering
  • Make use of lists of sentiment terms
  • Manually create analysis components based on
    cognitive linguistic theory parser, feature
    structure representation, etc

34
Supervised polarity classifier
  • Requirements A labeled database of opinion
  • Download ratings from Amazon.com, epinions.com
    etc.
  • Build a binary opinion classifier
  • From positive and negative ratings
  • Merge 1 and 2 stars to negative and 3, 4 and 5 to
    positive
  • Use thresholded SVM, maximum entropy, naïve
    Bayes, etc.

35
Supervised Training
  • Obtain Labeled Sentences positive, neutral,
    negative
  • Extract features words, n-grams, multi word
    expressions, feature generalization Kim Hovy
    2007
  • Feature values binary/ frequency
  • Run Training algorithm on the features to give a
    classifier
  • Optional Do feature selection (use
    log-likelihood ratio)

36
Semi-supervised approaches
  • Fully supervised techniques require
  • large amount of labeled data for the given domain
  • Semi-supervised systems
  • Use small amount of domain knowledge
  • From a small set of seed words use domain corpus
    to get domain relevant opinion words as discussed
    earlier

37
Semi-supervised approach
  • Gamon Aue 2005
  • Obtain opinion words by semi-supervised approach
  • Given a domain corpus, label data using average
    semantic orientation
  • Train classifier on labeled data
Write a Comment
User Comments (0)
About PowerShow.com