Integrating Techniques for Eventbased Business Intelligence Gathering - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Integrating Techniques for Eventbased Business Intelligence Gathering

Description:

Gander Mountain Inc said it acquired the privately held Western Ranchman ... Named entities: [Gander Mountain Inc, Western Ranchman Outfitters] 18. GE Global Research ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 25
Provided by: ics9
Category:

less

Transcript and Presenter's Notes

Title: Integrating Techniques for Eventbased Business Intelligence Gathering


1
Integrating Techniques for Event-based Business
Intelligence Gathering
Kareem S. Aggour John Interrante Ibrahim
Gokcen July 16, 2006
2
Business Problems
  • Manual search of existing news sources/
    aggregators
  • Emergence of novel news sources
  • Dealing with information explosion vs. keeping
    abreast of important developments
  • Distributed data collection across marketing,
    sales

3
Motivation
  • Identify sales/risk leads on 8 topics
  • Risk Bankruptcy, Management Succession,
    Litigation, Change in Auditors, Rating Change
  • Sales Bankruptcy, Outsourcing, Mergers
    Acquisitions, Facility Expansions
  • Provide actionable and focused content to risk
    and sales reps in financial services businesses
  • Automate extraction and integration of events
    from multiple providers
  • Reduce repetitious work by centralizing event
    collection

4
Anticipated Results (MA examples)
  • First Financial Management Corp said it has
    offered to acquire Comdata Network Inc for 18
    per share in cash and stock, or a total of about
    342.7 million.
  • Delta announced last September that it was
    purchasing Western.
  • Nerco Inc said its oil and gas unit closed the
    acquisition of a 47 working interest in the
    Broussard oil and gas field from Davis Oil Co for
    about 22.5 million in cash.

Extract key information from articles efficiently
and with good precision/recall for all topics
5
How? EBIG Agent Architecture
  • Ontology generation
  • Named entity extraction
  • Targeted phrase extraction using a dependency
    grammar
  • Query generation expansion
  • Data visualization
  • Text classification

6
Integrating Techniques
Query expansion
Text classification
Ontology generation
Data visualization
Event extraction
Named entity extraction
7
Extraction Pipeline
Articles
Sentences
Events
Text classification
Ontology patterns
Named entity extraction
Phrase extraction
8
Query Generation Expansion
  • Store queries (to a news source for a given date
    range) to prevent duplicate retrieval
  • If articles exist in the DB, retrieve from DB
  • Expand queries based on previously retrieved
    articles
  • Word frequency analysis on bag of words
  • Present frequent words in relevant articles for
    review

9
Text Classification with SVMs
  • Linear Support Vector Machines (SVMlight)
  • High-dimensionality enables good class separation
  • One-vs-all for 8 topics
  • Amenable to incremental learning
  • Label corrections by research analysts
  • Incoming new articles

10
Data Visualization
  • Centroid algorithm for cluster-preserving
    dimension reduction (Kim et al. 2005) Compute a
    p-dimensional representation qi of an
    n-dimensional vector q (p
  • Compute two centroids
  • Cc1, c2
  • Solve minqi Cqi q2

Rating change articles
Used primarily for article label validation and
finding anomalies
11
Ontology Generation
  • Topic patterns filter sentences
  • Key nouns and key verbs combined (acceptoffer,
    agreeacqui) symmetrically
  • Refined after precision/recall analysis
  • Topic keywords are used to extract events
  • Key nouns, verbs themselves
  • Phrases are extracted around them

12
Named Entity Extraction
  • Existence of an entity (company, organization) in
    a sentence indicates an event
  • Entities become a part of extraction rules
  • Sentences with at least one entity are sent to
    the event extractor
  • No anaphora resolution
  • Commercial and Open Source tools available
  • Connexors MEX, GATE
  • Ability to add custom lexicons in both

13
Targeted Phrase Extraction (TPE)
  • Originates from Functional Dependency Grammar
    (Tapanainen et al.)
  • The syntax tree of a sentence has a unique root,
    which is the main verb of the sentence
  • All other verbs also are roots of subtrees

Delta announced last September that it was
purchasing Western
14
Targeted Phrase Extraction (TPE)
  • Given a target string S (key noun, verb or
    company name) compute its subtree
  • If S is the main verb, output the entire parse
    tree (except tmp)
  • If S is a subject or an object in the sentence
    output the corresponding parse subtree
  • If S is a modifier of a subject or object, output
    the corresponding parse subtree

15
Targeted Phrase Extraction (TPE)
  • Simple TPE rules become predicate-argument pairs
    (word/concept, role)
  • (C-Company, SUBJ) Extract all clauses where a
    company name is a subject
  • Company X acquired Company Y
  • (C-Company, OBJ) Extract all clauses where a
    company name is an object
  • Company X acquired Company Y
  • ((C-Company, ?), (takeover, MOD_OBJ)) Extract
    all clauses where a company name is present and
    the word takeover is an object modifier
  • Company X rebuffed a takeover proposal from
    Company Y

16
Experimental Results
  • Reuters MA Reuters-21578, Apte-90 split, ACQ
    category,
  • WSJ The Wall Street Journal articles on MA,
    Bankruptcy, Facility Expansions

17
Extraction Results
  • First Financial Management Corp said it has
    offered to acquire Comdata Network Inc for 18
    per share in cash and stock, or a total of about
    342.7 million
  • Named entities First Financial Management Corp,
    Comdata Network Inc
  • Delta announced last September that it was
    purchasing Western
  • Named entities Delta
  • Nerco Inc said its oil and gas unit closed the
    acquisition of a 47 working interest in the
    Broussard oil and gas field from Davis Oil Co for
    about 22.5 million in cash.
  • Named entities Nerco Inc, Davis Oil Co
  • Gander Mountain Inc said it acquired the
    privately held Western Ranchman Outfitters, a
    catalog and point-of-purchase retailer of western
    apparel based in Cheyenne, WY.
  • Named entities Gander Mountain Inc, Western
    Ranchman Outfitters

18
Using EBIG
19
Company Searching
20
Industry Searching
21
Event Reports
22
Heatmap Event Visualization
23
Conclusions
  • Illustrated an end-to-end business application of
    event extraction
  • Demonstrated the applicability of a multi-agent
    system integrating ML and NLP techniques to
    collection of focal events
  • Analyst relevance feedback will be critical in
    filtering content
  • Learning costs and benefits of news sources will
    improve information quality and system efficiency
  • Deliberative learning

24
Q A
Write a Comment
User Comments (0)
About PowerShow.com