Intelligent Internet Agents for Distributed Data Mining yzhang, sowen, sprasad, rajcs'gsu'edu gjvece - PowerPoint PPT Presentation

Loading...

PPT – Intelligent Internet Agents for Distributed Data Mining yzhang, sowen, sprasad, rajcs'gsu'edu gjvece PowerPoint presentation | free to download - id: e79df-MmQ4O



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Intelligent Internet Agents for Distributed Data Mining yzhang, sowen, sprasad, rajcs'gsu'edu gjvece

Description:

... based agents (e.g. Job Search; Sports-NBA Stats, Bibliography ... Use a GA as the Heuristic Search Engine. Apply the GA selection and inversion operators ... – PowerPoint PPT presentation

Number of Views:369
Avg rating:3.0/5.0
Slides: 24
Provided by: dro2
Learn more at: http://tinman.cs.gsu.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Intelligent Internet Agents for Distributed Data Mining yzhang, sowen, sprasad, rajcs'gsu'edu gjvece


1
Intelligent Internet Agents for Distributed Data
Miningyzhang, sowen, sprasad,
raj_at_cs.gsu.edugjv_at_ece.gatech.edu
  • Yanqing Zhang, Scott Owen, Sushil Prasad and Raj
    Sunderraman
  • Department of Computer Science
  • Georgia State University
  • George Vachtsevanos
  • School of Electrical and Computer Engineering
  • Georgia Institute of Technology

2
Outline
  • Motivation
  • Architecture of Intelligent Internet Agents
  • Program Libraries of Intelligent Middleware
  • Smart Web Search Agents
  • Intelligent Soft Computing Agents
  • Benefits
  • Deliverables
  • Conclusion

3
Motivation
  • Distributed Web KDD Useful information and
    knowledge mined in distributed Web databases
  • QoS (Efficiency, Web Speed, User Time) Huge
    amounts of useless data flow on the Internet
  • From Data Web to Information Web Upgrade a
    current data-flow-oriented Internet to a future
    information-flow-oriented Internet
  • Intelligent Web Middleware with reusable,
    portable and scalable intelligent functionality
  • Smart E-Business Use intelligent Web agents to
    do better E-Business on the Internet

4
Architecture of Intelligent Internet Agents
Application Layer E-Commerce, E-Education,
other E-B
Intelligent Layer Data Mining, Soft Computing,
ES, etc
Network Layer Backbone, gigaPoPs, other hardware
5
Program Libraries of Intelligent Middleware
  • Binary Association Rule Generator
  • 2. Fuzzy Association Rule Generator
  • Neural-Net-based Data Classifier and Pattern
    Generator
  • Fuzzy c-means Program for Data Clustering
  • Genetic Algorithms for Data Refinement and
    Optimization
  • Granular Neural Nets for Linguistic Data Mining
  • XML-based Smart Web Search Sub-Programs
  • Connection Programs between Database and Middle
    Layer
  • Local Cache Database Manager
  • Local Cache Informationbase Manager
  • Basic GUI Programs
  • Client-Server Creation and Communication Programs
  • Distributed Operation Manager
  • Distributed Data Mining Synchronization,
  • Web Customer Log Miner, ... , and so on.

6
Smart Web Search Agents
  • Data Search Engines gtgt Information Search Agents
  • - Traditional searching on the Web is done using
    one of the following three
  • - Directories (Yahoo, Lycos, etc)
  • - Search Engines (AltaVista, NorthernLight,
    etc)
  • - Metasearch Engines (MetaCrawler, SavvySearch,
    AskJeeves, etc)
  • All of these involve keyword searches
    Drawback not easily personalized,
  • too many results (although many give
    relevancy factors)

7
  • - Smart Search Agents will provide
  • - more personalized searches
  • - domain-based search,
  • - more efficient searches

8
  • Smart Search Agents will employ
  • - local cache databases (containing frequently
    asked queries/results possibly updated
    periodically - nightly!)
  • - local cache information base (containing mined
    information and discovered knowledge for
    efficient personal use)
  • - domain-based agents (e.g. Job Search
    Sports-NBA Stats, Bibliography-Digital Libraries)

9
  • Some initial results
  • M. Nagarajan, Metagenie - A metasearch engine
    for multi-databases, M.S. thesis, GSU (July 1999)
  • Domains Jobs, Books
  • S. Ahmed, EXACT-FINDER A cache-based
    meta-search engine, M.S. thesis, GSU (May 2000)
  • Local cache database storing personalized
    frequently asked queries and results, updated
    periodically
  •  R. Sunderraman, ReQueSS Relational Querying of
    semi-structured data, ICDE 2000 (demo session),
    San Diego, CA, March 2000.
  • X. Li, Querying unified sources of Web data,
    M.S. thesis, GSU (July 1999)
  • Data wrappers for Web sources (NBA
    stats/box scores, DBLP Bibliography database)

10
Intelligent Tools for E-Business
  • Computational Intelligence, Neural Networks,
    Fuzzy Logic, Genetic Algorithms, Hybrid Systems
  • Learning Algorithms, Heuristic Searching
  • Data Analysis and Modeling, Data Fusion and
    Mining, Knowledge Discovery
  • Prediction Time Series Analysis
  • Information Retrieval, Intelligent User Interface
  • Intelligent Agents, Distributed IA and
    Multi-Agents, Cooperative Knowledge-based Systems

11
Enhancing E-Business Process Through Data Mining
  • Traditional Data Mining Tools
  • Simple query and reporting
  • Visualization driven data exploration tools, OLAP
  • Discovery process is user driven
  • Quality of discovered knowledge
  • Having right data
  • Having appropriate data mining tools!!!

12
Intelligent Data Mining Tools
  • Automate the process of discovering
    patterns/knowledge in data
  • Require hypothesis, exploration
  • Derive business knowledge (patterns) from data
  • Combine business knowledge of users with results
    of discovery algorithms

13
Intelligent Information Agents
  • The Data Mining Problem
  • Clustering/ Classification
  • Association
  • Sequencing
  • Viewed as an Optimization Problem
  • Tools Genetic Algorithms

14
Fuzzy Rules Discovering
  • Rules discovering The discovery of associations
    between business events, i.e. which items are
    purchased together
  • In order to do flexible querying and intelligent
    searching, fuzzy query is developed to uncover
    potential valuable knowledge
  • Fuzzy Query uses fuzzy terms like tall, small,
    and near to define linguistic concepts and
    formulate a query
  • Automated search for fuzzy Rules is carried out
    by the discovery of fuzzy clusters or
    segmentation in data

15
Fuzzy Decision MakingMatch Users with Dynamic
Products, Services, and Pricing
Low Risk High Response High Retention -gt Customer
Preferred Pricing according to Life-time
Value Cross-Selling Bundle Extra Liability
Insurance
Loss Ratio
R
(
isk)
Low Medium High
Persistency
Low Medium High
R
(
etention)
Low
Medium High
R
esponse
16
Measuring Performance of Intelligent Agents
  • Accuracy distance or variance measure of IAs
    performance from their goal, i.e. Fuzzy
    Entropy
  • Speed latency of response
  • Cost resources consumed, consequences of
    failures
  • Benefit payoff for goals achieved

17
Performance Assessment, Learning and Optimization
Learning/ Adaptation
Performance Evaluation Module
Goals/ Objectives
18
Examples
  • Product Information Clustering
  • Use a GA as the Heuristic Search Engine
  • Apply the GA selection and inversion operators
  • Evaluate information content
  • Estimate system entropy
  • Apply reinforcement learning strategy
  • Dynamic Pricing
  • In addition to above steps, explore association
    and sequencing relations

19
The New Technology Paradigm
Internet Related Technologies
Euphoria/ Optimism
Reality
Back to Basics
Time
20
INFORMATION IS SELLING NOW!
Intelligent Agents will give your information
product bargaining power
21
Benefits
  • Better QoS
  • - Web users get information (not raw data)
  • - Smart agents can make decisions for users
  • - Smart agents can save users surfing time
  • Faster Internet
  • - Information flows on the Internet quickly
    (e.g., 1k information ltlt 100 k raw data)
  • - Reduce data redundancy on the Internet
  • - Reduce Web communication congestion

22
Deliverables
  • Intelligent Middle Layer
  • - Data Mining Program Libraries
  • - Soft Computing Program Libraries (e.g.,
    Neural Networks, Fuzzy Logic, Genetic Algorithms,
    Neuro-fuzzy Systems)
  • Application Layer
  • - Smart Web Search Agents
  • - Intelligent Soft Computing Agents

23
Conclusion
  • To make the future Internet more intelligent and
    more efficient, it is necessary to design
    relevant "Intelligent Middleware" between network
    hardware and high-level Web application systems.
  • We will first design basic intelligent middle
    layer with basic intelligent functionality, and
    then implement two Web application systems for
    distributed data mining and E-Business.
About PowerShow.com