Web Intelligence (WI) - PowerPoint PPT Presentation

Loading...

PPT – Web Intelligence (WI) PowerPoint presentation | free to download - id: 4a62b6-YTNjN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Web Intelligence (WI)

Description:

Definition, Research Challenges and Major Tools Yang Chen UNC Charlotte GATE Results display Attributes Project information SOM The multi-level self-organizing map ... – PowerPoint PPT presentation

Number of Views:256
Avg rating:3.0/5.0
Slides: 41
Provided by: yche6
Learn more at: http://coitweb.uncc.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Web Intelligence (WI)


1
Web Intelligence (WI)
  • Definition, Research Challenges and Major Tools

Yang Chen UNC Charlotte
2
Outline
  • A brief history of Web Intelligence
  • Motivations for WI
  • Definition and Perspectives of WI
  • Research Agenda
  • Major Web Intelligence Tools
  • Conclusion

3
A Brief History of WI
  • 1999 Collaborative research initiatives
  • Ning Zhong, Data Mining and Knowledge Systems
  • Jiming Liu, Intelligent agents and multi-agents
  • Yiyu Yao, Information retrieval and intelligent
    information systems
  • Combined research efforts with common goal
    create a new sub-discipline covering theories and
    techniques related to web information.

4
A Brief History of WI
  • 2000 Publication of a two-page position paper on
    WI (Zhong, Liu, Yao, Ohsuga, COMPSAC 2000)

5
A Brief History of WI
  • 2001 First Asia-Pacific Conference on Web
    Intelligence
  • 2002 Publication of first special issue on WI in
    IEEE Computer
  • 2002 Web Intelligence Consortium
  • 2003 First edited book on WI
  • 2005 The international WIC Institute

6
Outline
  • A brief history of Web Intelligence
  • Motivations for WI
  • Definition and Perspectives of WI
  • Trends and Research Agenda
  • Major Web Intelligence Tools
  • Conclusion

7
Motivation
  • The sheer size of Web
  • Difficulties in the storage, management, and
    efficient and effective retrieval
  • Complexity of Web
  • Heterogeneous collection of structured,
    unstructured, semi-structured, interrelated, and
    distributed Web documents
  • Consist texts, images and sounds

8
Motivation
Web Intelligence on the Web
9
Industrial Interests in WI
  • Web Intelligence kis-lab.com/wi01/
  • Web-Intelligence Home Page
  • www.web-intelligence.com/
  • Intelligence on the Web
  • www.fas.org/irp/intelwww.html
  • WIN home WEB INTELLIGENCE NETWORK,
  • smarter.net/
  • CatchTheWeb - Web Research, Web Intelligence
    Collaboration www.catchtheweb.com/
  • Infonoia Web Intelligence In Your Hands
  • www.infonoia.com/myagent/en/baseframe.html

10
Motivations
  • Data production on the Web is at an exponential
    growth rate.
  • A fast growing industrial interest in WI
  • Only a few academic papers
  • We need to narrow the gap between industry needs
    and academic research.

11
Outline
  • A brief history of Web Intelligence
  • Motivations for WI
  • Definition and Perspectives of WI
  • Research Agenda
  • Major Web Intelligence Tools
  • Conclusion

12
What is Web Intelligence
  • Web Intelligence (WI) exploits the fundamental
    and practical impact that advanced Information
    Technology (IT) and innovative Artificial
    Intelligence (AI) will have on the Web
  • Integration of IT with AI
  • Applications of AI on the Web

13
Web Intelligence System
Based on Zhongs AWIC03 keynote talk
14
An Example
15
Advanced Questions
  • How the customer enters VIP portal in order to
    target products and manage promotions and
    marketing campaigns?
  • What is the semantic association between the
    pages the customer visited?
  • Is the visitor familiar with the Web structure?
    Or is he or she a new user or a random one?
  • Is the visitor a Web robot or other users?

16
Advanced WI System
  • Making a dynamic recommendation to a Web user
    based on the user profile and usage behavior
  • Automatic modification of a websites contents
    and organization
  • Combining Web usage data with marketing data to
    give information about how visitors used a
    website.

17
Advanced WI System
18
Perspectives of WI
  • WI can be classified into four categories (based
    on Russel Norvigs scheme)

19
Outline
  • A brief history of Web Intelligence
  • Motivations for WI
  • Definition and Perspectives of WI
  • Research Agenda
  • Major Web Intelligence Tools
  • Conclusion

20
Research Agenda of WI
  • Semantic Web mining and automatic
  • construction of ontologies
  • Social network intelligence

21
The Semantic Web
  • The Semantic Web is based on languages that make
    more of the semantic content of the page
    available in machine-readable formats for
    agent-based computing.
  • A semantic language that ties the information
    on a page to machine readable semantics
    (ontology).

22
Components of Semantic Web
  • A unifying data model such as RDF.
  • Languages with defined semantics, built on RDF,
    such as OWL (DAMLOIL).
  • Ontologies of standardized terminology for
    marking up Web resources.
  • Tools that assist the generation and processing
    of semantic markup.
  • Ontologies provides the semantic backbone for
    Semantic Web applications.

23
Ontologies offer
  • Communication
  • Normative models, Networks of relationships
  • Sharing Reuse
  • Specifications, Reliability
  • Control
  • Classification, and Finding, sharing, discovering
    relationships

24
Categories of Ontologies
  • A domain-specific ontology describes a
    well-defined technical or business domain.
  • A task ontology might be either domain-specific
    or reconstructed from a set of domain-specific
    ontologies for meeting the requirement of a task.
  • A universal ontology describes knowledge at
    higher levels.

25
Research Agenda of WI
  • Semantic Web mining and automatic
  • construction of ontologies
  • Social network intelligence

26
The Web as a Graph
  • We can view the Web as a directed social network
    that connects people (organizations or social
    entities).
  • Research Questions
  • How big is the graph? (outdegree and indegree)
  • Can we browse from any page to any other?
    (clicks)
  • Can we exploit the structure of the Web?
    (searching and mining)
  • How to discover and manage the Web communities?
  • What does the Web graph reveal about social
    dynamics?

27
Social Network Intelligence
28
Social Network
29
Outline
  • A brief history of Web Intelligence
  • Motivations for WI
  • Definition and Perspectives of WI
  • Trends and Research Agenda
  • Major Web Intelligence Tools
  • Conclusion

30
Major Web Intelligence Tools
  • I. Collection
  • Offline Explorer
  • SpidersRUs (AI Lab)
  • Google Scholar
  • II. Analysis (Data and Text Mining)
  • Google APIs
  • Google Translation
  • GATE
  • Arizona Noun Phraser (AI Lab)
  • Self-Organizing Map, SOM (AI Lab)
  • Weka
  • III. Visualization
  • NetDraw
  • JUNG
  • Analysts Notebook and Starlight

31
Collection Offline Explorer
Project list
Project properties setup window
Download URLs
File filters, URL filters, and other advanced
properties.
Download level
File modification check
32
Analysis Google APIs
  • Google provides many APIs to help you quickly
    develop your own applications.
  • http//code.google.com/more/
  • Examples of Google APIs
  • Google API for Inlink Discovers what pages link
    to your website.
  • Google Data APIs Provide a simple, standard
    protocol for reading and writing data on the Web.
    Several Google services provide a Google Data
    API, including Google Base, Blogger, Google
    Calendar, Google Spreadsheets and Picasa Web
    Albums.
  • Google AJAX Search API Uses JavaScript to embed
    a simple, dynamic Google search box and display
    search results in your own Web pages.
  • Google Analytics Allows users gather, view, and
    analyze data about their Website traffic. Users
    can see which content gets the most visits,
    average page views and time on site for visits.
  • Google Safe Browsing APIs Allow client
    applications to check URLs against Google's
    constantly-updated blacklists of suspected
    phishing and malware pages.
  • YouTube Data API Integrates online videos from
    YouTube into your applications.

33
GATE
  • Information Extraction tasks
  • Named Entity Recognition (NE)
  • Finds names, places, dates, etc.
  • Co-reference Resolution (CO)
  • Identifies identity relations between entities in
    texts.
  • Template Element Construction (TE)
  • Adds descriptive information to NE results (using
    CO).
  • Template Relation Construction (TR)
  • Finds relations between TE entities.
  • Scenario Template Production (ST)
  • Fits TE and TR results into specified event
    scenarios.
  • GATE also includes
  • Parsers, stemmers, and Information Retrieval
    tools
  • Tools for visualizing and manipulating ontology
    and
  • Evaluation and benchmarking tools.

34
GATE
Attributes
Project information
Results display
35
SOM
  • The multi-level self-organizing map neural
    network algorithm was developed by Artificial
    Intelligence Lab at the University of Arizona.
  • Using a 2D map display, similar topics are
    positioned closer according to their
    co-occurrence patterns more important topics
    occupy larger regions.

36
SOM
Different Topics
Warm colors represent new topics.
37
Visualization JUNG
  • The Java Universal Network/Graph Framework (JUNG)
    is a software library for the modeling, analysis,
    and visualization of data that can be represented
    as a graph or network. It was developed by School
    of Information and Computer Science at the
    University of California, Irvine.
  • http//jung.sourceforge.net/index.html
  • The current distribution of JUNG includes
    implementations of a number of algorithms from
    graph theory, data mining, and social network
    analysis
  • Clustering
  • Decomposition
  • Optimization
  • Random Graph Generation
  • Statistical Analysis
  • Calculation of Network Distances and Flows and
    Importance Measures (Centrality, PageRank, HITS,
    etc.).

38
JUNG
Examples of visualization types
39
Conclusion
  • The marriage of hypertext and internet leads to a
    revolution the Web.
  • The marriage of Artificial Intelligence and
    Advanced Information Technology, on the platform
    of Web, will lead to another paradigm shift the
    Intelligent and Wisdom Web.

40
Thank You
Any Question?
About PowerShow.com