Knowledge Management Systems: Development and Applications Part I: Overview and Related Fields - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Knowledge Management Systems: Development and Applications Part I: Overview and Related Fields

Description:

Knowledge Management Systems: Development and Applications Part I: Overview and Related Fields Hsinchun Chen, Ph.D. McClelland Professor, Director, Artificial ... – PowerPoint PPT presentation

Number of Views:1506
Avg rating:3.0/5.0
Slides: 75
Provided by: Jane189
Category:

less

Transcript and Presenter's Notes

Title: Knowledge Management Systems: Development and Applications Part I: Overview and Related Fields


1
Knowledge Management Systems Development and
ApplicationsPart I Overview and Related Fields
Hsinchun Chen, Ph.D. McClelland
Professor, Director, Artificial Intelligence Lab
The University of Arizona Founder, Knowledge
Computing Corporation
Acknowledgement NSF DLI1, DLI2, NSDL, DG, ITR,
IDM, CSS, NIH/NLM, NCI, NIJ, CIA, DHS, NCSA, HP,
SAP
????????, ??? ??
2
  • My Background ( A Mixed Bag!)
  • BS NCTU Management Science, 1981
  • MBA SUNY Buffalo Finance, MS, MIS
  • Ph.D. NYU Information System, Minor CS, 1989
  • Dissertation An AI Approach to the Design Of
    Online Information Retrieval Systems (GEAC
    Online Cataloging System)
  • Assistant/Associate/Full/Chair Professor,
    University of Arizona, MIS Department
  • Scientific Counselor, National Library of
    Medicine USA), National Library of China,
    Academia Sinica

3
  • My Background (A Mixed Bag!)
  • Founder/Director, Artificial Intelligent Lab,
    1990
  • Founder/Director, Hoffman eCommerce Lab, 2000
  • PIs NSF CISE DLI-1 DLI-2, NSDL, DG, DARPA, NIJ,
    NIH, CIA, DHS
  • Associate Editors JASIST, DSS, ACM TOIS, IEEE
    SMC, IEEE ITS
  • Conference/program Co-hairs ICADL 1998-2004,
    China DL 2002/2004, NSF/NIJ ISI 2003-2006, JCDL
    2004
  • Industry Consulting HP, IBM, ATT, SGI,
    Microsoft, SAP
  • Founder, Knowledge Computing Corporation, 2000

4
Knowledge Management Overview
5
  • Knowledge Management Overview
  • What is Knowledge Management
  • Data, Information, and Knowledge
  • Why Knowledge Management?
  • Knowledge Management Processes

6
Unit of Analysis
  • Data 1980s
  • Factual
  • Structured, numeric Oracle, Sybase, DB2
  • Information 1990s
  • Factual Yahoo!, Excalibur,
  • Unstructured, textual Verity, Documentum
  • Knowledge 2000s
  • Inferential, sensemaking, decision making
  • Multimedia ???

7
Data, Information and Knowledge
  • According to Alter (1996), Tobin (1996), and
    Beckman (1999)
  • Data Facts, images, or sounds (interpretationme
    aning )
  • Information Formatted, filtered, and summarized
    data (actionapplication )
  • Knowledge Instincts, ideas, rules, and
    procedures that guide actions and decisions

8
Application and Societal Relevance
  • Ontologies, hierarchies, and subject headings
  • Knowledge management systems and practices
    knowledge maps
  • Digital libraries, search engines, web mining,
    text mining, data mining, CRM, eCommerce
  • Semantic web, multilingual web, multimedia web,
    and wireless web

9
The Third Wave of Net Evolution
2010
ARPANET
Internet
SemanticWeb
Function
Server Access
Knowledge Access
Info Access
1995
Unit
Server
Concepts
File/Homepage
1975
2000
Example
Email
Concept Protocols
WWW World Wide Wait
1985
1965
Company
IBM
???
Microsoft/Netscape
10
Knowledge Management Definition
The system and managerial approach to
collecting, processing, and organizing
enterprise-specific knowledge assets for business
functions and decision making.
11
Knowledge Management Challenges
  • making high-value corporate information and
    knowledge easily available to support decision
    making at the lowest, broadest possible levels
  • Personnel Turn-over
  • Organizational Resistance
  • Manual Top-down Knowledge Creation
  • Information Overload

12
Knowledge Management Landscape
  • Research Community
  • NSF / DARPA / NASA, Digital Library Initiative I
    II, NSDL (120M)
  • NSF, Digital Government Initiative (60M)
  • NSF, Knowledge Networking Initiative (50M)
  • NSF, Information Technology Research (300M)
  • Business Community
  • Intellectual Capital, Corporate Memory,
  • Knowledge Chain, Competitive Intelligence

13
Knowledge Management Foundations
  • Enabling Technologies
  • Information Retrieval (Excalibur, Verity, Oracle
    Context)
  • Electronic Document Management (Documentum, PC
    DOCS)
  • Internet/Intranet (Yahoo!, Google)
  • Groupware (Lotus Notes, MS Exchange)
  • Consulting and System Integration
  • Best practices, human resources, organizational
    development, performance metrics, methodology,
    framework, ontology (Delphi, EY, Arthur
    Andersen, AMS, KPMG)

14
Knowledge Management Perspectives
  • Process perspective (management and behavior)
    consulting practices, methodology, best
    practices, e-learning, culture/reward, existing
    IT ? new information, old IT, new but manual
    process
  • Information perspective (information and library
    sciences) content management, manual ontologies
    ? new information, manual process
  • Knowledge Computing perspective (text mining,
    artificial intelligence) automated knowledge
    extraction, thesauri, knowledge maps ? new IT,
    new knowledge, automated process

15
KM Perspectives
16
KM, Emergence of a Discipline (Ponzi, 2004)
  • Influences from three disciplines Management and
    Policy (40), Computer Science (30),
    Information/Library Science (20)
  • Continuous, steady growth since 1990 academic
    publications and industry articles not a fad
    (unlike BPR, TQM)
  • Seminal books and articles in Knowledge
    Management (e.g., Drucker, Davenport, Nonaka)
    the 50 most-cited KM articles

17
KM Thoughts and Thinkers
  • Future organizations are information-based
    organizations of knowledge workers
    Specialization, cross-discipline task teams,
    disappearance of middle managers (Drucker, The
    Coming of the New Organization)
  • The Japanese Management Style Tacit knowledge,
    redundancy, slogans, metaphors the Ba the
    SECI Model Socialization, Externalization,
    Combination, and Internalization (Nonaka, The
    Knowledge-Creating Company)

18
KM Thoughts and Thinkers (contd)
  • Knowledge generation (acquisition, dedicated
    resources, fusion, adaptation, knowledge
    networking) Knowledge codification (mapping and
    modeling knowledge) Knowledge transfer
    Technologies for KM Learning from experiments
    (Davenport, Working Knowledge)
  • Deep Smart Seeing the big picture and knowing
    the skills learning from experience (Leonard,
    Deep Smart)

19
KM Thoughts and Thinkers (contd)
  • Teaching smart people how to learn Defensive
    reasoning and doom loop Learning how to reason
    productively (Argyris, Teaching Smart People How
    to Learn)
  • Technology gets in the way Research on work
    practices Harvesting local innovation and
    innovating with customer PARC anthropologists
    (John Seely Brown, Research that Reinvents the
    Corporation)
  • Inverting organizations (individual professionals
    leading) Creating intellectual webs (Quinn,
    Managing Professional Intellect)

20
Knowledge Management The Industry and Status
21
  • Anderson Consulting (Accenture)
  • (1) Acquire
  • (2) Create
  • (3) Synthesize
  • (4) Share
  • (5) Use to Achieve Organizational Goals
  • (6) Environment Conducive to Knowledge Sharing

22
  • Ernst Young
  • (1) Knowledge Generation
  • (2) Knowledge Representation
  • (3) Knowledge Codification
  • (4) Knowledge Application

23
Reason for Adopting KM
Retain expertise of personnel

51.9
Increase customer satisfaction
43.1
Improve profits, grow revenues
37.5
Support e-business initiatives
24.7
Shorten product development cycles
23
Provide project workspace
11.7
Knowledge Management and IDC May 2001
24
Business Uses Of KM Initiative
Capture and share best practices


77.7
Provide training, corporate learning
62.4
Manage customer relationships
58
Deliver competitive intelligence
55.7
Provide project workspace
31.4
Manage legal, intellectual property
31.4
Continue
25
Leader Of KM Initiative
Knowledge Management and IDC May 2001
26
Implementation Challenges
Employees have no time for KM

41
Current culture does not encourage sharing
36.6
Lack of understanding of KM and Benefits
29.5
Inability to measure financial benefits of KM
24.5
Lack of Skill in KM techniques
22.7
Organizations processes are not designed for KM
22.2
Continue
27
Implementation Challenges
Lack of funding for KM
21.8
Lack of incentives, rewards to share
19.9
Have not yet begun implementing KM
18.7
Lack of appropriate technology
17.4
Lack of commitment from senior management
13.9
No challenges encountered
4.3
Knowledge Management and IDC May 2001
28
Types of Software Purchased
Messaging e-mail

44.7
Knowledge base, repository

40.7
Document management

39.2
Data warehousing

34.6
Groupware
33.1
Search engines
32.3
Continue
29
Types of Software Purchased
Web-based training
23.8
Workflow
23.8
Enterprise information portal
23.2
Business rules management
11.6


Knowledge Management and IDC May 2001
30
Spending On IT Services For KM
15.3 Training
27.8 Consulting Planning
13.7 Maintenance
27 Implementation
15.3 Operations, outsourcing
Knowledge Management and IDC May 2001
31
Software Budget Allotments
Enterprise information portal

35.6
Document management
26.2
Groupware

24.4
Workflow
22.9
Data warehousing
19.3
Search engines
13.0
Continue
32
Software Budget Allotments
Web-based training
11.4
Messaging e-mail
10.8
Other

29.2



Knowledge Management and IDC May 2001
33
Knowledge Management Systems Overview
34
  • Knowledge Management Systems (KMS)
  • Characteristics of KMS
  • The Industry and the Market
  • Major Vendors and Systems

35
Knowledge Management Systems Definition
  • KMSs are computer-based information systems that
  • can help an enterprise acquire, manage, retain,
    analyze, and retrieve mission-critical
    information and help turn enterprise information
    into well-organized, abstract, and actionable
    knowledge and
  • can help an enterprise identify and
    inter-connect experts, managers, and knowledge
    workers and help extract, retain, and
    disseminate their knowledge in an organization.

36
KM Architecture (Source GartnerGroup)
Web UI
Web Browser
Knowledge Maps
Enterprise Knowledge Architecture
Knowledge Retrieval
Conceptual
Physical
KR Functions
Text and Database Drivers
Application Index
Database Indexes
Text Indexes
Workgroup Applications
Databases
Applications
Distributed Object Models
Intranet and Extranet
Network Services
Platform Services
37
Knowledge Retrieval Level (Source GartnerGroup)
Concept Yellow Pages
Retrieved Knowledge
  • Clustering categorization table of contents
  • Semantic Networks index
  • Dictionaries
  • Thesauri
  • Linguistic analysis
  • Data extraction
  • Collaborative filters
  • Communities
  • Trusted advisor
  • Expert identification

Semantic
Value Recommendation
Collaboration
38
Knowledge Retrieval Vendor Direction(Source
GartnerGroup)
Market Target
Newbies
IR Leaders
  • grapeVINE
  • Sovereign Hill
  • CompassWare
  • Intraspect
  • KnowledgeX
  • WiseWire
  • Lycos
  • Autonomy
  • Perspecta
  • Verity
  • Fulcrum
  • Excalibur
  • Dataware

Knowledge Retrieval
NewBies
IR Leaders
Niche Players
  • IDI
  • Oracle
  • Open Text
  • Folio
  • IBM
  • InText
  • PCDOCS
  • Documentum

Lotus
Netscape
Technology Innovation
Microsoft
Niche Players
Not yet marketed
Content Experience
39
  • KM Software Vendors

Challengers
Leaders
Lotus
Microsoft
Dataware
Autonomy
Verity
IBM
Excalibur
Ability to Execute
Netscape Documentum
PCDOCS/
Fulcrum
IDI
Inference
OpenText
Lycos/InMagic
CompassWare
GrapeVINE
KnowledgeX
InXight
WiseWire
SovereignHill
Semio
Intraspect
Visionaries
Niche Players
Completeness of Vision
40
Two Approaches to Codify Knowledge
Top-Down Approach
  • Structured
  • Manual
  • Human-driven

Bottom-Up Approach
  • Unstructured
  • System-aided
  • Data/Info-driven

41
  • Sample KMS
  • Search Engine and Web Portal
  • Data Mining
  • Text Mining
  • Web Mining

42
Managing Information Search Engine and Web
Portal (Source Jan Peterson and William
Chang, Excite)
43
Basic Architectures Search
Log
20M queries/day
Spider
Web
SE
Spam
Index
Browser
SE
SE
Freshness
24x7
Quality results
800M pages?
44
Basic Architectures Directory
Url submission
Surfing
Ontology
Web
SE
Browser
SE
SE
Reviewed Urls
45
Spidering
  • Web HTML data
  • Hyperlinked
  • Directed, disconnected graph
  • Dynamic and static data
  • Estimated 2 billion indexible pages
  • Freshness
  • How often are pages revisited?

46
Indexing
  • Size
  • from 50M to 150M to 3B urls
  • 50 to 100 indexing overhead
  • 200 to 400GB indices
  • Representation
  • Fields, meta-tags and content
  • NLP stemming?

47
Search
  • Augmented Vector-space
  • Ranked results with Boolean filtering
  • Quality-based re-ranking
  • Based on hyperlink data
  • or user behavior
  • Spam
  • Manipulation of content to improve placement

48
Queries
  • Short expressions of information need
  • 2.3 words on average
  • Relevance overload is a key issue
  • Users typically only view top results
  • Search is a high volume business
  • Yahoo! 50M queries/day
  • Excite 30M queries/day
  • Infoseek 15M queries/day

49
Alta Vista within site search, machine
translation
50
Directory
  • Manual categorization and rating
  • Labor intensive
  • 20 to 50 editors
  • High quality, but low coverage
  • 200-500K urls
  • Browsable ontology
  • Open Directory is a distributed solution

51
Yahoo manual ontology (200 ontologists)
52
Special Collections
  • Newswire
  • Newsgroups
  • Specialized services (Deja)
  • Information extraction
  • Shopping catalog
  • Events recipes, etc.

53
The Hidden Web
  • Non-indexible content
  • Behind passwords, firewalls
  • Dynamic content
  • Often searchable through local interface
  • Network of distributed search resources
  • How to access?
  • Ask Jeeves!

54
The Role of NLP
  • Many Search Engines do not stem
  • Precision bias suggests conservative term
    treatment
  • What about non-English documents
  • N-grams are popular for Chinese
  • Language ID anyone?

55
Link Analysis
  • Authors vote via links
  • Pages with higher inlink are higher quality
  • Not all links are equal
  • Links from higher quality sites are better
  • Links in context are better
  • Resistant to Spam
  • Only cross-site links considered

56
Page Rank (Page98)
  • Limiting distribution of a random walk
  • Jump to a random page with Prob. ?
  • Follow a link with Prob. 1- ?
  • Probability of landing at a page D
  • ?/T ? P(D)/L(D)
  • Sum over pages leading to D
  • L(D) number of links on page D

57
Who asks What?
  • Query logs revisited
  • Query-based indexing why index things people
    dont ask for?
  • If they ask for A, give them B
  • From atomic concepts to query extensions
  • Structure of questions and answers
  • Shyam Kapurs chunks

58
Futures
  • Vertical markets healthcare, real estate, jobs
    and resumes, etc.
  • Localized search
  • Search as embedded app
  • Shopping 'bots
  • Open Problems
  • Has the bubble burst?

59
From SE to Web Portal
  • Spidering Intranet and Internet crawling
  • Integration legacy systems and databases
  • Content aggregation and conversion
  • Process Collaboration, chat, workflow
    management, calendaring, and such
  • Analysis data and text mining, agent/alert, web
    mining

60
Discovering Knowledge Data Mining (Source
Michael Welge Automated Learning Group, NCSA)
61
Why Data Mining? -- Potential Applications
  • Database analysis, decision support, and
    automation
  • Market and Sales Analysis
  • Fraud Detection
  • Manufacturing Process Analysis
  • Risk Analysis and Management
  • Experimental Results Analysis
  • Scientific Data Analysis
  • Text Document Analysis

62
Data Mining Confluence of Multiple Disciplines
  • Database Systems, Data Warehouses, and OLAP
  • Machine Learning
  • Statistics
  • Mathematical Programming
  • Visualization
  • High Performance Computing

63
Data Mining A KDD Process
64
Required Effort for Each KDD Step
65
Data Mining Models and Methods
66
Deviation Detection
  • Identify outliers in a dataset.
  • Typical techniques OLAP charting, probability
    distribution contrasts, regression analysis,
    discriminant analysis

67
Link Analysis (Rule Association)
  • Given a database, find all associations of the
    form
  • IF lt LHS gt THEN ltRHS gt
  • Prevalence frequency of the LHS and RHS
    occurring together
  • Predictability fraction of the RHS out of all
    items with the LHS
  • e.g., Beer and diaper

68
Database Segmentation
  • Regroup datasets into clusters that share common
    characteristics.
  • Typical techniques hierarchical clustering,
    neural network clustering (SOM), k-means

69
Predictive Modeling
  • Use past data to predict future response and
    behavior.
  • Typical technique supervised learning (Neural
    Networks, Decision Trees, Naïve Bayesian)
  • E.g., Who is most likely to respond to a direct
    mailing

70
Data/Information Visualization
  • Gain insight into the contents and complexity of
    the database being analyzed
  • Vast amounts of under utilized data
  • Time-critical decisions hampered
  • Key information difficult to find
  • Results presentation
  • Reduced perceptual, interpretative, cognitive
    burden

71
Rule Association - Basket Analysis
72
Text Mining Visualization
This data is considered to be confidential and
proprietary to Caterpillar and may only be used
with prior written consent from Caterpillar.
73
Decision Tree Visualizer
74
From Data Mining to Text Mining
  • Techniques linguistics analysis, clustering,
    unsupervised learning, case-based reasoning
  • Ontologies XML/RDF, content management
  • P1000 A picture is worth 1000 words
  • Formats/types email, reports, web pages, etc.
  • Integration KMS and IT infrastructure
  • Cultural rewards and unintended consequences
Write a Comment
User Comments (0)
About PowerShow.com