Concept Switching in the Interspace: Networking Infrastructure for Community Knowledge PowerPoint PPT Presentation

presentation player overlay
1 / 61
About This Presentation
Transcript and Presenter's Notes

Title: Concept Switching in the Interspace: Networking Infrastructure for Community Knowledge


1
Concept Switching in the InterspaceNetworking
Infrastructure for Community Knowledge
Bruce Schatz CANIS LaboratoryGraduate School of
Library and Information ScienceUniversity of
Illinois at Urbana-Champaign Graduate School of
Informatics, Kyoto University schatz_at_kuis.kyoto-u.
ac.jp, www.canis.uiuc.edu
IEEE Knowledge Media Networking KMN02 Keynote
Address, CRL, Kyoto Japan, July 11, 2002
2
THE THIRD WAVE OF NET EVOLUTION
CONCEPTS
OBJECTS
PACKETS
3
CONCEPT SPACES
  • from Objects to Concepts
  • from Syntax to Semantics
  • Infrastructure is Interaction with Abstraction

Internet is packet transmission across
computers Interspace is concept navigation
across repositories
4
LEVELS OF INDEXES
5
THE DISTRIBUTED WORLD
  • Community Repositories in the Interspace
  • Peer to Peer Networking Infrastructure
  • Every Person performs Every Role

USER request LIBRARIAN reference INDEXER class
ify PUBLISHER quality AUTHOR generate
6
Meta Data
  • How to Represent the
  • Community Knowledge
  • Automatic and Interactive
  • Representation Techniques
  • for Capturing the
  • Fundamental Structure

7
Meta Maps
  • How to Locate the
  • Community Knowledge
  • Automatic and Interactive
  • Location Techniques
  • for Capturing the
  • Fundamental Landscape

8
CONCEPTS ACROSS THE INTERSPACE
9
SCALABLE SEMANTICS
  • Automatic indexing
  • Domain-Independent indexing
  • Statistical clustering
  • Compute Context of
  • concepts within documents
  • documents within repositories

10
CROSS-OVERS IN SEMANTIC INDEXING
11
COMPUTING CONCEPTS
92 4,000 (molecular biology) 93 40,000
(molecular biology) 95 400,000 (electrical
engineering) 96 4,000,000 (engineering) 98
40,000,000 (medicine)
12
SIMULATING A NEW WORLD
  • Obtain discipline-scale collection
  • MEDLINE from NLM, 10M bibliographic abstracts
  • human classification Medical Subject Headings
  • Partition discipline into Community Repositories
  • 4 core terms per abstract for MeSH classification
  • 32K nodes with core terms (classification tree)
  • Community is all abstracts classified by core
    term
  • 40M abstracts containing 280M concepts
  • concept spaces took 2 days on NCSA Origin 2000
  • Simulating World of Medical Communities
  • 10K repositories with gt 1K abstracts (1K w/ gt
    10K)

13
COMMUNITY PROCESSING
14
Semantic Indexing
  • Extracting Concepts (AI)
  • Canonical noun phrases
  • Generic statistical parser
  • Computing Context (IR)
  • Co-occurrence frequency, in collection
  • Useful interactively, not strict ordering

15
System Side Infrastructure
  • Classification Technologies for Multimedia
    Documents
  • Phrases (multi-word nouns)
  • Concepts (generic phrases)
  • Types (identified concepts)
  • Clusters (grouped types)
  • Structures (semantic universals)

16
INTERSPACE NAVIGATION
  • Semantic Indexes for Community Repositories
  • Navigating Abstractions within Repository
  • concept space category map
  • Interactive browsing by Community experts
  • www.canis.uiuc.edu/interspace-prototype

17
Interspace Remote Access Client
18
Navigation in MEDSPACE
  • For a patient with Rheumatoid Arthritis
  • Find a drug that reduces the pain (analgesic)
  • but does not cause stomach (gastrointestinal)
    bleeding

Choose Domain
19
Concept Search
20
Concept Navigation
21
Retrieve Document
22
Navigate Document
23
Retrieve Document
24
(No Transcript)
25
Category Map
26
Category Navigation
27
Concept Navigation
28
User Side Infrastructure
  • Navigation Technologies for
  • Search Interfaces
  • Exact Match (noun phrases)
  • Relationship List (concept suggestions)
  • Cluster Comparison (groups to groups)
  • Spreading Activation (group intersections)
  • Artificial Landscapes (semantic distances)

29
SWITCHING
  • In the Interspace
  • each Community maintains its own repository
  • Switching is navigating Across repositories
  • use your vocabulary to search
    another specialty

30
Medicine Session
31
Categories and Concepts
32
Concept Switching
33
Document Retrieval
34
CONCEPT SWITCHING
  • Concept versus Term
  • set of semantically equivalent terms
  • Concept switching
  • region to region (set to set) match

35
ENGINEERING SESSION
36
Engineering Categories Concepts
37
Further Concept Navigation
38
Searching via Concept Suggestion
39
Switching Across Repositories
40
Future Technologies
  • Concept Switching
  • Spreading activation, type tagging
  • Dynamic Indexing
  • On-the-fly collections, during session
  • Path Matching
  • Aggregating indexes, many repositories

41
Semantic Analysis of Multimedia
  • Collections of Objects containing Units
  • Text community repository (topic proximity)
  • document abstracts containing noun phrases
  • Image aerial photograph (spatial proximity)
  • feature regions containing texture tiles
  • Units -- media-dependent (statistical parsers)
  • Indexes -- media-independent (statistical
    clusters)

42
Media Interoperability Model
  • text concept space category map (geoscience)
  • 1M phrases in 500K abstracts from Georef
  • and Petroleum Abstracts
  • image concept category maps in aerial photos
  • visual thesaurus maps for 200K regions in 800
    images (6M tiles)
  • geographic map (where) v. semantic map (what)
  • spatial gazetteer as bridge imageltgttextltgtnumber

43
Text and Number Interoperability
Integrated Result Within the bounding geography
location, 2 documents and 88 AVHRR records
related to the integrated query are retrieved.
Text and AVHRR Query Show me information about
Santa Barbara area with mild temperature and high
vegetation density.
44
Image Concept Switching
Image Query By browsing a texture (tile)
catalog, show me information about residential
and farm land areas.
Result A set of related images are retrieved and
shown in the Results Frame. The full-size image
368 is displayed with its place names and tile
locations.
45
INFORMATION SPACEFLIGHT
  • Landscape as category map visualization
  • Valleys are semantic clusters
  • Hills are semantic distances
  • Traversal across multiple levels of abstraction

46
Category Maps
47
SELF-ORGANIZING MAPS (SOMs)
48
INFORMATION SPACEFLIGHT
49
INFORMATION SPACEFLIGHT
Flying through Cyberspace
50
THE NET OF THE 21st CENTURY
  • Beyond Objects to Concepts
  • Beyond Search to Analysis
  • Problem Solving via Cross-Correlating Multimedia
    Information across the Net
  • Every community has its own special library
  • Every community does semantic indexing
  • The Interspace is true Cyberspace

51
Subject Assignment
  • Improved Search by
  • Identifying Subjects
  • Human Indexers classify Documents
  • From Subject Thesaurus and Knowledge
  • Interactive Support for Community Curators
  • (Subject Experts but Classification Amateurs)
  • Use Concept Spaces to Suggest Subjects
  • From Related Documents in the Collection
  • See Best Paper Nominee at ACM DL 98

52
Structure Assignment
  • Improved Search by
  • Identifying Structures
  • Human Indexers classify Clusters
  • From Generic Structures beyond Subjects
  • Universal Structures Cross-Cultural
  • Interactive Support for Community Curators
  • (Subject Experts but Classification Amateurs)
  • Necessary for Peer-Peer Infrastructure
  • When Ordinary Persons form Communities

53
(No Transcript)
54
The Structures of Everyday Life
  • Bodies (individuals)
  • Food and Clothes
  • Buildings (groups)
  • Houses and Cities
  • Transportation (physical interactions)
  • Rails (trains) and Roads (cars)
  • Communication (logical interactions)
  • Phones (talking) and Computers (retrieving)

55
Navigating Universal Structures
  • A planet for every kids local environment
  • Federating the planets into a universe
  • Ordering all planets from kids Point Of View
  • Flying through the Kids Universe
  • Finding similar kids from different POVs
  • Connecting historically through museums

56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com