Semantics, Syndication and Social Networks: Mechanisms for Future Structured Information Spaces - PowerPoint PPT Presentation

About This Presentation
Title:

Semantics, Syndication and Social Networks: Mechanisms for Future Structured Information Spaces

Description:

Allow curators and users to DIY simple specific ontologies and KBs (targetted ... gate.ac.uk/talks/ecdl-sept-2004.ppt. More: http://gate.ac.uk/ Related projects: ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 15
Provided by: ham48
Category:

less

Transcript and Presenter's Notes

Title: Semantics, Syndication and Social Networks: Mechanisms for Future Structured Information Spaces


1
Semantics, Syndication and Social Networks
Mechanisms for Future Structured Information
Spaces Hamish Cunningham (University of
Sheffield) Werner Haas (Johaneum Research) Ant
Miller (BBC) Libby Miller (University of
Bristol) Ralph Traphoener (Empolis /
Bertelsmann) Paul Warren (British Telecom)
2
Whats the difference between Mother Theresa and
Tony Bliar? http//gate.ac.uk/
http//nlp.shef.ac.uk/ Hamish Cunningham Dept.
Computer Science, University of Sheffield
3
Why semantic metadata?
  • Different types of metadata allow different types
    of search (but also incur different costs and
    have different limits)
  • full text "find me Nevsky in Bulgaria"
  • taxonomy / thesaurus / semantic annotation /
    ontology "find me churches in Eastern Europe"
  • E.g. BBC's INFAX taxonomic system 66 of
    searches would fail if only full text
  • The web promotes diversity but also
    fragmentation there's too much of it less and
    less impact for curated data
  • In face of this cultural memory institutions need
  • Syndication and mediation (to pool outlets and
    multiply impact) this means presentation-independ
    ent, multipurpose content
  • Users as assistants (to cut the cost of
    metadata) this can mean shared
    conceptualisations of content
  • How do we get there?

4
The semantic web and why you can't have it (yet)
  • The semantic web is about a semantic layer for
    interoperability, machine-readability, inference
    ideal for semantic libraries?
  • Problems
  • Construction and maintenance of shared
    taxonomies, terminologies ontologies is
    expensive
  • Annotation of content relative to them is v.
    expensive
  • How does a machine tell the difference between
    "Mother Theresa is a Saint" and "Tony Blair is a
    Saint"? (Beyond the shallow and the general we
    get into typical AI problems, the contextual and
    shifting nature of meaning, etc.)

5
Four promising directions
  1. Use recommender systems to make the users into
    curators assistants (who tells Google which page
    is important? other web users do, by linking
    also Amazon)
  2. Allow curators and users to DIY simple specific
    ontologies and KBs (targetted adjuncts to general
    models like CIDOC)
  3. Use Information Extraction (IE) to populate
    semantic models
  4. Ride the next wave of social software and on-line
    communities (Wikis, Bloggs, OSN, file sharing /
    P2P, RSS/ATOM)

6
IT context the Knowledge Economy and Human
Language
  • Gartner, December 2002
  • taxonomic and hierachical knowledge mapping and
    indexing will be prevalent in almost all
    information-rich applications
  • through 2012 more than 95 of human-to-computer
    information input will involve textual language
  • A contradiction
  • to deal with the information deluge we need
    formal knowledge in semantics-based systems
  • our archived history is in informal and ambiguous
    natural language
  • The challenge to reconcile these two phenomena

7
HLT Closing the Loop
KEY MNLG Multilingual Natural Language
GenerationOIE Ontology-aware Information
ExtractionAIE Adaptive IECLIE Controlled
Language IE
(M)NLG
Semantic Web Semantic GridSemantic Web
Services
Formal Knowledge(ontologies andinstance bases)
HumanLanguage
OIE
(A)IE
ControlledLanguage
CLIE
8
Information Extraction
  • Information Extraction (IE) pulls facts and
    structured information from the content of large
    text collections.
  • Contrast IE and Information Retrieval
  • NLP history from NLU to IE
  • Progress driven by quantitative measures
  • MUC Message Understanding Conferences
  • ACE Advanced Content Extraction
  • General Architecture for Text Engineering (GATE)
    http//gate.ac.uk/

9
IE Example
  • The shiny red rocket was fired on Tuesday. It is
    the brainchild of Dr. Big Head. Dr. Head is a
    staff scientist at We Build Rockets Inc.
  • NE "rocket", "Tuesday", "Dr. Head, "We Build
    Rockets"
  • CO"it" rocket "Dr. Head" "Dr. Big Head"
  • TE the rocket is "shiny red" and Head's
    "brainchild".
  • TR Dr. Head works for We Build Rockets Inc.
  • ST rocket launch event with various participants

10
Ontology-based IE
XYZ was established on 03 November 1978 in
London. It opened a plant in Bulgaria in
Ontology KB
Location
Company
HQ
partOf
City
Country
type
type
HQ
type
type
establOn
partOf
03/11/1978
11
A Necessary Trade-Off Domain specificity vs.
task complexity
general
acceptableaccuracy
specificity
domainspecific
complexity
complex
simple
bag-of-words
events
entities
relations
12
Open information, defended communities
  • Trend 1 seconds out, round 5 file sharing is
    about to go social
  • Trend 2 the living room is about to be
    computerised
  • What will happen when all your living room
    devices fold into a single PC?
  • Bill Gates hopes you'll be running Windoze, but
    Consumer Electronics firms bet on Linux stable
    hardware (no viruses, no crashes, cheap, ...)
  • What if these two trends combine? Ubiquitous
    on-line communities centred on shared content,
    with a model of trust
  • What if memory institutions provide means of
    organising, explaining, interlinking the
    cross-over between modern popular culture and the
    curated memory?
  • Important because DRM is the beginning of the end
    of civilisation as we know it (controls how you
    consume media you buy has the potential to be
    linked with censorship and with invasive
    behaviour logging)
  • you can't make digital objects behave like
    physical objects - unless you totally control the
    hardware and the operating system
  • if someone has control, then we may end up
    finding that someone has given the contract for
    preserving our culture to Haliburton

13
Memory is not a luxury
  • C21st all the C20th mistakes but bigger
    better?
  • If you dont know where youve been, how can you
    know where youre going?
  • Libraries, museums, archives ammunition in the
    war on ignorance (more dangerous than terror?)
  • Ammunition is useless if you cant find it new
    technology must make our history accessible to
    all, for all our futures

14
Summary
  • Cultural memory can benefit from semantic
    metadata, presentation-independence and
    repurposing
  • Semantic web technology
  • no it wont make machines intelligent
  • perhaps simple specific models can work
  • Four ways to cross the AI bridge DIY models
    recommenders IE OSN P2P
  • This talk http//gate.ac.uk/talks/ecdl-sept-2004.
    ppt
  • More http//gate.ac.uk/ ? Related projects
Write a Comment
User Comments (0)
About PowerShow.com