Information Integration the Web Way - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Information Integration the Web Way

Description:

It is virtually impossible to discover critical information that you did not ... you will be invited, if not to the first one (April) to the lollapalooza InterOp ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 29
Provided by: kmN4
Category:

less

Transcript and Presenter's Notes

Title: Information Integration the Web Way


1
Information Integration the Web Way
  • Andrew Schain, Kendall Clark

2
Background
  • It all started innocently enough, I was at a
    conference 8 months ago and heard about a problem
    Jeanne was having
  • It is an interesting problem.

3
The Problem
  • It is virtually impossible to discover critical
    information that you did not know existed and
    extremely difficult to find relevant information
    you are aware of.
  • Our data problem exists within at least 5
    dimensions size, complexity, diversity, rate of
    growth and trust.
  • Use-case scenarios and requirements change all
    the time.
  • We cannot anticipate in advance what the next
    collection of information elements need to be or
    for what purpose!!

4
The challenge
  • Integrate information from disjoint data sources,
    ad hocly, to solve customer needs
  • Without upsetting delicate info-ecologies (data
    owners, curators, extant policies procedures)
  • Without requiring major investment in time or

5
The inspiration...
6
The Goal
  • Alleviate NASAs data management problem by
    making information discoverable with machine
    assistance, and retrieved as an integrated
    response across different databases,
    repositories, sources and systems.

7
The design principles
8
  • Aggregate and federate information
  • Deploy a service that makes the whole
    infrastructure smarter
  • Leverage public standards
  • Innovate in the user interface
  • Formalize information models

9
Info Federation
  • Must leave data in situ, close to those who know
    it best
  • Must not upset delicate info-niches by alienating
    curators, owners, or violating policies politics
  • Use a sufficiently expressive federation
    technology (in this case, W3Cs RDF)

10
Deploy a service
  • POPS is an expertise locator app, but...
  • Also a service, deployed in the fabric of NASA
    application infrastructure
  • Thus, the POPS data is reusable by other apps
  • Lower barriers to ad hoc reuse

11
Leverage public standards
  • Why? Humility, laziness, tiny budget!
  • Promotes reuse, cohesion with existing
    technologies
  • Open Source software is our friend
  • Return on Investment
  • They work! )

12
Some candidate public standards
  • HTTP, SOAP, WSDL, SPARQL Protocol for RDF
  • XML, RDF, RDFS, JSON, OWL
  • SPARQL Query Language for RDF
  • FOAF, DOAP
  • Atom Syndication Format

13
Innovative UI
  • Be different? Look-feel-and-act different.
  • JSpace, a Polyarchical Visual Query Builder for
    Federated RDF Stores
  • Social Network visualizations
  • What the hell is a polyarchy?

14
JSpace
  • A polyarchy is a means of interacting with
    multiple intersecting hierarchies
  • Which is precisely what many information
    integration problems are (people orgs
    projects skills)
  • The backend is only half the problem

15
Visual Query Builder?
  • Folks can learn a QL, but why?
  • Get the machine to build queries based on regular
    and customary user input browsing
  • Browsing better than searching
  • Add query-by-example (find another thing like
    this thing except with this difference...)
  • Propinquity!!!

16
Formalizing Information
  • Use OWL (Web Ontology Language) to formalize the
    problem domain(s)
  • Why?
  • Correctness, create shared understanding,
    regulatory compliance (DRM)
  • To prepare for the eventual semantic upshift

17
Other stuff we probably need
  • Model Libraries
  • Data Access agreements
  • Data assurance
  • Model assurance
  • Good go to application models
  • Desire commitment
  • Lots more

18
Whats next for POPS?
  • We have buy-in and project plan with the OCE. We
    will validate our agreement (plan)for
    implementation within the NEN and have it done
    before I go on vacation in August
  • Continue working with Clark Parsia Kendall,
    Bijan, Jen Golbeck, Mike Grove, Chris Shenton
    others including folks on my SAIC team to build
    a similar service at HQ for EA as-builts.

19
What about the rest of us?
  • Lets throw a party!
  • For our comrades who are current practitioners
  • Give them a blank piece of paper and write down
    stuff that would make things easier and stuff
    that makes things really hard
  • Invite some folks we want to make friends with
    and work the list together

20
Examples
21
Models in Federated Libraries
  • Domain specific references that can be used by
    developers
  • Domain specific information representations
    (complete with logic, cardinality, etc) that can
    be used to form queryable information that cuts
    across sources
  • Code repositories
  • Web Services repositories so that task-orientated
    computing services can be discovered, assessed,
    choreographed, and orchestrated

22
Data access agreements
  • Between who and who (and who is keeping track?)
  • Valid?
  • Has it gone thru a validity checker y/n?
  • Current?
  • Is it fresh? (may not need to be) but we need to
    know
  • Provenance?
  • Who is the responsible person for the system and
    for the data?
  • Access Permissions?
  • Given the set of data required, does the access
    permission change?

23
But really to
  • Articulate the goal
  • Develop the planning, including gathering
    requirements, prioritizing tasks, identifying
    resources, and setting up a road map for the next
    few years.
  • Some of you will be invited, if not to the first
    one (April) to the lollapalooza InterOp

24
Backups
25
Mathematics of the who-knows-who relationship
visualization
Given a set of people, P and a set of
relationships, R, that connect people and
entities We define five types of relationships
1) same facility, 2) same department, 3) same
skill and department, 4) same skill and project,
5) same skill, project, and facility. Call these
r1 - r5. rixy indicates a relationship of type i
between person x (px) and person y (py) There is
a direct connection between users pu and ps if
there exists an rmus If there is not a direct
connection, we search for a path from pu to ps by
finding pa such that there exists rmua, rnas.
Then, we add (pu, ps, pa, rmua, rnas) to the
graph. For example, if Alice is the user and Bob
is the selected person, we will look for a direct
relationship between them, such as if Alice and
Bob both work in the same department (i.e. find
rmalice,bob). If the direct relationship does
not exist, we look at all the people Alice has
relationships with, and check to see if any of
them also have relationships with Bob. For
example, Alice may work in the same facility as
Chuck (r1alice,chuck). Chuck, in turn, may have
the same skill and work on the same project as
Bob (r4Chuck,Bob). Chuck then becomes a
connection between Alice and Bob. All three
people and their relationships are added to the
graph.
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com