XML Working Group, Emerging Technology Committee, U'S' Federal CIO Council - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

XML Working Group, Emerging Technology Committee, U'S' Federal CIO Council

Description:

When an application accesses a view, the EII platform transparently handles ... User can view results in any category tree, and switch views from one tree to ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 37
Provided by: Kin6150
Category:

less

Transcript and Presenter's Notes

Title: XML Working Group, Emerging Technology Committee, U'S' Federal CIO Council


1
XML Working Group,Emerging Technology
Committee,U.S. Federal CIO Council
XML, XFML, and the Other Xs From Data
Aggregation to Faceted Search-and-Discovery to
Business Value
Joint presentation
Washington, DC July 23, 2003
Brett Stein, Solutions Director XAware, Inc.
Iqbal Talib, Co-Founder CEO i411, Inc.
2
Presentation Outline (Road Map)
1. Objectives 2. Rationale
Brett
Iqbal
31. Demo 2
3. Context 4. Impetus
18. Demo 1
19-30. Faceted search discovery
5-13. Data aggregation
32-33. Conclusions, QA
14-16. Context for Demos 1 2
3
1. Objectives (what? bottom line?)
  • __________________________________________________
    ___
  • Demystify some of the Xs, e.g.
  • - XML, XFML, XTM, and XQuery
  • - XSLT, ebXML, DASL
  • __________________________________________________
    ___
  • Show the power of creating real-time read/write
    access through unified XML views of disparate,
    dispersed, and (un)structured data sources among
    Federal agencies
    Demo 1 Convert OMB data to XML
  • __________________________________________________
    ___
  • Show human-friendly, category-based navigation
    via real-time, faceted search-and-discovery
    technology Demo 2
    Conduct faceted search of OMB data
  • __________________________________________________
    ___
  • Illustrate how others will benefit, e.g.,
    unlock the full social, economic, and
    intelligence value of information assets of a
    government agency
  • __________________________________________________
    ___

4
2. Rationale (basis for joint presentation?)
  • __________________________________________________
    ________
  • Vital need among Federal agencies for data
    integration, e.g., treat multiple data
    repositories as one logical sourceand create a
    single view
  • Urgency to access, search, retrieve, share, and
    exchange critical information across sister
    agenciesoften in real time
  • Leverage innovativebut provenbetter, faster
    mousetraps to attain technology and business
    goals, including ROI
  • __________________________________________________
    ________
  • XML underpins the common format for data
    exchangeand the basis upon which faceted
    searching can be enabled
  • XFML allows the sharing of hierarchical faceted
    metadata and indexing efforts
  • __________________________________________________
    ________
  • Relevance to the xmlWG CIO Counciland
    education and outreach to government
  • Show me demos suggested by the xmlWG (OMB
    Exhibit 53)
  • __________________________________________________
    ________

5
3. Context (big picture? external forces at play?)

10
1
Citizens/CRM
New world
2
9
New technologies
Congress
Civilian agencies
3
8
GPRA/NPR 1993, ITMRA 1996
Defense agencies
Intelligence agencies
Federal government
U.S. DHS 2003
Security agencies

4
7
ERM GPEA 1998
e-Gov/ FirstGov 2002
5
6
9/11, U.S. Patriot Act
FEA 2001
6
4. Impetus (example issues?)
  • Government agencies face significant challenges
    and opportunities with tagging, sharing,
    searching, and exchanging information, e.g
    __________________________________________________
    _________________
  • Of the 28 lines of business found in the Federal
    government, 19 Executive Departments and agencies
    (on average) are performing the same line of
    business E-Government Strategy, OMB
  • Each agency typically has invested intraditional
    approaches, regardless of other departments
    redundant efforts E-Government Strategy, OMB
  • __________________________________________________
    ________________
  • Agencies may be wasting at least 20 of the
    approx. 60 billion (FY 2004) allocation for IT
    on redundant systems and services
  • 40 of all application development effort is
    spent on accessing existing data IDC
  • __________________________________________________
    ________________
  • Myriad issues, trends, and mandates
  • - typhoons of data
  • - silos, stovepipes
  • - interoperability, interconnectivity
  • - net-centricity
  • - horizontal fusion
  • - EA, EAI, ERM
  • __________________________________________________
    ________________

7
5. XMLand the Other Xs (brief definitions?)
  • XML (eXtensible Markup Language) A standard,
    simple, self-describing way of encoding both text
    and data so that content can be processed with
    relatively little human intervention and
    exchanged across diverse hardware, operating
    systems, and applications.
  • XFML (eXchangeable Faceted Metadata Language) An
    XML model to express topics, organized in
    hierarchies or trees within mutually exclusive
    containers called facets. It also allows the
    expression of indexing efforts metadata
    assigned to pages of data
  • XTM (XML Topic Maps) An XML specification that
    provides a model and grammar for representing the
    structure of information resources used to define
    topics, and the associations (relationships)
    between topics.
  • XQuery (XML Query Language) A query language
    that uses the structure of XML to intelligently
    express queries across all these kinds of data,
    whether physically stored in XML or viewed as XML
    via middleware.
  • XSLT (eXtensible Stylesheet Language
    Transformation) defines the syntax and semantics
    of XSLT, which is a language for transforming XML
    documents into other XML documents (HTML).
  • ebXML (Electronic Business using eXtensible
    Markup Language) a modular suite of
    specifications that enables enterprises of any
    size and in any geographical location to conduct
    business over the Internet.
  • XFDL (eXtensible Forms Description Language) The
    purpose of XFDL is to solve the body of problems
    associated with digitally representing complex
    forms such as those found in business and
    government.


8
6. Standards-Driven Enterprise Info. Integration
  • __________________________________________________
    _________
  • Virtual views into multiple data sources (
    Source Giga Information Group)
  • A view represents a business entity in a
    metadata-based description
  • a customer
  • a sales pipeline
  • the performance of a manufacturer's production
    floor
  • __________________________________________________
    _________
  • Applications access a view as if the data were
    physically located in a single databaseeven
    though individual data may reside in a different
    source system
  • When an application accesses a view, the EII
    platform transparently handles connectivity with
    back-end databases and applications, along with
    related functions, e.g., security, data
    integrity, and query optimization
  • __________________________________________________
    _________
  • XML is the ideal conduit for enabling such
    on-demand views
  • XML Views facilitate the continued utilization of
    current IT investments, and the eventual
    intelligent migration to new world technologies
    according to business rules ______________________
    _____________________________________

9
7. Single View of Information
Across all levels and functions of government
10
8. Solving the Integration Challenge
11
(No Transcript)
12
9. Information On-Demand
  • Aggregation
  • Data chaining
  • Inbound XML
  • Decomposition
  • Synchronization
  • Conditional logic

- Unique standards/XML-based approach -
13
10. Using XML for Federal Enterprise Architecture
  • BRM is a function-driven framework for
    describing the business operations of the Federal
    Government independent of the agencies that
    perform them
  • XAware uses XML to bi-directionally supply data
    to applications independent of the physical
    agency location of the data source(s) in a
    many-to-many relationship
  • DRM will describe the data and information
    that support program and business line operations
    (Cross-Agency Exchange)
  • XAware utilizes a standards-based approach (XML)
    to accomplish cross-agency information exchanges
    in real-time, leveraging current IT investments.

14
11. Using XML for FEA (contd)
  • The massive duplication of efforts that the
    FEA is intended to resolve is indicative of the
    many individual agency data sources leveraged to
    accomplish line of business activities
  • XAware allows many data sources to appear as a
    single logical source, with an XML interface

15
12. Integrated Information Sharing
Single XML views across all levels and functions
of government
16
13. XAware Example Applications
  • Civilian government
  • U.S. DOI, Bureau of Land Management
  • U.S. Department of State
  • Justice Information System - Alaska
  • State local government
  • Nebraska Department of Environment
  • Colorado Department of Health
  • California EPA
  • Defense DoD
  • Raytheon
  • Northrop Grumman
  • Mitre

17
14. Context for Demos 1 2 OMB and the Federal
Budget
  •  

Report on Information Technology (IT) Spending
for the Federal Government for Fiscal Years 2002,
2003, and 2004
18
15. Context for Demos 1 2 The Federal Budget
Process
  • __________________________________________________
    ________
  • Year-long, multi-step process with multiple
    reviews, refinements, and touch points
  • Every office in every agency has to develop its
    budget each FY
  • __________________________________________________
    ________
  • Plans, programs, and budgets start at the lowest
    level, then trickle up to the highest levels,
    e.g., Department Secretary
  • Proposed budget then sent to OMB, Executive
    Office of the President
  • The President submits final budget request to
    Congress (FY 2004 gt 2 trillion
    20 of U.S. GDP)
  • __________________________________________________
    ________
  • OMB Exhibit 300 is a superset of OMB Exhibit 53
  • As such, the instance budget data are accumulated
    via the Exhibit 300 form, they become a candidate
    for a more complex implementation of the
    functionalityto be described as Demo 1

19
16. Demo 1 Demo 2 Flowchart of Steps
20
17. Demo 1 Conversion of OMB Exhibit 53 to XFML
  • __________________________________________________
    ________
  • OMB Exhibit 53 is represented as an Excel
    spreadsheet containing IT Investment Details for
    every project in every agency of the Federal
    government, with total, development, and
    steady-state costs for past, current, and next
    fiscal year
  • __________________________________________________
    ________
  • This demonstration will use XAwares XA-Suite to
    convert the Exhibit 53 spreadsheet to an XFML
    representation
  • Each row in the spreadsheet is represented by a
    page in the XFML file
  • Each page has a title which uniquely identifies
    a row in the spreadsheet
  • Each page in the XFML file contains
    occurrences, where each occurrence indicates
    the existence of a topic on the page
  • Each topic belongs to a facet, defining the
    organizational structure of the XFML file
  • __________________________________________________
    ________
  • Five facets have been defined for this
    demonstration
  • Department, Investment Type, Budget Entry Year,
    Project Type, and Investment

21
18. Demo 1 Conversion of OMB Exhibit 53 to XFML
  • Topics have been defined for each of the facets
  • Department
  • Each agency name is a topic
  • Investment Type Topics are
  • 01 Projects by Mission Area
  • 02 Office Automation and Infrastructure
  • 03 Enterprise Architecture and Planning
  • 04 Grants Management
  • 05 Intramural / Grants to States
  • Budget Entry Year Topics are
  • 2000, 2001, 2002, and 2003
  • 24 (representing one of the 24 E-Gov initiatives)
  • Project Type Topics are
  • Major and Minor
  • Investment Topics have been established for
  • Dollar amount ranges for each of Total,
    Development, and Steady-State expenditures

22
19. The Big Issue with Search and Discovery (SD)
  • How do you make sense of the information
    contained in a
  • very large data repository(s)? By having the
    ability to
  • Get an aerial view of all the information,
    neatly organized by categories and distributed by
    counts of the documents found
  • See all of this information along hierarchical
    categories and as different views (facets)
  • Conduct search and browsein tandem, and by
    categories within any facet
  • Slice all or any part of the repository along any
    combination of category axessimilarly organized
    and distributed
  • Bring back the searched result subset(s) in real
    timesimilarly organized and distributed

23
20. Evolution of Apps./User Interfaces and Data
Relns.
Applications and user interfaces
Data relationships
1960s
Now
24
21. Reqs. of Data Stakeholders that are Driving
Standards
25
22. Specific Problems with SDw.r.t. Stakeholders
  • Stakeholder 1 End-user
  • Search around (Google type)
  • Browse one or more taxonomies
  • Find what s/he knows existseven if s/he may or
    may not know how to ask for it
  • Discoverserendipitously and deductively
  • Stakeholder 2 Author
  • Expect target audience to have access
  • Possess the means of controlling discovery by
    end-users
  • Have the confidence that a user looking for a
    document will find it
  • Stakeholder 3 Database mgr.
  • Ensure minimum intrusion
  • Use help in identifying errors in their databases
  • Stakeholder 4 Repository mgr.
  • Enable a single point of search
  • Provide simultaneous, real-time update of all new
    info
  • Synchronize data across all classes of users
  • Win consensus of DB managers
  • Address security concerns

26
23. Specific Problems with SDw.r.t. Business
Value
Economic value
Social value
Intelligence value
Better, faster interaction between end-users
and textual information
Pillar 3 Greater visibility
Pillar 5 Real-time customer response
Pillar 4 Virtual syndication
Pillar 2 Faceted search and browse
Pillar 1 Virtual aggregation
27
24. Solving the SD ProblemsWhat Do We Need?
XML to help define as much of the common
structure between disparate data sources without
disrupting the existing maintenance
infrastructure and processes
  • XFML to help metatag the facets and headings to
    resources (documents)
  • From attributes (fields) in a document (e.g.,
    source, date, author)
  • Manual effort
  • Automated heuristic classifiers
    (e.g., Applied Semantics, Entrevo, Inxight,
    NStein)
  • WebDAV for collaborative tagging

Better, faster interaction between end-users
and textual information
  • Other technologies, e.g.
  • XTM (topic maps)and XTM vs. XFML

28
25. Toward a Better Mousetrap for Search and
Discovery
  • What must the faceted SD technology do?
  • - Bin-sort, with counts, any searched subset of
    the data repository by one or more underlying
    taxonomies (and facets)in real time.
  • Whats the big deal?
  • - The big deal is the performance. Retrieving
    and sorting results sets that may contain
    hundreds of thousands or millions of documents in
    real time is a huge performance challenge.
  • So, how could it be done?
  • - By employing certain indexing techniques,
    search algorithms, and parallel processing.
  • And how could we achieve search/research
    interoperability across dispersed, disparate
    databases?
  • - With search API and hardware that sit
    independently of the database hardware and its
    maintenance infrastructure.

29
26. Search, Browse, Slice, Dice, Discoverby
Facets
Visual example of a user-controlled search logic
with the faceted SD technology
View 2 See subset along Axis 1
View 1 Get aerial layout of the entire database
View 3 See subset along Axis 2
View 4 Search by keyword
30
27. Conventional vs. Faceted Search Discovery
31
28. Searchvia Google
  • Search only Web-based information, not RDBMS
    data.
  • Results ranked by relevancewhich may not be
    relevant to the user.
  • Long lists not organized by categories
    therefore, valuable information may not be
    visible to the user.
  • No opportunity for user to refine search results,
    perform drill-downs, and back-track without
    starting over.

32
29. Searchvia Faceted Search Discovery
  • Search results always in context and presented in
    structured (sub)categories.
  • All returned items visible (with counts of the
    documents found).
  • User can view results in any category tree, and
    switch views from one tree to another at any
    time.
  • User controls searching, browsing, and
    back-tracking rapidly and interactively.
  • Documents achieve maximum exposure and visibility
    by unlocking, organizing, and amplifying
    critical/relevant data across multiple data
    repositories through a single point of search.

Relevant views of the data
Simultaneous free-text Boolean search
Categories of selected views shown at all times
33
30. i411 Discovery Engine Example Applications
Government applications
CRISP Grants Database Office of Extramural
Research, National Institutes of Health, U.S.
Department of Health and Human Services http//cri
sp.i411.com/
AIDS Projects Query System Office of AIDS
Research, National Institutes of Health, U.S.
Department of Health and Human Services http//dem
o.altum.com
With Altum, Inc.
Trade and Economic Archives/KM STAT-USA, Economi
cs and Statistics Administration, U.S. Department
of Commerce http//statusa.i411.com
34
31. Demo 2 Multifaceted SD of OMB Exhibit 53
Demo 2 URL http//demoweb01.i411.com/budget/inde
x.html
35
32. Conclusionsand QA
  • __________________________________________________
    ______________
  • W.r.t. the databases of Federal government
    agencies, there are many acute, urgent issues
    that have to do with
  • - The underlying data (and documents)
  • - Bringing data from different sources
  • - Ensuring the integrity of data
  • - Generating value for the end-user at the
    front line
  • Butthese problems present huge opportunities
    that can be tapped with new, human-friendly
    technologies for data aggregation, search, and
    discovery
  • __________________________________________________
    ______________
  • Creating real-time read/write access of
    disparate, dispersed, and (un)structured data
    sources among Federal agencies through unified
    XML views is very powerful
  • Unlocking the full social, economic, and
    intelligence value of information assets within
    an agency is a long-term pursuit, rapidly aided
    by the simplification of data exchange and
    integration (realize benefits quickly)
  • The creation of information on-demand through XML
    views is an FEA-compliant solution to many
    significant IT challenges in government
  • __________________________________________________
    _______________

36
33. Contact Information
Iqbal Talib, Co-Founder CEO Amin Hassam, V.
President, Gov. Solutions i411, Inc. 13655 Dulles
Technology Drive, Suite 250 Herndon, Virginia
80920 Email italib_at_i411.com Email
ahassam_at_i411.com Iqbal 703.793.3270 x105 Amin
703,793.3270 x140 www.i411.com
Brett Stein, Solutions Director Steve Horneman,
Director of Marketing XAware, Inc. 2060 Briargate
Parkway, Suite 150 Colorado Springs, Colorado
80920 Email bstein_at_xaware.com Email
shorneman_at_xaware.com Brett 719.884.5420 Steve
719.884.5424 www.xaware.com
Write a Comment
User Comments (0)
About PowerShow.com