Title: XML Web Services: Superfund Data Integration, Work Flow, and Exchange Network
1XML Web Services Superfund Data Integration,
Work Flow, and Exchange Network
- David Eng and Brand Niemann
- XML Environmental Measurement Workgroup (XEMG),
US EPA - February 21 and 25, 2002
Disclaimer Any reference to or depiction of the
commercial product of any vendor is for
illustrative purposes only and does not
constitute an endorsement by EPA or the trainer.
2Overview
- 1. Executive Summary
- 2. OIC Mission Statement
- 3. The Basics with Examples
- 4. Work Flow and Integration
- 5. Exchange Network
- 6. Questions and Answers (February 21st)
- 7. Contact Information
31. Executive Summary
- The XML Environmental Measurement Workgroup
(XEMG) has focused on Superfund Data for its
First Pilot Project - New content collect in XML and register the
data elements and XML Schema. - Legacy content (selected) re-purpose and
re-publish to XML. - Integration and analysis build distributed XML
registry and repository and link to statistical
and other analytical tools. - Exchange network simulate on a laptop!
- Once proven, we can apply to any document and
data. - Part of How to Get Started with XML Web
Services Training.
42. OIC Mission Statement
- The mission of OIC is to protect human health and
the environment by collecting and distributing
quality environmental data in the most efficient
manner possible. OIC is building the Agencys
portal in the National Environmental Information
Exchange Network and developing mechanisms for
integrating facility, geospatial and other data
types. OIC also helps EPA succeed by providing
customer services such as data sets of common
interest, records management, and data
standardization tools. (Mark Luttner, 2/15/2002)
52. OIC Mission Statement
- Some comments
- Collect and distribute information in the most
efficient manner XML Web Services. - Integrate facility, geospatial, and other data
types XML Web Services Pilot Projects. - Provide customer services XML Web Services Pilot
Projects.
63. The Basics with Examples
- The Basics
- 3.1 What are XML Web Services?
- 3.2 Future National Environmental Information
Exchange Network. - Examples of XML in Action
- 3.3 Data Tables.
- 3.4 Simple Node.
- 3.5 Databases.
- 3.6 VoiceXML
- 3.7 Geospatial.
- 3.8 Content Authoring and Management.
- 3.9 Data Integration and Coordination.
73.1 What are XML Web Services?
- XML is a standard for preserving and
communicating information encoding, tagging,
and internationalizing that will be everywhere. - Web Services provide communication between
applications running on different Web servers
that will bring the Internet to its new level. - XML Web Services is the technology underlying the
new programs you have heard about - Exchange Network (NEIEN)
- Central Data Exchange (CDX)
- Alpha and Beta Nodes
- XML Web Services is what everyone in EPA and the
States should know something about.
83.2 Future National Environmental Information
Exchange Network
- The Blueprint Team recommends the exclusive use
of XML as the common basic interchange language
for data flows. (page 24) - The Blueprint Team believes that simplified
versions of the tools (e-commerce servers),
technologies (XML), and security levels being
developed and rapidly embraced by the private
sector can be applied to the business of
environmental agencies. - The Knowledge Transfer Action Team of the Interim
Network Steering Group has developed an XML
Training package for managers and technical
staff. - Blueprint for a National Environmental Exchange
Network, National Blueprint Team, Document
amended June 2001.
93.2 Blank Data Exchange Template
103.2 Central Data Exchange and the Network
113.3 Data Tables
123.3 Data Tables Data Binding
- Data Binding
- Link an XML document to an HTML page and then
bind standard HTML elements to individual XML
elements (save time money on delivering small
Web databases). - See Unit 3. Introduction to XML for the Web Step
by Step (updated to Second Edition) and Unit 14.
XML Web Services Toxics Release Inventory. - XML separates the data itself from the
presentation of the data. - Data tri99table1.xml
- Presentation tri99table1.htm
- Both files can be viewed separately by IE 5.x and
6. - The XML file has many other uses and future
proofs your data against periodic technology
changes.
133.3 Parts of a Well-Formed XML Document
- XML
Declaration - Comment
- White Space
- href"Inventory01.css"? Processing Instruction
- End of Prolog
- White Space
-
-
- The Adventures of Huckleberry
Finn - Mark Twain
- mass market paperback
- 298
- 5.49
-
- - Document Element (Root Element)
- -
-
- The Turn of the Screw
- Henry James
143.4 Simple Node
- FileMaker 5.5 (http//www.filemaker.com)
- Low cost (e.g 150 at Virginia MicroCenter).
- Only PC with Windows 95 or Mac(not a Web
Server). - Dedicated IP Address.
- My EPA Windows 98 Desktop PC.
- http//161.80.87.87/xmlpilot.htm
- Interface Customization and Portal Features.
- Third party developer resources
- Macromedia Dreamweaver.
- Adobe GoLive.
- Allaire ColdFusion.
153.4 FileMaker 5.5 Database-to-XML
163.4 FileMaker 5.5 Database-to-XML
173.4 FileMaker 5.5 Database-to-XML
183.4 FileMaker 5.5 Database-to-XML
193.5 Databases
- Database-to-XML Conversion Save time money on
delivering and integrating large Web databases. - Integrated Taxonomic Information System (ITIS).
- XML output option viewable with IE5 and 6.
- EPA Local Emergency Planning Committee Database.
- Query by ZIP Code, State, and/or Name and
Address. - Schematics on Integration
- Six databases need 30 filters.
- Six databases and an XML hub only need 12
filters. - XML for interchange between applications.
- See Unit 7. XML Web Services XML-ization of
Databases and Metadata.
203.5 XML Output Option
213.5 XML Output Option
223.5 Local Emergency Committee Database
233.5 Local Emergency Committee Database
243.5 Database-to-XML Conversion Six Databases
Need 30 Filters
Oracle
Postgress
Sybase
mySQL
Informix
Access
253.5. Database-to-XML ConversionSix Databases
and An XML Hub Only Need 12 Filters
Oracle
Postgress
Sybase
mySQL
XML Hub
Informix
Access
263.5 Database-to-XML Conversion XML for
Interchange Between Applications
Database
GIS
Spreadsheet
XML Repository
XML
OLAP Data Warehouse
3D Visualization
273.6 VoiceXML
- Delivery of XML to Multiple End-points
- The phone remains the ubiquitous communications
device and can be used to bridge the digital
divide with non-PC users and to meet the new
Section 508 requirements, so if you can access
content via the Web from a browser, you can
access it using VoiceXML from the telephone. - Develop and run VoiceXML applications on the
Tellme.com server for free that can be paid for
by consumers via per minute fees. - See Unit VoiceXML.
- Pilot Project 1-866-745-7735
283.6 VoiceXML Schematic
293.6 VoiceXML
303.6 VoiceXML
313.7 Geospatial
- Geography Markup Language
- Geo-spatial Web (integrate spatial and
non-spatial XML databases). - Open GIS Consortium (EPA became a member in the
Year 2001) - http//www.opengis.org/
- See Unit 20 Geography Markup Language.
- See Unit 7. XML Web Services XML-ization of
Databases and Metadata. - LandView Census 200 Population Estimator (XML and
Java). - USA Counties Spatial Statistics (SQL and XSLT).
323.7 Geo-spatial Web
333.7 Census TIGER/GML (Geography Markup Language)
SVG (Scalable Vector Graphics) Viewer
343.7 LandView Census 2000 Population Estimator
353.7 LandView Census 200 Population Estimator
36Web server
1
HTML w/javascript
Filemaker with Population database
End User with web browser
2
3
lat/long/ radius request
xml request
4
xml file
5
- End user enters URL for HTML page
- End user enters latitude/longitude and radius,
presses Get Population button - Javascript in web page issues URL to Filemaker
for the census block records - Filemaker sends XML file back to the web page
- Javascript reads the XML file, performs
calculations, updates the web page.
373.7 USA Counties Spatial Statistics
383.7 USA Counties Spatial Statistics
393.8 Content Authoring and Management
- Content Authoring (Re-design and Re-publishing)
Metadata, Geo-spatial Index and Baseline
Assessment, etc. (improved navigation and
integrated searching). - Content (document) Management Hierarchical
organization of EPA and other environmental
content by topics, themes, etc. (same benefit as
above). - See Unit 15. Web Content Authoring and Management
Tools with XML. - Creating Good Content.
- Managing Content as a Collection and Network.
- Web Content Management Tools.
403.8 Re-design and Re-publishing
413.8 Re-design and Re-publishing
423.8 Content Management
433.8 Content Management
443.9 Data Integration and Coordination
- Virtual Centralization with XML Messaging
FedGov, Region 3, 5, 7, states, etc. (virtual
centralization of diverse and distributed
content). - See Unit 8. XML Web Services Building at the
National, Regional, and State Level. - See Unit 9. XML Web Services Standard by
Standard. - Evolution of Web Services.
- SOAP Basics.
- The Web Services Definition Language (WSDL).
- Universal Description, Discovery, and Integration
(UDDI). - Web Services Web Agent Demonstration (Section 2).
453.9 Virtual Centralization of Diverseand
Distributed Content
NXT 3 Interface
Search, Personalization, Document Management,
Metadata, etc.
Content Network Hierarchical Folders Each Can
be a Portal on Different Web Server!
Portlets
Portal (s)
Portlets
463.9 NXT 3 e-Content P2P Platform Concepts
- Folders can contains files, databases, and Web
resources. - Folders can/should be on different Web servers,
but look and function as though they are on the
same Web server. - This is accomplished by two new XML-based
standards that send lean XML messages between the
Web servers - Content Network Protocol (CNP)
- eXtensible Indexing Language (XIL)
- Distributed folders and nodes can be managed both
centrally and locally by the Content Network
Manager and the Manage Content Administration
Tools.
473.9 Virtual Centralization with XML Messaging
483.9 XML Web Services Standards Stack
494. Integration and Work Flow
- 4.1 Content, Integration and Work Flow.
- Folder structure, indexing, and interlinking (see
Part 1). - See Tax Analysts Federal Research Library for the
results of more time spend with more content
collections, structuring, interlinking, and
special queries across content collections. - 4.2 LivePublish Uses of XML.
- 4.3 NXT 3 Indexing and Management of PDF File
Collections. - 4.4 Demo Script.
- 4.5 NXT 3 Demo Script
504.1 Tax Analysts Federal Research Library
514.1 Tax Analysts Federal Research Library
524.1 Tax Analysts Federal Research Library
534.2 LivePublish Uses of XML
- Serve up native XML.
- Convert XML to HTML using a CSS or XSL at run
time using the Display Filter API. - Convert XML to HTML at build time.
- Uses an XML-based file to define site look and
feel. - The build Makefiles are XML files that define the
structure and contents of the information
collections. - XML-based legacy conversion tools simplify the
conversion of existing content into HTML. - Indexsheets (XIL) define and control the indexing
of content like stylesheets (XSL) define and
control the formatting (see separate handout).
544.2 LivePublish
- Extensible Indexing Language (XIL)
- Leverages W3C XSLT/XPath Standard. The ability of
XIL to separate search fields and table of
contents structure from specific elements plays
an important role in bringing the sites together
into one. See Bennett Cookson, NextPage, XML 2000
Conference presentation (Word). - A rule-based mechanism that looks for a
particular tag within the document and assigns
the content within the opening and closing tag to
a particular field which is fully searchable
(e.g. index author names in a field called
Author Name). Also can be used to present
user-unfriendly tags in a more easily understood
language. -
-
-
-
-
-
- ....
-
- Stephen King
554.3 NXT 3 Indexing of PDF File Collections
564.3 NXT 3 Indexing of PDF File Collections
574.3 NXT 3 Indexing of PDF File Collections
584.3 NXT 3 Indexing of PDF File Collections
594.3 NXT 3 Management of PDF File Collections
604.4 LivePublish Demo Script
- EPA Superfund Program Web Service
- Five Document Collections.
- Folder structure and indexing (could have more
like Tax Analysts Federal Research Library). - Five Year Review Report.
- Interactive Map Interface and Other Links.
- 10 different link types.
- Advanced Search
- ROD and Action Levels.
- Institution Control, Deed Restrictions, and
Fencing.
614.5 NXT 3 Demo Script
- EPA Superfund Data XML Web Service
- Superfund PDF Collection 1
- 22 PDF files45 MB (indexed in about 10 seconds).
- View 112609.pdf
- Advanced Search (set Show document excerpts in
the results list to Short saves time retrieving
each PDF!) - ROD and Action Levels.
- Institution Control, Deed Restrictions, and
Fencing - Options
- Find similar
- Save search
- Manage Saved Searches
625. Exchange Network
- 5.1 Different content, integration, and work flow
on separate servers (simulate on laptop today!) - Web Servers
- Personal Laptop.
- Internal Superfund.
- External Superfund Contractor.
- 5.2 NextPage XML Web Services.
- 5.3 Content Network Manager (Network
Administration). - 5.4 Peer-to-Peer Architecture and Hierarchical
Peer-to-Peer Architecture. - 5.5 Integrator Node of Exchange Network
- Only Public Superfund data and information are
being used in this pilot work!
635.1 LivePublish Server
645.1 NXT 3 P2P Platform Server
655.1 NXT 3 P2P Platform Server
665.2 NextPage XML Web Services
- NXT 3 has been delivering XML Web Services since
July 2000 based on an early SOAP recommendations
before SOAP became a standard. - NextPage is developing full support for SOAP,
WSDL, and UDDI standards and conforming Web
service frameworks such as .Net and Sun One
(Java). - Basic XML Web services provides low-level
communication and NXT 3 provides high-level data
coordination when intelligent evaluation of
distributed content and collaborative
capabilities in the context of business processes
is needed (just released Matrix).
675.3 Content Network Manager
685.3 Content Network Manager
695.3 Content Network Manager
705.3 Content Network Manager
715.3 Content Network Manager
725.4 Peer-to-Peer Architecture for the Exchange
Network of Services and Stewardship
The outer nodes are desktops and portable devices
with specialized Services and Applications.
The inner nodes are servers with State, Regional
and National Services and Applications.
At least one of the inner nodes is also the
Integrator Node with the directory and metadata
functions.
Key Both Client Server Nodes (all circles)
This is called a peer-to-peer Web Services
architecture where every node can communicate
with every other node using SOAP (the XML
Protocol).
735.4 Hierarchical Peer-to-Peer Architecture for
the Exchange Network of Services and Stewardship
This is the way that most enterprises start
with an Exchange Network.
Key Client Nodes (outer circles) Server Nodes
(inner circles)
745.5 Integrator Node of Exchange Network
756. Questions and Answers
- Willie Wong XML for field data.
- Part 1, page 19.
- Ethan McMahon XML in the enterprise
architecture framework. - Part 1, page 9 and Section 5.
- Linda Kirkland retrieve metadata for databases
and trends analyses. - Part 1, pages 23, 53, and 67.
- Gene Durman harmonization with NEIEN.
- Part 1, page 63.
- Andy Lowe XML Web Services.
- Part 1, pages 68-70.
- Michael Johnson tagging standards.
- Part 1, pages 17-18.
- Steve Vineski backend architecture and query
functions. - Part 1, pages 55 and 58.
- Lisa Jenkins making lab and field data more
accessible. - Part 1, page 23.
- Region 6 contractor XML-ization of SDMS.
- Part 1, pages 61-62.
767. Contact Information
- David Eng
- Office of Emergency and Remedial Response (OERR)
- Chair, XML Environmental Measurement Workgroup
(XEMG) - Voice 703-603-8827
- Email eng.david_at_epa.gov
- Brand Niemann. Ph.D.
- USEPA Headquarters, EPA West, Room 6143D
- Office of Environmental Information, MC 2822T
- 1200 Pennsylvania Avenue, NW, Washington, DC
20460 - 202-566-1657
- niemann.brand_at_epa.gov
- EPA http//161.80.70.167
- Outside EPA http//130.11.44.140