High Performance Web Service Architecture for Sensors and Geographic Information Systems - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

High Performance Web Service Architecture for Sensors and Geographic Information Systems

Description:

Used streaming database connection (MySQL) for faster retrieval of the query ... We have shown that the GIS Services can be implemented as streaming services. ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 35
Provided by: Galip
Category:

less

Transcript and Presenter's Notes

Title: High Performance Web Service Architecture for Sensors and Geographic Information Systems


1
High Performance Web Service Architecture for
Sensors and Geographic Information Systems
  • Galip Aydin

2
Geographic Information Systems
  • A Geographic Information System is a system for
    creating, storing, sharing, analyzing,
    manipulating and displaying spatial data and
    associated attributes.
  • GIS history saw the evolution from mainframe GIS
    to Desktop GIS to Distributed GIS.
  • Modern GIS require
  • Distributed data access for spatial databases
  • Utilizing remote analysis, simulation or
    visualization tools.

3
Traditional Distributed GIS Approach
  • Problems with traditional approaches
  • Distributed nature of the geo-data various
    client-server models, databases, HTTP, FTP, RDBs,
    XML DBs etc.
  • Data format problems, conversion overheads
  • Data processing issues, hardware and software
    requirements, COM/ActiveX, CORBA/IIOP frameworks
  • Which introduce three challenges
  • Assembling data from distributed repositories
  • Adoption of universal standards for format
    interoperability
  • Interoperable services for better utilization of
    computational resources

4
Open Geographic Standards
  • Open GIS Standards bodies aim to make geographic
    information and services neutral and available
    across any network, application, or platform.
  • Two major standard bodies OGC and ISO/TC211,
    former being most popular
  • OGC Specifications are widely accepted
  • Data Format Specs GML, SensorML, OM
  • Service Specs WFS, WMS, WCS
  • OGC Services are HTTP GET/POST based limited
    data transport capabilities (HTTP, FTP, files
    etc.)
  • Not Web Services tightly coupled, point to point
    communication results in centralized, synchronous
    applications.

5
Motivations
  • Lack of service orchestration capabilities
  • Complex problems require GIS applications to
    collaborate.
  • Coupling data sources to scientific applications
  • Data transport requirements
  • Proliferation of Sensors
  • Ability to analyze data on-the-fly, continuous
    streaming support, scalable systems for addition
    of new sensors.
  • High performance and high rate messaging
  • Real-time data access, rapid response systems,
    crisis management etc.
  • From the Grids perspective
  • To apply general Grid/Distributed computing
    principles to GIS
  • Investigate how to integrate with geophysical and
    other scientific applications

6
Motivating Use Cases
  • Pattern Informatics
  • Earthquake forecasting code developed by Prof.
    John Rundle (UC Davis) and collaborators, uses
    seismic archives.
  • Regularized Dynamic Annealing Hidden Markov
    Method (RDAHMM)
  • Time series analysis code, can be applied to GPS
    and seismic archives, can be applied to real-time
    data.
  • Interdependent Energy Infrastructure Simulation
    System (IEISS)
  • Models infrastructure networks (e.g. electric
    power systems and natural gas pipelines) and
    simulates their physical behavior,
    interdependencies between systems.
  • SOPAC GPS Networks provide real-time messages.

7
Research Issues 1
  • Applying Web Service principles to GIS data
    services
  • Orchestration of Services, workflows, simple
    services are not suitable for large data sets and
    where quick response is required
  • High Performance support in GIS services.
  • Interoperability
  • The system should bridge GIS and Web Service
    communities by adapting standards from both.
  • Other GIS applications should be able to consume
    data without having to do costly format
    conversions.

8
Research Issues 2
  • Scalability
  • The system should be able to handle high volume
    and high rate data transport and processing.
  • Plugging new sensors, data sources or
    geoprocessing applications should not degrade
    systems overall performance.
  • Flexibility and extendibility
  • How to develop real-time services to process
    sensor data on the fly.
  • Ability to add new filters without system
    failures.
  • Quality of Service Issues
  • Is latency introduced by services in processing
    real-time sensor data acceptable?

9
SOA for GIS Geophysical Data Grid
  • We utilize Web Services to realize Service
    Oriented Architecture, OGC data formats and
    application interfaces for interoperability at
    both levels.
  • GIS Data Grid Properties
  • Based on the sources geospatial data can be seen
    as archival and real-time data. The architecture
    provides standard control and access interfaces
    for both types.
  • Supports alternate transport and representation
    schemes, uses topic based messaging
    infrastructure for large volume data transport.
  • UDDI based FTHPIS as services registry.
  • Streaming and non-streaming services to access
    archived data.
  • Real-Time and near real-time services for
    accessing sensor metadata and sensor measurements.

10
Geophysical Data Grid Architecture
Real-Time Data Grid
Archival Data Grid
11
GIS Grid 1 - Archival Data Services
  • Web Feature Service is the default OGC
    specification for vector data.
  • We have built Web Service version of WFS for
    accessing geospatial data on distributed
    databases.
  • The first Web Service version of WFS has been
    successfully used in several scientific workflows
    with other services (WMS, HPSearch, FTHPIS).
  • WFS can access multiple distributed databases,
    can query other WFSs for remote features.
  • Problems with Web Service version of the WFS
  • Request-response, not asynchronous,
  • Performance GI Services are not designed to
    handle non-trivial data transfers. Large data
    requests, SOAP overhead.
  • XML Encoding Size of the geospatial data
    increases with GML encoding which increases
    transfer times, or may cause exceptions

12
WFS Performance Improvements Streaming WFS
  • To improve performance of the WFS
  • Utilized publish/subscribe messaging system for
    high performance data transfer. Similar to WFS
    but data and control channel separation, allows
    one to many data distribution.
  • Used streaming database connection (MySQL) for
    faster retrieval of the query results, and lower
    GML creation overhead.
  • Binary XML Frameworks are integrated for reducing
    XML payload size which improves transfer times.
  • Binding data transfer to Grid messaging
    middleware reduces SOAP creation overhead.

13
WFS Interaction with services and data sources
14
GIS Grid Example IEISS Integration
WMS Ahmet Sayar UDDI, Context Service Mehmet
Aktas
15
Streaming WFS Performance
  • We test the system for up to 10.000 features
  • The tests reveal the performance of the
    streaming service with and without Binary XML
    integration
  • We use BNUX and Fast Infoset Binary XML
    Frameworks for compressing the GML
    FeatureCollection documents
  • The BNUX and FI timings include encoding and
    decoding costs

16
GIS Grid 2 - Real-Time Data Services
  • Sensors and sensor networks are being deployed
    for measuring various geo-physical entities.
  • Sensors and GIS are closely related. Sensor
    measurements are used by GIS for statistical or
    analytical purposes.
  • With the proliferation of the sensors, data
    collection and processing paradigms are changing.
  • Most scientific geo-applications are designed to
    work with archived data.
  • Critical Infrastructure Systems and Crisis
    Management environments require fast and accurate
    access to real-time sources and a
    flexible/pluggable architecture for geoprocessing
    of the data.

17
SensorGrid Architecture
  • Major components
  • Real-Time filters
  • Grid Messaging Substrate
  • Information Service
  • Filters can be run as Web Services to create
    workflows.
  • Filter Chains can be deployed for complex
    processing.
  • Streaming messaging provide high-performance
    transfer options.

18
Real-Time Filters
  • Real-time data processing is supported by
    employing filters around publish/subscribe
    messaging system.
  • The filters are extended from a generic class to
    inherit publish and subscribe capabilities.
  • They can be connected in parallel or serial as
    chains to solve complex problems.

19
Filter Metadata and Chains
Parallel Operation
Serial Operation
20
Use Case - GPS Sensors
  • A good example for scientific sensors are GPS
    station networks. GPS measurements are used for
    determining post-seismic deformation,
    understanding long-term crustal movement etc.
  • SOPAC GPS networks
  • 8 networks for 80 stations produce 1Hz high
    resolution data.
  • Socket based real-time binary-RYO format access
    is available, but not utilized!
  • We developed filters to provide multiple format
    (RYO, ASCII, GML) real-time streaming access.
  • OHIO principle and chain of filters.
  • We use publish/subscribe based NaradaBrokering
    for managing real-time streams, topics for
    hierarchical organization of the sensors.

21
SOPAC Real-Time Filters for GPS Streams
22
Application Integration with Real-Time Filters
  • Station Monitor Filter records real-time
    positions for 10 minutes and calculates position
    changes
  • Graph Plotter Application creates visual
    representation of the positions.
  • RDAHMM Filter records real-time positions for 10
    minutes and invokes RDAHMM application which
    determines state changes in the XYZ signal.
  • Graph Plotter Application creates visual
    representation of the RDAHMM output.

23
AJAX and Real-Time positions on Google maps
24
Recording and Replaying Sensor Streams
  • Filters can be used to record and replay
    scenarios, such as Earthquakes in GPS case.
  • We developed RYO Recorder and RYO Publisher
    Filters.
  • The RYO Recorder creates daily archives of the
    GPS Streams.
  • RYO Publisher can be used to play daily or
    certain segments of the records.
  • We replayed the 2004 Southern California
    Earthquake using Parkfield GPS network archive

25
SensorGrid Performance Tests
  • Two Major Goals System Stability and Scalability
  • Ensuring stability of the distributed Filter
    Services for continuous operation.
  • Finding the maximum number of publishers
    (sensors) and clients that can be supported with
    a single broker.
  • Investigate if system scales for large number of
    sensors and clients.

26
Test Methodology
Ttransfer (T2 T1) (T4 T3)
  • The test system consists of a NaradaBrokering
    server and a three-filter chain for publishing,
    converting and receiving RYO messages.
  • We take 4 timings for determining mean end-to-end
    delivery times of GPS measurements.
  • The tests were run at least for 24 hours.
  • GridFarm001-008 servers are used in these tests.

27
1- System Stability Test
  • The basic system with three filters and one
    broker.
  • The figure shows average results for every 30
    minutes.
  • The average transfer time shows the continuous
    operation does not degrade the system performance.

28
2 Multiple Publishers Test
  • We add more GPS networks by running more
    publishers.
  • The results show that 1000 publishers can be
    supported with no performance loss. This is an
    operating system limit.

29
3 Multiple Clients Test
1000 Clients
Adding clients
  • We add more clients by running multiple Simple
    Filters which subscribe to the same ASCII topic.
  • The system can support as many as 1000 clients
    with very low performance decrease.

30
Extending Scalability
  • The limit of the basic system appears to be 1000
    clients or publishers.
  • This is due to an Operating System restriction of
    open file descriptors (1024 for Red Hat Linux).
  • To overcome this limit we create NaradaBrokering
    networks with linking multiple brokers.
  • We run 2 brokers to support 1500 clients.
  • Number of brokers can be increased indefinitely,
    so we can potentially support any number of
    publishers and subscribers.

31
4 Multiple Brokers Test
  • Messages published to first broker can be
    received from the second broker.
  • We take timings on each broker.
  • We connect 750 clients to each broker and run for
    24 hours.
  • The results show that the performance is very
    good and similar to single broker test.

32
4 Multiple Brokers Test
750 Clients
750 Clients
33
Real-Time Filters Test Results
  • The RYO Publisher filter runs at 1Hz and
    publishes 24-hour archive of the CRTN_01 GPS
    network, which contains 9 GPS stations.
  • The single broker configuration can support 1000
    clients or publishers (GPS networks - 9000
    individual stations).
  • The system can be scaled up by creating
    NaradaBrokering broker networks.
  • Message order was preserved in all tests.

34
Contributions
  • A SOA approach to create a common platform to
    support both archival and real-time geospatial
    data in data-centric Grids.
  • Merging Web Services and Open Geographic
    Standards for supporting interoperability at both
    data and application levels.
  • We have shown that the GIS Services can be
    implemented as streaming services.
  • Integration of Binary XML Frameworks with the
    Streaming Services shows performance gains for
    long network distances.
  • We have shown that the Sensor Grids can be built
    on top of the publish/subscribe middleware.
  • Real-Time continuous data support is realized in
    a Service Architecture.
  • Scalable architecture implementation for large
    number of sensor networks.
Write a Comment
User Comments (0)
About PowerShow.com