Exploiting Connector Knowledge to Efficiently Disseminate Highly Voluminous Data Sets - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Exploiting Connector Knowledge to Efficiently Disseminate Highly Voluminous Data Sets

Description:

A medium-sized volume of data, e.g., on the order of a gigabyte needs to be ... across the WAN to the Digital Movie Repository to backup its entire catalog and ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 15
Provided by: chrisma2
Category:

less

Transcript and Presenter's Notes

Title: Exploiting Connector Knowledge to Efficiently Disseminate Highly Voluminous Data Sets


1
Exploiting Connector Knowledge to Efficiently
Disseminate Highly Voluminous Data Sets
  • Chris A. Mattmann
  • David Woollard
  • Nenad Medvidovic
  • 3rd Workshop on SHaring and reusing ARchitectural
    Knowledge
  • 30th Intl Conference on Software Engineering
    (ICSE08)
  • Leipzig, GermanyTuesday, April 15, 2008

2
Outline
  • Problem Space
  • Connector Selection
  • Connector Knowledge
  • Insight
  • Observation
  • DISCO A Framework for Connector Selection
  • Results
  • Conclusion

3
Data Distribution Scenarios
A Backup Site periodically connects across the
WAN to the Digital Movie Repository to backup its
entire catalog and archive of over 20 terabytes
of movie data and metadata.
A medium-sized volume of data, e.g., on the order
of a gigabyte needs to be delivered across a LAN,
using multiple delivery intervals consisting of
10 megabytes of data per interval, to a single
user.
4
Challenges of Selecting the Right Connector
Technology
XML-RPC
Given our current architecture?
UFTP
Siena
GridFTP
bbFTP
Which one is the best one?
FTP
Aspera
CORBA
RMI
SOAP
Given our distribution scenarios and requirements?
Bittorrent
SFTP
HTTP/REST
JXTA
SCP
GLIDE/PRISM-MW
5
This is an Architectural Decision
  • Architectural decisions (such as connector
    selection) impact functional and non-functional
    properties of the overall data distribution
    system architecture
  • It does matter what connector you select
  • Functional (performance)
  • Efficiency, consistency, scalability,
    dependability of the data transfer
  • Non-functional (e.g., interoperability, security)
  • We assert that this process has largely remained
    an art form and forces organizations to rely on
    organizational gurus whose knowledge is never
    encoded or understood

6
The Role of Architectural Knowledge
  • Connector selection is so difficult because there
    is no reproducible way to make the right
    connector selection that a guru would make
  • Why?
  • Lack of an audit trail
  • Lack of Architectural Knowledge
  • Our work define and capture this knowledge about
    connectors!

7
Two Types of Knowledge
  • Insight
  • This is the inherent knowledge about the
    architectural properties of a connector that make
    it suitable for a particular distribution
    scenario
  • Example Because connector A has Cache-based data
    access, it is inherently more scalable (than a
    connector B with Session-based access) and
    ultimately more applicable for larger volume
    scenarios
  • Key connector architectural properties data
    access transient availability, which can have
    values of Peer, Cache, or Session-based.

8
Two Types of Knowledge
  • Observation
  • The observed characteristics of a connector that,
    based on past experience using it in a single (or
    family of) data distribution scenario(s), that
    make it either applicable, or inapplicable for a
    scenario
  • Example Because connector A successfully
    delivered 100 MB of data while maintaining a
    transfer rate of near 10 MBs/sec it is more
    efficient and scalable than connector B which had
    3 drops in connection, varying in transfer from
    from 3MBs/sec to 8 MBs/sec delivering around 90
    MB of same data as connector A.
  • Key observable properties efficiency, scalability

9
An Example
  • 2 Postulated Connectors
  • Architectural Metadata captured
  • Connector model based on ICSE00 Mehta et al.
    Taxonomy of Connectors
  • Sample distribution scenario Need to send 1 more
    terabytes of data across a WAN from the US to
    Europe to exactly 1 user in a single delivery
    interval.
  • Insight and Observation

10
(No Transcript)
11
DISCO A Framework for Connector Selection
12
Experimental Results
  • 30 real data distribution scenarios from JPL, and
    NCI EDRN projects
  • Run DISCO connector selection using architectural
    knowledge
  • Low (10 insights, 8 observations)
  • Medium (50 insights, 16 observation)
  • High Knowledge (100 insights, 24 observations)
  • Compare against expert selection answer key
  • 80 accuracy

13
Conclusions
  • Architectural Insight and Observation
  • Positive Impact framework for connector
    selection thats over 80 accurate
  • Standards for architectural knowledge description
  • Didnt show it, but using standard XML files and
    schemas to describe connectors, capture
    distribution scenarios, and record observation
    and insight
  • Needed first known step in capturing
    architectural knowledge about connectors in a
    standard form

14
Questions?
Write a Comment
User Comments (0)
About PowerShow.com