Towards software components - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Towards software components

Description:

Fast Unicode conversions. Careful implementation ( profiling) Simplicity is key to performance ... yet simple. Essentially SQL for XML data. Document doc ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 17
Provided by: flav68
Category:

less

Transcript and Presenter's Notes

Title: Towards software components


1
C O M P U T A T I O
N A L R E S E A R
C H D I V I S I O N
Towards software components for efficient and
easy communication and data integration in P2P
networks W. Hoschek Jan. 21, 2005
2
Overview
  • Communication and data integration in P2P
    networks is difficult
  • Most APIs for network I/O either do not scale
    well, or are hard to use
  • XML is great but its APIs are harder to use than
    expected (generality)
  • Large overheads related to XML serialization and
    deserialization
  • It need not be that way

3
Overview
  • Seek to enable
  • robust and powerful commodity XML tool chains
  • while retaining good messaging performance
  • Components for
  • asynchronous non-blocking network I/O
  • binary encoding of XML
  • XQuery/XPath manipulations of messages
  • Trade-offs formed by
  • performance, usability, flexibility, expressive
    power
  • Some preliminary performance results

4
Synchronous blocking I/O
  • Requires one thread per connection
  • Many threads scheduling inefficiencies
  • Concurrency and synchronization issues subtle and
    non-intuitive, hard to find debug
  • Serious degradation under overload

5
Async non-blocking I/O
  • Stages, Queues, Events, Event Handlers, Threads
  • One (or few) threads for N connections
  • Event driven design rather than OO call interface
  • Few concurrency bugs since (mostly) single
    threaded
  • Can avoid overload via explicit queue
    shaping/priorities

6
Implementation
  • SEA toolkit
  • A layer on top of Java NIO
  • TCP UDP
  • Overhead of toolkit
  • (preliminary, not measuring network)
  • 30000 msg/s (tiny msg size)
  • 200 MB/s (large msg size)
  • Documentation
  • http//dsd.lbl.gov/sea

7
Easy API (Hello World Server)
agent new NetAgent() myStage new
StageManager().createStage().start() agent.addLis
tenPort(myStage, 9000) agent.start() onAccept
ed(rsp) rsp.getAgent().enqueue( new
ChannelRequest.WriteData(rsp.getKey().channel(),
hello world))
8
XML Serialization Deserialization Overheads
  • XML is complex and very general
  • Standards compliant XML handling --gt
    inefficiencies
  • Serialization
  • 4-5 MB/s (standard textual XML)
  • 15-50 MB/s (bnux binary XML)
  • Deserialization (parsing)
  • 2-11 MB/s (standard textual XML)
  • 30-101 MB/s (bnux binary XML)
  • Data compression factor
  • 1.2 - 4
  • Guarantees well-formed XML, preserving W3C XML
    Infoset and W3C Canonical XML(!)

9
Binary XML Applications
  • Tightly coupled high-performance systems
  • exchanging large volumes of networked XML data
  • Compact main memory caches
  • Short-term storage as BLOBs
  • in backend databases or files
  • e.g. "session" data with limited duration
  • Not a standard - thus
  • not intended as a replacement for standard
    textual XML in loosely coupled systems where
    maximum long-term interoperability is the
    overarching concern
  • not intended for long-term data storage

10
XML Serialization Deserialization Overheads
  • BNUX Binary XML
  • Eliminate tag redundancy via tokenization and
    string pooling
  • Eliminate DTD and XML Schema checking
  • Efficient buffer (re)use
  • Fast Unicode conversions
  • Careful implementation (profiling)
  • Simplicity is key to performance
  • Guarantees well-formed XML, preserving W3C XML
    Infoset and W3C Canonical XML(!)

11
Easy BNUX API
Document doc new Builder().build("/tmp/test.xml"
) // write binary XML document to file
BinaryXMLCodec codec new BinaryXMLCodec()
byte data codec.serialize(doc, 0)new
FileOutputStream("/tmp/test.xml.bnux").write(data)
// read binary XML document from filebyte
data XOMUtil.toByteArray( new
FileInputStream("/tmp/test.xml.bnux")) Document
doc codec.deserialize(data) System.out.println
(doc)
12
Easy yet powerful XML XQuery XPath
  • Manipulating and querying XML data
  • Manual SAX/DOM cumbersome at best
  • XSLT often too complicated
  • Most APIs have steep learning curve, contain
    quite a few bugs
  • XQuery XPath are powerful yet simple
  • Essentially SQL for XML data

Document doc // retrieve timeout of a given
transaction// from XML protocol message
timeout XQueryUtil.xquery(doc,
"/opentransactionID123/scope/timeout)
13
XQueries can be powerful
  • List books published by Addison-Wesley after
    1991, including their year and title

ltbibgt for b in doc("http//bstore1.example.
com/bib.xml")/bib/book where b/publisher
"Addison-Wesley" and b/_at_year gt 1991 return
ltbook year" b/_at_year "gt
b/title lt/bookgt lt/bibgt
14
Implementation
  • Leveraging existing software
  • standards compliance, efficiency, maturity
  • Designed straightforward API (Nux)
  • Internally glues Saxon XQuery engine to XOM
    library
  • Tricky internals (!)
  • Preliminary performance for simple queries
  • 2000 (100000) queries/sec over 100 (0.5) KB input
    documents 200 (50) MB/s
  • served from memory, commodity PC 2004, Java 1.5
  • Example ballpark figures
  • use cases, documents and query complexity can
    vary wildly
  • Documentation at http//dsd.lbl.gov/nux

15
Putting it all together
  • Components
  • Async non-blocking I/O
  • Binary XML encoding
  • XQuery and Xpath
  • Enable use of
  • robust and powerful commodity XML tool chains
  • while retaining good performance,
  • sweet spot in the space of trade-offs formed by
    performance, usability, flexibility and
    expressive power

16
Future Work
  • Integration with firefish, scishare, P2P routing
    and maintainance strategies, etc. as outlined in
    LDRD
  • More detailed performance studies
  • End-to-end rather than isolated studies
Write a Comment
User Comments (0)
About PowerShow.com