Grid InterNetworking and QoS for the Grid - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Grid InterNetworking and QoS for the Grid

Description:

just plug in, don t care where (processing) power ... Service Description. Discovery, Selection. Deployment, Invocation. Dynamic Instantiation ... TCP Libra ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 35
Provided by: telekoop
Category:

less

Transcript and Presenter's Notes

Title: Grid InterNetworking and QoS for the Grid


1
Grid InterNetworking andQoS for the Grid
Michael Welzl http//www.welzl.at DPS NSG Team
http//dps.uibk.ac.at/nsg Institute of Computer
Science University of Innsbruck
Q2S Colloquium Q2S/NTNU, Trondheim 4 September,
2006
2
Grid InterNetworking
3
What is the Grid?
  • Metaphor power grid
  • just plug in, dont care where (processing) power
    comes from,dont care how it reaches you
  • Common definitionThe real and specific problem
    that underlies the Grid concept is coordinated
    resource sharing and problem solving in dynamic,
    multi institutional virtual organizationsIan
    Foster, Carl Kesselman and Steven Tuecke, The
    Anatomy of the Grid Enabling Scalable Virtual
    Organizations, International Journal on
    Supercomputer Applications, 2001
  • Common termsVirtual Team - members of one or
    several Virtual Organizations who use a Grid
  • Most of the time...
  • the real and specific goal is High Performance
    Computing
  • virtual organizations and virtual teams are well
    defined(as opposed to the SETI_at_Home usage
    scenario)
  • i.e. not an open system, security is a big issue

4
Scope
  • Grid history parallel processing at a growing
    scale
  • Parallel CPU architectures
  • Multiprocessor machines
  • Clusters
  • (Massively Distributed) computers on the
    Internet

Size
  • Traditional goal processing power
  • Grid people parallel people thus, goal has not
    changed much
  • Broader definition (resource sharing)
  • reasonable - e.g., computers also have harddisks
    -)
  • New research areas / buzzwords Wireless Grid,
    DataGrid, Pervasive Grid, this space reserved
    for your favorite research area Grid
  • sometimes perhaps a little too broad, e.g., P2P
    Working Group is now part of the Global Grid
    Forum

Reasonable to focus on this.
5
Grid Workflow Applications
  • Components are built, Web (Grid) Services are
    defined,Activities are specified
  • Activities (which may communicate with each
    other) should automatically be distributed by a
    scheduler

6
UIBK-DPS development ASKALONA Grid Application
Development and Computing Environment
XML
7
Grid requirements
  • Efficiency ease of use
  • Programmer should not worry (too much) about the
    Grid
  • Underlying system has to deal with
  • Error management
  • Authentification, Authorization and Accounting
    (AAA)
  • Efficient Scheduling / Load Balancing
  • Resource finding and brokerage
  • Naming
  • Resource access and monitoring
  • No problem we do it all - in Middleware
  • de facto standard Globus Toolkit
  • installation of GT3 in our high performance
    system 1 1/2 hours or so...
  • yes, it truly does it all ) 1000s of
    addons - GridFTP, MDS, NWS, GRAM, ..
  • this is just the basis - e.g., ASKALON is layered
    on top of Globus

8
Grid-network peculiarities
  • Special behavior
  • Predictable traffic pattern - this is totally new
    to the Internet!
  • Web users create traffic
  • FTP download starts ... ends
  • Streaming video either CBR or depends on
    content! (head movement, ..)
  • Could be exploited by congestion control
    mechanisms
  • Distinction Bulk data transfer (e.g. GridFTP)
    vs. control messages (e.g. SOAP)
  • File transfers are often pushed and not
    pulled
  • Distributed System which is active for a while
  • overlay based network enhancements possible
  • Multicast
  • P2P paradigm do work for others for the sake of
    enhancing the whole system (in your own
    interest) can be applied - e.g. act as a PEP,
    ...
  • sophisticated network measurements possible
  • can exploit longevity and distributed
    infrastructure
  • Special requirements
  • file transfer delay predictions
  • note useless without knowing about shared
    bottlenecks
  • QoS, but for file transfers only (advance
    reservation)

9
Research gap Grid-specificnetwork enhancements
Bringing the Grid to its full potential !
Applications with specialnetwork properties
andrequirements
Driving a racing caron a public road
Traditional Internet applications(web browser,
ftp, ..)
10
What is EC-GIN?
  • European project Europe-China Grid
    InterNetworking
  • STREP in European IST FP6 Call 6
  • 2.2 MEuro, 11 partners (7 Europe 4 China)
  • Networkers developing mechanisms for Grids

11
Research Challenges
  • Research Challenges
  • How to model Grid traffic?
  • Much is known about web traffic (e.g.
    self-similarity) - but the Grid is different!
  • How to simulate a Grid-network?
  • Necessary for checking various environment
    conditions
  • May require traffic model (above)
  • Currently, Grid-Sim / Net-Sim are two separate
    worlds(different goals, assumptions, tools,
    people)
  • How to specify network requirements?
  • Explicit or implicit, guaranteed or elastic,
    various possible levels of granularity
  • How to align network and Grid economics?
  • Combined usage based pricing for various
    resources including the network
  • What P2P methods are suitable for the Grid?
  • What is the right means for storing short-lived
    performance data?

12
Some issues application interface...
  • How to specify properties and requirements
  • Should be simple and flexible - use QoS
    specification languages?
  • Should applications be aware of this?? Trade-off
    between service granularity and transparency!

13
... and peer awareness
Data flow
Data flow
Grid end system
(b) NSG PEP
14
Problem How Grid folks see the Internet
Just like Web Service community
  • Abstraction - simply use what is available
  • still performance main goal
  • Existing transport system(TCP/IP Routing ..)
    works well
  • QoS makes things better, the Grid needs it!
  • we now have a chance for that, thanks to IPv6

Absolutely not like Web Service community !
Wrong.
  • Quote from a paper review
  • In fact, any solution that requires changing the
    TCP/IP protocol stack is practically unapplicable
    to real-world scenarios, (..).
  • How to change this view
  • Create awareness - e.g. GGF GHPN-RG published
    documents such asnet issues with grids,
    overview of transport protocols
  • Develop solutions and publish them! (EC-GIN,
    GridNets)

15
A time-to-market issue
Typical Grid project
Result thesis running codetests in
collaboration withdifferent research areas
Typical Network project
Result thesis simulationcode perhaps early
real-lifeprototype (if students did well)
16
Machine-only communication
  • Trend in networks from support of Human-Human
    Communication
  • email, chat
  • via Human-Machine Communication
  • web surfing, file downloads (P2P systems),
    streaming media
  • to Machine-machine Communication
  • Growing number of commercial web service based
    applications
  • New hype technologies Sensor nets, Autonomic
    Computing vision
  • Semantic Web (Services) first big step for
    supporting machine-only communication at a high
    level
  • So far, no steps at a lower level
  • This would be like RTP, RTCP, SIP, DCCP, ... for
    multimedia appsnot absolutely necessary, but
    advantageous

17
The long-term value of Grid-net research
  • Key for achieving this change viewpoint
    fromwhat can we do for the Grid to what can
    the Grid do for us(or from what does the Grid
    need to what does the Grid mean to us)
  • A subset of Grid-net developments willbe useful
    for other machine-onlycommunication systems!

18
QoS for the Grid
  • A Grid InterNetworking exampe

19
QoS the state-of-the-art -(
  • Papers from SIGCOMM03 RIPQOS Workshop Why do
    we care, what have we learned?
  • QoSs Downfall At the bottom, or not at all! Jon
    Crowcroft, Steven Hand, Richard Mortier,Timothy
    Roscoe, Andrew Warfield
  • Failure to Thrive QoS and the Culture of
    Operational Networking Gregory Bell
  • Beyond Technology The Missing Pieces for QoS
    Success Carlos Macian, Lars Burgstahler, Wolfgang
    Payer, Sascha Junghans, Christian Hauser, Juergen
    Jaehnert
  • Deployment Experience with Differentiated
    Services Bruce Davie
  • Quality of Service and Denial of Service
    Stanislav Shalunov, Benjamin Teitelbaum
  • Networked games --- a QoS-sensitive application
    for QoS-insensitive users? Tristan Henderson,
    Saleem Bhatti
  • What QoS Research Hasnt Understood About Risk
    Ben Teitelbaum, Stanislav Shalunov
  • Internet Service Differentiation using Transport
    Optionsthe case for policy-aware congestion
    control Panos Gevros

20
Key reasons for QoS failure
  • Required participation of end users and all
    intermediate ISPs
  • normal Internet users want Internet-wide QoS,
    or no QoS at all
  • In a Grid, a virtual team wants QoS between its
    nodes
  • Members of the team share the same ISPs - flow of
    is possible
  • Technical inability to provision individual
    (per-flow) QoS
  • normal Internet users
  • unlimited number of flows come and go at any time
  • heterogeneous traffic mix
  • Grid users
  • number of members in a virtual team may be
    limited
  • clear distinction between bulk data transfer and
    SOAP messages
  • appearance of flows mostly controlled by
    machines, not humans
  • ? QoS can work for the Grid !
  • still, often practical problems (involvement of
    many ISPs for global Grid, ..)

21
Proposed architecture
  • Goal efficient per-flow QoS without signaling to
    routers
  • ultimate dream (very long-term goal) without any
    router involvement!(99 instead of 100 reliable
    guarantees)
  • Idea use traditional coarse-grain QoS (DiffServ)
    to differentiate between
  • long-lived bulk data transfer with advance
    reservation (EF) and
  • everything else ( SOAP etc. over TCP) (best
    effort)
  • Allows us to assume isolated traffic planned to
    drop this requirement later
  • Because data transfers are long lived, apply
    admission control
  • Flows signal to resource broker (RB) when joining
    or leaving the network
  • Mandate usage of one particular congestion
    control mechanism for all flows in the EF
    aggregate
  • Enables efficient resource usage because flows
    are elastic

22
Key ingredients of our QoS soup
  • Link capacities must be known, paths should be
    stable(capacity information should be updated
    upon routing change)
  • Shared bottlenecks must be known
  • Bottlenecks must be fairly shared by congestion
    control mechanism irrespective of RTT (max-main
    fairness required, i.e. all flows must increase
    their rates until they reach their limit)
  • No signaling to routers no way to enforce
    proper behavior? there must be no cheaters
  • User incentive fair behavior among cooperating
    nodes among which Grid application is distributed
  • Unfair behavior between Grid apps 1 and 2 in same
    Grid neglected(usually acceptable, as used by
    same Virtual Organization)

23
Link capacities must be known
  • Can be attained with measurements
  • Working on permanently active, (mostly) passive
    measurement system for the Grid that detects
    capacity with packet pair
  • send two packets p1 and p2 in a row high
    probability that p2 is enqueued exactly behind p1
    at bottleneck
  • at receiver calculate bottleneck bandwidth via
    time between p1 and p2
  • e.g. TCP Delayed ACKreceiver automatically
    sendspacket pairs? passive TCP
    receivermonitoring is quite good!
  • exploit longevity - minimizeerror by listening
    for along time

24
Shared bottlenecks must be known
  • Simple basis distributed traceroute tool
  • enhancement traceroute terminates early upon
    detection of known hop
  • Handle black holes in traceroute
  • generate test messages from A, B to C - identify
    signature from B in As traffic
  • method has worked in the past controlled
    flooding for DDoS detection

25
Congestion Control mechanism must be max-min fair
  • Was once said to be impossible without per-flow
    state in routers
  • not true XCP and some others
  • but these explicit require router support...
  • Main problem dependence on RTT
  • three good indications that this can be removed
    without router support
  • CADPC/PTP (my Ph.D. thesis)...
  • max-min fairness based on router feedback, but
    only capacity and available bandwidth (could also
    be obtain by measuring)
  • Result in old paper on phase effects by Sally
    Floyd
  • TCP Libra
  • Problem efficiency - no max-min fair
    high-speed CC mechanism without router support
  • now plan to change existing one based on
    knowledge from above examples

26
Per-flow QoS without signaling to routers
Traditional method signaling to edge routers
(e.g. with COPS) at this point!
Synchronization ofdistributed (P2P
based)database link capacitiesknown to all
brokers
Synchronization ofdistributed (P2P
based)database all flows knownto all brokers
Synchronization ofdistributed (P2P
based)database all flows knownto all brokers
continuous measurementsupdate to BB upon path
change
27
Efficiency via elasticity
  • QoS guarantees in Grid File will be transferred
    within X seconds? enables flexible resource
    usage

28
Efficiency via elasticity /2
  • Flow 1 stopped, flows 2-4 automatically increase
    their rates
  • leading to earlier termination times E2-E4
    known to (calculated by) BB

29
Efficiency via elasticity /3
  • Flow 5 asks BB for admission
  • BB knows about current rates and promised E2-E4,
    grants access

30
Efficiency via elasticity /4
Additional flow admitted and earlier termination
times than promised!
  • Flow 2 terminates in time
  • Flows 3-5 will also terminate in time

31
Elasticity without Congestion Control?
  • Significant amount of additional signaling
    necessary

As flow 5 is admitted, signal reduce your rates
toflows 2-4 required!
As Flow-1 stops, Flows 2-4 could increase their
rates
Without congestion control, signal increase your
rates to flows 2-4 required!
32
Additional considerations
  • How to assign different rates to different flows?
  • max-min fairness if a sender acts like two, it
    obtains twice the rate
  • consider rate consisting of slots (e.g. 1 kbit/s
    1 slot)
  • flows can consist of several slots
  • let congestion control mechanism operate on slots
  • Possibility admit new flows when things look
    even worse
  • in previous example, flows terminated earlier
    despite the entry of flow 5? unnecessary, would
    be possible to reduce their rates a bit(main
    goal of architecture say yes to requests and
    fulfil them)
  • what if only one of the flows would terminate
    earlier than promised?
  • introduce unfairness remove slots from a flow
    which previously increased its rate if deadline
    can still be kept (calculated in BB)
  • disadvantage more signaling again

33
Difficult distant future work
  • Drop requirement of traffic isolation via
    DiffServ
  • constantly obtain and update conservative
    estimate of available bandwidth using packet pair
    (works without saturating link)
  • ensure that limit is never exceeded condition
    red otherwise!
  • Some open questions...
  • does this require the CC mechanism to be
    TCP-friendly?
  • condition red reduce slots, or let flows be
    aggressive for a short time?
  • How to handle routing changes
  • will be noticed, but can reduce capacity ? break
    QoS guarantee
  • condition red can happen in worst case, but to
    be avoided at all cost
  • mitigation methods
  • very conservative estimate of available
    bandwidth leave headroom
  • tell senders to reroute via intermediate end
    systems
  • Bottom line lots of complicated issues, but
    possible to solve them

34
Thank you!
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com