Grid%20Computing - PowerPoint PPT Presentation

About This Presentation
Title:

Grid%20Computing

Description:

Grid Computing Outline Introduction Using the grid Ongoing research The presentation is based on the web, especially the work of Faisal N. Abu-Khzam & Michael A ... – PowerPoint PPT presentation

Number of Views:400
Avg rating:3.0/5.0
Slides: 73
Provided by: Renc5
Category:

less

Transcript and Presenter's Notes

Title: Grid%20Computing


1
Grid Computing
  • Outline
  • Introduction
  • Using the grid
  • Ongoing research
  • The presentation is based on the web, especially
    the work of Faisal N. Abu-Khzam Michael A.
    Langston (University of Tenessee)

2
Introduction
  • What is Grid Computing?
  • Who Needs It?
  • An Illustrative Example
  • Grid Users
  • Current Grids

3
What is Grid Computing?
  • Computational Grids
  • Homogeneous (e.g., Clusters)
  • Heterogeneous (e.g., with one-of-a-kind
    instruments)
  • Cousins of Grid Computing
  • Methods of Grid Computing

4
Computational Grids
  • A network of geographically distributed resources
    including computers, peripherals, switches,
    instruments, and data.
  • Each user should have a single login account to
    access all resources.
  • Resources may be owned by diverse organizations.

5
Computational Grids
  • Grids are typically managed by gridware.
  • Gridware can be viewed as a special type of
    middleware that enable sharing and manage grid
    components based on user requirements and
    resource attributes (e.g., capacity, performance,
    availability)

6
Cousins of Grid Computing
  • Parallel Computing
  • Distributed Computing
  • Peer-to-Peer Computing
  • Many others Cluster Computing, Network
    Computing, Client/Server Computing, Internet
    Computing, etc...

7
Distributed Computing
  • People often ask Is Grid Computing a fancy new
    name for the concept of distributed computing?
  • In general, the answer is no. Distributed
    Computing is most often concerned with
    distributing the load of a program across two or
    more processes.

8
P2P Computing
  • Sharing of computer resources and services by
    direct exchange between systems.
  • Computers can act as clients or servers depending
    on what role is most efficient for the network.

9
Methods of Grid Computing
  • Distributed Supercomputing
  • High-Throughput Computing
  • On-Demand Computing
  • Data-Intensive Computing
  • Collaborative Computing
  • Logistical Networking

10
Distributed Supercomputing
  • Combining multiple high-capacity resources on a
    computational grid into a single, virtual
    distributed supercomputer.
  • Tackle problems that cannot be solved on a single
    system.

11
High-Throughput Computing
  • Uses the grid to schedule large numbers of
    loosely coupled or independent tasks, with the
    goal of putting unused processor cycles to work.

12
On-Demand Computing
  • Uses grid capabilities to meet short-term
    requirements for resources that are not locally
    accessible.
  • Models real-time computing demands.

13
Data-Intensive Computing
  • The focus is on synthesizing new information from
    data that is maintained in geographically
    distributed repositories, digital libraries, and
    databases.
  • Particularly useful for distributed data mining.

14
Collaborative Computing
  • Concerned primarily with enabling and enhancing
    human-to-human interactions.
  • Applications are often structured in terms of a
    virtual shared space.

15
Logistical Networking
  • Global scheduling and optimization of data
    movement.
  • Contrasts with traditional networking, which does
    not explicitly model storage resources in the
    network.
  • Called "logistical" because of the analogy it
    bears with the systems of warehouses, depots, and
    distribution channels.

16
Who Needs Grid Computing?
  • A chemist may utilize hundreds of processors to
    screen thousands of compounds per hour.
  • Teams of engineers worldwide pool resources to
    analyze terabytes of structural data.
  • Meteorologists seek to visualize and analyze
    petabytes of climate data with enormous
    computational demands.

17
An Illustrative Example
  • Tiffany Moisan, a NASA research scientist,
    collected microbiological samples in the
    tidewaters around Wallops Island, Virginia.
  • She needed the high-performance microscope
    located at the National Center for Microscopy and
    Imaging Research (NCMIR), University of
    California, San Diego.

18
Example (continued)
  • She sent the samples to San Diego and used
    NPACIs Telescience Grid and NASAs Information
    Power Grid (IPG) to view and control the output
    of the microscope from her desk on Wallops
    Island. Thus, in addition to viewing the samples,
    she could move the platform holding them and make
    adjustments to the microscope.
  • The microscope produced a huge dataset of images.
  • This dataset was stored using a storage resource
    broker on NASAs IPG.
  • Moisan was able to run algorithms on this very
    dataset while watching the results in real time.

19
Grid Users
  • Grid developers
  • Tool developers
  • Application developers
  • End Users
  • System Administrators

20
Grid Developers
  • Very small group.
  • Implementers of a grid protocol who provides
    the basic services required to construct a grid.

21
Tool Developers
  • Implement the programming models used by
    application developers.
  • Implement basic services similar to conventional
    computing services
  • User authentication/authorization
  • Process management
  • Data access and communication
  • Also implement new (grid) services such as
  • Resource locations
  • Fault detection
  • Security
  • Electronic payment

22
Application Developers
  • Construct grid-enabled applications for end-users
    who should be able to use these applications
    without concern for the underlying grid.
  • Provide programming models that are appropriate
    for grid environments and services that
    programmers can rely on when developing
    (higher-level) applications.

23
System Administrators
  • Balance local and global concerns.
  • Manage grid components and infrastructure.
  • Some tasks still not well delineated due to the
    high degree of sharing required.

24
Some Highly-Visible Grids
  • The NSF PACI/NCSA Alliance Grid.
  • The NSF PACI/SDSC NPACI Grid.
  • The NASA Information Power Grid (IPG).
  • The Distributed Terascale Facility (DTF) Project.

25
DTF
  • Currently being built by NSFs Partnerships for
    Advanced Computational Infrastructure (PACI)
  • A collaboration NCSA, SDSC, Argonne, and Caltech
    will work in conjunction with IBM, Intel, Quest
    Communications, Myricom, Sun Microsystems, and
    Oracle.
  • DTF Expectations
  • A 40-billion-bits-per-second optical network
    (Called TeraGrid) is to link computers,
    visualization systems, and data at four sites.
  • Performs 11.6 trillion calculations per second.
  • Stores more than 450 trillion bytes of data.

26
Using the Grid
  • Globus
  • Condor
  • Harness
  • Legion
  • IBP
  • NetSolve
  • Others

27
Globus
  • A collaboration of Argonne National Laboratorys
    Mathematics and Computer Science Division, the
    University of Southern Californias Information
    Sciences Institute, and the University of
    Chicago's Distributed Systems Laboratory.
  • Started in 1996 and is gaining popularity year
    after year.
  • A project to develop the underlying technologies
    needed for the construction of computational
    grids.
  • Focuses on execution environments for integrating
    widely-distributed computational platforms, data
    resources, displays, special instruments and so
    forth.

28
The Globus Toolkit
  • The Globus Resource Allocation Manager (GRAM)
  • Creates, monitors, and manages services.
  • Maps requests to local schedulers and computers.
  • The Grid Security Infrastructure (GSI)
  • Provides authentication services.

29
The Globus Toolkit
  • The Monitoring and Discovery Service (MDS)
  • Provides information about system status,
    including server configurations, network status,
    and locations of replicated datasets, etc.
  • Nexus and globus_io
  • provides communication services for heterogeneous
    environments.
  • Global Access to Secondary Storage (GASS)
  • Provides data movement and access mechanisms that
    enable remote programs to manipulate local data.
  • Heartbeat Monitor (HBM)
  • Used by both system administrators and ordinary
    users to detect failure of system components or
    processes.

30
Condor
  • The Condor project started in 1988 at the
    University of Wisconsin-Madison.
  • The main goal is to develop tools to support High
    Throughput Computing on large collections of
    distributively owned computing resources.
  • Runs on a cluster of workstations to glean wasted
    CPU cycles.
  • A Condor pool consists of any number of
    machines, of possibly different architectures and
    operating systems, that are connected by a
    network.
  • Condor pools can share resources by a feature of
    Condor called flocking.

31
The Condor Pool Software
  • Job management services
  • Supports requests about the job queue .
  • Puts a job on hold.
  • Enables the submission of new jobs.
  • Provides information about jobs that are already
    finished.
  • A machine with job management installed is called
    a submit machine.

32
The Condor Pool Software
  • Resource management
  • Keeps track of available machines.
  • Performs resource allocation and scheduling.
  • Machines with resource management installed are
    called execute machines.
  • A machine could be a submit and an execute
    machine simultaneously.

33
Condor-G
  • A version of Condor that uses Globus to submit
    jobs to remote resources.
  • Allows users to monitor jobs submitted through
    the Globus toolkit.
  • Can be installed on a single machine. Thus no
    need to have a Condor pool installed.

34
Legion
  • An object-based metasystems software project
    designed at the University of Virginia to support
    millions of hosts and trillions of objects linked
    together with high-speed links.
  • Allows groups of users to construct shared
    virtual work spaces, to collaborate research and
    exchange information.
  • An open system designed to encourage third party
    development of new or updated applications,
    run-time library implementations, and core
    components.
  • The key feature of Legion is its object-oriented
    approach.

35
Harness
  • A Heterogeneous Adaptable Reconfigurable
    Networked System
  • A collaboration between Oak Ridge National Lab,
    the University of Tennessee, and Emory
    University.
  • Conceived as a natural successor of the PVM
    project.

36
Harness
  • An experimental system based on a highly
    customizable, distributed virtual machine (DVM)
    that can run on anything from a Supercomputer to
    a PDA.
  • Built on three key areas of research Parallel
    Plug-in Interface, Distributed Peer-to-Peer
    Control, and Multiple DVM Collaboration.

37
IBP
  • The Internet Backplane Protocol (IBP) is a
    middleware for managing and using remote storage.
  • It was devised at the University of Tennessee to
    support Logistical Networking in large scale,
    distributed systems and applications.

38
IBP
  • Named because it was designed to enable
    applications to treat the Internet as if it were
    a processor backplane.
  • On a processor backplane, the user has access to
    memory and peripherals, and can direct
    communication between them with DMA.

39
IBP
  • IBP gives the user access to remote storage and
    standard Internet resources (e.g. content servers
    implemented with standard sockets) and can direct
    communication between them with the IBP API.

40
IBP
  • By providing a uniform, application-independent
    interface to storage in the network, IBP makes it
    possible for applications of all kinds to use
    logistical networking to exploit data locality
    and more effectively manage buffer resources.

41
NetSolve
  • A client-server-agent model.
  • Designed for solving complex scientific problems
    in a loosely-coupled heterogeneous environment.

42
The NetSolve Agent
  • A resource broker that represents the gateway
    to the NetSolve system
  • Maintains an index of the available computational
    resources and their characteristics, in addition
    to usage statistics.

43
The NetSolve Agent
  • Accepts requests for computational services from
    the client API and dispatches them to the
    best-suited sever.
  • Runs on Linux and UNIX.

44
The NetSolve Client
  • Provides access to remote resources through
    simple and intuitive APIs.
  • Runs on a users local system.
  • Contacts the NetSolve system through the agent,
    which in turn returns the server that can best
    service the request.
  • Runs on Linux, UNIX, and Windows.

45
The NetSolve Server
  • The computational backbone of the system.
  • A daemon process that awaits client requests.
  • Runs on different platforms a single
    workstation, cluster of workstations, symmetric
    multiprocessors (SMPs), or massively parallel
    processors (MPPs).

46
The NetSolve Server
  • A key component of the server is the Problem
    Description File (PDF).
  • With the PDF, routines local to a given server
    are made available to clients throughout the
    NetSolve system.

47
The PDF Template
  • PROBLEM Program Name
  • LIB Supporting Library Information
  • INPUT specifications
  • OUTPUT specifications
  • CODE

48
Network Weather Service
  • Supports grid technologies.
  • Uses sensor processes to monitor cpu loads and
    network traffic.
  • Uses statistical models on the collected data to
    generate a forecast of future behavior.
  • NetSolve is currently integrating NWS into its
    agent.

49
Gridware Collaboarations
  • NetSolve is using Globus' "Heartbeat Monitor" to
    detect failed servers.
  • A NetSolve client is now in testing that allows
    access to Globus.
  • Legion has adopted NetSolves client-user
    interface to leverage its metacomputing
    resources.
  • The NetSolve client uses Legions data-flow
    graphs to keep track of data dependencies.

50
Gridware Collaboarations
  • NetSolve can access Condor pools among its
    computational resources.
  • IBP-enabled clients and servers allow NetSolve to
    allocate and schedule storage resources as part
    of its resource brokering. This improves fault
    tolerance.

51
Ongoing Research
  • Motivation
  • Special Projects
  • Ongoing work at Tennessee
  • General Issues
  • Open questions of interest to the entire research
    community

52
Motivation
  • Computer speed doubles every 18 months
  • Network speed doubles every 9 months

Graph from Scientific American (Jan-2001) by Cleo
Vilett, source Vined Khoslan, Kleiner, Caufield
and Perkins
53
Special Projects
  • The SInRG Project.
  • Grid Service Clusters (GSCs)
  • Data Switches
  • Incorporating Hardware Acceleration.
  • Unbridled Parallelism
  • SETI_at_home and Folding_at_home
  • The Vertex Cover Solver
  • Security.

54
The SInRG Project
55
The Grid Service Cluster
  • The basic grid building block.
  • Each GSC will use the same software
    infrastructure as is now being deployed on the
    national Grid, but tuned to take advantage of the
    highly structured and controlled design of the
    cluster.
  • Some GSCs are general-purpose and some are
    special-purpose.

56
The Grid Service Cluster
57
An advanced data switch
  • The components that make up a GSC must be able to
    access each other at very high speeds and with
    guaranteed Quality of Service (QoS).
  • Links of at least1Gbps assure QoS in many
    circumstances simply by over provisioning.

58
Computational Ecology GSC
  • Collaboration between computer science and
    mathematical ecology.
  • 8-processor Symmetric Multi-Processor (SMP).
  • Initial in-core memory (RAM) is approximately 4
    gigabytes.
  • Out-of-core data storage unit provides a minimum
    of 450 gigabytes.

59
Medical Imaging GSC
  • Collaboration between computer science and the
    medical school.
  • High-end graphics workstations.
  • Distinguished by the need to have these
    workstations attached as directly as possible to
    the switch to facilitate interactive manipulation
    of the reconstructed images.

60
Molecular Design GSC
  • Collaboration between computer science and
    chemical engineering.
  • Data visualization laboratory
  • 32 dual processors
  • High performance switch

61
Machine Design GSC
  • Collaboration between computer science and
    electrical engineering.
  • 12 Unix-based CAD workstations.
  • 8 Linux boxes with Pilchard boards.
  • Investigating the potential of reconfigurable
    computing in grid environments.

62
Machine Design GSC
63
Types of Hardware
  • General purpose hardware can implement any
    function
  • ASICs hardware that can implement only a
    specific application
  • FPGAs reconfigurable hardware that can
    implement any function

64
The FPGA
  • FPGAs offer reprogrammability
  • Allows optimal logic design of each function to
    be implemented
  • Hardware implementations offer acceleration over
    software implementations which are run on general
    purpose processors

65
The Pilchard Environment
  • Developed at Chinese University in Hong Kong.
  • Plugs into 133MHz RAM DIMM slot and is an example
    of programmable active memory.
  • Pilchard is accessed through memory read/write
    operations.
  • Higher bandwidth and lower latency than other
    environments.

66
Objectives
  • Evaluate utility of NetSolve gridware.
  • Determine effectiveness of hardware acceleration
    in this environment.
  • Provide an interface for the remote use of FPGAs.
  • Allow users to experiment and gauge whether a
    given problem would benefit from hardware
    acceleration.

67
Sample Implementations
  • Fast Fourier Transform (FFT)
  • Data Encryption Standard algorithm (DES)
  • Image backprojection algorithm
  • A variety of combinatorial algorithms

68
Implementation Techniques
  • Two types of functions are implemented
  • Software version - runs on the PCs processor
  • Hardware version - runs in the FPGA
  • To implement the hardware version of the
    function, VHDL code is needed

69
The Hardware Function
  • Implemented in VHDL or some other hardware
    description language.
  • The VHDL code is then mapped onto the FPGA
    (synthesis).
  • CAD tools help make mapping decisions based on
    constraints such as chip area, I/O pin counts,
    routing resources and topologies, partitioning,
    resource usage minimization.

70
The Hardware Function
  • Result of synthesis is a configuration file (bit
    stream).
  • This file defines how the FPGA is to be
    reprogrammed in order to implement the new
    desired functionality.
  • To run, a copy of the configuration file must be
    loaded on the FPGA.

71
Behind the Scenes
VHDL code
Configuration file
Software and Hardware functions
PDFs, Libraries
72
Conclusions
  • Hardware acceleration is offered to both local
    and remote users.
  • Resources are available through an efficient and
    easy-to-use interface.
  • A development environment is provided for
    devising and testing a wide variety of software,
    hardware and hybrid solutions.
  • Unbridled parallelism
  • Sometimes the overhead of gridware is unneeded
  • Well known examples include SETI_at_home and
    Folding_at_home
Write a Comment
User Comments (0)
About PowerShow.com