International Virtual Data Grid Laboratory Project Overview - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

International Virtual Data Grid Laboratory Project Overview

Description:

Planning, milestones closely tied to those of experiments. External ... How-to guides. Other. Student experiences. NSF Review (February 10, 2004) Paul Avery ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 37
Provided by: paula92
Category:

less

Transcript and Presenter's Notes

Title: International Virtual Data Grid Laboratory Project Overview


1
  • International Virtual Data Grid
    LaboratoryProject Overview

Paul Avery University of Florida avery_at_phys.ufl.ed
u
NSF Review Washington, DCFebruary 10, 2004
2
iVDGL Goals
  • Deploy a Grid laboratory
  • Support research mission of data intensive
    experiments
  • Provide computing and personnel resources at
    university sites
  • Provide platform for computer science technology
    development
  • Prototype and deploy a Grid Operations Center
    (iGOC)
  • Integrate Grid software tools
  • Into computing infrastructures of the experiments
  • Support delivery of Grid technologies
  • Hardening of the Virtual Data Toolkit (VDT) and
    other middleware technologies developed by
    GriPhyN and other Grid projects
  • Education and Outreach
  • Lead and collaborate with Education and Outreach
    efforts
  • Provide tools and mechanisms for underrepresented
    groups and remote regions to participate in
    international science projects

3
Original iVDGL Science Drivers
  • US-ATLAS, US-CMS (LHC experiments)
  • 100s of Petabytes
  • LIGO (gravitational wave search)
  • 100s of Terabytes
  • Sloan Digital Sky Survey
  • 10s of Terabytes
  • Future Grid resources
  • Massive CPU (PetaOps)
  • Large distributed datasets (gt100PB)
  • Global communities (1000s)

4
Global LHC Data Grid Hierarchy
CMS Experiment
Online System
0.1 - 1.5 GBytes/s
CERN Computer Center
Tier 0
10-40 Gb/s
Tier 1
2.5-10 Gb/s
Tier 2
1-2.5 Gb/s
Tier 3
Physics caches
1-10 Gb/s
10s of Petabytes/yr by 2007-81000 Petabytes in
lt 10 yrs?
PCs
Tier 4
5
iVDGL Summary
  • iVDGL participants
  • Physicists from 4 frontier physics/astronomy
    experiments
  • Computer science support teams (Globus, Condor,
    VDT, )
  • iVDGL basics (2001 2006)
  • Funded by ITR through Physics Division
  • 13.65M (NSF) 2M (matching)
  • 18 universities, SDSC, 4 labs, 100 people
  • Integrated Outreach effort (Led by UT
    Brownsville)
  • iVDGL management
  • Paul Avery (Florida) co-Director
  • Ian Foster (Chicago) co-Director
  • Rob Gardner (Chicago) Project Coordinator
  • Jorge Rodriguez (Florida) Deputy Coordinator

6
iVDGL Institutions
  • U Florida
  • Boston
  • Caltech
  • Chicago
  • Harvard
  • Hampton
  • Indiana
  • Johns Hopkins
  • Penn State
  • Salish Kootenai
  • UC San Diego
  • Southern California
  • Texas, Austin
  • Texas, Brownsville
  • Wisconsin, Milwaukee
  • Wisconsin, Madison
  • Argonne
  • Brookhaven
  • Fermilab
  • LBL
  • SDSC
  • SLAC
  • New sites
  • Vanderbilt (Summer 2003)
  • Iowa (Fall 2003)
  • Michigan (March 2004?)
  • FIU (March 2004?)
  • U Buffalo (March 2004?)

Funded by iVDGL
7
iVDGL Sites (February 2004)
SKC
Boston U
UW Milwaukee
Michigan
PSU
UW Madison
BNL
Fermilab
LBL
Argonne
Iowa
Chicago
J. Hopkins
Indiana
Hampton
Caltech
ISI
Vanderbilt
  • Partners
  • EU
  • Brazil
  • Korea

UCSD
UF
Austin
FIU
Brownsville
8
Roles of iVDGL Institutions
  • U Florida CMS (Tier2), Management
  • Caltech CMS (Tier2), LIGO (Management)
  • UC San Diego CMS (Tier2), CS
  • Indiana U ATLAS (Tier2), iGOC (operations)
  • Boston U ATLAS (Tier2)
  • Harvard ATLAS (Management)
  • Wisconsin, Milwaukee LIGO (Tier2)
  • Penn State LIGO (Tier2)
  • Johns Hopkins SDSS (Tier2), NVO
  • Vanderbilt BTEV (Tier2)
  • Chicago CS, Coord./Management, ATLAS (Tier2)
  • Southern California CS
  • Wisconsin, Madison CS
  • Texas, Austin CS
  • Salish Kootenai LIGO (Outreach, Tier3)
  • Hampton U ATLAS (Outreach, Tier3)
  • Texas, Brownsville LIGO (Outreach, Tier3)
  • Fermilab CMS (Tier1), SDSS, NVO
  • Brookhaven ATLAS (Tier1)

9
iVDGL Budget (9/2001 9/2006)
10
iVDGL Budget Major Activities
11
iVDGL Management and Coordination
ProjectDirectors
Advisory Committee
GriPhyN
Collaborating Grid Projects
Project Steering Group
Facilities Team
Core Software Team
Operations Team
Project Coordination Group
Applications Team
GLUE Interoperability Team
Outreach Team
12
Integration of iVDGL GriPhyN
  • Both NSF funded, overlapping periods
  • GriPhyN 11.9M (NSF) 1.6M (match) (9/20009/20
    05)
  • iVDGL 13.7M (NSF) 2M (match) (9/20019/2006)
  • Basic composition
  • GriPhyN 12 universities, SDSC, 3 labs (80
    people)
  • iVDGL 18 institutions, SDSC, 4 labs (100
    people)
  • Large overlap people, institutions, experiments
  • GriPhyN (Grid research) vs iVDGL (Grid
    deployment)
  • GriPhyN 2/3 CS 1/3 physics ( 0 H/W)
  • iVDGL 1/3 CS 2/3 physics (20 H/W)
  • Many common elements
  • Common Directors, Advisory Committee, linked
    management
  • Common Virtual Data Toolkit (VDT)
  • Common Grid testbeds
  • Common Outreach effort

13
Management
  • Challenges from large, dispersed, diverse project
  • 100 people
  • 15 funded institutions several unfunded ones
  • Multi-culturalism CS, 5 experiments
  • Different priorities and risk equations
  • Project coordinators have helped tremendously
  • Rob Gardner iVDGL Coordinator (Chicago)
  • Jorge Rodriquez iVDGL Deputy Coordinator
    (Florida)
  • Ruth Pordes iVDGL Interim Coordinator, PPDG
    Coordinator
  • Mike Wilde GriPhyN Coordinator (UC/Argonne)
  • Rick Cavanaugh GriPhyN Deputy Coordinator
    (Florida)

14
Management (cont.)
  • Internal coordination
  • Many meetings, telecons, workshops, etc.
  • Presence of experiment, CS leaders in Steering
    Group
  • Planning, milestones closely tied to those of
    experiments
  • External coordination
  • Trillium GriPhyN, PPDG
  • National TeraGrid, Globus, NSF, SciDAC,
  • International LCG, EDG, GGF, HICB,
  • Networks Internet2, ESNET, ICFA-SCIC
  • Industry trends OGSA, web services, etc.
  • Highly time dependent
  • Requires lots of travel, meetings, energy

15
Work Teams
  • Facilities Team
  • Hardware (Tier1, Tier2, Tier3)
  • Core Software Team
  • Grid middleware, toolkits (from US EU)
  • Laboratory Operations Team (iGOC)
  • Operations, coordination, software support,
    monitoring
  • Applications Team
  • High energy physics, gravity waves, digital
    astronomy, etc.
  • Education and Outreach Team
  • Web tools, curriculum development, involvement of
    students
  • Integrated with GriPhyN
  • Connections to other projects (PPDG, CHEPREO,
    NPACI-EOT, )

16
External Advisory Committee
  • Members
  • Fran Berman (SDSC Director)
  • Dan Reed (NCSA Director)
  • Joel Butler (former head, FNAL Computing
    Division)
  • Jim Gray (Microsoft)
  • Bill Johnston (LBNL, DOE Science Grid)
  • Fabrizio Gagliardi (CERN, EDG Director)
  • David Williams (former head, CERN IT)
  • Roscoe Giles (Boston U, NPACI-EOT)
  • Paul Messina (resigned, Dec. 2003)
  • Met with us 3 times 4/2001, 1/2002, 1/2003,
    1/2004
  • Extremely useful guidance on project scope goals

Postponed to April/May
17
Students
  • Involved in many aspects of iVDGL
  • J. Zamora (B.S., UT Brownsville) Outreach, LIGO
  • A. Gupta (M.S. UFL) Web pages
  • A. Zahn (B.S. UFL) Info. systems, web pages
  • L. Sorrillo (B.S. Hampton) Grid2003
  • S. Morriss (B.S. UT Brownsville) LIGO, Grid2003
  • A. Martinez (B.S. UT Brownsville) LIGO, web pages
  • P. Peiris (B.S. UT Brownsville) LIGO, web pages
  • Theses
  • Sonal Patil (M.S., USC) Pegasus 10/2003
  • Sarp Oral (Ph.D, UFL) Grid networking 12/2003
  • S. Pena (M.S. UT Brownsville) LIGO
    analysis 9/2004
  • R. Subramaniyan (Ph.D, UFL) Grid
    monitoring 12/2005
  • R. Balasubramanian (Ph.D, UFL) Grid
    networking 12/2006

18
Students (2)
  • GriPhyN students contributing (using iVDGL
    resources)
  • K. Ranagnathan (Ph.D, Chicago) Data placement
  • C. Dumitrescu (Ph.D, Chicago) Policy models,
    enforcement
  • Y. Zhou (Ph.D Chicago) Grid portal (outreach)
  • J. Uk In (Ph.D, Florida) Scheduling, policy
    algorithms
  • Students from other projects
  • A. Rodriguez (Ph.D, Illinois) Bioinformatics
  • QuarkNet
  • Paid for 3 students to spend time at Fermilab

19
iVDGL Project Pages Heavily Used
  • Database driven
  • Participants
  • Documents
  • Meetings
  • News items
  • Remote editing of pages
  • News items
  • Meeting agenda builder
  • Participant info
  • Talks at meetings
  • Documents
  • Linked pages
  • iVDGL, GriPhyN, Grid2003

20
GriPhyN/iVDGL Outreach Web Site
  • Basic information
  • Grid info
  • Physics expts
  • EO activities
  • Documents
  • Technical support
  • How-to guides
  • Other
  • Student experiences

www.griphyn.org/outreach
21
GriPhyN Grid2003 Pages
22
Global Context Data Grid Projects
Collaborating Grid infrastructure projects
  • U.S. Projects
  • GriPhyN (NSF)
  • iVDGL (NSF)
  • Particle Physics Data Grid (DOE)
  • PACIs and TeraGrid (NSF)
  • DOE Science Grid (DOE)
  • NEESgrid (NSF)
  • NSF Middleware Initiative (NSF)
  • EU, Asia projects
  • European Data Grid (EU)
  • EDG-related national Projects
  • DataTAG (EU)
  • LHC Computing Grid (CERN)
  • EGEE (EU)
  • CrossGrid (EU)
  • GridLab (EU)
  • Japanese, Korea Projects
  • gt200M?
  • Two major clusters US EU

23
U.S. Trillium Grid Projects
  • Trillium PPDG GriPhyN iVDGL
  • PPDG 10M (DOE) (1999 2004)
  • GriPhyN 12M (NSF) (2000 2005)
  • iVDGL 14M (NSF) (2001 2006)
  • Basic composition (150 people)
  • PPDG 4 universities, 6 labs
  • GriPhyN 12 universities, SDSC, 3 labs
  • iVDGL 18 universities, SDSC, 4 labs, foreign
    partners
  • Expts BaBar, D0, STAR, Jlab, CMS, ATLAS, LIGO,
    SDSS/NVO
  • Complementarity of projects
  • GriPhyN CS research, Virtual Data Toolkit (VDT)
    development
  • PPDG End to end Grid, services, monitoring,
    analysis
  • iVDGL Grid laboratory deployment using VDT
  • Experiments provide frontier challenges

24
Trillium Project Coordination
  • Trillium participants
  • Large overlap in leadership, people, experiments
  • Benefit of coordination
  • Common software base packaging VDT Pacman
  • Collaborative / joint projects monitoring,
    demos, security,
  • Wide deployment of new technologies, e.g. Virtual
    Data
  • Stronger, broader outreach effort
  • Forum for US Grid projects
  • Joint strategies, meetings and work
  • Unified U.S. entity to interact with
    international Grid projects
  • Build significant Grid infrastructure Grid2003

25
  • Grid2003 An Operational Grid
  • 28 sites (2100-2800 CPUs)
  • 400-1300 concurrent jobs
  • 10 applications
  • Running since October 2003

Korea
http//www.ivdgl.org/grid2003
26
Grid2003 Participants
Korea
US-CMS
iVDGL
GriPhyN
US-ATLAS
DOE Labs
PPDG
Biology
27
Grid2003 Three Months Usage
28
Grid2003 Success
  • Much larger than originally planned
  • More sites (28), CPUs (2800), simultaneous jobs
    (1300)
  • More applications (10) in more diverse areas
  • Able to accommodate new institutions
    applications
  • U Buffalo (Biology) Nov. 2003
  • Rice U. (CMS) Feb. 2004
  • Continuous operation since October 2003
  • Strong operations team (iGOC at Indiana)
  • US-CMS using it for production simulations (next
    slide)

29
Production Simulations on Grid2003
US-CMS Monte Carlo Simulation
Yield 1.5 ? US-CMS resources
Non-USCMS
USCMS
30
Grid2003 A Necessary Step
  • Learning how to operate a Grid
  • Add sites, recover from errors, provide info,
    update, test, etc.
  • Need tools, services, procedures, documentation,
    organization
  • Need reliable, intelligent, skilled people
  • Learning how to cope with large scale
  • Interesting failure modes as scale increases
  • Increasing scale must not overwhelm human
    resources
  • Learning how to delegate responsibilities
  • Multiple levels Project, Virtual Org., service,
    site, application
  • Essential for future growth
  • Grid2003 experience critical for building
    useful Grids
  • Frank discussion in Grid2003 Project Lessons doc

31
Grid2003 and Beyond (1)
  • Further evolution of Grid3 (Grid3, etc.)
  • Contributes to persistent Grid
  • A development Grid, testing new software releases
  • New releases integrated into the persistent Grid
    over time
  • Participates in LHC data challenges
  • Involvement of new sites
  • New institutions and experiments
  • New international partners (e.g., Brazil, Taiwan,
    )
  • Improvements in Grid middleware and services
  • Integrating multiple VOs
  • Monitoring
  • Troubleshooting

32
Grid2003 and Beyond (2)
  • Coordination with LHC Computing Grid
  • Federating Grid2003 resources with LCG
  • Participate in global data challenge exercises
  • Involvement of other disciplines
  • CS, LIGO, Astronomy, Biology,
  • Coordinated education/outreach
  • QuarkNet, GriPhyN, iVDGL, PPDG, CMS, ATLAS
  • CHEPREO center at Florida International U
  • Digital Divide efforts (Feb. 15-20 Rio workshop)

33
Open Science Grid
  • Goal Build an integrated Grid infrastructure
  • Support US-LHC research program, other scientific
    efforts
  • Resources from laboratories and universities
  • Federate with LHC Computing Grid
  • Getting there OSG-1 (Grid3), OSG-2,
  • Series of releases ? increasing functionality
    scale
  • Constant use of facilities for LHC production
    computing
  • Other steps
  • Jan. 12 meeting for public input, planning
  • White paper to be expanded into roadmap
  • iVDGL role
  • Same project deliverables, but in a larger
    context
  • Laboratory activities, Tier2 centers, operations,
    outreach

34
Inputs to Open Science Grid
Technologists
Trillium
US-LHC
Universityfacilities
OpenScienceGrid
Educationcommunity
Multi-disciplinaryfacilities
Laboratorycenters
ComputerScience
Otherscienceapplications
35
Extra Slides
36
New Experiments and Institutions
  • BTEV
  • Fermilab experiment, startup in 2008
  • Vanderbilt and Fermilab represent BTEV effort
    within iVDGL
  • Intriguing idea of using Grid for online Level
    4 trigger
  • Biology
  • U Buffalo
  • Under consideration
  • Existing HENP experiments within PPDG
  • D0, CDF, STAR, JLab experiments
  • Each has petabyte of data
  • iVDGL involvement through its Trillium partner
    PPDG
  • Still being discussed
Write a Comment
User Comments (0)
About PowerShow.com