CMS MC Production over USCMS Integration Grid Testbed - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

CMS MC Production over USCMS Integration Grid Testbed

Description:

Separation of development and Production TBs. Still a Testbed ... Mainly bottlenecks (nfs, sockets, disk) No dead-end (work arounds exist) ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 8
Provided by: Anz7
Category:

less

Transcript and Presenter's Notes

Title: CMS MC Production over USCMS Integration Grid Testbed


1
CMS MC Production overUSCMS Integration Grid
Testbed
  • Anzar Afaq
  • FermiLab

2
From TB to IGT
  • Test, Integration and Production Grids
  • Separation of development and Production TBs
  • Still a Testbed
  • Tier-II resources combined into Grid effort
  • Over 240 cpu(s)
  • FNAL, UCSD, Caltech, UFL
  • With UWI providing middleware support
  • CERN/LCG will be joining soon
  • Using almost all available resources 24 X 7.

3
Software Components
  • Using MCRunJob for job creation
  • Using MOP for production over the Grid
  • Cmsim 125.3, ORCA 6_2_0
  • VDT 1.1.3 Sever and Client
  • Condor and fbsng batch systems
  • Monitoring tools (Ganglia etc)
  • Data Transfer tools (Tonys scripts !)

4
Progressing
  • Smooth start
  • 10 ? 50 ? 100 in a week
  • Throttled submission
  • Few sites 1.5 ? 2 ? 3 ? 4
  • End-to-End
  • Smaller farms
  • End-to-End integration seemed to be working
  • Excellent throughput
  • Achieved 20K per day
  • (and decreasing!)

5
Problems unveiled
  • Breakdowns (03 major)----scalability ?
  • Unforeseen issues (not any more!)
  • Larger submissions (30)
  • Parallel g-u-c
  • NFS timeouts
  • Larger farms
  • Disk management
  • Failing jobs garbage collection
  • Mainly bottlenecks (nfs, sockets, disk)
  • No dead-end (work arounds exist)
  • Throttling, code adjustments and fixes
  • Cleanup and start again

6
Effort
  • Need baby sitting
  • Monitoring and alarming helps
  • Shift duties
  • Need to spread experience
  • man-power
  • Missing scheduler at Grid level
  • Manual submission/Resource allocation
  • Improving as we go.(cannot avoid that)
  • End-To_End job tracking is hard (Tool?)
  • Learning as we go

7
Now and Soon
  • Running egamma big jet production
  • cmkin ? oodigi !
  • Recently started h2root (for clarens)
  • Plans to add more CPUs for CMSIM only production.
  • CERN/LCG on its way to join IGT
  • Working closely with Middleware teams
  • Continue production24 X 7 !
Write a Comment
User Comments (0)
About PowerShow.com