User Group Meeting Fall 2007 - PowerPoint PPT Presentation

Loading...

PPT – User Group Meeting Fall 2007 PowerPoint presentation | free to view - id: 1c1583-ZTJiM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

User Group Meeting Fall 2007

Description:

User Group Meeting Fall 2007 – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 50
Provided by: edben
Category:
Tags: fall | group | meeting | user | yrf

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: User Group Meeting Fall 2007


1
User Group MeetingFall 2007
  • University of Notre Dame
  • Center for Research Computing
  • September 17th, 2007

2
Agenda
  • Staff Introduction
  • Hardware Update
  • Software Update
  • End-to-End Network Performance Testing
  • Linux Workstations
  • User Support Resources
  • User Statistics
  • Outreach
  • OSG, NWICG, and Condor
  • Future Plans

3
Staff Introduction
4
Hardware Update
  • Rich Sudlow

5
Hardware Update
  • Additional 2 GB RAM added to 72 x2100 systems
    dcopt073 dcopt144
  • All Xeon systems (128 nodes / 256 CPU 2 GB RAM)
    replaced with Opteron systems ( 144 nodes / 576
    CPU 8 GB RAM) ddcopt001- ddcopt144 All
    systems support remote management (IPMI)
  • SMC switches replaced with Extreme Networks 450e
    stacked with 2 x 10 Gb loops and 10 Gb uplink.
    2 x 1Gb ethernet connections per system.

6
Hardware Update - Systems
  • 36 nodes Dell SC1435 2 x AMD Model 2218 CPU -
    2.6 GHz dual dual core (ddcopt) 8 GB RAM 80
    GB SATA disk 1 rack
  • Dell donation of opteronA for additional
    interactive use.

7
Hardware Update - Systems
  • 108 nodes Sun X2200 system 2 x AMD Model 2218
    CPU - 2.6 GHz dual dual core 8 GB RAM 146 GB
    15K SAS disk 3 racks

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
Hardware Update Network
  • Addition of Extreme Networks Black Diamond 8810
    core switch provides 10 Gb network connectivity
    within CRCs Union Station facility

14
Hardware Update - Storage
  • 4 server NetApp GX 3050 servers 8 Gb network
    interfaces
  • 56 TB raw -14 TB per server 28 x 500GB SATA
    disks
  • 30 TB avail after formatting, WAFL, etc
  • Currently used for distributed scratch space
    /dscratch on CRC resources.

15
Hardware Update Storage Suns x4500 Thumper
  • Inexpensive disk space 25 - 35K for 24 TB
    Multipurpose server 4 processor Opteron 16 GB
    RAM Could be used for NFS fileserver / OpenAFS
    servers / Lustre. ZFS 10 Gb network interfaces
    available. Runs Solaris 10 or Red Hat 4U4.
    Purchased first one 11/2006 another in 1/2007.

16
Storage - Filesystems evaluated this past year
  • IBRIX used at Purdue talked to vendor
    numerous times on-site presentation. Purdue
    IBRIX/NFS translators used to avoid purchasing
    IBRIX clients.
  • Lustre evaluation with Data Direct hardware
    (8500 )using 4 Sun x4100 Object Store Targets
    (OST) Great for large file transfers limited
    client support. A number of issues being
    addressed or soon quotas, module load kernel.
    Recently purchased by Sun.
  • HPs Scalable FS HPs implementation of Lustre
    Lagging features tied to a single vendor for
    hardware and software.
  • NetApp GX Beta Looks and feels a LOT like AFS
    but with NFS. Purchased a 4 server 3050 GX using
    SATA drives 9/2006. Currently used for
    distributed scratch space in CRC. Security an
    issue for clients outside CRC.
  • OpenAFS the old reliable standard.
  • Some of these filesystems are clear choices at
    BIG sites but whats the right fit at Notre
    Dame, with a small but growing research community
    and limited staff. Typical research being done
    on fast desktops and also super computers.
    What filesystem works best for that? OpenAFS
  • Currently using Local scratch, OpenAFS, NetApp
    GX (/dscratch)

17
Hardware Update Storage - Suns x4500 Thumper
  • Paper presented at OpenAFS Best Practices
    Workshop2007 at SLAC http//www.nd.edu/rich/afs
    bpw2007
  • Currently starting to use Thumpers in Notre
    Dames production cell -
  • Discussions of new AFS cell in the Advanced
    Computing Technologies team a number of complex
    technical and resource questions.
  • Advanced Computing Technologies team System
    administrators across campus Engineering,
    Science, Arts Letters, COBA, University
    Library, OIT, CRC. Meets biweekly Friday A.M.
    join us join Listserv ND-ACT_at_listserv.nd.edu

18
Hardware Update Part II
19
Software Update
  • J.C. Ducom

20
Software Update
  • User feedback on the module list
  • Confusion about which module to use
  • modules take care of the CPU
  • architecture and of the dependencies
  • with other modules (e.g. Amber)?

21
  • List too long
  • modules take care of the CPU
  • architecture
  • removal of programs that come by
  • default with the OS (e.g. Acroread)?
  • math libraries under /opt/und/mathlib
  • (e.g. fftw, acml, blas,
    lapack,etc...)?
  • See CRC Wiki on how to use them

22
  • Softwares out of date
  • Default software versions will be
    moved
  • to the latest revision of
    software unless
  • there are major know bugs.
  • Updates in the last two weeks of July and last
    two weeks of December
  • Current plans are to retain only one "older"
    version
  • of software unless we hear from users (e.g
    Matlab)?
  • More info at http//www.nd.edu/jducom/ACT/module_
    cleanup.pdf

23
Network Performance TestingLinux Workstations
  • James Rogers

24
End-to-End Network Performance Testing
  • BWCTL (Bandwidth Controller)
  • http//e2epi.internet2.edu/bwctl/index.html
  • NDT (Network Diagnostic Test)
  • http//e2epi.internet2.edu/ndt/index.html
  • http//ndt.helios.nd.edu7123 (data center)
  • http//ndt.hpcc.nd.edu7123 (Union Sta.)
  • OWP (One-Way Ping)
  • http//e2epi.internet2.edu/owamp/
  • Indiana GigaPoP
  • http//indiana.gigapop.net
  • http//weathermap.grnoc.iu.edu/gigapop_jpg.html

25
Sample Weather Map
26
Sample Drill-down Graphs

Last Hour
Last Day
also last week, month and year
27
Linux Workstations
  • Recent software updates
  • Killer script
  • Condor
  • Flock with Purdue and Univ. of Wisconsin

28
User Support ResourcesUser Statistics
  • Paul Brenner

29
User Support Resources
  • Web Page Updates
  • Time-sensitive information on right hand side
    under Upcoming Events and Announcements
  • New Communications main Page, click on Blog,
    Wiki Support
  • Blog
  • Wiki
  • Archive of CRCsupport emails

30
User Statistics
  • Utilization statistics since Jan 2006
  • Based on Sun Grid Engine Queue data
  • Currently on the granularity of 1 month
  • Does not indicate peak days or hours
  • Statistics are job based
  • A job is assigned one or more cores
  • General statistics
  • Multiple users with special needs/requirements

Center for Research Computing
30
31
Center for Research Computing
31
32
CRC Usage
Cluster A maxed out Feb 06
User Training began Jan 07
Center for Research Computing
32
33
NWICG Usage
New power user
User Training began Jan 07
Users need DoE certs and training to use the
system
Power user graduated
Center for Research Computing
33
34
  • Notes
  • Top 20 Users for Jobs and CPU Hrs are Highly
    Disjoint
  • Average Number of Total Users was 64 with steady
    growth trend
  • -Small monthly STD from this average observation

Center for Research Computing
34
35
Total Storage 1,114 GBytes
Center for Research Computing
35
36
Total Storage 9,773 GBytes
Center for Research Computing
36
37
Queue Wait Time
  • After adding Cluster C, average wait time for 16-
    and 32-processor jobs fell from days to hours and
    minutes
  • Currently little to no wait time for Opteron jobs
    requesting up to 16 processors.
  • Jobs waiting in queue currently
  • Jobs submitted to the Sun v880s
  • Jobs that require sequential processing.results
    of Job 1 must be completed before Job 2 can
    begin
  • Jobs with license limitations for the software
  • Jobs that do not match the available queues
  • Total job requests/month have steadily climbed
    from 4,900/month (Jan 06) to 17,900/month
    (July 07)

Center for Research Computing
37
38
Outreach
  • Ed Bensman

39
Outreach
  • Training
  • http//crc.nd.edu/information/training.shtml
  • User Assistance
  • CRCsupport_at_nd.edu
  • User Surveys
  • Fall 2007 survey will be for non-users to
    determine unmet needs
  • Spring 2007 survey results https//www3.nd.edu/c
    rcs/partners/advisory/survey_results.pdf
  • Grant Support
  • CRC staff have assisted faculty and researchers
    in quantifying CRC resources available to users
  • Seminars, colloquia, and workshops
  • Monitor CRC home page for latest information

40
OSG, NWICG, and Condor
  • Ed Bensman

41
OSG, NWICG, and Condor
  • CRC, member of the Open Science Grid (OSG)
  • In-Saeng Suh and Ed Bensman attended User Group
    and Sys Admin meeting this summer at Fermi Lab
  • Working with Engage Virtual Organization (VO) to
    improve user experience with Grid computing
  • Northwest Indiana Computational Grid (NWICG)
  • New web site http//www.nwicgrid.org/
  • Wiki format, includes job submission scripts
  • and improved documentation
  • Condor
  • Working to install separate gatekeeper for Condor
    users (should be ready by Spring 2008)

42
Future Plans
  • J.C. Ducom

43
Future Plans
  • Upgrade of Sun Grid Engine Scheduler 6.1
  • Several new and enhanced features to further
    reduce the wait timeand user frustration.
  • Time line 2-3 months. Transition between the
    two (incompatible) versions will be done via
    module
  • Queue names will not be changed

44
Graphical qstat tool
45
  • Visualization large high-resolution display
  • Meetings with faculty in AME, Chemistry, Physics,
  • Biology, CSE, Psychology
  • What would be the benefit of an increased size
    and resolution display for research and teaching?
  • Would it help in developing more
    inter-disciplinary collaboration?
  • Would users benefit from it in future grant
    proposals or during site visit from DoD/DoE?
  • How often would users use it knowing that it will
    be located remotely from their building of origin?

46
  • Proof-of-concept LCD-tiled display

HiperSpace at UCSD 220M pixels (55screens)?
  • ND visualization display
  • 27-34 M pixels
  • 7-8.5 feet long
  • 3 feet high

47
  • All-in-one purposes
  • Scientific visualization
  • to view data at human-scale physical sizes
  • to view large amount of data simultaneously
  • Nodes will be loaded with scientific
    visualization tools (avs, vis5d, tecplot, vmd,
    vtk, etc...)
  • Render farm
  • Plug to SGE6.1 with special queues for rendering
    jobs

48
  • Collaboration and education
  • Possibility to share data on the common display

49
  • Develop new research projects
  • visualization tools, user interface
    evaluation/performance, etc...

Time line 2-3 months Our objective with this
proof-of-concept is to evaluate whether such a
visualization tool can really be used as a
productive tool in term of research, teaching and
grant proposal writing for CRC
users. Suggestions/critiques more than welcome!
About PowerShow.com