DSG Database Administration Projects Presentation for Experiments March 10, 2004 CD Projects Status - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

DSG Database Administration Projects Presentation for Experiments March 10, 2004 CD Projects Status

Description:

By xmas failure rate was increasing and disks were being replaced several times a week. ... Right after xmas the array gave the appearance that it would lock up. ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 18
Provided by: bobfo3
Category:

less

Transcript and Presenter's Notes

Title: DSG Database Administration Projects Presentation for Experiments March 10, 2004 CD Projects Status


1
DSG Database Administration Projects
Presentationfor ExperimentsMarch 10, 2004 CD
Projects Status Meeting
2
Success Stories of the last 6 Months
Page 2
  • CDF Hardware Replacement, Db upgrade
  • Delivered Dec. 1, 2003
  • Deliverables Sun v880 to replace fcdfora1,
    upgrade of databases to Oracle 9i.
  • D0 Hardware Replacement , Db upgrade
  • Delivered July, 2003
  • Deliverables Sun v880 to replace d0ora1 ,
    upgrade of databases to Oracle 9i.
  • Minos using sam in development, hardware plans
    underway.
  • Upgrade to linux 7.3 on dbserver machines.
  • Deployment of calibration servers.
  • Consolidated all Oracle designer users to
    Designer 9i on fncduh1.
  • Investigation into use of DataGuard for cdf
    online complete. Continuing standby requirements
    and investigation with cdf. http//fcdfhome.fnal.g
    ov/usr/rlc/standby/standby_04Mar04.ps

3
Database Support
  • Continued database support in a proactive manner
    to minimize down time.
  • Practice backup and recoveries.
  • Maintain toolman.
  • Keep Oracle Enterprise Manager current.
  • Apply security patches.
  • Tune database and queries.
  • General user support.
  • Moving toward incorporating linux configurations
    to our support matrix.

4
Database Support cont.
  • Production 24x7 with primary secondary on call
  • 4 machines
  • Development / Integration 9x5 with primary
    secondary
  • 7 machines 5 (dev) 3 (int) 1(cdfval) 2(testbeds)
  • 3 DBAs
  • Anil Kumar CDF/DO
  • R.Jetton D0/CDF
  • Nelly Stanfield CDF
  • S.Lebedeva freeware
  • J.Trumbo anywhere as needed
  • 2.5 sysadmins
  • Jeff Schmidt CDF Online
  • Steven Kovich CDF/D0
  • Richard Jetton CDF/D0 ½ time
  • M.Mihalek - CDF/D0 as needed

5
Maintenance and continuing support of sam for d0,
cdf, minos, cms
  • Dba(s) A.Kumar, J.Trumbo, R.Jetton
  • Stakeholder Orgs cdf, d0, minos, cd
  • Ongoing maintenance and support for the sam
    databases, including
  • Schema modifications for db server rewrite are
    complete.
  • Created mini sam, to all developers a small,
    empty schema for testing.
  • Assisted minos with setup of sam, and hardware
    proposals.
  • Cross training of R.Jetton as dba.
  • Work on chains and links development.
  • Continue working with the sam development team.
  • Comment lost D.Bonham in Aug.2003, plan to
    refind her Jan. 2005

6
Implement Oracle Streams at CDF
  • Replace oracle's basic replication process with
    the oracle streams process for replication of cdf
    online and offline data. Oracle streams will
    allow for additional replicas to be established
    and seamless database schema cuts.
  • DBA(s) A.Kumar, N.Stanfield
  • Stakeholder Orgs cdf, cd
  • Streams has been extended due to bugs in the
    streams software and when running in conjunction
    with basic replication. New hardware delivered
    Dec. 2003. Installed on new hardware and tested
    new patches Jan./Feb. 2004. Oracle now says the
    problem is running streams and basic
    simultaneously. A new plan has been developed to
    test that theory.
  • Plans http//www-css.fnal.gov/dsg/internal/ora_re
    pl/Streams_Status_03_04_2004_file
    http//www-css.fnal.gov/dsg/internal/ora_repl/stre
    am_rep_dev_strategy.htm

7
Implement Oracle Advanced Security Option
  • Implement oracle advance security option to
    oracle databases.
  • Stakeholder Orgs Organizations using oracle
    databases.
  • Oracle Advanced Security Option will
  • provide a kerberized logon to oracle
  • databases. We would like to find
  • additional resources for this. RJetton
  • has done some initial work. This
  • project will probably be 6-9 months
  • to establish and implement.
  • A consultant from oracle provided
  • proof of product fall 2002.
  • http//www-css.fnal.gov/dsg/internal/ora_adm/index
    .htm

8
Minos
  • Implement a dev/int/prd database
  • for minos, including the writing of an
  • mou between minos and cd.
  • Dba(s) J.Trumbo, N.Stanfield
  • Stakeholder Orgs cd, minos
  • Liz Buckley-Geer has begun
  • drafting an mou. This mou should
  • detail the longer term maintenance
  • agreement between cd and minos
  • involving the support and
  • maintenance of minos databases.
  • Dsg will be providing both Oracle
  • database and system administration
  • support.
  • Dsg has assisted in getting a
  • prototype sam station established
  • using a mini sam schema on the d0
  • development database.

9
Minos
  • Dsg has also drafted hardware
  • recommendations for both linux and sun
  • solutions. We are initially working toward
  • linux on sun, using solaris as a fallback.
  • Dsg is currently benchmarking a demo
  • sun v20, opteron, rh 3.0 linux box for
  • purchase.
  • Hardware purchase is pending
  • benchmark results.
  • see
  • http//www-css.fnal.gov/dsg/internal/minos/hardwar
    e_justification.html for plan.

10
CMS
  • support for cms for calibration
  • Dba(s) A.Kumar
  • Stakeholder Orgs cd, uscms
  • Dsg has provided consulting
  • and training for oracle designer,
  • and relational database concepts
  • to support the cms calibration
  • prototype.
  • Dsg initially provided an initial oracle
  • instance for prototyping and
  • data modeling expertise.
  • Hcal prototype application db is
  • deployed using oracle with rh. Hcal will
  • be joining emou for a joint database.
  • Plan is to design a generic schema to
  • allow for additional detectors.

11
CMS
  • Pixel application db is under construction.
  • USCms PPD has alerted us to a potential request
    for additional database and system support.
    Details of split of work fermi vs. cern are not
    yet certain, or if the work will be at fermi at
    all.
  • Prototype testing to begin in May, asking for
    24x7 support. Possible collaboration with Cern IT
    for 24x7 coverage.
  • Will be using Oracle licenses from LHC license
    pool.
  • For further details see http//www-css.fnal.gov/ds
    g/internal/cms/CMS/cms_db_status_files/v3_document
    .htm

12
San/backups
  • DBA(s) dsg
  • Stakeholder Orgs cd, d0, cdf, csi
  • A budget proposal has been
  • made to provide a multi
  • mirrored backup environment
  • on san technology for d0ora2
  • and fcdfora4 rman backups.
  • Meeting scheduled week of March 8 to discuss plan
    with R.Pasetes.
  • Results of san testing could impact long term
    backup recovery scenerios and hardware planning.

13
Freeware
  • Mysql/Postgres prototype
  • proof of product with CDF data
  • Mechanism for population IS on demand, it does
    not support updates
  • CDF successfully tested with CDF code -
    (Karlsruhe)
  • Auger using mysql on fnal cluster
  • DSG has begun to provide consulting for freeware
    databases
  • actively maintaining new versions of mysql
    postgres in KITS and working towards a more
    robust environment
  • actively maintaining documentation for mysql
    postgres in our freeware area.
  • Reference url
  • http//www-css.fnal.gov/dsg/external/freeware
  • actively assisting users with questions,
    upgrades, testing, etc. for freeware products.

14
CAD
  • C.Kastner implementing upgraded ansys software,
    Team Central(including oracle) and Ideas.
  • Oracle licenses are covered by the team central
    purchase.
  • Development will be housed on the old d0 E4500.
  • Wheres production going?
  • Project team to begin data scrubbing of current
    and legacy data has been established.

15
D0ora2 array problems
  • The Clarion array on d0ora2 houses the database
    files for production sam. This is critical path
    hardware.
  • Clarion array with 1.2T disk attached, began
    having disk failures mid Dec. 2003. Many man
    hours spent over the next 8 weeks attempting to
    fix this issue.
  • By xmas failure rate was increasing and disks
    were being replaced several times a week.
  • Right after xmas the array gave the appearance
    that it would lock up. Both sps would fail.

16
D0ora2 array problems
  • Park place, the hardware vendor was working on
    solution.
  • No diagnostic tools seemed to be available.
  • Began throwing hardware at the array.
  • Database backups would consistently halt the
    array.
  • expert from cleveland came in jan. 11. replaced
    the qlogic software and firmware.
  • Ran fine with stress tests till Wed. jan. 14.

17
D0ora2 array problems
  • Meeting with d0 on Jan. 12 to discuss options.
  • Had additional problems end of Jan., however, EMC
    got involved and provided monitoring software.
  • Has been running flawlessly since then.
  • Clarion array is 5 years old and toward the end
    of its useful life. Will continue to be more and
    more costly to maintain.
  • Funding for new array may be requested for fy
    2005.
  • Quotes are being gathered. A plan of options
    proposal will be made to D0 soon.
Write a Comment
User Comments (0)
About PowerShow.com