SCEC Data Management in SRB Digital Library - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

SCEC Data Management in SRB Digital Library

Description:

Every SCEC collection has a metadata attribute to briefly describe what kind of ... If metadata attributes already exist, then update their value; otherwise, create ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 13
Provided by: jillan5
Category:

less

Transcript and Presenter's Notes

Title: SCEC Data Management in SRB Digital Library


1
SCEC Data Management in SRB Digital Library
  • Reagan Moore
  • George Kremenek
  • Yifeng Cui
  • Yuanfang Hu
  • Jing Zhu
  • Marcio Faerman
  • San Diego Supercomputer Center
  • University of California, San Diego
  • SCEC/CME All Hands Meeting

2
Outlines
  • SCEC data size in SRB digital library (DL)
  • Data organization in SRB DL
  • Data integrity in SRB DL
  • Metadata management in SRB DL

3
SCEC Data Size
  • 3D ground motion collection for the LA basin
  • 60 scenarios, 150M data / scenario
  • TeraShake 1 2
  • 4 TeraShake1 runs, 6 TeraShake2 runs (wave
    propagation dynamic rupture)
  • Each simulation generates
  • Surface data and its derivatives
  • 1.2 TB surface velocity data
  • 0.4 TB velocity magnitude data
  • 1.2 TB surface displacement data
  • 0.4 TB displacement magnitude data
  • 1.2 TB surface seismograms data
  • 5 TB volume data (optional)
  • Each file has one backup copy at HPSS or SRB tape
  • Other data checkpoints and visualizations
  • Total 168 TB, 3.5 million files

4
SCEC Data Organization
  • 3D ground motion collection for the LA basin
  • Organized by scenarios
  • TeraShake 1 2 simulation
  • simulation id visualization
  • checkpoint
  • input
  • output surface-velocity vmag
  • peak
  • surface-displacement dmag
  • peak
  • surface-seismograms
  • volume

5
Data Integrity in SRB DL
  • Replication
  • Output data of each simulation run have backup
    copies at HPSS or SRB tape
  • Md5 checksums
  • Surface velocity data of each simulation run have
    md5 checksums as metadata
  • Data mutual backup
  • Surface data and seismograms data mutually backup
    each other
  • Codes to convert from surface data to seismograms
    data, or from seismograms data to surface data

6
Metadata Management in SRB DL
  • Collection level
  • Every SCEC collection has a metadata attribute to
    briefly describe what kind of data inside the
    collection
  • File level
  • Every surface velocity file has a metadata
    attribute to keep its md5 checksum for data
    integrity
  • Every seismograms file has metadata attributes to
    describe properties associated with the file

Sufmeta xhist00001 0 simulation_level leaf
4 data_product_file_sequence_number 1 6
computation_float_size 4 2 data_product_type
seismogram 3 data_product_component east
5 computation_endian big 1 simulation_id
7
7
SRB Metadata Management Tool Set
  • Provide facilities to Create, Display, Update and
    Remove SRB metadata
  • For either SRB collections or SRB files
  • In an easy-to-use manner
  • Each operation needs an input file in certain
    format
  • Suitable for large-scale metadata operations
  • Millions of files in SCEC simulation
  • Tool set is located at SCEC CVS direction
    /home/cvs/SRB_Tools

8
Create Operation
  • Function create new metadata entries for
    collections/files according to attribute lists in
    input file
  • Input file format
  • SRB collection
  • ltdirgt collection path (relative or absolute
    path)
  • metadata attribute n value n
  • SRB file
  • ltfilegt file path (relative or absolute path)
  • metadata attribute n value n
  • Example
  • Command line ./create.pl input_create

more input_create ltdirgt/home/sceclib.scec/test/m
y_test DC.title This is a test collection
DC.date 2006-07-10 ltfilegt/home/sceclib.scec/te
st/my_test/test1.dat DC.title test file
1 ltfilegt/home/sceclib.scec/test/my_test/test2.dat
DC.title test file 2
9
Display Operation
  • Function Display the metadata of
    collections/files listed in input file
  • Input file format
  • SRB collection ltdirgt collection path (relative
    or absolute path)
  • SRB file ltfilegt file path (relative or absolute
    path)
  • Example
  • Command line ./display.pl input_display
  • Output

more input_display ltdirgt/home/sceclib.scec/test/
my_test ltfilegt/home/sceclib.scec/test/my_test/test
1.dat ltfilegt/home/sceclib.scec/test/my_test/test2.
dat
./display.pl input_display /home/sceclib.scec/t
est/my_test 0 DC.title This is a test
collection 1 DC.date 2006-07-10
/home/sceclib.scec/test/my_test/test1.dat 0
DC.title test file 1 /home/sceclib.scec/test/my
_test/test2.dat 0 DC.title test file 2
10
Update Operation
  • Function Update metadata entries for
    collections/files according to attribute lists in
    input file. If metadata attributes already exist,
    then update their value otherwise, create them
    as new metadata entries
  • Input file format
  • SRB collection
  • ltdirgt collection path (relative or absolute
    path)
  • metadata attribute n value n
  • SRB file
  • ltfilegt file path (relative or absolute path)
  • metadata attribute n value n
  • Example
  • Command line ./update.pl input_update

more input_update ltdirgt/home/sceclib.scec/test/m
y_test DC.place SDSC ltfilegt/home/sceclib.sce
c/test/my_test/test1.dat DC.date 2006-07-13
11
Remove Operation
  • Function Remove all metadata of
    collections/files listed in input file
  • Input file format
  • SRB collection ltdirgt collection path (relative
    or absolute path)
  • SRB file ltfilegt file path (relative or absolute
    path)
  • Example
  • Command line ./remove.pl input_remove

more input_remove ltdirgt/home/sceclib.scec/test/m
y_test ltfilegt/home/sceclib.scec/test/my_test/test1
.dat ltfilegt/home/sceclib.scec/test/my_test/test2.d
at
12
Summary
  • SRB digital library provides data management
    facilities to SCEC
  • Large space to hold 100 TB data
  • Replication, checksum and mutual backup
    mechanisms for data integrity
  • Easy-to-use metadata management tool set
Write a Comment
User Comments (0)
About PowerShow.com