Scientific Data Infrastructure in CAS - PowerPoint PPT Presentation

About This Presentation
Title:

Scientific Data Infrastructure in CAS

Description:

Title: data infrastructure Author: Jianhui Last modified by: Administrator Created Date: 10/25/2004 3:30:19 AM Document presentation format: (4:3) – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 28
Provided by: Jianhui
Category:

less

Transcript and Presenter's Notes

Title: Scientific Data Infrastructure in CAS


1
Scientific Data Infrastructure in CAS
  • Dr. Jianhui Li(lijh_at_cnic.cn)
  • Scientific Data Center
  • Computer Network Information Center
  • Chinese Academy of Sciences

2
Scientific Data infrastructure
Application enabled environments and typical
applications
Middle ware (Scientific data grid middleware,
internet-based storage service middleware)
Software and Toolkits (scientific data
collection, curation, and publishing, data
analyzing and visualization)

Massive storage system Data-intensive computing
facilities High speed network
3
DRC Data Resource Center
  • A new organization responsible for data
    preservation, curation and access service in CAS

Long-term preservation of important data
Data Resource Center
collaborator
Technology service
Network storage space
Management system
staff
mass data
Application service
Data online service
Mass data analysis and process
system environment
Mass data backup
4
Infrastructure for DRC
  • High Speed Network
  • 2Gbps linked with CSTNET
  • 2 Gbps linked with CSTNET-CNGI
  • GLORIAD
  • Data Intensive Computing facilities
  • 1000 CPU Core Clusters Scientific Computing
    Grid(200Tflops)
  • Massive Storage System
  • 1PB online disk 5PB Tape
  • A storage network will start to build this year
  • 1 center 1 archive center 10 storage nodes
    around China
  • Over 20PB

5
Scientific Databases (SDB)
  • A Long-term mission started in 1986 which funded
    by CAS
  • many institutes involved
  • long-term, large-scale collaboration
  • data from research, for research
  • Collecting multi-discipline research data and
    promoting data sharing
  • More than 350 research databases and 400 datasets
    by 61 institutes
  • Over 60TB data available to open access and
    download

http//www.csdb.cn
6
Scientific Databases (cont.)
  • SDB Contents
  • Physics Chemistry, Geosciences, Biosciences,
    Atmospheric Ocean Science, Energy Science,
    Material Science, Astronomy Space Science

7
Scientific Databases (cont.)
  • Database integration
  • Resource database
  • Reference database
  • Application oriented database

Application oriented database
Reference database
Resource database
Research database
Research database
8
Scientific Databases (cont.)
  • 2 Reference databases
  • China Species
  • compound
  • 4 application-Oriented databases
  • High Energy (ITER)
  • Western Environment Research
  • Ecology research
  • Qinghai Lake Research
  • 8 Resource databases
  • Geo-Science
  • Biodiversity
  • Chemistry
  • Astronomy
  • Space Science
  • Micro biology and virus
  • Material science
  • Environment

9
CAS Scientific Data Grid
  • Based on Scientific Data Grid Middleware (SDG)
  • SDG is built upon the Scientific Database,
    supporting to find and access large scale,
    distributed and heterogeneous scientific data
    uniformly and conveniently in a SECURE and proper
    way
  • Building scientific data application grid
    according to domain requirements
  • Integrate distributed data, analysis tools and
    storage and computing facilities, providing a
    uniform data service interface
  • 4 pilot grids
  • bioscience grid
  • geoscience grid
  • Chemistry grid
  • Astronomy and space science grid

10
Function Framework of SDG
  • A scalable and integrated data sharing
    environment
  • Providing services for grid users, grid managers
    and resource provides
  • Operating by the operation center, science
    gateways and data nodes

User
Grid Manager
Resource Provider
Operation Center
Science Gateway
Data Node
11
Access Scientific Data Grid
12
VisualDB - Powered your database
  • A toolkit to manage, publish and share
    scientific database by visual configure interface
    without writing codes
  • A database integration access broker
  • A data quality assessment tool
  • A database access and usage statistics tool

13
Function Framework of VisualDB
14
Catalog Builder

15
Security Center
16
Data Forge
17
vReport
18
Application enabled environments and typical
applications
  • Domain specific data intensive application
    environment
  • Support one specific research area
  • Integrated scientific data, storage, computing
    analysis model and tools
  • An easily and friendly interactive interface
  • Scalable user defined data process workflow
  • Typical pilot systems
  • Remote sensing data on-demand accessing and
    processing service environment
  • CFCI - China FLUX Cyber-Infrastructure
  • DarwinTreeMolecular data analysis and
    application environment
  • Atmospheric science data integration analysis
    platform

19
Atmospheric science data integration analysis
platform
  • Status quo

20
Atmospheric science data integration analysis
platform
  • Problems
  • The size of Atmospheric data has reached TB level
    and they are distributed.
  • The personal computer hard disk, memory limit of
    the research work
  • Many algorithm finished by scientific researcher
    cant be shared easily.

21
Architecture
Scientific Data Analysis Online Platform
22
work flow
Five step
Iterative
23
Select data
24
Choose algorithm
25
Config param
26
plot and result
27
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com