Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets. - PowerPoint PPT Presentation

About This Presentation
Title:

Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets.

Description:

Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets. Art Wetzel, Greg Hood and Markus Dittrich National Resource for ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 7
Provided by: ArtWe1
Category:

less

Transcript and Presenter's Notes

Title: Prototyping a virtual filesystem for storing and processing petascale neural circuit datasets.


1
Prototyping a virtual filesystem for storing and
processing petascale neural circuit datasets.
Art Wetzel, Greg Hood and Markus
Dittrich National Resource for Biomedical
Supercomputing Pittsburgh Supercomputing
Center awetzel_at_psc.edu 412-268-3912 www.psc.edu
and www.nrbsc.org
R. Clay Reid, Jeff Lichtman, Wei-Chung Allen
Lee Harvard Medical School, Allen Institute
for Brain Science Center for Brain Science,
Harvard University Davi Bock HMMI Janelia Farm
David Hall and Scott Emmons
Albert Einstein College of Medicine
Jan 11, 2012 Connectomics Data Project Overview
2
Reconstructing brain circuits requires high
resolution electron microscopy over long
distances BIGDATA
Vesicles 30 nm diam.
A synaptic junction gt500 nm wide with cleft gap
20 nm
www.coolschool.ca/lor/BI12/unit12/U12L04.htm
Dendritic spine
Recent ICs have 32nm features 22nm chips are
being delivered.
Dendrite
Gate oxide 1.2nm thick
3
A10 Tvoxel dataset aligned by our groupwas an
essential part of the March 2011 Nature paper
with Davi Bock, Clay Reid and Harvard
colleaguesNow we are working ontwo datasets of
100TB each and expect to reach PBs in 2-3 years.
4
The CS project is to implement and test a
prototype virtual filesystem to address common
problems associated with neuralcircuit and other
massive datasets.
  • The most important aim is reducing unwanted data
    duplication as raw data are preprocessed for
    final analysis. The virtual filesystem addresses
    this by replacing redundant storage by on-the-fly
    computing.
  • The second aim is to provide a convenient
    framework for efficient on-the-fly computation on
    multidimensional datasets within high performance
    parallel computing environments using both CPU
    and GPGPU processing.
  • The Filesystem in User Space mechanism (FUSE)
    provides a convenient implementation basis that
    will work across a variety of systems. There are
    many existing FUSE codes that serve as useful
    examples.

5
We would eventually like to have a flexible
software framework that allows a combination of
common prewritten and user written application
codes to operate together and take advantage of
parallel CPU and GPGPU technologies.
6
Multidimensional data structures to provide
efficient random and sequential access analogous
to the 1D representations provided by standard
filesystems will be part of this work.
Students working on this project will have access
to a parallel cluster which holds our large
datasets along with the compilers and other tools
required. Minimal end-to-end functionality with
simple linear transforms can likely be achieved
in about 8 weeks and then extended as time
permits. Please contact Art Wetzel if there are
further questions awetzel_at_psc.edu.
Write a Comment
User Comments (0)
About PowerShow.com