Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based on Knoppix Instant Computing Technology - PowerPoint PPT Presentation

About This Presentation
Title:

Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based on Knoppix Instant Computing Technology

Description:

Title: 1 Author: FK Last modified by: FK Created Date: 4/25/2006 7:05:53 PM Document presentation format: Other titles – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 25
Provided by: FK88
Category:

less

Transcript and Presenter's Notes

Title: Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based on Knoppix Instant Computing Technology


1
Improving the Research Bootstrap of Condor High
Throughput Computing for Non-Cluster Experts
Based on Knoppix Instant Computing Technology
  • RIKEN Genomic Science Center
  • Fumikazu KONISHI

2
Background
  • Biologists need a high performance computing
    system for their research process. However, they
    do not know how to build a cluster system by
    themselves.

3
Meet Chie-san.
I borrowed slides from Condor.
4
Chie-sans Application
  • Run a Sequence Sweep of InterProScan for Mouse
    cDNAs of a total of 103,000 clones .
  • InterProScan takes on the average 1 minute to
    compute on a typical workstation (total
    103000 1 103000 minutes 1716 hours )
  • InterProScan requires 6G bytes Public Database
    set for each.

http//www.ebi.ac.uk/interpro/README1.html
5
I have 103,000 sequences to search a gene
functional domain. And I am Non-Cluster Experts.
Who will help me?
6
Getting Knoppix for InterProScan High Throughput
computing Edition
  • Available as a free download from
  • Google Search fumikazu.
  • Download the image file.
  • The image includes
  • InterProScan4.1
  • Condor 6.6.10
  • PVFS2 1.2
  • Ganglia 3.0.1

7
Chie-san can boot up by an image of Instant High
Throughput Computing with an Application on labs
machines
She can borrow labs computers on weekend without
any software installation.
8
Goal
  • This research goal is to provide an instant high
    performance bioinformatics research workbench for
    all biology researchers, and allow us easy setup
    in collaborative project without side effect to
    local system.

Bioinformatics
9
Instant Setup Technologies
  • Install-Based Deploy System
  • RPM-Based automatic configuration technology
    (Redhat)
  • NPACI Rocks toolkits (UCSD)
  • Image-Based Deploy System
  • Live-CD technology (Knoppix)

10
Key Solutions
  • Knoppix
  • A GNU/Linux distribution that construct a machine
    without hard disk instillation.
  • Parallel File System
  • PVFS is intended a high-performance parallel file
    system for cluster computing. This system
    provides high bandwidths access and huge volume
    storage area.

11
Parallel File System on RAM Disk
12
Knoppix for InterProScan4.1 High Throughput
Computing Edition
13
Worker Node
PXE Boot
Head Node
Database download server
14
Step 1 Booting image
Boot the head node, IP address leased by the DHCP
server is displayed after the boot sequence.
15
Step 2 after the successful, two setup
optionsEASY and ADVANCEDare displayed on the
screen.
16
Step 3 Boot work nodes
All nodes must support PXE boot The system must
automatically assess whether sufficient resources
are available for the database arrangement of
InterProScan4.1.
17
Step 4 building cluster system
18
(No Transcript)
19
Download InterProScan database set
20
Testing
The system submits a single test job. The test
jobs are completed in a few minutes. The condor
job status is displayed on the browser, and
Ganglia provides a large amount of information on
all nodes. All configurations can be tested in
this phase.
21
Results
22
(No Transcript)
23
Web site
http//big.gsc.riken.jp/index_html/Members/fumikaz
u/htc
24
Questions
Write a Comment
User Comments (0)
About PowerShow.com