Virtual Laboratory: Data Intensive Science during Holiday @ Robinson Village in Italy! - PowerPoint PPT Presentation

About This Presentation
Title:

Virtual Laboratory: Data Intensive Science during Holiday @ Robinson Village in Italy!

Description:

Title: Vartual Lab: Data Intensive Science during Holiday _at_ Robinson Village in Italy Author: Rajkumar Buyya Last modified by: Rajkumar Buyya Document presentation format – PowerPoint PPT presentation

Number of Views:221
Avg rating:3.0/5.0
Slides: 72
Provided by: rajkuma
Category:

less

Transcript and Presenter's Notes

Title: Virtual Laboratory: Data Intensive Science during Holiday @ Robinson Village in Italy!


1
Virtual LaboratoryData Intensive Science during
Holiday _at_ Robinson Village in Italy!
  • Rajkumar Buyya

Melbourne, Australiawww.buyya.com/ecogrid
2
Grid Warning!
  • This is a science fiction story on the future of
    grid computing
  • All actors mentioned in this talk are
  • Application under consideration is fictitious.
  • Prof.Watson-II is researching on drug design.
  • The complete story is fictitious except the Grid
    technology!

3
Prof. Watson-II Spends all his time in Lab _at_
University of Lecce
4
Watson-IIs wife was Unhappy
  • Since he was not all spending any time with her
    kids.
  • Everyday he goes to lab _at_ 8am and comes backs to
    home at 11pm night.
  • After few day he and his wife had a big fight _at_
    Home
  • She gives him warning If he does not come home
    tomorrow by 6pm, he will have to face life time
    consequence.

5
Prof. Watson-II works upto 5pm in Lab _at_
University of Lecce
Goes to Work _at_ 9am
Returns to home by 5.30PM!
6
Watson-II having moon light dinner with his Wife
7
Prof. Watson-II works up to 5pm in Lab _at_
University of Lecce
Goes to Work _at_ 9am
Returns to home by 5.30PM!
8
Watson-II promises his wife that he will soon
take her for a holiday _at_ Robinson Village
9
Prof. Watson-II hires assistant and works smarter!
Goes to Work _at_ 9am
Returns to home by 5.30PM!
10
Watson-II Family starts their holiday
11
Watson-II Family on 5 Day Holiday _at_ Robinson
Village
12
Day 1 _at_ Robinson Village
13
Watson-II _at_ Robinson Village Beach
14
Watson-II happens to meet a Grid researcher on
beach!
15
Watson-II quickly reads news clipping that he got
from Grid researcher
16
Watson-II continues surfing???
?
17
Watson-II having moon light dinner with his Wife
18
Day 2 _at_ Robinson Village
19
Goes to Internet Room does some surfacing of
Grid researcher page
20
Drug Design Data Intensive Computing on Grid
  • A Virtual Laboratory for Molecular Modelling for
    Drug Design on Peer-to-Peer Grid.
  • It provides tools for examining millions of
    chemical compounds (molecules) in the Protein
    Data Bank (PDB) to identify those having
    potential use in drug design.
  • In collaboration with
  • Kim Branson, Structural Biology, Walter and Eliza
    Hall Institute (WEHI)

http//www.csse.monash.edu.au/rajkumar/dd_at_home/
21
DesignDrug_at_Home ArchitectureA Virtual Lab for
Molecular Modeling for Drug Design on P2P Grid
Grid Info. Service
Grid Market Directory
Data Replica Catalogue
Give me list PDBs sources Of type aldrich_300?
service cost?
service providers?
GTS
Resource Broker
Screen 2K molecules in 30min. for 10
mol.5 please?
(RB maps suitable Grid nodes and Protein DataBank)
get mol.10 from pdb1 screen it.
PDB2
mol.10 please?
(GTS - Grid Trade Server)
PDB1
22
Software Tools
  • Molecular Modelling Tools (DOCK)
  • Parameter Modelling Tools (Nimrod/enFusion)
  • Grid Resource Broker (Nimrod-G)
  • Data Grid Broker
  • Protein Data Bank (PDB) Management and
    Intelligent Access Tools
  • PDB databse Lookup/Index Table Generation.
  • PDB and associated index-table Replication.
  • PDB Replica Catalogue (that helps in Resource
    Discovery).
  • PDB Servers (that serve PDB clients requests).
  • PDB Brokering (Replica Selection).
  • PDB Clients for fetching Molecule Record (Data
    Movement).
  • Grid Middleware (Globus and GrACE)
  • Grid Fabric Management (Fork/LSF/Condor/Codine/)

23
DOCK code(Enhanced by WEHI, U of Melbourne)
  • A program to evaluate the chemical and geometric
    complementarities between a small molecule and a
    macromolecular binding site.
  • It explores ways in which two molecules, such as
    a drug and an enzyme or protein receptor, might
    fit together.
  • Compounds which dock to each other well, like
    pieces of a three-dimensional jigsaw puzzle, have
    the potential to bind.
  • So, why is it important to able to identify small
    molecules which may bind to a target
    macromolecule?
  • A compound which binds to a biological
    macromolecule may inhibit its function, and thus
    act as a drug.
  • Thus disabling the ability of (HIV) virus
    attaching itself to molecule/protein!
  • With system specific code changed, we have been
    able to compile it for Sun-Solaris, PC Linux, SGI
    IRIX, Compaq Alpha/OSF1

Original Code University of California, San
Francisco http//www.cmpharm.ucsf.edu/kuntz/
24
Dock input file
  • score_ligand yes
  • minimize_ligand yes
  • multiple_ligands no
  • random_seed 7
  • anchor_search no
  • torsion_drive yes
  • clash_overlap 0.5
  • conformation_cutoff_factor 3
  • torsion_minimize yes
  • match_receptor_sites no
  • random_search yes
  • . . . . . .
  • . . . . . .
  • maximum_cycles 1
  • ligand_atom_file S_1.mol2
  • receptor_site_file ece.sph
  • score_grid_prefix ece
  • vdw_definition_file parameter/vdw.defn
  • chemical_definition_file parameter/chem.defn

25
Parameterized Dock input file
score_ligand score_ligand minim
ize_ligand minimize_ligand multipl
e_ligands multiple_ligands random_s
eed random_seed anchor_search
anchor_search torsion_drive
torsion_drive clash_overlap
clash_overlap conformation_cutoff_factor
conformation_cutoff_factor torsion_minimize
torsion_minimize match_receptor_sit
es match_receptor_sites random_search
random_search . . . . . .
. . . . . . maximum_cycles
maximum_cycles ligand_atom_file
ligand_number.mol2 receptor_site_file
HOME/dock_inputs/receptor_site_file score_g
rid_prefix HOME/dock_inputs/score_
grid_prefix vdw_definition_file
vdw.defn chemical_definition_file
chem.defn chemical_score_file
chem_score.tbl flex_definition_file
flex.defn flex_drive_file
flex_drive.tbl ligand_contact_file
dock_cnt.mol2 ligand_chemical_file
dock_chm.mol2 ligand_energy_file
dock_nrg.mol2
26
Dock PlanFile (contd.)
parameter database_name label "database_name"
text select oneof "aldrich" "maybridge"
"maybridge_300" "asinex_egc" "asinex_epc"
"asinex_pre" "available_chemicals_directory"
"inter_bioscreen_s" "inter_bioscreen_n"
"inter_bioscreen_n_300" "inter_bioscreen_n_500"
"biomolecular_research_institute"
"molecular_science" "molecular_diversity_preservat
ion" "national_cancer_institute" "IGF_HITS"
"aldrich_300" "molecular_science_500" "APP" "ECE"
default "aldrich_300" parameter score_ligand
text default "yes" parameter minimize_ligand
text default "yes" parameter multiple_ligands
text default "no" parameter random_seed integer
default 7 parameter anchor_search text default
"no" parameter torsion_drive text default
"yes" parameter clash_overlap float default
0.5 parameter conformation_cutoff_factor integer
default 5 parameter torsion_minimize text
default "yes" parameter match_receptor_sites
text default "no" parameter random_search text
default "yes" . . . . . . . . . . .
. parameter maximum_cycles integer default
1 parameter receptor_site_file text default
"ece.sph" parameter score_grid_prefix text
default "ece" parameter ligand_number integer
range from 1 to 200 step 1
Molecules to be screened
27
Dock PlanFile
task nodestart copy ./parameter/vdw.defn
node. copy ./parameter/chem.defn node.
copy ./parameter/chem_score.tbl node.
copy ./parameter/flex.defn node. copy
./parameter/flex_drive.tbl node. copy
./dock_inputs/get_molecule node. copy
./dock_inputs/dock_base node. endtask task main
nodesubstitute dock_base dock_run
nodesubstitute get_molecule
get_molecule_fetch nodeexecute sh
./get_molecule_fetch nodeexecute
HOME/bin/dock.OS -i dock_run -o dock_out
copy nodedock_out ./results/dock_out.jobname
copy nodedock_cnt.mol2
./results/dock_cnt.mol2.jobname copy
nodedock_chm.mol2 ./results/dock_chm.mol2.jobnam
e copy nodedock_nrg.mol2
./results/dock_nrg.mol2.jobname endtask
28
Nimrod/TurboLinux enFuzion GUI tools for
Parameter Modeling
29
Docking Experiment Preparation
  • Setup PDB DataGrid
  • Index PDB databases
  • Pre-stage (all) Protein Data Bank (PDB) on
    replica sites
  • Start PDB Server
  • Create Docking GridScore (receptor surface
    details) for a given receptor on home node.
  • Pre-Staging Large Files required for Docking
  • Pre-stage Dock executables and PDB access client
    on Grid nodes, if required (e.g., dock.Linux,
    dock.SunOS, dock.IRIX64, and dock.OSF1 on Linux,
    Sun, SGI, and Compaq machines respectively). Use
    globus-rcp.
  • Pre-stage/Cache all data files (3-13MB each)
    representing receptor details on Grid nodes.
  • This can can be done demand by Nimrod/G for each
    job, but few input files are too large and they
    are required for all jobs). So,
    pre-staging/caching at http-cache or broker level
    is necessary to avoid the overhead of copying the
    same input files again and again!

30
Protein Data Bank
  • Databases consist of small molecules from
    commercially available organic synthesis
    libraries, and natural product databases.
  • There is also the ability to screen virtual
    combinatorial databases, in their entirety.
  • This methodology allows only the required
    compounds to be subjected to physical screening
    and/or synthesis reducing both time and expense.

31
Target Testcase
  • The target for the test case electrocardiogram
    (ECE) endothelin converting enzyme. This is
    involved in heart stroke and other transient
    ischemia.
  • Ischemia A decrease in the blood supply to a
    bodily organ, tissue, or part caused by
    constriction or obstruction of the blood vessels.

32
Resource Brokering Architecture for Molecular
Screening on World Wide Grid
Screen 2K molecules in 30min. for 10
Nimrod/G Computational Grid Broker
Algorithm1
Data Replica Catalogue
. . .
PDB Broker
AlgorithmN
3
PDB replicas please?
advise PDB source?
5
1
4
2
Grid Info. Service
process send results
selection advise use GSP4!
Screen mol.5 please?
Is GSP4 healthy?
7
6
mol.5 please?
PDB2
PDB Service
PDB Service
GSP1
GSP2
GSPm
GSP4
GSP3(Grid Service Provider)
GSPn
33
Nimrod/G in ActionScreening on World-Wide Grid
34
Felt Inspired Goes to Surfing
?
35
Watson-II again saw Grid researcher on beach and
asks him a favor!
Can I borrow your Grid identity for 2 days ?
Nice Grid Researcher Trusts Watson Gives him
his Grid identity including access to his World
Wide Grid testbed!
Grid Trust on the Beach!
36
Watson-II _at_ Robinson Village
37
Day 3 _at_ Robinson Village
38
Watson-II continues surfing ???
?
39
Watson Gets an Idea while surfing
40
Goes to Internet Room connects to Grid
researcher machine
41
Connects to his U.Lecce lab machine and copies
all protein samples he prepared before taking
holiday
42
Copies Test experiment of Grid researcher
modifies it to use his lab experiment data.
43
Starts Parameter Exploration
44
Starts Molecular Experimentation
Screen 50K molecules in 120min. for 200
Nimrod/G Computational Grid Broker
Algorithm1
Data Replica Catalogue
. . .
PDB Broker
AlgorithmN
3
PDB replicas please?
2
5
1
4
advise PDB source?
Grid Info. Service
use GSP4!
Screen mol.5 please?
Is GSP4 healthy?
6
mol.5 please?
PDB2
PDB Service
PDB Service
GSP1
GSP2
GSPm
GSP4
GSP3
GSPn
45
Nimrod/G in ActionScreening on World-Wide Grid
46
Watson-II continues surfing ???
?
47
Comes back to Internet room after 2 hours and
asks his assistant to test results
48
Watson-II assistant conducts tests afternoon ?
Sends email to Wantson in the evening looks
like our client is improving
49
Day 4 _at_ Robinson Village
50
Watson-II does some more exploration this time
with one million molecules. Asks Nimrod to email
results to his assistant for testing...
51
Starts Parameter Exploration
52
Watson-II continues surfing ???
?
53
After Lunch
  • Watson-II reads email that he received from his
    assistant and pleased with the results of his
    experiment.
  • Sends email to VC of his university to do Press
    release of his breakthrough discovery ?

54
Did Watson-II invents cure for AIDS ?
Yes ?
?
55
Vice Chancellor calls for Press Meeting
  • The news spreads like I love you Virus around
    the world including Sweden and Norway!

56
Day 5 _at_ Robinson Village
57
Watson-II continues surfing ???
?
58
Watson-II receives a phone call from Sweden
?
59
Sweden Announces Nobel Award for a Scientist on
Holiday _at_ Robinson Village
!
60
Watson-II the Great at evening _at_ Robinson
Village
61
Watson-II shares the success with Grid
researcher!!!
62
Day 6 and Beyond!
63
Watson-II Family returns to their home happily.
64
Watson-II having moon light dinner with his Wife
_at_ home
65
Prof. Watson-II works up to 5pm in Lab _at_
University of Lecce
Goes to Work _at_ 9am
Returns to home by 5.30PM!
66
Watson-II had moon light dinner with his Wife _at_
home all his life!
67
Do you want to repeat Watson-IIs success in High
Energy Physics ?
?
68
If so, download Software Explore it in 2006
when LHC expt. starts
  • Nimrod Parameteric Computing
  • http//www.csse.monash.edu.au/davida/nimrod/
  • Economy Grid Nimrod/G
  • http//www.buyya.com/ecogrid/
  • Virtual Laboratory/DesignDrug_at_Home
  • http//www.buyya.com/dd_at_home/
  • Grid Simulation (Java based)
  • http//www.buyya.com/gridsim/
  • World Wide Grid testbed
  • http//www.buyya.com/ecogrid/wwg/
  • Looking for new volunteers to grow ?
  • Please contact me to barter your our machines!
  • Want to build on our work/collaborate
  • Talk to me now or email rajkumar_at_csse.monash.edu.
    au

69
(No Transcript)
70
Further Information
  • Books
  • High Performance Cluster Computing, V1, V2,
    R.Buyya (Ed), Prentice Hall, 1999.
  • The GRID, I. Foster and C. Kesselman (Eds),
    Morgan-Kaufmann, 1999.
  • IEEE Task Force on Cluster Computing
  • http//www.ieeetfcc.org
  • Global Grid Forum
  • www.gridforum.org
  • IEEE/ACM CCGridxy www.ccgrid.org
  • CCGrid 2002, Berlin ccgrid2002.zib.de
  • Grid workshop - www.gridcomputing.org

71
Further Information
  • Cluster Computing Info Centre
  • http//www.buyya.com/cluster/
  • Grid Computing Info Centre
  • http//www.gridcomputing.com
  • IEEE DS Online - Grid Computing area
  • http//computer.org/dsonline/gc
  • Compute Power Market Project
  • http//www.ComputePower.com
Write a Comment
User Comments (0)
About PowerShow.com