CLADE Review 2003-2008 - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

CLADE Review 2003-2008

Description:

CLADE Review 2003-2008 Nancy Wilkins-Diehr wilkinsn_at_sdsc.edu CLADE 2008, June 23, 2008 NSF (my sponsor) has long recognized the importance of science and technology ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 53
Provided by: NancyWilk7
Category:

less

Transcript and Presenter's Notes

Title: CLADE Review 2003-2008


1
CLADE Review2003-2008
  • Nancy Wilkins-Diehr
  • wilkinsn_at_sdsc.edu

2
The Origin of CLADE
  • The CLADE workshop began with a discussion at
    HPDC-11, July 24-26, 2002, at Edinburgh
    International Conference Center in Scotland.
  • Salim Hariri, C.S. Raghavendra, and I and likely
    a couple of others got to talking about the state
    of Grid applications.
  • At that time quite a lot of progress had been
    made with tools and technologies for distributed
    applications, but we were not seeing many
    applications papers at HPDC, or in other forums
    either.
  • So Salim suggested that we put together a
    workshop to focus attention on applications, and
    he asked me to help organize it.
  • Ray Bair

3
Keys to the Success of CLADE
  • Compliments the HPDC program
  • Focus on real applications that demonstrate the
    use of Grid approaches on a significant scale.
  • CLADE's association with HPDC still distinguishes
    it from other conferences
  • Bringing together cutting edge computer science
    and applications
  • Support of the HPDC Steering Committee
  • Strong Program Committee chairs
  • Good advice from CLADE's Steering Committee
  • Engaged Program Committee members
  • Peer-review system has been important in
    selecting good papers that are timely and
    interesting
  • Distribution of the CLADE proceedings at the
    workshop increases the value and usefulness of
    the papers to the participants

4
2008 CLADE Organization
  • STEERING COMMITTEE
  • Raymond Bair, ANL
  • Ioana Banicescu, Mississippi State Univ.
  • Francine Berman, Univ. of Calif., San Diego
  • Jack Dongarra, Univ. of Tenn., Knoxville
  • Salim Hariri, University of Arizona
  • Manish Parashar, Rutgers University
  • Viktor Prasanna, Univ. of Southern Calif.
  • Joel Saltz, Ohio State University
  • Edward Seidel, Louisiana State University
  • Alan Sussman, University of Maryland
  • PROGRAM COMMITTEE Henrique Andrade, IBM
    ResearchDavid Bernholdt, ORNLJiannong Cao, HK
    PolyUUmit Catalyurek, Ohio State U.Kenneth
    Chiu, U. BinghamtonJose Cunha, U. Nova de
    LisboaEwa Deelman, ISIFrederic Desprez, ENS
    LyonHai Jin, HUSTTevfik Kosar, Louisiana State
    U.Tahsin Kurc, Ohio State U.Jysoo Lee, Calit2
  • Sang Boem Lim, KonKuk U.David Lowenthal, U.
    GeorgiaMalika Mahoui, IUPUIJames Myers, NCSA
  • Gregory Newby, Arctic Region Supercomputing
    CenterJun Ni, U. IowaYoonho Park, IBM
    ResearchMarlon Pierce, Indiana U.
  • Ilkyun Ra, U. Colorado DenverThomas Rauber,
    U. BayreuthGudula Rünger, TU ChemnitzEdward
    Walker, TACC
  • Shaowen Wang, UIUC

5
Todays Talk
  • Overview CLADE keynotes 2003-2007
  • 2003 Dynamic Data Driven Application Systems,
    Frederica Darema
  • 2004 A Grid based Diagnostics and Prognosis
    System for Rolls Royce Aero Engines The DAME
    Project, Jim Austin
  • 2005 Enabling Science and Engineering
    Applications on the Grid, Ed Seidel
  • 2006 Gridcast - a Next Generation Broadcasting
    Infrastructure?, Terry Harmer
  • 2007 The Cancer Biomedical Informatics Grid
    Connecting the Cancer Research Community, Scott
    Oster
  • TeraGrid Science Gateways

6
CLADE 2003, Seattle
  • Keynote Presentation
  • Frederica Darema, Senior Science and Technology
    Advisor and Director of the Next Generation
    Software Program, National Science Foundation
  • Dynamic Data Driven Application Systems
  • Highlighted the relationship between theory,
    simulation and experiment or field data
  • Dynamic feedback and control loop between
    simulation and experimental data
  • DDDAS has potential for significant impact to
    science, engineering, and commercial world, akin
    to the transformation effected since the 50s by
    the advent of computers

7
Example DDDAS Applications
  • Generalized methodology for state estimation and
    prediction
  • Predictor-Corrector methods
  • Advanced Driving Assistance Systems for
    automobiles
  • Tracking algorithms for Air Traffic Control
  • Enhancing oil exploration methods and
    capabilities
  • Enhanced manufacturing supply chains through
    sensor information

Source Frederica Darema
8
  • Virtual operations re-planning and control
  • Event-driven simulations for systems subject to
    unplanned outages
  • Earthquake tolerant buildings and bridges
  • Fire propagation prediction and management

Source Frederica Darema
9
  • Integrated Image-Guided Interventions
  • Real-time, three-dimensional (3D) imaging needs
    of surgeons.
  • Biodiversity and bio-complexity
  • Dramatic changes due to habitat transformation,
    invasions of exotic species, chemical
    contamination, diseases and epidemics, climate
    change, and floods and drought

Source Frederica Darema
10
  • Hydro-complexity Weather, Water and Pollution
  • Design and configuration methodologies for sensor
    networks
  • The oceanographic community at large has
    interests in DDDAS in order to help optimize
    observing systems for important scientific
    studies.

Source Frederica Darema
11
CLADE 2004, Honolulu
  • Keynote Presentation
  • Jim Austin, University of York
  • A Grid based Diagnostics and Prognosis System for
    Rolls Royce Aero Engines The DAME Project
  • Very practical engineering application
  • Using distributed data intensive Grid application
    to diagnosis and prognosis of Rolls-Royce Aero
    Engines

12
Distributed Aircraft Maintenance Environment
(DAME)
  • UK e-Science pilot project
  • Quote
  • Neural networkbased techniques for real-time
    monitoring
  • Compare stored vibration data with instantaneous
    snapshots
  • Each flight produces 1GB of data, TBs per year of
    distributed data for a fleet.
  • AURA
  • Advanced Uncertain Reasoning Architecture for
    Pattern Matching
  • Pattern matching among terascale datasets,
    distribute for speed
  • CBR
  • Case Based Reasoning systems for intelligent
    decision support
  • Correlates engine anomalies with root cause
  • Combine into scalable system using grid
    middleware
  • Utilising large amounts of vibration and
    performance data available from modern
    aero-engines for fleet based diagnostics

Source Jim Austin
13
  • Fault diagnosis and prognosis integrated with
    predictive maintenance
  • Detect that engine has deviated from normal
    (QUOTE)
  • Diagnose why (AURA)
  • Form a prognosis (CBR)
  • Plan remedial actions
  • Common components of all fault diagnosis and
    prognosis systems

Source Jim Austin
14
  • Quality of Service and Security are two most
    important project concerns
  • QoS critical for commercial deployment, SLAs will
    likely be a necessity
  • Workgroup formed to focus on security
  • Future directions
  • Base services can be used with many other apps
  • Put core services into a portal
  • More flexible workflow configurations
  • Current project considered a demonstration
    project
  • Commercial implementation will need high
    availability, reliability, data integrity,
    confidentiality

Source Jim Austin
15
CLADE 2005, Research Triangle Park, NC
  • Keynote Presentation
  • Ed Seidel, Louisiana State University
  • Enabling Science and Engineering Applications on
    the Grid
  • Ed Seidel, recently named Office of
    Cyberinfrastructure director at NSF reporting to
    Dr. Bement
  • Many years experience with distributed
    applications and high performance computing

16
Optical Networks 1000x faster than regionalWhat
are people doing with this?
  • Collaboration
  • Distributed communities (NEES, GEON), shared CI
    data, code, tools, resources, simulations
  • Standard things
  • Task farming, resource brokering, remote steering
  • New scenarios
  • Apps abstracted, dynamic apps find their own
    services, resources, people distributed apps
    spawned, monitored
  • Grids bring it all together, but worries in the
    US about DOE, NSF CI funding

Source Ed Seidel
17
Distributed computation the old way
  • Why?
  • Capacity computers cant keep up with needs
  • Throughput
  • Issues
  • Bandwidth (increasing faster than computation)
  • Latency
  • Communication needs, Topology
  • Communication/computation
  • Techniques to be developed
  • Overlapping communication/computation
  • Extra ghost zones to reduce latency
  • Compression
  • Algorithms to do this for scientist
  • Gridlab.org, cactuscode.org

Source Ed Seidel
18
Distributed computation the new way
  • Intelligent parameter surveys, Monte Carlos
  • May control other simulations
  • Dynamic staging move to faster/cheaper/bigger
    machine (Grid Worm)
  • Need more memory? Need less?
  • Multiple universe clone to investigate steered
    parameter (Gird Virus)
  • Automatic component loading
  • Needs of process change, discover/load/execute
    new component somewhere
  • Automatic look ahead, convergence testing
  • spawn off and run coarser resolution to predict
    likely future, study convergence
  • Routine profiling
  • Best machine/queue, choose resolution parameters
    based on queue
  • Dynamic load balancing inhomogeneous loads,
    multiple grids
  • DDDAS injecting data into the above, feed back
    to experiment

Source Ed Seidel
19
GridLab5M EU Project
  • Code/User/Infrastructure should be aware of
  • environment
  • Discover resources available NOW, and their
    current state
  • What is my allocation?
  • What is the bandwidth/latency between sites?
  • Code/User/Infrastructure should be able to make
    decisions
  • A slow part of my simulation can run
    asynchronouslyspawn it off!
  • New, more powerful resources just became
    availablemigrate there!
  • Machine went downreconfigure and recover!
  • Need more memory (or less!)get it by adding
    (dropping) machines!
  • Code/User/Infrastructure should be able to
    publish to central server for tracking,
    monitoring, steering
  • Unexpected eventnotify users!
  • Collaborators from around the world all connect,
    examine simulation.
  • Rethink algorithms Task farming, vectors,
    pipelines, etc all apply on Grids The Grid IS
    your Computer!

Source Ed Seidel
20
Eds Conclusions
  • Optical Networks, grids promise new ways of
    computing
  • Networks need application toolkits, reasonable
    cost model
  • Standards developing
  • 15 years ago parallel computing drove
    interconnects, HPF, MPI
  • Now 2 levels...OGSA grid services, SAGA for apps
  • GridLab www.gridlab.org
  • Grid Application Toolkit www.gridlab.org/GAT
  • Documentation, publications, software download
  • Cactus Computational Toolkit www.cactuscode.org
  • GGF Simple API for Grid Applications (SAGA)
  • Today, SAGA continues as an active research group
    in the Open Grid Forum (OGF)
  • Paper presentation on GAT/SAGA at TeraGrid 08
    last week

Source Ed Seidel
21
CLADE 2006, Paris
  • Keynote Presentation
  • Terry Harmer, Technical Director of the Belfast
    e-Science Centre (BeSC)
  • Gridcast - a Next Generation Broadcasting
    Infrastructure?
  • Media broadcasting
  • BBC has offices in most world capitals
  • Large scale, distributed, dynamic, highly
    reactive management of broadcast content
  • Prototype broadcasting grid developed has been
    deployed since 2004
  • UK e-Science project
  • 50 of funding for UK e-Science centers must come
    from industry

22
Broadcasting is distributedUndergoing rapid
technical change
  • Grid can potentially address technical challenges
  • Secure, wide area distribution of high volume
    content
  • Secure remote access to high value technical
    resources
  • Advanced editing suites
  • Integration of devices, equipment, applications
  • Economic challenges to deliver cost-effective.
    Resilient, extensible infrastructure in rapidly
    changing environment
  • BBC wanted move to commodity infrastructure
  • 280 gig per hour in data movement
  • Grid as integration framework
  • Tie together various platforms
  • Deploy software
  • Not really for computing at this stage
  • 13 May, 2008
  • BeSC awarded over 900,000 to continue its role
    in developing the successor to the world wide web
  • Use of grid via Gridcast provides greater
    programming autonomy among BBC sites

Source Terry Harmer
23
CLADE 2007, Monterrey, CA
  • Keynote Presentation
  • Scott Oster, Ohio State University
  • The Cancer Biomedical Informatics Grid
    Connecting the Cancer Research Community
  • Goal Relieve suffering due to cancer by 2015
  • 61 cancer labs supported by the National Cancer
    Institute (NCI)
  • More than 50 of these, 30 organizations, 800
    people involved in caBIG
  • Create scalable, actively managed organization
    that will connect members of the NCI-supported
    cancer enterprise by building a biomedical
    informatics network

24
caBIG Motivation
  • This year there will be approximately 1,400,000
    Americans diagnosed with cancer
  • More than 500,000 Americans are expected to die
    from cancer this year
  • In 2005, the NIH estimated costs for cancer at
    209.9 billion, with direct medical costs of 74
    billion

Source Scott Oster
25
What is caBIG?
  • Common, widely distributed infrastructure that
    permits the cancer research community to focus on
    innovation
  • Shared, harmonized set of terminology, data
    elements, and data models that facilitate
    information exchange
  • Collection of interoperable applications
    developed to common standards
  • Cancer research data available for mining and
    integration

Source Scott Oster
26
Driving Needs
  • A multitude of legacy information systems, most
    of which cannot be readily shared between
    institutions
  • Difficulty in identifying and accessing available
    resources
  • Approach standards-based grid, WSRF web
    services, Introduce
  • But standards in Web/Grid service domain are
    turbulent at best
  • Competing interests of big business and
    multiple standards bodies
  • An absence of tools to connect different
    databases
  • An absence of common data formats
  • Approach Adopt XML as data exchange format
  • Cancer Data Standards Repository (caDSR) captures
    logical model with annotations facilitates reuse
    and formal definition
  • A huge and growing volume of data must be
    collected, analyzed, and made accessible
  • Gridftp, move services to data
  • Few common vocabularies, making it difficult, if
    not impossible, to interlink diverse research and
    clinical results

Source Scott Oster
27
  • An absence of information infrastructure to share
    data within an institution, or among different
    institutions
  • If cancer is cured, and caBIG resources play a
    role, there will be much interest in knowing who
    contributed what (and who funded them)
  • Technical Approach
  • Single sign on, Grid Authentication and
    Authorization with Reliably Distributed Services
    (GAARDS)
  • Federate Identity Management (Dorian)
  • Authorization solutions
  • GridGrouper for group-based
  • CSM for local policy
  • Globus PDPs for complex rules
  • Institutional Review Boards (IRB) involved for
    any protected health information (PHI) even for
    de-identified data
  • Grid is multi-institutional which means IRBs must
    reach agreements (read separately employed
    lawyers working together)
  • Socio-Cultural Approach
  • Whole workspace in caBIG dedicated to it (DSIC)
  • NCI in a good position to encourage it
  • Large percentage of institutions cancer research
    funding comes from NCI
  • Hope is motivation will be value-based once
    initially primed

Source Scott Oster
28
Scotts Summary
  • The bad news
  • Large-scale, distributed knowledge sharing is
    hard
  • The good news
  • The potential rewards are large
  • The good news (for computer scientists)
  • There are lots of unsolved problems (and interest
    in getting them solved)
  • Disparate Systems
  • Lack of Common Data Formats
  • Data Interoperability
  • Finding Resources
  • Data Size
  • User Accounting
  • Data Privacy
  • Intellectual Capital
  • Complicated Trust Arrangements
  • Computationally Intensive
  • Evolving Infrastructure

Source Scott Oster
29
TeraGrid Science Gateways
30
Phenomenal Impact of the Internet on Worldwide
Communication and Information Retrieval
Only 16 years since the release of Mosaic!
  • Implications on the conduct of science are still
    evolving
  • 1980s, Early gateways, National Center for
    Biotechnology Information BLAST server, search
    results sent by email, still a working portal
    today
  • 1992 Mosaic web browser developed
  • 1995 International Protein Data Bank Enhanced by
    Computer Browser
  • 2004 TeraGrid project director Rick Stevens
    recognized growth in scientific portal
    development and proposed the Science Gateway
    Program
  • Simultaneous explosion of digital information
  • Analysis needs in a variety of scientific areas
  • Sensors, telescopes, satellites, digital images
    and video
  • 1 machine on Top500 today is more powerful than
    all combined entries on the first list in 1993

31
1998 Workshop Highlights Early Impact of Internet
on Science
  • Shared access to geographically disperse
    resources
  • Assembling the best minds to tackle the toughest
    problems regardless of location
  • Tackling the same problems differently, but also
    tackling different problems
  • Not only the scope, but the process of scientific
    investigation is changed
  • As the chemical applications and capabilities
    provided by collaboratories become more familiar,
    researchers will move significantly beyond
    current practice to exciting new paradigms for
    scientific work

Requirements for future success include -
Development of interdisciplinary partnerships of
chemists and computer scientists - Flexible and
extensible frameworks for collaboratories - Means
to deploy, support, and evaluate collaboratories
in the field
32
Rapid Advances in Web Usability
  • First generation
  • Static Web pages
  • Second generation
  • Dynamic, database interfaces, cgi
  • Lacked the ease of use of desktop applications
  • Third generation
  • True networked and internetworked applications
    that enable dynamic two-way, even multi-way,
    communication and collaboration on the Web.
  • Remarkable new uses of the Web in the
    organizational workplace and on the Internet

Source Screen Porch White Paper, The University
of Western Ontario (1996)
33
The Internet as a Resource for News and Information about Science Summary of Findings at a Glance
40 million Americans rely on the internet as their primary source for news and information about science.
For home broadband users, the internet and television are equally popular as sources for science news and the internet leads the way for young broadband users.
The internet is the source to which people would turn first if they need information on a specific scientific topic.
The internet is a research tool for 87 of online users. That translates to 128 million adults.
Consumers of online science information are fact-checkers of scientific claims. Sometimes they use the internet for this, other times they use offline sources.
Convenience plays a large role in drawing people to the internet for science information.
Happenstance also plays a role in users experience with online science resources. Two-thirds of internet users say they have come upon news and information about science when they went online for another reason.
Those who seek out science news or information on the internet are more likely than others to believe that scientific pursuits have a positive impact on society.
Internet users who have sought science information online are more likely to report that they have higher levels of understanding of science.
Between 40 and 50 of internet users say they get information about a specific topic using the internet or through email.
Search engines are far and away the most popular source for beginning science research among users who say they would turn first to the internet to get more information about a specific topic.
Half of all internet users have been to a website which specializes in scientific content.
Fully 59 of Americans have been to a science museum in the past year.
Science websites and science museums may serve effectively as portals to one another.
The convenience of getting scientific material on
the web opens doors to better attitudes and
understanding of science. November 20,
2006 John B. Horrigan, Associate Director
http//www.pewinternet.org/pdfs/PIP_Exploratorium_
Science.pdf
34
NSF (my sponsor) has long recognized the
importance of science and technology interactions
  • Interdisciplinary programs did much to facilitate
    application-technology integration and develop
    standard tools
  • 1997 PACI Program
  • Shotgun marriages of technologists and
  • application scientists
  • A few groups served as path finders and
  • benefited tremendously
  • NPACI neuroscience thrust in 1997 leads
  • to Telescience portal and BIRN in 2001
  • Information Technology Research (ITR)
  • NSF Middleware Initiative (NMI)
  • Plug and play tools so more groups can benefit

35
NSF Continues Its Leadership TodayWhat Will Lead
to Transformative Science?
  • Virtual environments have the potential to
    enhance collaboration, education, and
    experimentation in ways that we are just
    beginning to explore.
  • In every discipline, we need new techniques that
    can help scientists and engineers uncover fresh
    knowledge from vast amounts of data generated by
    sensors, telescopes, satellites, or even the
    media and the Internet.

Gateways are a terrific example of interfaces
that can support transformative science
36
Evolution of the Gateway Program
  • 2004 TeraGrid Science Gateway term originates
  • We will help them build gateway portals that
    leverage TeraGrid capabilities and provide
    web-based interfaces to community tools
  • 2005 Gateway requirements analysis team
  • Areas of identified commonality include
  • Web services, auditing, community accounts,
    flexible allocations, scheduling, outreach
  • Needs of command-line supercomputing users fairly
    well defined
  • Ssh to tg-login
  • Data transfer to and from supercomputer
  • Software
  • MPI, math libraries, domain software
  • Compilers
  • Batch queue submission
  • Help desk
  • Need to address Gateway developer needs just as
    efficiently

37
Tremendous Opportunities Using the Largest Shared
Resources - Challenges too!
  • Whats different when the resource doesnt belong
    just to me?
  • Resource discovery
  • Accounting
  • Security
  • Proposal-based requests for resources
    (peer-reviewed access)
  • Code scaling and performance numbers
  • Justification of resources
  • Gateway citations
  • Tremendous benefits at the high end, but even
    more work for the developers
  • Potential impact on science is huge
  • Small number of developers can impact thousands
    of scientists
  • But need a way to train and fund those developers
    and provide them with appropriate tools

38
Ongoing Work to Meet Common Needs
  • Web Services
  • GT4 deployment, identification of remaining
    capabilities
  • Information services, MDS
  • Registry of Gateway services
  • TG-specific where can I run soonest with QBETS
  • Auditing
  • GRAM audit to retrieve usage information for
    individual compute jobs
  • GridShib
  • Counting gateway users, individualized
    accounting, increased security
  • Community Accounts
  • Policy finalized, security approaches being
    tested by RPs
  • GridShib development, testing with gateways
  • Resource requests
  • Collaboration with reviewers to develop
    guidelines for Gateway PIs
  • Adapt to usage uncertainties, ability to assess
    impact, Gateway management structure
  • Scheduling
  • Metascheduling
  • On-demand via SPRUCE framework
  • Outreach
  • Pathways project
  • Gateway use by educators
  • Training MSI students to build Gateways
  • Documentation
  • Extensive wiki information transformed into
    navigable documentation
  • Gateway Hosting
  • Available at IU through peer review
  • Staff Support
  • Targeted support, general capabilities,
    production coordinator

39
Variety of Gateways Available Today
Title Discipline
Open Science Grid (OSG) Advanced Scientific Computing
Special PRiority and Urgent Computing Environment (SPRUCE) Advanced Scientific Computing
Massive Pulsar Surveys using the Arecibo L-band Feed Array (ALFA) Astronomical Sciences
National Virtual Observatory (NVO) Astronomical Sciences
Linked Environments for Atmospheric Discovery (LEAD) Atmospheric Sciences
Computational Chemistry Grid (GridChem) Chemistry
Computational Science and Engineering Online (CSE-Online) Chemistry
Network for Earthquake Engineering Simulation (NEES) Earthquake Hazard Mitigation
GEON(GEOsciences Network) (GEON) Earth Sciences
Network for Computational Nanotechnology and nanoHUB Emerging Technologies Initiation
TeraGrid Geographic Information Science Gateway (GISolve) Geography and Regional Science
CIG Science Gateway for the Geodynamics Community Geophysics
QuakeSim (QuakeSim) Geophysics
The Earth System Grid (ESG) Global Atmospheric Research
National Biomedical Computation Resource (NBCR) Integrative Biology and Neuroscience
Developing Social Informatics Data Grid (SIDGrid) Language, Cognition, and Social Behavior
Neutron Science TeraGrid Gateway (NSTG) Materials Research
Biology and Biomedicine Science Gateway Molecular Biosciences
Open Life Sciences Gateway (OLSG) Molecular Biosciences
The Telescience Project Neuroscience Biology
Grid Analysis Environment (GAE) Physics
SCEC Earthworks Project Seismology
TeraGrid Visualization Gateway Visualization, Graphics, and Image Processing
40
Easy Gateway True and False TestAnswers Provided
  • TeraGrid selects all gateways (F)
  • TeraGrid designs all gateways (F)
  • TeraGrid limits the number of gateways (F)
  • All gateways need TeraGrid funding to exist (F)
  • Any PI can request an allocation and use it to
    develop a gateway (T)
  • Gateway design is community-developed and that is
    the core strength of the program (T)
  • TeraGrid staff are alerted to gateway work when a
    proposal is reviewed or when a community account
    is requested (T)
  • Limited TeraGrid support can be provided for
    targeted assistance to integrate an existing
    gateway with TeraGrid (T)

41
Gateway Idea Resonates with Scientists
  • Capabilities provided by the Web are easy to
    envision because we use them in every day life
  • Researchers can imagine scientific capabilities
    provided through a familiar interface
  • Groups resonate with the fact that gateways are
    designed by communities and provide interfaces
    understood by those communities
  • But also provide access to greater capabilities
    on the back end without the user needing to
    understand the details of those capabilities
  • Scientists know they can undertake more complex
    analyses and thats all they want to focus on
  • But this seamless access doesnt come for free.
    It all hinges on very capable developers.

42
Gateways Greatly Expand Access
  • Almost anyone can investigate scientific
    questions using high end resources
  • Not just those in the research groups of those
    who request allocations
  • Fosters new ideas, cross-disciplinary approaches
  • Encourages students to experiment
  • But used in production too
  • Increasing number of papers resulting from the
    use of gateways
  • Scientists can focus on challenging science
    problems rather than challenging infrastructure
    problems

43
Highlights NanoHub Explosive User Growth
  • In past 12 months
  • 68,975 users
  • 43 from U.S.
  • 25,187 course downloads
  • 8,287 podcast downloads
  • 371 online meetings
  • Full featured gateway
  • Simulation tools, curricula, multimedia, user
    contributions, collaborations

44
Highlights LEAD Inspires StudentsAdvanced
capabilities regardless of location
  • A student gets excited about what he was able to
    do with LEAD
  • Dr. SikoraAttached is a display of 2-m T and
    wind depicting the WRF's interpretation of the
    coastal front on 14 February 2007. It's
    interesting that I found an example using IDV
    that parallels our discussion of mesoscale
    boundaries in class. It illustrates very nicely
    the transition to a coastal low and the strong
    baroclinic zone with a location very similar to
    Markowski's depiction. I created this image in
    IDV after running a 5-km WRF run (initialized
    with NAM output) via the LEAD Portal. This
    simple 1-level plot is just a precursor of the
    many capabilities IDV will eventually offer to
    visualize high-res WRF output. Enjoy!
  • Eric (email, March 2007)

45
Highlights GridChem Employs a Client-Server
Approach
46
for Production Science
  • Chemical Reactivity of the Biradicaloid
    (HO...ONO) Singlet States of Peroxynitrous Acid.
    The Oxidation of Hydrocarbons, Sulfides, and
    Selenides. Bach, R. D et al. J. Am. Chem. Soc.
    2005, 127, 3140-3155.
  • The "Somersault" Mechanism for the P-450
    Hydroxylation of Hydrocarbons. The Intervention
    of Transient Inverted Metastable Hydroperoxides.
    Bach, R. D. Dmitrenko, O. J. Am. Chem. Soc.
    2006, 128(5), 1474-1488.
  • The Effect of Carbonyl Substitution on the Strain
    Energy of Small Ring Compounds and their
    Six-member Ring Reference Compounds Bach, R. D.
    Dmitrenko, O. J. Am. Chem. Soc. 2006,128(14),
    4598.
  • Azide Reactions for Controlling Clean Silicon
    Surface Chemistry Benzylazide on Si(100)-2 x
    1Semyon Bocharov et al..J. Am. Chem. Soc., 128
    (29), 9300 -9301, 2006
  • Chemistry of Diffusion Barrier Film Formation
    Adsorption and Dissociation of Tetrakis(dimethylam
    ino)titanium on Si(100)-2 1 Rodriguez-Reyes,
    J. C. F. Teplyakov, A. V.J. Phys. Chem. C.
    2007 111(12) 4800-4808.
  • Computational Studies of 22 and 42
    Pericyclic Reactions between Phosphinoboranes and
    Alkenes. Steric and Electronic Effects in
    Identifying a Reactive Phosphinoborane that
    Should Avoid Dimerization Thomas M. Gilbert and
    Steven M. Bachrach Organometallics, 26 (10), 2672
    -2678, 2007.

47
cancer Bioinformatics Grid Addressing todays
challenges in cancer research and treatment
  • The mission of caBIG is to develop a truly
    collaborative information network that
    accelerates the discovery of new approaches for
    the detection, diagnosis, treatment, and
    prevention of cancer, ultimately improving
    patient outcomes.
  • The goals of caBIG are to
  • Connect scientists and practitioners through a
    shareable and interoperable infrastructure
  • Develop standard rules and a common language to
    more easily share information
  • Build or adapt tools for collecting, analyzing,
    integrating, and disseminating information
    associated with cancer research and care.

Source cabig.cancer.gov
48
caBIG and TeraGrid
  • caBIG conducted study of all Gateways
  • Pleased to discover that community accounts and
    web services will exactly meet their requirements
  • TeraGrid resources incorporated into geWorkbench
  • an open source platform for integrated genomics
    used to
  • Load data from local or remote data sources.
  • Visualize gene expression and sequence data in a
    variety of ways.
  • Provide access to client- and server-side
    computational analysis tools such as t-test
    analysis, hierarchical clustering, self
    organizing maps, regulatory networks
    reconstruction, BLAST searches, pattern/motif
    discovery, etc.
  • Clustering is used to build groups of genes with
    related expression patterns which may contain
    functionally related proteins, such as enzymes
    for a specific pathway
  • Validate computational hypothesis through the
    integration of gene and pathway annotation
    information from curated sources as well as
    through Gene Ontology enrichment analysis.

49
geWorkbench Integrages TeraGrid Resources
Although the new service is TeraGrid-aware, the
perspective from geWorkbench does not change.
As far as geWorkbench is concerned, it is still
connecting to a Hierarchical Clustering caGrid
service. The difference is now the caGrid
service is a gateway service that submits a
TeraGrid job on behalf of geWorkbench.
geWorkbench, however, does not notice this
difference.
Source http//wiki.c2b2.columbia.edu/informatics/
index.php/GeWorkbench_Example
50
Hide the C in CLADE with a GatewayWhen is a
gateway appropriate?
  • Researchers using defined sets of tools in
    different ways
  • Same executables, different input
  • GridChem, CHARMM
  • Creating multi-scale or complex workflows
  • Datasets
  • Common data formats
  • National Virtual Observatory
  • Earth System Grid
  • Some groups have invested significant efforts
    here
  • caBIG, extensive discussions to develop common
    terminology and formats
  • BIRN, extensive data sharing agreements
  • Difficult to access data/advanced workflows
  • Sensor/radar input
  • LEAD, GEON

51
Tremendous Potential for Gateways
  • In only 16 years, the Web has fundamentally
    changed human communication
  • Science Gateways can leverage this amazingly
    powerful tool to
  • Transform the way scientists collaborate
  • Streamline conduct of science
  • Influence the publics perception of science
  • Reliability, trust, continuity are fundamental to
    truly change the conduct of science through the
    use of gateways
  • High end resources can have a profound impact
  • The future is very exciting!

52
Thank you for your attention
  • wilkinsn_at_sdsc.edu
  • www.teragrid.org
Write a Comment
User Comments (0)
About PowerShow.com