Title: Cyberinfrastructure Technologies and Applications
1Cyberinfrastructure Technologies and Applications
- Summit on Cyberinfrastructure Innovation At Work
- Banff Springs Hotel
- Banff Canada October 11 2007
- Geoffrey Fox
- Computer Science, Informatics, Physics
- Pervasive Technology Laboratories
- Indiana University Bloomington IN 47401
- http//grids.ucs.indiana.edu/ptliupages/presentati
ons/ - gcf_at_indiana.edu http//www.infomall.org
2e-moreorlessanything
- e-Science is about global collaboration in key
areas of science, and the next generation of
infrastructure that will enable it. from its
inventor John Taylor Director General of Research
Councils UK, Office of Science and Technology - e-Science is about developing tools and
technologies that allow scientists to do faster,
better or different research - Similarly e-Business captures an emerging view of
corporations as dynamic virtual organizations
linking employees, customers and stakeholders
across the world. - This generalizes to e-moreorlessanything
including presumably e-AlbertaEnterprise and
e-oilandgas, e-geoscience . - A deluge of data of unprecedented and inevitable
size must be managed and understood. - People (see Web 2.0), computers, data (including
sensors and instruments) must be linked. - On demand assignment of experts, computers,
networks and storage resources must be supported
2
3What is Cyberinfrastructure
- Cyberinfrastructure is (from NSF) infrastructure
that supports distributed science (e-Science)
data, people, computers - Clearly core concept more general than Science
- Exploits Internet technology (Web2.0) adding (via
Grid technology) management, security,
supercomputers etc. - It has two aspects parallel low latency
(microseconds) between nodes and distributed
highish latency (milliseconds) between nodes - Parallel needed to get high performance on
individual large simulations, data analysis etc.
must decompose problem - Distributed aspect integrates already distinct
components especially natural for data - Cyberinfrastructure is in general a distributed
collection of parallel systems - Cyberinfrastructure is made of services
(originally Web services) that are just
programs or data sources packaged for distributed
access
3
4Underpinnings of Cyberinfrastructure
- Distributed software systems are being
revolutionized by developments from e-commerce,
e-Science and the consumer Internet. There is
rapid progress in technology families termed Web
services, Grids and Web 2.0 - The emerging distributed system picture is of
distributed services with advertised interfaces
but opaque implementations communicating by
streams of messages over a variety of protocols - Complete systems are built by combining either
services or predefined/pre-existing collections
of services together to achieve new capabilities - As well as Internet/Communication revolutions
(distributed systems), multicore chips will
likely be hugely important (parallel systems) - Industry not academia is leading innovation in
these technologies
5Service or Web Service Approach
- One uses GML, CML etc. to define the data
structure in a system and one uses services to
capture methods or programs - In eScience, important services fall in three
classes - Simulations
- Data access, storage, federation, discovery
- Filters for data mining and manipulation
- Services could use something like WSDL (Web
Service Definition Language) to define
interoperable interfaces but Web 2.0 follows old
library practice one just specifies interface - Service Interface (WSDL) establishes a contract
independent of implementation between two
services or a service and a client - Services should be loosely coupled which normally
means they are coarse grain - Services will be composed (linked together) by
mashups (typically scripts) or workflow (often
XML BPEL) - Software Engineering and Interoperability/Standard
s are closely related
6Computing and Cyberinfrastructure TeraGrid
TeraGrid resources include more than 250
teraflops of computing capability and more than
30 petabytes of online and archival data storage,
with rapid access and retrieval over
high-performance networks. TeraGrid is
coordinated at the University of Chicago, working
with the Resource Provider sites Indiana
University, Oak Ridge National Laboratory,
National Center for Supercomputing Applications,
Pittsburgh Supercomputing Center, Purdue
University, San Diego Supercomputer Center, Texas
Advanced Computing Center, University of
Chicago/Argonne National Laboratory, and the
National Center for Atmospheric Research.
Grid Infrastructure Group (UChicago)
UW
PSC
UC/ANL
NCAR
PU
NCSA
UNC/RENCI
IU
Caltech
ORNL
USC/ISI
SDSC
TACC
Resource Provider (RP)
Software Integration Partner
7Data and Cyberinfrastructure
- DIKW Data ? Information ? Knowledge ? Wisdom
transformation - Applies to e-Science, Distributed Business
Enterprise (including outsourcing), Military
Command and Control and general decision support - (SOAP or just RSS) messages transport information
expressed in a semantically rich fashion between
sources and services that enhance and transform
information so that complete system provides - Semantic Web technologies like RDF and OWL might
help us to have rich expressivity but they might
be too complicated - We are meant to build application specific
information management/transformation systems for
each domain - Each domain has Specific Services/Standards (for
APIs and Information such as KML and GML for
Geographical Information Systems) - and will use Generic Services (like R for
datamining) and - Generic Standards (such as RDF, WSDL)
- Standards made before consensus or not observant
of technology progress are dubious
8Information and Cyberinfrastructure
Raw Data ? Data ? Information ?
Knowledge ? Wisdom
AnotherGrid
Decisions
AnotherGrid
SS
SS
SS
SS
FS
FS
OS
MD
MD
FS
Portal
FS
OS
OS
OS
OS
Inter-Service Messages
FS
FS
FS
FS
AnotherService
FS
MD
MD
OS
MD
OS
OS
FS
Other Service
FS
FS
FS
FS
OS
MD
OS
OS
FS
FS
FS
MD
MD
FS
Filter Service
OS
FS
MetaData
AnotherGrid
FS
FS
FS
MD
Sensor Service
SS
SS
SS
SS
SS
SS
SS
SS
SS
SS
AnotherService
9Information Cyberinfrastructure Architecture
- The Party Line approach to Information
Infrastructure is clear one creates a
Cyberinfrastructure consisting of distributed
services accessed by portals/gadgets/gateways/RSS
feeds - Services include
- Computing
- original data
- Transformations or filters implementing DIKW
(Data Information Knowledge Wisdom) pipeline - Final Decision Support step converting wisdom
into action - Generic services such as security, profiles etc.
- Some filters could correspond to large
simulations - Infrastructure will be set up as a System of
Systems (Grids of Grids) - Services and/or Grids just accept some form of
DIKW and produce another form of DIKW - Original data has no explicit input just output
10Virtual Observatory Astronomy GridIntegrate
Experiments
Radio
Far-Infrared
Visible
Dust Map
Visible X-ray
Galaxy Density Map
11(No Transcript)
12CReSIS PolarGrid
- Important CReSIS-specific Cyberinfrastructure
components include - Managed data from sensors and satellites
- Data analysis such as SAR processing possibly
with parallel algorithms - Electromagnetic simulations (currently commercial
codes) to design instrument antennas - 3D simulations of ice-sheets (glaciers) with
non-uniform meshes - GIS Geographical Information Systems
- Also need capabilities present in many Grids
- Portal i.e. Science Gateway
- Submitting multiple sequential or parallel jobs
- The need for three distinct types of components
Continental USA with multiple base and field
camps - Base and field camps must be power efficient
- Terrible connectivity from base and field camps
to Continental subGrid
13CICC Chemical Informatics and Cyberinfrastructure
Collaboratory Web Service Infrastructure
Portal Services RSS Feeds User
Profiles Collaboration as in Sakai
Core Grid Services Service Registry Job
Submission and Management Local Clusters IU
Big Red, TeraGrid, Open Science Grid
14Process Chemistry-Biology Interaction Data from
HTS (High Throughput Screening)
Percent Inhibition or IC50 data is retrieved from
HTS
Scientists at IU prefer Web 2.0 to Grid/Web
Service for workflow
Grids can link data analysis ( e.g image
processing developed in existing Grids),
traditional Chem-informatics tools, as well as
annotation tools (Semantic Web, del.icio.us) and
enhance lead ID and SAR analysis A Grid of Grids
linking collections of services atPubChem ECCR
centers MLSCN centers
Workflows encoding plate control well
statistics, distribution analysis, etc
Question Was this screen successful?
Workflows encoding distribution analysis of
screening results
Question What should the active/inactive cutoffs
be?
Question What can we learn about the target
protein or cell line from this screen?
Workflows encoding statistical comparison of
results to similar screens, docking of compounds
into proteins to correlate binding, with
activity, literature search of active compounds,
etc
Compound data submitted to PubChem
CHEMINFORMATICS
PROCESS
GRIDS
15People and Cyberinfrastructure Web 2.0
- Web 2.0 has tools (sites) and technologies
- Technologies (later) are competition for Grids
and Web Services - Sites (below) are the best way to integrate
people into Cyberinfrastructure - Kazaa, Instant Messengers, Skype, Napster,
BitTorrent for P2P Collaboration text,
audio-video conferencing, files - del.icio.us, Connotea, Citeulike, Bibsonomy,
Biolicious manage shared bookmarks - MySpace, YouTube, Bebo, Hotornot, Facebook, or
similar sites allow you to create (upload)
community resources and share them Friendster,
LinkedIn create networks - http//en.wikipedia.org/wiki/List_of_social_networ
king_websites - Writely, Wikis and Blogs are powerful specialized
shared document systems - Google Scholar and Windows Live Academic Search
tells you who has cited your papers while
publisher sites tell you about co-authors
16Best Web 2.0 Sites -- 2006
- Extracted from http//web2.wsj2.com/
- Social Networking
- Start Pages
- Social Bookmarking
- Peer Production News
- Social Media Sharing
- Online Storage (Computing)
16
17Web 2.0 Systems are Portals, Services, Resources
- Captures the incredible development of
interactive Web sites enabling people to create
and collaborate
18Web 2.0 and Web Services I
- Web Services have clearly defined protocols
(SOAP) and a well defined mechanism (WSDL) to
define service interfaces - There is good .NET and Java support
- The so-called WS- specifications provide a rich
sophisticated but complicated standard set of
capabilities for security, fault tolerance,
meta-data, discovery, notification etc. - Narrow Grids build on Web Services and provide
a robust managed environment with growing
adoption in Enterprise systems and distributed
science (so called e-Science) - Web 2.0 supports a similar architecture to Web
services but has developed in a more chaotic but
remarkably successful fashion with a service
architecture with a variety of protocols
including those of Web and Grid services - Over 500 Interfaces defined at http//www.programm
ableweb.com/apis - Web 2.0 also has many well known capabilities
with Google Maps and Amazon Compute/Storage
services of clear general relevance - There are also Web 2.0 services supporting novel
collaboration modes and user interaction with the
web as seen in social networking sites, portals,
MySpace, YouTube,
19Web 2.0 and Web Services II
- I once thought Web Services were inevitable but
this is no longer clear to me - Web services are complicated, slow and non
functional - WS-Security is unnecessarily slow and pedantic
(canonicalization of XML) - WS-RM (Reliable Messaging) seems to have poor
adoption and doesnt work well in collaboration - WSDM (distributed management) specifies a lot
- There are de facto standards like Google Maps and
powerful suppliers like Google which define the
rules - One can easily combine SOAP (Web Service) based
services/systems with HTTP messages but the
lowest common denominator suggests additional
structure/complexity of SOAP will not easily
survive
20Applications, Infrastructure, Technologies
- The discussion is confused by inconsistent use of
terminology this is what I mean - Multicore, Narrow and Broad Grids and Web 2.0
(Enterprise 2.0) are technologies - These technologies combine and compete to build
infrastructures termed e-infrastructure or
Cyberinfrastructure - Although multicore can and will support
standalone clients probably most important
client and server applications of the future will
be internet enhanced/enabled so key aspect of
multicore is its role and integration in
e-infrastructure - e-moreorlessanything is an emerging application
area of broad importance that is hosted on the
infrastructures e-infrastructure or
Cyberinfrastructure
21Some Web 2.0 Activities at IU
- Use of Blogs, RSS feeds, Wikis etc.
- Use of Mashups for Cheminformatics Grid workflows
- Moving from Portlets to Gadgets in portals (or at
least supporting both) - Use of Connotea to produce tagged document
collections such as http//www.connotea.org/user/c
rmc for parallel computing - Semantic Research Grid integrates multiple
tagging and search systems and copes with
overlapping inconsistent annotations - MSI-CIEC portal augments Connotea to tag a mix of
URL and URIs e.g. NSF TeraGrid use, PIs and
Proposals - Hopes to support collaboration (for Minority
Serving Institution faculty)
22Use blog to create posts.
Display blog RSS feed in MediaWiki.
23Semantic Research Grid (SRG) Architecture
8/3/2018
23
24MSI-CIEC Portal
MSI-CIEC Minority Serving Institution
CyberInfrastructure Empowerment Coalition
25Mashups v Workflow?
- Mashup Tools are reviewed at http//blogs.zdnet.co
m/Hinchcliffe/?p63 - Workflow Tools are reviewed by Gannon and Fox
http//grids.ucs.indiana.edu/ptliupages/publicatio
ns/Workflow-overview.pdf
- Both include scripting in PHP, Python, sh etc. as
both implement distributed programming at level
of services - Mashups use all types of service interfaces and
perhaps do not have the potential robustness
(security) of Grid service approach - Mashups typically pure HTTP (REST)
25
26Grid Workflow Datamining in Earth Science
- Work with Scripps Institute
- Grid services controlled by workflow process real
time data from 70 GPS Sensors in Southern
California
NASA GPS
Earthquake
26
27Grid Workflow Data Assimilation in Earth Science
- Grid services triggered by abnormal events and
controlled by workflow process real time data
from radar and high resolution simulations for
tornado forecasts
Typical graphical interface to service composition
28Web 2.0 uses all types of Services
- Here a Gadget Mashup uses a 3 service workflow
with a JavaScript Gadget Client
28
29Web 2.0 Mashups and APIs
- http//www.programmableweb.com/apis has (Sept 12
2007) 2312 Mashups and 511 Web 2.0 APIs and with
GoogleMaps the most often used in Mashups - The Web 2.0 UDDI (service registry)
30The List of Web 2.0 APIs
- Each site has API and its features
- Divided into broad categories
- Only a few used a lot (49 APIs used in 10 or
more mashups) - RSS feed of new APIs
- Amazon S3 growing in popularity
31Grid-style portal as used in Earthquake Grid
- The Portal is built from portlets providing
user interface fragments for each service that
are composed into the full interface uses OGCE
technology as does planetary science VLAB portal
with University of Minnesota
Now to Portals
31
32Portlets v. Google Gadgets
- Portals for Grid Systems are built using portlets
with software like GridSphere integrating these
on the server-side into a single web-page - Google (at least) offers the Google sidebar and
Google home page which support Web 2.0 services
and do not use a server side aggregator - Google is more user friendly!
- The many Web 2.0 competitions is an interesting
model for promoting development in the world-wide
distributed collection of Web 2.0 developers - I guess Web 2.0 model will win!
32
33Typical Google Gadget Structure
- Lots of HTML and JavaScript lt/Contentgt lt/Modulegt
Portlets build User Interfaces by combining
fragments in a standalone Java Server Google
Gadgets build User Interfaces by combining
fragments with JavaScript on the client
34Web 2.0 v Narrow Grid I
- Web 2.0 and Grids are addressing a similar
application class although Web 2.0 has focused on
user interactions - So technology has similar requirements
- Web 2.0 chooses simplicity (REST rather than
SOAP) to lower barrier to everyone participating - Web 2.0 and Parallel Computing tend to use
traditional (possibly visual) (scripting)
languages for equivalent of workflow whereas
Grids use visual interface backend recorded in
BPEL - Web 2.0 and Grids both use SOA Service Oriented
Architectures - System of Systems Grids and Web 2.0 are likely
to build systems hierarchically out of smaller
systems - We need to support Grids of Grids, Webs of Grids,
Grids of Services etc. i.e. systems of systems of
all sorts
34
35Web 2.0 v Narrow Grid II
- Web 2.0 has a set of major services like
GoogleMaps or Flickr but the world is composing
Mashups that make new composite services - End-point standards are set by end-point owners
- Many different protocols covering a variety of
de-facto standards - Narrow Grids have a set of major software systems
like Condor and Globus and a different world is
extending with custom services and linking with
workflow - Popular Web 2.0 technologies are PHP, JavaScript,
JSON, AJAX and REST with Start Page e.g.
(Google Gadgets) interfaces - Popular Narrow Grid technologies are Apache Axis,
BPEL WSDL and SOAP with portlet interfaces - Robustness of Grids demanded by the Enterprise?
- Not so clear that Web 2.0 wont eventually
dominate other application areas and with
Enterprise 2.0 its invading Grids
36Web 2.0 v Narrow Grid III
- Narrow Grids have a strong emphasis on standards
and structure Web 2.0 lets a 1000 flowers
(protocols) and a million developers bloom and
focuses on functionality, broad usability and
simplicity - Semantic Web/Grid has structure to allow
reasoning - Annotation in sites like del.icio.us and
uploading to MySpace/YouTube is unstructured and
free text search replaces structured ontologies - Portals are likely to feature both Web and
desktop client technology although it is
possible that Web approach will be adopted more
or less uniformly - Web 2.0 has a very active portal activity which
has similar architecture to Grids - A page has multiple user interface fragments
- Web 2.0 user interface integration is typically
Client side using Gadgets AJAX and JavaScript
while - Grids are in a special JSR168 portal server side
using Portlets WSRP and Java
36
37The Ten areas covered by the 60 core WS-
Specifications
WS- Specification Area Typical Grid/Web Service Examples
1 Core Service Model XML, WSDL, SOAP
2 Service Internet WS-Addressing, WS-MessageDelivery Reliable Messaging WSRM Efficient Messaging MOTM
3 Notification WS-Notification, WS-Eventing (Publish-Subscribe)
4 Workflow and Transactions BPEL, WS-Choreography, WS-Coordination
5 Security WS-Security, WS-Trust, WS-Federation, SAML, WS-SecureConversation
6 Service Discovery UDDI, WS-Discovery
7 System Metadata and State WSRF, WS-MetadataExchange, WS-Context
8 Management WSDM, WS-Management, WS-Transfer
9 Policy and Agreements WS-Policy, WS-Agreement
10 Portals and User Interfaces WSRP (Remote Portlets)
38WS- Areas and Web 2.0
WS- Specification Area Web 2.0 Approach
1 Core Service Model XML becomes optional but still useful SOAP becomes JSON RSS ATOM WSDL becomes REST with API as GET PUT etc. Axis becomes XmlHttpRequest
2 Service Internet No special QoS. Use JMS or equivalent?
3 Notification Hard with HTTP without polling JMS perhaps?
4 Workflow and Transactions (no Transactions in Web 2.0) Mashups, Google MapReduce Scripting with PHP JavaScript .
5 Security SSL, HTTP Authentication/Authorization, OpenID is Web 2.0 Single Sign on
6 Service Discovery http//www.programmableweb.com
7 System Metadata and State Processed by application no system state Microformats are a universal metadata approach
8 ManagementInteraction WS-Transfer style Protocols GET PUT etc.
9 Policy and Agreements Service dependent. Processed by application
10 Portals and User Interfaces Start Pages, AJAX and Widgets(Netvibes) Gadgets
39Too much Computing?
- Historically one has tried to increase computing
capabilities by - Optimizing performance of codes
- Exploiting all possible CPUs such as Graphics
co-processors and idle cycles - Making central computers available such as
NSF/DoE/DoD supercomputer networks - Next Crisis in technology area will be the
opposite problem commodity chips will be
32-128way parallel in 5 years time and we
currently have no idea how to use them
especially on clients - Only 2 releases of standard software (e.g.
Office) in this time span - Gaming and Generalized decision support (data
mining) are two obvious ways of using these
cycles - Intel RMS analysis
- Note even cell phones will be multicore
- There is Too much data as well as Too much
computing but unclear implications
40Intels Projection
41RMS Recognition Mining Synthesis
Recognition
Mining
Synthesis
Is it ?
What is ?
What if ?
Find a model instance
Create a model instance
Model
Model-less
Real-time streaming and transactions on static
structured datasets
Very limited realism
Model-based multimodal recognition
Real-time analytics on dynamic,
unstructured, multimodal datasets
Photo-realism and physics-based animation
42Recognition
Mining
Synthesis
What is a tumor?
Is there a tumor here?
What if the tumor progresses?
It is all about dealing efficiently with complex
multimodal datasets
Images courtesy http//splweb.bwh.harvard.edu800
0/pages/images_movies.html
43Intels Application Stack
44Multicore SALSA at IU
- Service Aggregated Linked Sequential Activities
- http//www.infomall.org/multicore
- Aims to link parallel and distributed (Grid)
computing by developing parallel applications as
services and not as programs or libraries - Improve traditionally poor parallel programming
development environments - Can use messaging to link parallel and Grid
services but performance functionality
tradeoffs different - Parallelism needs few µs latency for message
latency and thread spawning - Network overheads in Grid 10-100s µs
- Developing Service (library) of multicore
parallel data mining algorithms
45Microsoft CCR for Parallelism
- Use Microsoft CCR/DSS where DSS is
mash-up/workflow service model built from CCR and
CCR supports MPI or Dynamic threads - CCR Supports exchange of messages between threads
using named ports - FromHandler Spawn threads without reading ports
- Receive Each handler reads one item from a
single port - MultipleItemReceive Each handler reads a
prescribed number of items of a given type from a
given port. Note items in a port can be general
structures but all must have same type. - MultiplePortReceive Each handler reads a one
item of a given type from multiple ports. - JoinedReceive Each handler reads one item from
each of two ports. The items can be of different
type. - Choice Execute a choice of two or more
port-handler pairings - Interleave Consists of a set of arbiters (port
-- handler pairs) of 3 types that are Concurrent,
Exclusive or Teardown (called at end for clean
up). Concurrent arbiters are run concurrently but
exclusive handlers are - http//msdn.microsoft.com/robotics/
45
46Timing of HP Opteron Multicore as a function of
number of simultaneous two-way service messages
processed (November 2006 DSS Release)
DSS Service Measurements
- Measurements of Axis 2 shows about 500
microseconds DSS is 10 times better
46
47MPI Exchange Latency in µs (20-30 µs computation between messaging) MPI Exchange Latency in µs (20-30 µs computation between messaging) MPI Exchange Latency in µs (20-30 µs computation between messaging) MPI Exchange Latency in µs (20-30 µs computation between messaging) MPI Exchange Latency in µs (20-30 µs computation between messaging) MPI Exchange Latency in µs (20-30 µs computation between messaging)
Machine OS Runtime Grains Parallelism MPI Exchange Latency
Intel8cgf12 (8 core 2.33 Ghz) (in 2 chips) Redhat MPJE (Java) Process 8 181
Intel8cgf12 (8 core 2.33 Ghz) (in 2 chips) Redhat MPICH2 (C) Process 8 40.0
Intel8cgf12 (8 core 2.33 Ghz) (in 2 chips) Redhat MPICH2 Fast Process 8 39.3
Intel8cgf12 (8 core 2.33 Ghz) (in 2 chips) Redhat Nemesis Process 8 4.21
Intel8cgf20 (8 core 2.33 Ghz) Fedora MPJE Process 8 157
Intel8cgf20 (8 core 2.33 Ghz) Fedora mpiJava Process 8 111
Intel8cgf20 (8 core 2.33 Ghz) Fedora MPICH2 Process 8 64.2
Intel8b (8 core 2.66 Ghz) Vista MPJE Process 8 170
Intel8b (8 core 2.66 Ghz) Fedora MPJE Process 8 142
Intel8b (8 core 2.66 Ghz) Fedora mpiJava Process 8 100
Intel8b (8 core 2.66 Ghz) Vista CCR (C) Thread 8 20.2
AMD4 (4 core 2.19 Ghz) XP MPJE Process 4 185
AMD4 (4 core 2.19 Ghz) Redhat MPJE Process 4 152
AMD4 (4 core 2.19 Ghz) Redhat mpiJava Process 4 99.4
AMD4 (4 core 2.19 Ghz) Redhat MPICH2 Process 4 39.3
AMD4 (4 core 2.19 Ghz) XP CCR Thread 4 16.3
Intel4 (4 core 2.8 Ghz) XP CCR Thread 4 25.8
48Clustering algorithm annealing by decreasing
distance scale and gradually finds more clusters
as resolution improved Here we see 10 increasing
to 30 as algorithm progresses
49Parallel Multicore Clustering (C on Windows)
Parallel Overheadon 8 Threads running on Intel 8
core Speedup 8/(1Overhead)
10 Clusters
Overhead Constant1 Constant2/n Constant1
0.05 to 0.1 (Client Windows) due to
threadruntime fluctuations
20 Clusters
10000/(Grain Size n points per core)
50We use DSS as Service Framework as Integrated
with CCR Supporting MPI/Threading
51Intel 8-core C with 80 Clusters Vista Run Time
Fluctuations for Clustering Kernel
- 2 Quadcore Processors
- This is average of standard deviation of run time
of the 8 threads between messaging
synchronization points
52Intel 8 core with 80 Clusters Redhat Run Time
Fluctuations for Clustering Kernel
- This is average of standard deviation of run time
of the 8 threads between messaging
synchronization points
Standard Deviation/Run Time
Number of Threads
53What should one do?
- i.e. How does one Cyberinfrastructure enable a
given area/application XYZ - As computing free, focus on identifying
data/information/knowledge/wisdom needed (there
is probably too much data but not so much wisdom
in DIKW pipeline) - Should we care just about original data or also
about the whole pipeline DIKW? - Scope out supercomputer/computer services needed
and exploit OGF standards - Identify services (filters, often data mining)
needed by XYZ? - Will we need parallel implementations of filters
if so use multicore compatible frameworks - Identify standards for application XYZ
- Set up distributed XYZ Services
- Use Web 2.0 (as it makes things easier) not
current Grids (which makes things harder) - Build a Programmable XYZ Web
- Emphasize Simplicity
- Is Secrecy important and in fact viable? Often
important but hard - What are synergies of XYZ to pervasive
capabilities such as Web 2.0 sites, National
resources like TeraGrid, and Personal aides in
an information rich world (future of PC) ?