Title: David P. Anderson
1Public Distributed Computing with BOINC
- David P. Anderson
- Space Sciences Laboratory
- University of California Berkeley
- davea_at_ssl.berkeley.edu
2Public-resource computing
home PCs
your computers
academic
business
95 96 97 98 99 00 01 02
03 04
GIMPS, distributed.net
SETI_at_home, folding_at_home
names public-resource computing peer-to-peer
computing (no!) public distributed
computing _at_home computing
fight_at_home
climateprediction.net
3The potential of public computing
- SETI_at_home 500,000 CPUs, 65 TeraFLOPs
- 1 billion Internet-connected PCs in 2010, 50
privately owned - If 100M participate
- 100 PetaFLOPs
- 1 Exabyte (1018) storage
public computing
CPU power, storage capacity
Grid computing
p
cluster computing
supercomputing
cost
4Public/Grid differences
5Economics (0th order)
cluster/Grid computing
public-resource computing
you
resources ()
Internet ()
Network (free)
resources (free)
1 buys 1 computer/day or 20 GB data transfer on
commercial Internet Suppose processing 1 GB data
takes X computer days Cost of processing 1
GB cluster/Grid X PRC 1/20 So PRC is
cheaper if X gt 1/20 (SETI_at_home X 1,000)
6Economics revisited
Underutilized free Internet (e.g. Internet2)
you
...
other institutions
commodity Internet
Bursty, underutilized flat-rate ISP
connection Traffic shapers can send at zero
priority gt bandwidth may be free also
7Why isn't PRC more widely used?
- Lack of platform
- jxta, Jabber not a solution
- Java apps are in C, FORTRAN
- commercial platforms business issues
- cosm, XtremWeb not complete
- Need to make PRC technology easy to use for
scientists
8BOINC Berkeley Open Infrastructure for Network
Computing
- Goals for computing projects
- easy/cheap to create and operate projects
- wide range of applications possible
- no central authority
- Goals for participants
- easy to participate in multiple projects
- invisible use of disk, CPU, network
- NSF-funded open source in beta test
- http//boinc.berkeley.edu
9(No Transcript)
10Climateprediction.net
- Global climate study (Oxford Univ.)
- Input 10MB executable, 1MB data
- CPU time 2-3 months (can't migrate)
- Output per workunit
- 10 MB summary (always upload)
- 1 GB detail file (archive on client, may upload)
- Chaotic (incomparable results)
11Einstein_at_home (planned)
- Gravity wave detection LIGO UW/CalTech
- 30,000 40 MB data sets
- Each data set is analyzed w/ 40,000 different
parameter sets each takes 6 hrs CPU - Data distribution replicated 2TB servers
- Scheduling problem is more complex than bag of
tasks
12Intel/UCB Network Study (planned)
- Goal map/measure the Internet
- Each workunit lasts for 1 day but is active only
briefly (pings, UDP) - Need to control time-of-day when active
- Need to turn off other apps
- Need to measure system load indices
(network/CPU/VM)
13(No Transcript)
14Project web site features
- Download core client
- Create account
- Edit preferences
- General disk usage, work limits, buffering
- Project-specific allocation, graphics
- venues (home/school/work)
- Profiles
- Teams
- Message boards, adaptive FAQs
15General preferences
16Project-specific preferences
17Data architecture
- Files
- immutable, replicated
- may originate on client or project
- may remain resident on client
- Executables are digitally signed
- Upload certificates prevent DOS
ltfile_infogt ltnamegtarecibo_3392474_jun_23_01lt/name
gt lturlgthttp//ds.ssl.berkeley.edu/a3392474lt/urlgt
lturlgthttp//dt.ssl.berkeley.edu/a3392474lt/urlgt lt
md5_cksumgtuwi7eyufiw8e972h8f9w7lt/md5_cksumgt ltnbyt
esgt10000000lt/nbytesgt lt/file_infogt
18Computation abstractions
- Applications
- Platforms
- Application versions
- may involve many files
- Work units inputs to a computation
- soft deadline CPU/disk/mem estimates
- Results outputs of a computation
19Scheduling pull model
scheduling server
data server
result 1 ... result n
upload
request X seconds of work host description
...compute...
download
core client
20Redundant computing
work generator
assimilator
canonical result
replicator
select canonical result assign credit
validator
scheduler
clients
21BOINC core client
file transfers restartable concurrent user limited
program execution semi-sandboxed graphics
control checkpoint control done, CPU time
app
app
API
API
shared mem
core client
22User interface
graphics
app
core client
screensaver
app
app
activate screensaver
control/state RPCs
control panel
23(No Transcript)
24Anonymous platform mechanism
- User compiles applications from source, registers
them with core client - Report platform as anonymous to scheduler
- Purposes
- obscure platforms
- security-conscious participants
- performance tuning of applications
25Project management tools
- Python scripts for project creation/start/stop
- Remote debugging
- collect/store crash info (stack trace)
- web-based browsing interface
- Strip charts
- record, graph system performance metrics
- Watchdogs
- detect system failures dial pager
26Conclusion
- Public-resource computing is a distinct paradigm
from Grid computing - PRC has tremendous potential for many
applications (computing and storage) - BOINC enabling technology for PRC
- http//boinc.berkeley.edu