SKA computing how hard can it be - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

SKA computing how hard can it be

Description:

... not possible in absence of concept, design, requirements, timeline... Don't let the hardware design drive the software. Don't forget some class of users ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 34
Provided by: timc166
Category:
Tags: ska | computing | hard

less

Transcript and Presenter's Notes

Title: SKA computing how hard can it be


1
SKA computing- how hard can it be?
  • Tim Cornwell

2
Overview
  • Three parts
  • Software costs
  • Software development
  • Hardware costs
  • Rough, first-pass discussions only
  • All three documented more in SKA memos 48 - 50

3
Part 1 Software costs(with Athol Kemball)
  • Bottom-up costing not possible in absence of
    concept, design, requirements, timeline.
  • Top-down costing possible by scaling from known
    budget
  • We use ALMA as the basis
  • Expect really low-precision answer
  • Not even two bit!

4
Why is software development high cost and high
risk?
  • No Moores Law for software
  • Highly non-linear process
  • 25 change in requirement complexity produces
    100 increase in software complexity
  • Two major causes of runaway projects
  • Inaccurate cost estimates
  • Unstable requirements
  • Reuse of limited help
  • 10 - 35 cost saving
  • Design patterns more useful

5
And
  • Scientific and operational requirements are much
    more demanding
  • Computational models of radio telescopes
    continually increase in complexity in response

6
Number of tables in visibility database schema
7
ALMA software budget
  • 435 person-years
  • 270/115 pre/post first science
  • 50 for further development of AIPS, NGAST,
    ACS
  • PDR characterized budget as lean compared to
    requirements
  • Our view could be 1000 person-years to meet all
    requirements
  • Bottom line - 40M to 100M

8
Scaling to SKA
  • More observing modes?
  • More concurrent sessions?
  • More antennas
  • Stations as well as antennas?
  • More telescopes - 2 instead of 1
  • SKA Connected element VLBI Mosaicing
    Pulsar SETI

9
Details
  • Take ALMA subsystem personnel costs
  • Apply plausible scaling for each subsystem
  • Sum to get one scenario
  • Hold head in hands
  • Repeat for different scenarios
  • e.g. fewer observing modes, no hybrid, etc.
  • See SKA memo 51

10
(No Transcript)
11
Some conclusions
  • Easy to get cost 2000 person-years
  • Hybrid doubles software costs
  • Halving the number of observing modes halves the
    cost
  • Number of antennas or stations has little bearing
    on costs

12
Main conclusion
  • Software costs are currently unknown
  • Whereof one cannot speak, one must remain
    silent, Wittgenstein
  • Range 50M to 200M is plausible for an easy to
    use telescope (a la ALMA)
  • Allocate a fixed fraction, 10-20 say, until
    costs can be determined with more accuracy
  • Include software people in writing requirements

13
Further actions?
  • Check against LOFAR (and others) costs
  • Firmer scientific definition of key observing
    modes

14
Part 2Software development(with Brian
Glendenning)
  • Cost is 500 - 2000 person years
  • Software team could be 200 or more in size,
    spread across the world
  • Code base 5 - 30 million lines of code
  • Likely to be done in a time of rapid changes
    within computing
  • Software is high risk!

15
Based on
  • Comments from colleagues and co-workers
  • Experience from Keck, LIGO, SDSS, SNAP, LHC, Jim
    Gray, ADASS, NSF Project Science workshops
  • Collection of references and resources in SKA
    memo 50
  • Fact and Fallacies in Software Engineering,
    Glass, 2003

16
Things SKA wont have
  • Inoperative management and/or management
    structures
  • Quantity rather than quality of oversight
  • Cultural differences at team and organization
    levels
  • Politicalization of technical decisions
  • Drastically insufficient funding
  • Unrealistic schedules

17
Things not to do
  • Dont omit software and data analysis
  • Dont write unrealistic requirements
  • Dont underestimate cost of complexity
  • Dont let the hardware design drive the software
  • Dont forget some class of users
  • Dont manage separately control and other
    software
  • Dont claim accurate cost estimates too early
  • Dont divide the work before the software
    architecture is defined

18
Things to do
  • Do make software a key part of the project
  • Do hire a first rate team
  • Do hire a dedicated management team
  • Do track risk
  • Do plan for logistical complexity
  • Do engage domain experts
  • Do demand operational model early
  • Do manage and document change
  • Do prototype, test, and simulate

19
More things to do
  • Do establish and debug a software process early
  • Do build and test a complete system early and
    often
  • Do integrate early and often
  • Do invest in infrastructure
  • Do review appropriately
  • Do get help from computing experts outside
    astronomy

20
Other factors
  • Algorithm development is vital and not part of
    the software effort
  • Establish a separate testing team
  • Plan for continuing development in operations

21
Part 3 Hardware costs
  • Wide-field imaging is probably the cost driver
  • RFI mitigation could also be expensive
  • Calculate wide-field imaging costs in 2004 (from
    simulations) and scale with Moores Law
  • Assume that ML holds for computing costs, not
    necessarily for per CPU costs
  • Current technology problems with ML

22
Why is wide-field imaging needed? And why is it
hard?
  • To reach continuum sensitivity limits, must
    account for all sources in FOV
  • At SKA sensitivities, FOV is filled with sources
  • High spectral dynamic range same problem
  • Two dimensional Fourier transforms will not work
  • Basic physics - Fresnel diffraction
  • Current best algorithm, w projection, gained a
    factor of ten increase in speed but its still
    time consuming

23
Wide field imaging
24
Scaling laws for wide-field imaging
  • Revised version of Perley-Clark scaling laws
  • Baseline length, antenna diameter (for fixed
    collecting area)

25
Details
  • Simulate SKA observing
  • Use w projection algorithm
  • Hand tuned FORTRAN inner loop
  • Find scaling coefficients
  • See SKA memo 49 for methodology
  • Numbers here are lower - using lower computing
    costs (per IEMT discussions)

26
Cost equation
  • Antenna diameter scaling is horrific!
  • Doubling antenna size saves factor of 256 in
    computing
  • Baseline dependency is tough
  • Easy to find hardware costs gt SKA cost
  • Multi-fielding not included
  • Error factor of 3 in each direction

27
Examples
  • Imaging with 12.5m antennas on the 5km baselines
    at 20cm
  • 0.5M in 2015
  • To avoid confusion, need to go to at least 35km
  • 30M in 2015
  • Increase the antenna diameter to 25m
  • 120K in 2015
  • For 350km baselines with 25m antennas,
  • 120M in 2015

28
Escape routes
  • Invent a new algorithm for wide-field imaging
  • Algorithm must be designed for hardware resources
  • W projection possible with GB memory
  • Optimum approach for multi-processors likely to
    be quite different
  • Correlator FOV shaping
  • Note by Lonsdale, Doeleman, and Oberoi (17 July
    2004)
  • Avoid small antennas
  • Very large antennas not good for imaging?
  • Lower risk by developing design for cheap 20m
    antenna
  • Only do hard cases infrequently
  • Reinvest in computing periodically
  • Special purpose hardware?
  • e.g. See BEE2, COTS correlator posters
  • Use stations on long baselines

29
LNSD station beam compared to ideal primary beam
PB for 13 element station
PB for 80m filled aperture
30
LNSD stations
  • Far out (gt 35km), antennas are clumped in
    stations of 13 antennas
  • Fewer stations gt lower data rate
  • BUT small number of ants/station gt station
    sidelobes high gt sampling requirements set by
    antenna size NOT station size
  • Processing
  • 100 (stations of 13 12m) 333 23.7m

31
Open questions about hardware costs
  • Confusion limits
  • 400,000 sources / deg2 at 20cm requires excellent
    Fourier coverage at 50 - 100km
  • Impact of other concepts
  • Aperture Array, Cylinders
  • Effect of multi-fielding
  • Probably really bad!
  • High dynamic range / foreground subtraction
  • Is very hard and will increase computing costs

32
Further actions?
  • Hold workshop on wide-field imaging for current
    and future telescopes
  • SKA, LOFAR, EVLA, LWA, eEVN, etc.
  • Focus simulations on wide-field imaging problems
    and possible solutions
  • Systematic exploration
  • Could actually get worse as we understand more of
    the details e.g. RFI mitigation, lumpy antenna
    sidelobes
  • Add computing to SKA cost model
  • Community needs to invest intellectual effort in
    parallelization - its coming
  • Issue SKA data processing challenge

33
Summary
  • Software costs
  • 500-2000 person-years
  • Recommendation allocate 20 of SKA budget until
    we know better
  • Software development
  • Be conservative in setting requirements
  • Be careful about cost estimates
  • Follow best practices from other projects
  • Hardware costs
  • Highly non-linear function of dish diameter and
    baseline length
  • Still many open questions
Write a Comment
User Comments (0)
About PowerShow.com