Title: A%20Unified%20Approach%20to%20Fast%20Digital%20Processing%20for%20Beam%20Dampers,%20Instrumentation,%20
1A Unified Approach toFast Digital Processing for
Beam Dampers, Instrumentation, Controls
- Bill Foster
- Beam Instrumentation Workshop
- May 6, 2004
2A Digital Manifesto
Example Application 3-coordinate Bunch-by-Bunch
Beam Damper for Fermilab Main Injector Implemented
on a Single Altera Stratix FPGA Five other
applications using this same hardware
3Once Upon a Time, there was a job
calledAudio Frequency Analog Engineer
- Their products
- Mixers
- Equalizers
- Crossover networks
- Reverbs
- Fuzz Boxes.
Bob Widlar, inventor of the IC Op-Amp and
other analog gems http//www.elecdesign.com/Global
s/PlanetEE/Content/3080.html
4Nowadays, an Audio Frequency Analog Engineer is
any high-school kid with a PC and SoundBlaster
Card
- Their products
- Mixers
- Equalizers
- Crossover networks
- Reverbs
- Fuzz Boxes
- Emulated on a PC
- PLUS
- Synthesizers that can fool my ears
- Time compressors that squeeze 20 off of a songs
play time without altering the pitch. - Real-time tone substitution makes even Leonard
Cohen sing on key - try doing that with an op-amp!
5What Unemployed the Audio Analog Engineers?
- ADC Sampling Rates and Accuracies exceeded
requirements - Audio requirements set by human ear
- Digital Processing capability exceeded
requirements at reasonable cost - 2 GHz CPU executes 50,000 instructions per audio
waveform sample
6Once Upon a Time, there was a job called
Low-Level Radio Frequency Analog Engineer
- Their products
- Mixers
- Equalizers
- Phase Shifters
- Down converters
- Phase-locked Loops
Fermilabs Booster Low-Level RF system as it
exists today!
7Nowadays, a LLRF Analog Engineer is (or should
be) any old programmerwith a fast digitizer and
an FPGA
- Their products
- Mixers
- Equalizers
- Phase Shifters
- Down converters
- PLLs
- Implemented in FPGAs
- PLUS
- Direct Digital Synthesis of complex RF waveforms
- Built-in system diagnostics
- Digital Reproducibility (spares!)
- High Speed Serial Links
- Multi-user support
8What Unemploys the Analog RF Engineers?
- ADC Sampling Rates and Accuracies exceed
requirements - 4 samples per RF clock gives bunch-by-bunch
phase and amplitude - Digital Processing capability exceeded
requirements at reasonable cost - FPGAs and DSPs
- ? For 100 MHz or less, it is GAME OVER
9Generic Hardware Concept for Accelerator
Instrumentation Control
Clock, control, ...
Cables from Tunnel
INPUTS BPM Stripline Pickup Resistive
Wall Flying Wire PMT RF Fanback Kicker
Monitor etc.
FAST ADC
Monster FPGA
Minimal Analog Filter
CPU Bus VME/ VXI/ PCI/ PMC etc. OR SERIAL LINK
. . .
. . .
. . .
FAST ADC
Minimal Analog Filter
OUTPUTS Stripline Kicker RF Fanout Analog
Monitor etc.
FAST DAC
10All-Coordinate Digital Damper
53 MHz, TCLK, MDAT,...
212 MHz
Stripline Pickup
FAST ADC
Monster FPGA
Minimal Analog Filter
12
Transverse Dampers Identical X Y
CPU VME/ VXI/ PCI/ PMC etc. OR Serial LINK
FAST ADC
Minimal Analog Filter
Stripline Kicker
Power Amp
424 MHz
FAST DACs
12
gt 27 MHz
Resistive Wall Monitor
FAST ADC
Minimal Analog Filter
Longi- tudinal (Z) Damper
Broadband Cavity
Power Amp
FAST DACs
12
11The Board
Alexi Seminov, Sten Hansen, Bill Ashmanskas,
Dennis Nicklaus, Hyejoo Kang
12Some Example Applications using this same basic
hardware
- 1) Universal Beam Position Monitors (BPMs)
- Handles full variety of FNAL beam RF structure
- 2) Generic instrumentation readout Scope
- ex Flying Wire readout for arbitrary bunches
- 3) Beam Loading Compensation
- 4) Universal Beam Dampers / Beamline Tuner
- 5) Entire Low-Level RF system
13Fast, High Precision Pipelined ADCs
- AD6645 14 Bits, 105 MHz
- AD9430 12 Bits, 210 MHz
- AD12500 12 Bits, 500 MHz (hybrid)
- Several 8 bits, 1-2 GHz (scopes)
- Private opinion it appears that ADCs are about
to fall off of Moores Law curve the same way
that CPUs have
14(No Transcript)
15AD6645 Functional Block Diagram
- Two-Stage Pipelined ADC
- Internal Track Hold
- Differential Analog Inputs
16This ADC can sample 53 MHz signals at 4 samples
per cycle to measure both In-Phase and Quadrature
on each cycle
17(No Transcript)
18Board Layout for High-Speed ADCsis a Lot Easier
Than it Used to Be
- LVDS signals eliminate digital noise
- 0.25V differential swing far quieter than TTL
- Direct glueless interface to FPGAs
- Fast input op-amps and surface mount components
with small parasitics - Front-end layout is not critical since it is
physically small
19Clock Distribution for ADCsis a Lot Easier Than
it Used to Be
- Clock and Signal timing can be fixed ex post
facto via FPGA firmware timing adjustments - Some A/Ds D/As have internal PLLs to reduce or
eliminate effects of clock jitter - FPGAs have high-quality clock distribution which
can be used to drive external A/D D/As - FPGA clock distribution can challenge Dedicated
Clock Distribution Chips (on notL. Dolittle)
20- Q What ADC Clock Speed is needed?
- A 4x RF Bunch Frequency
- Minimum needed for bunch-by-bunch Phase and
Amplitude measurement - In frequency domain, 4x RF sampling measures both
in-phase and quadrature components. - For Fermilabs 53 MHz RF ? 212 MHz ADCs
21212 MHz Sampling of RWM Pulse
Low-pass Filter Spreads signal /-5ns in time so
it will not be missed by ADC
Filter Reduces ADC Dynamic Range requirement,
since spike does not have to be digitized
22212 MHz Sampling of Stripline Signal
Roles of Phase and Amplitude signals are
reversed from unipolar case.
23Repetitive Waveform looks like simple sine wave,
but contains bunch-by-bunch phase and amplitude
A - B gives bunch-by-bunch in-phase signal
D - (CE)/2 gives bunch-by-bunch
out-of-phase or quadrature signal
Vector Sum sqrt(I2 Q2) is insensitive to
clock jitter
24Bunch-By-Bunch Phase vs. Turn NumberMeasured
with MI Digital Damper
- Damper Output comes from derivative of individual
bunch phase errors
25Bunch-By-Bunch Intensity
26Synchronous vs. Asynchronous ADC Sampling
- The choice is between
- N53 MHz beam phase locked sampling, or
- Asynchronous sampling at a (possibly) lower rate
- Asynchronous sampling of a waveform will allow
you to recover all the information, IF - you know that the input is a pure sine wave, or
- you know the input is repetitive (stored beam),
or - the sampling rate is much higher than fMAX
- My belief is, undersampling is just a bad idea
27The Perils of Undersamplinga Single-Pass Beam
If a single-pass beam does not have uniform bunch
populations, the ADC input is NOT a good sine
wave and an undersampled waveform can give an
erroneous picture of the beam signal.
The signal CAN be reconstructed with many passes
of stored beam.
28 dealing with this variety of beams would be
painful in Analog
29Digital Filter looking at many samples can still
extract individual bunch transverse positions
30Advantages of Digital Processing
- Digital filters more reproducible (gtspares!)
- Inputs and Outputs clearly defined ( stored!)
- filters can be developed debugged offline
- Digital filter can also operate at multiple lower
frequencies ...simultaneously if desired. - Re-use Standard hardware with new FPGA code
- or same code with different filter coefficients
31Conclusions on ADC Clock Rate
- A Bunch-by-Bunch processing system must sample
the raw waveform at a minimum of 4x the Bunch
frequency - You can never be
- too rich
- too thin
- or have too many ADC samples
32What is an FPGA?
- Reconfigurable Logic Array 106 logic gates
- Pre-built logic subassemblies Megafunctions
- Multipliers/Accumulators
- Multi-port RAMs
- Gigabit serial links
- Entire CPUs
- Phase Locked Loops
- Complex I/O pads
- More transistors than a Pentium
- Impressive Support software
33XYLINX And ALTERA Are the Industry Leaders
34(No Transcript)
35What is an FPGA Good At?
- Big Synchronous Arithmetic Pipelines
- 400 MHz multiply/accumulators, filters..
- High-Speed Interface with Modern Parts
- ADCs, DACs, Serial Links
- Built-in system diagnostics
- Digital Scope on every signal
- Flexibility and Multiple Applications
- Use one board design for many applications
- Add features without hardware changes
36SYNCHRONOUS PIPELINES
- When people say Analog is simple, they are
often referring to the deterministic execution
time (propagation delay). - Analog circuits never fail to respond in time
because they are off servicing an interrupt. - FPGA synchronous pipelines provide dedicated
logic which responds at a deterministic time. - This captures a big advantage of Analog.
37(No Transcript)
38FPGA Programming LanguagesGraphical vs.
Text-mode
- Graphical Schematic Entry is useful for
- diagramming data flow
- giving talks
- Text Mode features
- Text is faster to enter and more concise
- Can diff two files to see whats changed
- Code management systems can handle text well
- I PERSONALLY RECOMMEND TEXT MODE
39FPGA Programming LanguagesProprietary vs.
Industry Standard
- Proprietary languages often lock you into a
single vendor. - I use one (Altera AHDL) anyway.
- The industry standard VHDL language
- It is extremely verbose repetitive
- Translating AHDL into VHDL increases the text
length by a factor of 2. - LIKE TRANSLATING A DOCUMENT TO FRENCH
40Some Development Models
- How do you Download Talk to this Board that
youve just built? - FPGA Programming cable
- Firmware Serial Port Model
- Crate Backplane Bus Access
- On-Board CPU w/Ethernet
- Compiled-in On-chip CPU w/ Ethernet
41CPU Access to FPGA Registers
- Usually want ADDRESS/DATA R/W model for CPU
access to Control Registers - These address and data busses are synthesized
in firmware - Example (AHDL) for 32-bit read/write register
- (Bus,Outputs) BUS_REG( ) WITH(
- ADDRESS H08402020,
- WIDTH 32 )
42Development Model 1 FPGA Programming Cable
- The programming cable needed to program the FPGA
(usually through the PC printer port) can also be
used for limited communication - Not clear how useful this is for real-time
response since it works through serial port
driver - Altera provides compiled-in logic analyzer
which provides output that can be compared with
simulation.
43Development Model 2 Crate Backplane Bus Access
- Crate Backplane Bus connections to FPGA can
provide CPU access to registers in internal
address space of FPGA - Internal Address Space is defined in firmware
- Requires many bits of bus buffers, etc.
- Be Careful of Backplane Noise (TTL)
44Development Model 3 Serial Port to PC
- A firmware-defined Serial Port can be used for
2-wire communication with the .COM port of a PC - Terminal emulator can provide simple read/write
access the internal address space of the FPGA - Can also connect to spread sheets, etc via Visual
Basic access to .COM port
45Development Model 4 On-Board CPU w/Ethernet
- Postage Stamp Ethernet CPU or homebuilt DSP can
provide Ethernet and Web access to FPGA registers - Firewire is an alternative
- Remote update of firmware possible
- NIM-like modules without need of crate backplane
46Development Model 5 Compiled-in On-chip CPU
with Firmware-defined Ethernet
- High-end end FPGAs have built-in or
firmware-defined CPUs fast enough to support IP
stack, Web Servers, etc. - These are available on Demo Boards
- C-language programming of these is integrated
into FPGA development environment (no new
software!).
47Adding a new ACNET Device
1) Add register(s) to FPGA Firmware
2) Start Recompile (takes 6 minutes) 3)
Meanwhile, use DABBEL/D80 to define properties of
new ACNET device 4) Download Firmware Reboot
Crate (2 min.)
- ? Takes about 10 minutes from concept to
Fast-Time Plot
48Application 1 Universal BPM(Beam Position
Monitor)
- Measures position of each bunch on each pass
around the ring with full-bandwidth FIR filter - (R-L)/(RL) for each bunch measurement.
- Multi-bunch averages available for lower noise
- per batch, per turn, many turns, different
bandwidths - Multiple users can share hardware w/o conflicts
- ADC is always active, FPGA stores data many ways
- Same Hardware OK for Booster, MI, RR, TeV,
beamlines.
49FPGA Based Universal BPM
53 MHz, TCLK, MDAT,...
212 MHz
Split Plate Pickup 1
Monster FPGA(s)
Minimal Analog Filter
ADC
R
14
CPU VME/ VXI/ PCI/ PMC Ethernet etc.
Minimal Analog Filter
ADC
L
. . .
. . .
Pickup 4
Minimal Analog Filter
ADC
T
Minimal Analog Filter
B
ADC
Serial Link to Real Time Orbit Control System
Analog Position Monitor Test Point
(Optional) OR Modulation Output for Synchronous
Lock-in Detection Technique
FAST DAC
50Universal BPM Application Signal Processing
Steps
- 1) Bandwidth-Limit input signal to 53 MHz
- 2) 12 Bit Digitization at 212 MHz
- 3) FIR filter(s) to get single-bunch signal(s)
- 4) Sum Difference of plate signals
- 5) (Difference / Sum) gives position
- 6) Linearization lookup table or polynomials
- 7) Bunch, Batch, Multiturn Averaging
- 8) Scope Trace Buffers on every signal
- Multiple users can be acquiring and filtering
data multiple ways without conflicts
Front End
Inside FPGA
51Main Injector BPM Response Map
J. Crisp
- Linearization can be done in FPGA or readout
software
52Universal BPM Signal Processing Step 7
Averaging and Filtering
- Many Types of averaging possible
- Position Averaging over Bunches in a Batch
- Multi-Turn Averaging of Positions
- Multi-turn averaging of Raw Signals
- Fitting to betatron frequency (injection errors)
- - this gives info for ?-function measurement
- Emulation of DDC chip functions
- Spectrum analysis of position phase
- Different filters can be simultaneously active
53MAIN INJECTOR VERTICAL BPM (8 Bits)
1mm
DIGITAL DAMPER POSITION SIGNAL (Batch Average)
54Single-Bunch BPM Measurement was tested by
blowing out nearby bunches during Stacking Cycle
55BPM Resolution for 212 MHz Digitization of
Single 53 MHz Bunch
MAIN INJECTOR VERTICAL BPM (8 Bits)
1mm
DIGITAL DAMPER POSITION FOR SINGLE 53 MHz
BUNCH SINGLE-TURN (non-averaged)
56Multi-User Support
- FPGA CODE SUPPORTS
- 31 different users on different machine cycles
- Different averaging algorithms, simultaneously
active - Each user can sit and observe a different single
bunch - Different bunch frequencies on each cycle
- No User Interferences since Separate Dedicated
Logic is used for each purpose
57Application 2 Generic Instrumentation Readout
Scope
- What we want in a Generic Scope
- 1) Ability to trigger on TCLK events, Beam Synch
Events, analog threshold crossings of different
channels, etc. - 2) Multiple Users Sharing without conflicts
- - separate copies of trigger logic
- - separate buffers to store captured signals
- - separate filter algorithms run simultaneously
- 3) Common hardware software among systems
58Example Application of Generic Scope Flying
Wire PMT Readout
53 MHz, TCLK, MDAT,...
PMT(s) in Tunnel
106 MHz
FAST ADC
Monster FPGA
Minimal Analog Filter
14
FAST ADC
Minimal Analog Filter
CPU Bus VME/ VXI/ PCI/ PMC etc.
FAST ADC
Minimal Analog Filter
Encoder Signals
FAST ADC
Minimal Analog Filter
Motor Drive
Motor
DAC
59Example Application of Generic Scope Flying
Wire PMT Readout
- Photomultiplier Tube (PMT) pulses presented to
Analog filter to limit BW - Summing circuits in FPGA give total PMT pulse
height in narrow and wide gates - Individual gates report signals for 36x36 or more
bunches, average over many turns, etc. - FPGA can be used to control trigger the fly
- Raw PMT pulses can be simultaneously looked at
via multi-user hooks
60Application 3 Gods Own Beam Loading
Compensation
- 1) Digital Pipeline to reproduce IQ signals from
RW bunch monitor with N-turn delay. (N1...1/?S) - 2) Digital filters for transients and synchrotron
osc. - Inputs Resistive-wall monitor RF fanback.
- Digitization bunch-by-bunch I Q signals
- Outputs IQ to damper cavity, or LLRF
- frequency swing issues for LLRF drive
- Antiproton vs. Proton timing
61Application 4 Universal Damper
- A single FPGA has enough capability to do damping
calculations for X,Y, longitudinal. - Digital Filter which operates on I Q signals
from individual 53 MHz bunches can also be
reprogrammed to operate at lower frequencies. - Frequency swing during acceleration introduces
some timing complications, which can be fixed by
components (FIFOs, Dual-Port RAMS) inside of FPGA.
62Universal-Damper Application Signal
Processing Steps
- 1) Bandwidth-Limit input signal to 53 MHz
- 2) 12 Bit Digitization at 212 MHz
- 3) FIR filter to get single-bunch signal
- 4) Sum Difference of plate signals
- 5) Multi turn difference filter (FIR) w/delay
- 6) Pickup Mixing for correct Betatron Phase
- 7) Bunch-by-bunch gain, dead band etc.
- 8) Timing Corrections for Frequency Sweep
- 9) Pre-Emphasis for Kicker Power Amp
- 10) Power Amp for Kicker
Front End
Inside FPGA
Buy
63Longitudinal Beam Instability in FMI
- Occurs with 7 bunches filled (out of 588)
- Prevents low emittance bunch coalescing
First Bunch OK
7th Bunch Trashed
- Driven by cavity wake fields within bunch train
64Longitudinal Damper FPGA Logic
THRESH
8-Turn FIR calculates derivative of bunch phase
Bunch-by- Bunch Digital Phase Detector
ResistiveWall Pickup
-THRESH
/- KICK to DAMPER
Multi-Turn Memory
ADC
14
THRESH
Bunch Intensity FIR Filter
option (currently unused)
Individual Bunches are kicked or depending on
whether they are moving right or left in phase
65FPGA Code for Universal Damper (8-turn Filter)
66Transverse Damper 3 - Turn Filter
Arbitrary Betatron Phase of Kicker can be
accommodated
- Damper kick is calculated from single BPM
position reading on 3 successive turns.
67HERA-P Damper uses a 3-turn Digital FIR Filter
Klute, Kohaupt et. al. EPAC 96
- Digital Bunch by Bunch _at_ 96ns Spacing
- Immediate digitization following peak detection
683 Turn Filter Coefficients
- Damper kick is weighted sum of beam positions on
the 3 previous turns. - 3 Filter Coefficients Uniquely Determined by
- System Gain
- Betatron Phase Desired at Kicker
- Constraint that sum of filter coefficients 0
(so that filter does not respond to DC
offsets.)
69Frequency Sweep Issues
- Machines with frequency sweep (?Booster!) must
adjust ADC input clock and DAC output clock
phases as frequency sweeps. - This can be generated with Phase-locked loops and
delay-locked loops present inside FPGAs. - This requires access to both the RF clock, and
a cable-delayed version of the RF
clock, as timing references. - One-turn digital delay using FIFOs in FPGAs.
- Same hardware can be used for Booster thru
Tevatron.
70RF Clockingwith acceleration
Equal Length Cable Fanout so Beam Sees Same RF
Phase at all Cavities as RF Frequency Sweeps
During Acceleration
of Clock Cycles per Turn is harmonic number h
71ADC Clockingduring frequency sweep
Round-Trip Cable Delay on ADC Clock ensures ADC
Clock Beam Input Stay in Phase as Beam
Accelerates
May need additional phase adjustment to track
phase jumps at transition, etc.
72DAC Clockingduring frequency sweep
Propagation Delay CK ? DAC ? Cable ?
Kicker should match RF Fanout Delay so Kick Stays
in Phase as Beam Accelerates
73Generic Dampertolerating frequency sweep
All Logic Inside FPGA
FIFO needed due to phase shifts between DAC and
ADC clocks as beam accelerates
74Damper Output Precompensation
- 53 MHz Bunch-by-bunch kicker wants a 19ns square
pulse with a good flat top to minimize timing
sensitivity - Power Amp and Cable have non-ideal response for
square wave (ringing and tail) - DAC operating at 424 MHz is used to produce
specially sculpted pulse necessary to convince
Amp Cable to make a nice flat pulse at kicker.
75(No Transcript)
76Echotek Card Used for Initial Dampers
105 MSPS
AD6645
?212 MHz DAC Daughter Card (S. Hansen/ PPD)
- Echotek Board Originally Built to SLAC Design
Specification - 65MHz DDC version to be used for RR BPM upgrade
- 105 MHz version (with DAC daughter card) used
for Dampers
77Butchering the Echotek Board
- Scorched-Earth FPGA rewrite (GWF)
- 65 pages of firmware since Jan 03
- 212 MHz DAC Daughtercard
- Sten Hansen T. Wesson (PPD)
- 3 channels for X,Y,Z
- 212 MHz Output FIR (W. Schappert, RFI)
- Pre-emphasis compensation for analog outputs
- Prototype for 424 MHz output on final board
- Input Buffer Amp/Splitter Box (Brian Fellenz,RFI)
78(No Transcript)
79Multi-batch w/o and with transverse dampers
with dampers
w/o dampers
1 to 11 Booster turns
9 Booster turns
2.3?1013
_at_ 8.9 GeV/c ?X26.43, ?Y25.42, ?X-20,
?Y-16 _at_ 11.7 GeV/c ?X26.39, ?Y25.425
_at_ 8.9 GeV/c ?X26.44, ?Y25.47, ?X-5, ?Y-5 _at_
11.7 GeV/c ?X26.36, ?Y25.46
80Pushing up the intensity
_at_ 8.9 GeV/c ?X26.43, ?Y25.47 ?X?Y-5 _at_ 10.3
GeV/c ?X?Y-5 _at_ 11.7 GeV/c ?X26.36, ?Y25.46
Transverse Dampers ON
13-14 Booster turns
3.3?1013
Beam extracted at 0.65 s, just before transition
81What if I turn dampers off ?
turn dampers off
intensity
vertical
horizontal
82DONT TELL BOB MAU(or the rest of the
green-yellow color blind operators)THAT WE DID
THIS
83Filter for Undamped, Damped, and Anti-Damped
Bunches
84Blowing Selected Bunches out of the Machine (in
X,Y, or both)
1110111001110001111
? Neutrino Communications!
85Application 5 LLRF System on a Chip
- The single FPGA on the damper board has 12
Phase-locked loops and enough capability
synthesize all the signals needed for a complete
LLRF system. - The Damper Board has already been used (in a
simple way) to drive the FNAL Debuncher Ring LLRF
system. - A more ambitious (but achievable) goal is to
replace the entire Booster LLRF with a Damper.
86MOTIVATION Booster Low-Level RF. The Final
Frontier.
87Booster Low-Level RF
881. WCM
2. RPOS
5. MI RF
4. BDOT
? Notching and Cogging
3. TCLK
AB DRIVE OUT
89Booster LLRF External Connections
- 5 Inputs
- Wall-Current Monitor (Phase)
- Transverse Pickup (RPOS) (BNL Uses two)
- Start Pulse (TCLK)
- BDOT (Low bandwidth replace w/lookup?)
- MI AA Marker (Phase lock notch cogging)
- Two Outputs Cavity AB Drives
- (Optional?) Beam Clock Output
90Digital Booster LLRF Concept
TCLK, 53 MHz, MI AA, MDAT,...
ETHERNET
crystal ? 400 MHz
Monster FPGA
DDS Beam Synched Clock 160-212 MHz (4x Booster
RF)
Wall Current Monitor (PHASE)
FAST ADC
Minimal Analog Filter
12
12
91CONCLUSIONS
- Fast ADCs and Huge FPGAs are revolutionizing
Accelerator Instrumentation - The same basic hardware can perform a large
number of Instrumentation Control functions - A good first application of this technology is
the 3-coordinate bunch-by-bunch beam damper