Title: TAGnet link protocol for generating event-coherent DMA bursts in trigger farms Hans Muller, Filipe Vinci dos Santos, Angel Guirao, Francois Bal, Sebastien Gonzalve CERN ED Electronics
1TAGnet link protocol for generating
event-coherent DMA bursts in trigger farms Hans
Muller, Filipe Vinci dos Santos, Angel Guirao,
Francois Bal, Sebastien GonzalveCERN ED
Electronics
TAGnet was developed as part of the LHCb (CERN)
HS trigger project, started in Febr. 2002, in
collaboration with KIP Heidelberg, for use in
the level-1 VELO trigger farm.
- TAGnet is a protocol for the creation of
event-coherent DMA transfers between hardware
DMA engines of readout buffers and CPUs of a
trigger farm. TAGnet interconnects slave DMA
modules via twisted pair to a TAGnet scheduler
which collects all data requests from CPUs.
Twisted pair
Definition event-coherent DMA interconnected
hardware DMA engines are initiated to send
specified event-fragments to one requesting CPU
2Features of Event-coherent DMA
- CPU requests new events from scheduler whilst
processing previous one
- cheap implementation ( twisted pair, FPGA
logic, PCI card )
- added functionality via message-TAGs (
bufferchecks, common Xon/Xoff )
- highest possible use of the network raw
bandwidth ( hardware timing )
- no spurious arrivals of event fragments all
arrive concatenated in time
- no problem with crashing farm CPUs ( just 1
less )
- no worst-case destination buffer like for
round-robin ( 2 buffers sufficient )
- no problem with very large variations in CPUtime
per event
3LHCb VELO trigger problem
T A G N E T schedule memory-directed DMAs in
an event-coherent way
1. STF format 5 overhead )
4What is a TAG
- More
- 4 TAG classes
- 4 TAG types
- 7 bit Done counter
- 7 bit Source Module ID
- 7 bit Coded information
- 7 bit Error correcting code
5TAGs on a 16 bit bus
- 64 bit TAGs are transmitted in four 16 bit
words followed by 1 idle - 17th bit (Flag) used to delimit TAG
heartbeats ( 1111011110..) - Error-correcting Hamming code in the last word
6TAGs over narrow links
Logical 64 bit
7TAGnet slave
VHDL design and synthesis for FPGA
simplified block diagram
OUT
IN
- Paket Reception stores all incoming TAGs in
64 bit bypass register - Packet FiFo only stores TAGs which are
directed to the slave - Decoding Execution takes desired action
- DMA-engine gets loaded with
source/destination starts ation - Packet-Transmission copies used TAGs back
into the TAGnet ring
8heartbeat transport
9TAGnet components
10Tag classes
Valid Consume TAG class
0 0 invalid, not consumeable. These filler TAGs have no other purpose than allowing the scheduler to control the level of usable TAGs. In order to clear the TAGnet ring the scheduler transmits only these TAGs.
0 1
1 0
1 1
invalid, consumeable. These TAGs are freely
consumeable TAGs which can be used by any TAGnet
slave to create valid TAGs at its output
valid, not consumeable. These TAGs fall into the
type of directed scheduler messages, created
normally by a TAGnet slave. They contain message
information ( like errors ) for the scheduler
and hence must not be consumed by other TAGnet
slaves.
valid, consumeable. These TAGs fall into the
types undirected command, undirected message,
directed slave message and hence contain
important scheduler information (command /
address / message ) to be consumed by TAGnet
slaves.
11Tag types
directed / undirected encoded messages TAG type ( only defined for valid TAGs)
0 0 undirected slave command (C-TAG) of class VALID CONSUMABLE convey command and data to all slaves. These realtime TAGs are the large majority of all TAGs
0 1
1 0
1 1
undirected slave message (M-TAG) of class VALID
CONSUMABLE. These TAGs send an encoded message
to all slaves
directed slave message (M-TAG) of class VALID
CONSUMABLE. These TAGs send command and data to
one slave
directed scheduler message (M-TAG) of class
VALID NON CONSUMEABLE. These TAGs send an encoded
message (error or other) from a slave to the
scheduler
12C-TAGs -gt event-coherent DMA
C-TAGs are the vast majority of TAGs. Each C-TAG
creates 1 event-coherent DMA burst to a
requesting CPU all DMA-slaves are triggered to
load identical Source/Destination in their DMA
engines and to transmit their data. Result a
fast succession of subevents to the requester CPU.
event
Event-coherent DMA transfer
13Message Tags ( M-TAG)
Message TAGs (M-TAGS) coexist with C-TAGs for
messages between slaves and scheduler. Generated
by the scheduler software, M-TAGs are not time
critical.
14Tagnet in shared-memory farms
TAGs may be used for event-coherent
Event-building in any system. Shared-memory
for high rate (triggers) 1.) perform
high-rate eventbuilding using memory-memory copy
( may require blocksize aggregation ) 2.) create
TAGs at high rate on CPU demand
CPU-Farm
TAGnet
CPU
Scheduler
mem
S/N bridge
DMA
network
PCI
NIC
DMA
Input links
DMA
NIC
PCI
15DMA measurements PCI to PCI over 6 Gbit/s network
PCI 64 _at_ 66 MHz
4 Slink
16TAGnet Scheduler
- Hardware FPGA logic in PCI card
- serialize TAGs to twisted pair link ( mezzanine
card ) - monitor TAGnet ring alive status
(heart/errorbeats/clock) - auto-generation of next event-buffer ( default
1 ) - monitor status of outstanding and returning
C-TAGs - timeout for C-TAG return ( programmable via a
control register) - decode errors received via M-TAGs from slaves
- error reporting via interrupts
- accumulation of log-files from returning
M-TAGs (SDRAM buffer)
- Software C-Tag PCI driver, M-Tag control, Error
handling - PCI driver ( Linux W2000 )
- initialize/configure all TAGnet slaves
disable (throttle) triggers during setup - Creation of C-TAGs from request table at
rates gt trigger rate - creation of special C-Tags ( Reset, Align ,
Flush ) - use M-TAG functions for all setup /
monitoring/ diagnostic tasks - read / check log-files from returning M-TAGs
( including error TAGs from slaves) - routines for interrupt error handling
- regular source buffer verifications / flushing
via M-TAGs
17Scheduler hardware
18Scheduler software
- C-TAG creation
- CPU request 12 bit Identifier
- at 1 MHz trigger rate ( LHCb ) minimum C-TAG
request bandwidth is 2 Mbyte/s - Burst-mode PCI driver transmit CPU request
from memory to schedulers buffer _at_ 1 MHz
- M-TAG creation
- assemble any class/type of an M-TAG on user
request - send M-TAG
- M-TAG result collection
- readout of M-TAG logfile from SDRAM
- identify returned M-TAG (Type, ID , Command )
read result
- Error handling
- PCI interrupt handler
- Interrupt code register PCI
19C-TAG software loop for 2D shared memory cluster
20C-TAG loop timing result
Emulation of request loop 16 16 farm on
o ld PC ( PCI bus 32 bit 33 MHz)
Measured Xfer to PCI 1,4 us for up to 24 CPU
request bits Safe to say that gtgt 1 MHz applies
for faster PC with 64 bit 66 MHz PCI
Local segment (id0x80400, size1024) is created.
Local segment (id0x80400, size1024) is
created. Local segment (id0x80400) is mapped to
user space. The physical address for local
segment is 2f6000 Local segment (id0x80400) is
available for remote connections. Waiting for
the DMA transfer to be ready .... Node 8 received
interrupt (0x0) DMA transfer done! Client data 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1024 1024 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Detecting Orca
on 2f.... OK Physical address 2f6000
duration for the writing of 161000 32bits WORDS
22843 usec
21Summary
- TAGnet is a 64 bit protocol which sends TAGs at
up to 5 MHz rate in a ring of DMA slaves - Interconnection within a TAGnet ring is based on
twisted pair (CAT5 ) - C-TAGs organize event-coherent DMA transfers
on CPU demand - M-TAGs serve for initialization, error
reporting and control - 4 TAGnet classes and 4 TAGnet types ( 16
flavors ) - TAGnet scheduler is a PCI card which receives
CPU requests - First experimental TAGnet slave implementation
in LHCb Readout Unit ( FPGA ) - First experimental TAGnet master implementation
via programmable PCI-FLIC card - software loop CPU-requests to scheduler
demonstrated to work at more than 1 MHz - successive event-coherent DMA measured at
rates up to 2 MHz for 128 byte payloads
22PCI card with Tagnet mezzanine
FLIC card ( EP-ED)
- lowcost FPGA card
- very fast host bus IF
- 64 Mbyte SDRAM
- drivers for Linux/Windows
- programmable Slink IF
23TAGnet on LHCb Readout Unit
Dual DMA engines
Readout Network
TAGnet
Networked Embedded CPU
Input Links 4Slink
Subevent buffer