Title: ALICE DCS Review, Geneva November 14, 2005 DCS System Components and Performance
1ALICE DCS Review, Geneva November 14, 2005DCS
System Components and Performance
(Version 1.7)
2Outline
- Functionality and performance of DCS software
components - The ALICE DCS computers
- The core software components
- PVSSII
- Hardware Access Standards
- User Interface and Remote System Access
- DCS Databases
- Alerts and Error Handling
3Introduction
- The ALICE DCS is based on PVSSII SCADA system
with add-ons - Several commercial standards and technologies
were tested and adopted - JCOP and ALICE add-ons allow to adapt the
existing system to ALICE-specific needs - Performance of individual components was studied
- Invaluable information provided by JCOP (SUP
project, FWWG,) - Data collected from commercial documentations and
performance studies - ALICE Test setup in the DCS lab
- Experience of other PVSSII users
4In this presentation we will discuss
The functionality and performance of individual
system components
User Interface
The Test Setup
Databases
User Interface
PVSSII
Error Handling
Hardware Access
5 6The DCS Computing Setup
- Back-end and front-end system prototypes are
installed in the DCS lab - The infrastructure is used for
- Prototyping
- Performance tests
- Hw and SW Compatibility tests
- Computers are operating on a separate firewalled
network routed from CERN - Transition to CNIC foreseen
7The DCS Computing Lab
Fronted prototype
Pre-installation Servers
Backend Prototype
8Backend Systems
Pre-installation Servers Now installed in P2
Backend Prototype
9Pre-installation servers
Configuration Database Server
Archive Database Server
Primary Domain Controller (to be replaced by
NICEFC)
Secondary Domain Controller Backend Services
Remote Access Server
File Server
Pre-Installation Servers are already installed in
P2
10Worker Node Prototypes
- Several 2U rack mounted computers were tested for
use in the DCS - Tests covered hardware compatibility and
long-term stability - Several recommended models cover the whole range
of detector-needs
11View Inside a 2U Rack-mounted Computers Interior
2U PCI Riser
PCI Can Controller
NI-MXI2 VME Master
12 13In this section we will explain
- Basic concepts of PVSSII
- Managers
- Data model
- Data flow
- PVSSII extension by JCOP Framework
14PVSSII Architecture
- PVSSII (currently ver. 3.01SP1) is a commercial
SCADA system used in all four LHC experiments - PVSSII functions are carried out by the managers
- Independent program modules
- Communicate via PVSS protocol over TCP/IP
- Can be scattered across several computers for
optimum load balancing - Managers subscribe to data provided by detector
equipment - Data is sent on change
- Deadbands can be defined for drivers (data is
sent only if a significant change is seen) - The number and type of managers needed per
detector depends on the hardware architecture and
operational mode
15PVSSII Architecture
16Basic PVSSII Managers
- Event Manager (EV)
- Responsible for data flow in the PVSSII
- Coordinates communication in the system
- Distributes information to all managers which
subscribed to data - Database Manager (DB)
- Responsible for data manipulation, interface to
the databases - Control Manager (CTRL)
- Allows for execution of user scripts written in
C-like language - API Manager (API)
- Provides access to PVSSII functions and data to
custom code written in C - User Interface (UI)
- Allows for creation of custom user interface to
the PVSSII system - In development mode provides environment for
creating custom panels and for parametrization of
PVSSII objects and functionality - Driver (D)
- Handles communication with the hardware
17 In a scattered system, the managers are running
on many machines
In a simple system all managers run on the same
machine
18In a distributed system several PVSSII systems
(simple or scattered) are interconnected
19PVSSII Data Model
- Data in PVSSII is carried by DataPoints (DP) of
a given DataPoint Type (DPT) - DPT describes the device structure
- User definable
- Allows for high complexity
- DP instantiated from the DPT represents the
controlled device (e.g. HV channel) - DataPoint is structured into elements (DPE)
- DPE types can be of several types float, int,
char, dynamic array - DPE can contain references to other datapoints
- DPE value can be derived as a function of other
DPEs - DataPoint Elements can contain sub-elements
(configs) which define their behavior (e.g.
archival, alert handling etc.)
20Example Simple HV Channel
21DPE atributes (Example OPC config)
22DPE atributes (Example Archive config)
In this example the DPE archiving is configured
23 24Why do we need Framework?
- Standard device definition can contain several
thousand of parameters to be defined as DP - Common hardware solutions are widely adopted, the
software needs are also common - JCOP FW reduces the development process by
providing standard tools and definitions
Project 1
Project 2
Common Device (e.g.SY1527)
Project 3
25Benefits from Framework
- JCOP Framework reduces the development effort
- Reuse of components
- Hide complexity
- Facilitate the integration
- Help in the learning process
- Provide development guidelines
- FW provides a higher layer of abstraction
- reduce required knowledge of tools
- interface for non experts
- FW is not a final application
26Framework Deliverables
- Guidelines document
- Naming convention, Colors, alert classes,
development - Devices
- Common hardware, Mechanism to define new types
- Tools
- Device Editor Navigator,Finite State
Machines,Trending, Configuration from a
DB,Installation - Help system
- Within the code for libraries, an HTML file per
panel
27JCOP FW Architecture
Supervisory Application
Framework Tools
Framework Devices
Framework Core
FSM
PVSS
OPC, DIM, DIP
Databases
Web
28- PVSSII
- Operation
- Performance
29In this section we will explain
- The dataflow in PVSS
- Protection against overload
- Raw PVSSII performance (data digestion)
30Dataflow in the PVSS
- Data in PVSSII is exchanged via messages between
individual managers - All managers maintain their input and output
queues - Communication with the EM is synchronous
- Queue length is configurable
- Standard settings allow for storing 100k DPEs in
a queue
CTL
UI
OUT
IN
IN
IN
OUT
Sync
OUT
EM
DRV
HW
IN
DRV Client
DRV Server
HW
OPC,DIM
31PVSSII Protection Against Overload
- If a manager which is not a driver creates
traffic which cannot be processed by the EM, the
connection will be closed - so called evasive
action - If a driver generates excessive traffic, an
Overflow queue is created - Memory is dynamically allocated
- OVF queue contains one entry per changing DPE.
- Older values are rewritten with newer one
- Once the standard queue is drained, the contents
of the OVF queue is transmitted
CTL
UI
OUT
IN
IN
OUT
EM
OVERFLOW Queue
DRV
HW
e.g. OPC,DIM
32Raw PVSSII Performance
- Question How much data can be digested by the
PVSSII system? - The default queue size is set to store 100 000
DPEs. - This parameter is configurable
- For reference a fully equipped CAEN crate needs
8000 DPEs to be configured - Only a small subset of this is monitored
33Data Processing tests with PVSSII
- In following tests the data insertion rates were
studied - The system was flooded with data using following
algorithm
34PVSSII operation During a Burst without
ArchivalOS Performance counters
The CPU (PIV 2 GHz) needs 7.7 sec to execute
100k DP changes
1
The PVSSII needs 49 sec to exchange dpSet
related messages between involved managers
2
The PVSSII needs 52 sec until all queues are
empty
3
- CPU is loaded during the burst generation
(execution of the dpSet command). - Network activity indicates the communication
between the managers - PVSSII confirms the end of the burst (dpSetWait),
but the network activity continues, until the
queues are drained
35PVSSII operation During a Burst - with
ArchivalOS Performance Counters
The CPU (PIV 2 GHz) needs 9.1 sec to execute
100k DP changes
1
The PVSSII needs 51 sec to exchange dpSet
related messages between involved managers
2
The PVSSII needs 55 sec until all queues are
empty
3
- If PVSSII archival is active, the CPU is involved
as well. The data generation is by 10 slower - The time needed for full queue processing
increases by 20
36Burst Tests With Single Data Source
PIV E64T, 3.4GHz on I7221 chipset
37Partial Conclusions on previous tests
- The processing times scale linearly with the size
of the burst - The unaltered PVSSII system was able to digest
burst of size 170000 (floats) until the evasive
action occurred - The measurements suggest that the system could
cope with a sustained rate of 5500 DP changes/s
coming from the same source
38Burst Tests With Multiple Data Sources
- The previous test was repeated with several data
sources (scripts) with synchronized execution - The independent processes were restarted once all
previous bursts were fully digested - Test were performed with 1,2,3 and 4 independent
data generators
39Burst Tests With Multiple Data Sources
PIV E64T, 3.4GHz on I7221 chipset
40Partial Conclusions on previous tests
- Tests with multiple generators showed that the
processing time scales linearly - 4 generators induced up to 700 000 DPE changes in
synchronized bursts - The sustained data processing rate is slightly
better than the one obtained with the single
generator (thanks to parallel execution of the
generators) - Long term test did not reveal any problems even
after several days of running
41Big Distributed systems
- The load in ALICE will be shared by a distributed
system - What is the load which can be digested by a
distributed system? - Can be a really big distributed system created
and operated? - Is it possible to retrieve data from other
(remote) systems in a distributed environment?
42Overall System Performance
- A big distributed system can be seen as a
collection of individual PVSSII systems - Data overload on one system in principle does not
affect the other PVSSII systems - Individual systems operate independently
- Data exchanged between individual systems is
filtered (e.g. an alert avalanche is never
propagated across systems) - If the load over DIST becomes significant, PVSSII
can take evasive actions
43Distributed System Size
- 130 systems were created
- 40 000 DPEs defined in each system
- Equivalent of 5 fully equipped CAEN crates
- 5 200 000 DPEs defined in total
- The systems interconnected and operated
successfully
44Data Retrieval in Distributed System - Trends
- PVSSII behavior was studied in a setup with 16
computers organized in 3-level tree architecture - Each system had 40 000 DPE defined
- Each leaf node was generating data with 1000ms
delay/750 changes - The top-level node was able to display trends
from remote system - Tests were interrupted for practical reasons (no
space left) when 48 trend windows (each showing
16 remote channels) were opened
45Intermediate Conclusions
- A big distributed PVSSII system with 130
connected systems was successfully created - Data retrieval from remote nodes works correctly
46Remark Concerning the Functionality
- No indirect connections between PVSSII systems
are possible - Only systems with connection via DIST manager can
communicate (A-B, C-B,D-A, D-B)) - The DIST manager does not allow for request
routing (A-C is not possible in this example) - No risk of inadvertent data corruption by a
remote system (not connected via DIST)
A
A
B
C
D
47 48In this section we will explain
- two preferred technologies which are used to
access the DCS hardware - OPC Ole for Process Control
- Functionality
- Performance
- DIM Distributed Information Management system
- Functionality
- Performance
49Access to Hardware Devices
- Many hardware technologies are used in the DCS
- Low level access is carried by 3rd party device
drivers provided by hardware manufacturers - Need for device standardization (e.g. CAN
controllers) - Standard protocols and technologies OPC and DIM
are implemented on top of the proprietary
protocols to hide the hardware details - PVSSII needs to know only these tools in order to
communicate with the devices
50OPC Architecture
- OPC (Ole for Process Control) is a client sever
model providing access to device data - OPC Servers can be provided by different vendors
- The OPC server is mainly a gateway between the
OPC protocol (built on top of DCOM) and
proprietary protocols
OPC Client
OPC Server
OPC Interface
Device Access
Device
51OPC Servers are Widely Used in ALICE
Wiener
ISEG
Wiener
ELMB
CAEN
52OPC Items and Groups
- OPC represents data as Items collected in Groups
- OPC Item is a transient object linked to the
source data - OPC Servers typically create a separate thread
for each group - Important for performance tuning
OPC Client 2
OPC Client 1
Device Channels
53RAW OPC Performance
- Tests performed at CERN confirmed the raw OPC
performance number as specified in the OPC
whitepapers - Time T to read n OPC items in synchronous mode is
- T515(85n) µs (1ltnlt10000)
- Performance of the server connected to real
hardware depends on - Implementation
- Hardware internal latency
- Server refresh rate
- Grouping of OPC items
54CAEN OPC Server and SY1527
- Test were performed with CAEN SY1527 mainframe
equipped with 15 boards (12 channels each)
55CAEN OPC Server and SY1527
- Each channel was switched ON and then OFF using
the JCOP framework hierarchy (FSM) - Switching time for n channels with 1000 ms
refresh rate - T1200(20n) ms
- Switching time for n channels with 500 ms refresh
rate - T950(20n) ms
- Average time spent in the FSM is lt3s, this must
be added to the OPC switching time - Total switching time for 180 channels is 4.537s
56Remarks to the Previous Numbers
- CAEN claims that the single channel performance
does not depend on the module type and is
constant - Internal latency of devices has impact on the
overall OPC Server performance - Internal latency TI of the system controlled via
the CAEN 1617A EASY branch controller depends on - Number of Boards N1
- Number of channels per board N2
- Number of parameters monitored per channel (Vmon,
Imon,) N3 - TI ( N110 N2N31.5 ) ms
- (i.e. 1.5 ms per parameter 10 ms per board)
57Remarks on CAEN 1617A branch controller
- Once gathered by the 1617A, the parameters are
accessible via the OPC server which polls each
channel - New improved firmware will allow for asynchronous
communication with the OPC server - The refresh time will not depend on the number of
channels
58- Device Access (continued) DIM
59- OPC is a well documented standard, but
- Tied to Windows technology
- Sometimes difficult to implement
- DIM (Distributed Information Management System)
provides a lightweight mechanism built on top of
the TCP/IP - Platform independent
- Based on client/server paradigm
- Servers provide services to clients
- Servers accept commands from clients
60DIM Operation
- DIM is based on concept of named services
- Servers publish the services by registering them
in the name service - Client requests services and is updated
- At regular time intervals
- Whenever the conditions change
- Clients subscribe to service contacting the
server directly (after receiving the server name
from nameserver)
Name Server
Register Services
Request Service
Service Info
Subscribe to Service
Server
Client
Service Data
Commands
61Use of DIM in ALICE DCS
- DIM is underlying protocol of several tools
including the FSM - In ALICE DCS deploys DIM to interface with custom
systems such as laser calibration system or AREM
Pro power supplies - DIM is essential for communication with the
Front-end and Readout Electronics (FERO)
62DIM Service Updating Rates
Test Configuration SRV04 and SRV03 2xXeon,
2.8GHz, 1GB RAM Pclhcb155 PIV 3GHz, 1GB RAM
63DIM Server Throughput
Test Configuration SRV04 and SRV03 2xXeon,
2.8GHz, 1GB RAM Pclhcb155 PIV 3GHz, 1GB RAM
64- PVSSII User Interfaces (Local and Remote UI)
65UI Terminology
- All graphical user interfaces in PVSSII are
called panels - Panels are created by using the graphical editor
GEDI - PARA module is an interface allowing for DP
definition, parametrization and manipulation - In the panel, processes can be visualized and
controlled by using graphic objects und
underlying scripts
66Example Panel Editor and Event Parametrization
- Each panel event (e.g. mouse click) can trigger
an action described in a script - Panel objects can be linked to datapoints
67Local UI Tests
- Tests were performed with up to 100 local UIs
- Local UI needs 27MB of memory/UI
- CPU Load depends on what the UI is doing, the UI
itself adds negligible load - PVSSII can take evasive actions if it runs out of
resources
68Remote access to the DCS
- It is requested, that the DCS will be accessed
remotely - Remote access mechanism was designed and tested
in Alice - For safety and security reasons, only remote
monitoring will be available as a default option
69The PVSS Remote UI
User Interface Manager
RDP connection (to external client)
PVSS Remote UI connection (to the PVSS)
DCS Network
70Remote Access to the DCS Network
DCS Network
DCS FireWall
CERN Network
71Remote Access to the DCS Network tunneling via
CERNTS
Approach tested from Utrecht, Bratislava,
Catania Details of TS setup currently
discussed with IT ALICE test setup
operational
DCS Network
CERN Network
DCS TS
DCS FW
Internet
CERN TS
CERN FW
72Terminal server test setup
- Dual Xeon 2.8 GHz, 3 GB RAM running Windows 2003
Server has been used as a terminal server - Master project was running on P-IV 3 GHz equipped
with 1 GB RAM - A number of P-IV computers were used as remote
user stations - In test we studied impact of remote connections
on CPU utilization and memory allocation on
Terminal Server, remote user station and Master
Project
73Computer Infrastructure for Remote Access Tests
Windows XP Pro
Windows XP Pro
Windows XP Pro
Windows Server 2003
CERN Net.
Terminal server Router
Remote User
Remote User
Remote User
Windows XP Pro
Windows XP Pro
DCS Private Net. 192.168.39.0
PVSS Master Project
Remote User
74Performed tests
- Master project generated 50000 datapoints which
were repeatedly updated in bursts of size 3000DP - Remote UI displayed 50 to 200 of them at the same
time - Tests were performed for
- Different number of remote sessions
- Different number of datapoints updated on remote
screen - Connections were made to a busy and idling
master project
75Terminal Server Load
Master project generated 50000 datapoints and
updated 3000 /s
Master project stopped
Remote panel displays 50 values
Master project running
Remote panel displays 200 values
76Load of the workstation running the master project
Master project stopped
Remote client disconected
Master project running
Remote client connected
77Computer loads for large number of remote clients
Master project generated 50000 datapoints and
updated 3000 /s. Remote client displayed 50
values at a time
78Conclusions on TS and UI Tests
- No performance problems were observed for master
project - TS load depends on number of external
connections. - CPU of TS easily supported up to 60 external
connections (no tests were made beyond this
number) - Memory consumption was measured to be 30MB per
remote session. - Additional each remote UI consumed 30MB of
memory - No disturbing effects were observed on remote
workstations (such as freezing etc.) - TS client stores persistent bitmaps so the remote
screen refresh is very fast
79 80In this section we will explain
- Alert handling implementation in PVSSII
- Performance of Alert handling
81- PVSSII offers a comprehensive alert handling
strategy based on industrial standards - DIN 3699 Process control using display screens
- DIN 19235 Measurement and control signaling of
operating conditions
82- Alerts are sent if the monitored DPE value
changes from one alert range to another - Alerts can be visualized
- As text on alert screen
- As state display (e.g. flashing graphical
element) - Alerts might require acknowledgment on the Alert
and Error Screen (AES - Automatic functions can be assigned to alerts
- The function is executed when the alert is
triggered
83Alert Severity Levels in the Framework
- FW components hide the internals of alert
handling and provide - tools and conventions for alert definition
- Uniform naming which assures coherent system view
across detectors (e.g. priorities are grouped to
severity levels with well defined meaning) - Agreed FW severity levels are
- Warning
- An alert level which indicates an undesired
condition. In such a case there is no impact on
the physics data quality and no immediate
intervention is required. - Error
- An alert level which indicates an undesired
condition for which action is required. In such a
case there is a risk of an unwanted impact on the
physics data quality or risk to the detector. - Fatal
- An alert condition for which immediate
intervention is required.
84Alert Ranges
- Alert ranges must be matched with severity levels
- For each DPE at least two ranges must exist
- Valid range
- Alert range
- Example temperature probe with 5 defined ranges
85Alert Classes
- Common alert classes simplify the
parameterization of alert handling - by grouping together the parameterizable
properties of alert ranges of datapoint elements - Each alert has an alert class assigned
- Alert class describes the acknowledgment mode for
the alert - CTRL script can be used for example to change the
status of graphical elements, start the horn,
display a message etc. - Users can define additional alert classes
86Example of Alert Parametrization Using Ranges and
Classes
87Alert Statuses
- Each alert can have one of 5 possible statuses
- No Alert normal status
- Incoming not acknowledged alert active, not yet
acknowledged - Incoming acknowledged alert active, already
acknowledged - Outgoing unacknowledged status not pending, not
yet acknowledged - Outgoing acknowledged status not pending,
already acknowledged
88Acknowledgment types
- Acknowledgment types are assigned to DPEs to
allow for different handling algorithms - Not acknowledgeable alert changes only as a
result of status change - Acknowledgment deletes pending alert is reset
to No Alert - Incoming acknowledgeable the message is
acknowledgeable - Alert pair requires acknowledgement same as 3,
but outgoing alert also requires acknowledgment.
New alert overwrites the old one - Incoming and Outgoing require acknowledgement
same as 4, but both alerts must be acknowledged
(new alert does not overwrite the old one) - Types 1 and 5 are supported by FW tools
89Group alerts
- Several alerts can be OR linked to form a group
alert - An alert is fired if any of the linked alerts is
pending - Using the panel topology the group alerts can be
linked across several panels - Only summary information Is brought to the
attention of the operator (e.g. crate has a
problem if any of the channels has a problem)
90Suppression (Masking) of Alerts
- Suppression function can activate/deactivate any
of the alerts - Very useful feature if the monitored value is
outside of limits, but this is irrelevant for the
process (e.g. current fluctuations during ramping
process or FERO configuration) - Suppressed alerts are considered as non-existent
91Alert Hysteresis
- Hysteresis is defined to avoid repeated alert
generation for values fluctuating around the
alert limit - Value related hysteresis uses dead zones around
the alert limit - Time hysteresis for digital values bit must
remain in a given state for certain amount of
time - Time hysteresis for analog values adds time
inertia to value related hysteresis
92Value Related Hysteresis
Upper Limit 3
Range 3
Lower Limit 3
Upper Limit 2
- Priority levels defined in the PVSS
Range 2
Lower Limit 2
Upper Limit 1
Range 1
1
2
3
2
1
93Framework Alert Config Panel
Framework tools hide the complexity of PVSSII
alert handling and provide easy way to configure
the application
94- Alert Handling Performance
95What is the system load caused by the alert
definition?
- Question does the alert definition affect the
achievable DPE change rate? - Answer No, if the alert is not provoked, the DPE
change rate is not affected by the alert
definition (with or without alert checking) - The test system could cope with 1400 DPE
changes/s - What is the load if the alert is provoked?
- If one level was passed the rate dropped to 1100
changes/s - If two levels were passed, the rate dropped to
725 changes/s (because both alerts were processed)
96Memory consumption caused by alert definition
- Tests were performed by defining the DPE and
comparing the memory used by the EM - After adding alert definition, the memory
consumption was recalculated - Declaration of alerts takes extra memory
- 5 level alert definition requires app. additional
2500 bytes per DPE - Activation of alerts take 2 more memory
97Interpretation of test results
- The described tests showed limits for alert
absorption by the PVSSII system - Further tests were performed to show the
difference between displaying alerts from the
local and a remote system - For a typical CAEN crate, all local alerts are
displayed within 2 sec - If display of additional alerts from remote
system is required, the total time is 3 sec
98Present status of developments
- The alert definition and handling Is fully
supported by all FW devices - Tools are available in the FW
- Alert configuration is supported by the FW
configuration database tool - FWWG is studying improved AES screen
- ETM is working on improved alert filtering
methods
99Partial Conclusions
- PVSSII offers powerful alert handling mechanism
based on industrial standards - The test system could cope with alert avalanche
of 10000 alerts - The test system could support a sustained alert
rate of 200 alerts/sec - Group alerts and panel topology allow for
efficient alert reporting - In distributed system each AES screen can
retrieve alerts from a remote system without
significant overhead - New developments are underway
100- ALICE DCS Databases
- Configuration
- Archival
101The DCS Data Flow
DAQ
TRI
HLT
Electricity
Ventilation
Config.
ECS
Cooling
DIM, DIP
Gas
PVSSII
Magnets
Safety
FERO Version Tag
Access Control
LHC
Archive
102 103Configuration Database
- DCS Configuration data
- System Static Configuration (e.g. which processes
are running, managers, drivers etc.) - Device Static Configuration (device structure,
addresses, etc.) - Device Dynamic Configuration Recipe (device
settings, archiving, alert limits etc.) - Alice add-on FERO configuration, which is in
fact also a device configuration both dynamic
and static
104Configuration Database
System Static Configuration
Device Static Configuration
Configuration DB
Common Solution (FW devices only)
Device Dynamic Configuration
FERO Configuration
Alice Specific
PVSS-II underlying software
Hardware
105Configuration Database
- Device static configuration
- One dimensional versioning
- Complete description of the PVSS device
- Device dynamic configuration (recipes)
- Set-points and alert configuration
- Two dimensional versioning (operating mode and
version) - Initially a default recipe is loaded to the
device as a part of its static configuration
106FW PVSS Configuration Prototype Implementation
- Database access based on ADO both on Windows and
Linux - ETM provides sets of libraries enabling quasi
ADO functionality on Linux - Underlying database system is Oracle
- Configuration database tool is a part of
framework distribution and is available for
download on framework pages
107FW Configuration Database Tests (1)
- Development and tests were performed by Piotr
Golonka (results are still preliminary, but
already convincing) - Test configuration consists of CAEN SY1527 with
16 A1511 boards (12 channels each) - Typical properties were taken into account
- Values i0, i1, v0, v1 setpoints, switch on/off,
ramp speeds, trip time - Alerts overcurrent, overvoltage, trip,
undervoltage, hw alert, current current, chanel
on, status, current voltage) - Test results
- 10 sec needed to store the recipe in cache
- 10.5 sec to apply recipe from cache to PVSS
- No performance drop observed for parallel
operation of 26 systems
108The FW configuration DB - results
- The total time needed to download a configuration
for a CAEN crate (using caching mechanism) is
1010s - Significant speed improvement - this number
should be compared with time of 180s reached
with previous generation of FW configuration
tools - Further performance improvements are expected
(using prefetching mechanism etc.) - Incremental recipes will bring both performance
and functionality improvement - Incremental recipe is loaded on top of a existing
recipe
109FERO Configuration
- FERO Configuration is ALICE-specific
- Each detector has different requirements e.g.
data volumes, data structures, access patterns - The FERO configuration data is downloaded by
specialized software (FED servers) on PVSSII
request - Depending on the architecture, the FERO data is
stored either as individual record/channel or
BLOBs - In following tests we estimated the data
downloading performance based on the available
information
110Data Download from Oracle Server (BLOBs)
- Large BLOBS were stored on the DB server
connected to private DCS network (1Gbit/s server
connection, 100Mbit/s client connection) - Retrieval rate of 150MB configuration BLOBs by 3
concurrent clients was measured to be - 3-11 MB/s/client
- Upper limit of 11MB/s is affected by client
network connection - For comparison, retrieval of 150MB configuration
BLOBs by 1 client at CERN 10Mbit/s network
0.8MB/s (cached) - results depend on Oracle cache status, first
retrieval is slower, succeeding access is faster - Depending on detector access patterns, the
performance can be optimized by tuning the
servers cache
111Test With Small BLOBs
- Test records consists of 10 Blobs with size 10kB
each - 260 configuration records were retrieved per test
- Allocation of BLOBs between configuration records
altered - From random
- To shared (certain BLOBs were re-used between
configuration records) -gt to test the ORACLE
caching mechanism
112BLOB Retrieval Rates
Retrieval rate per client for different number of
clients as a function of fraction of re-used
BLOBs
113Data Download Results (BLOBs)
- The obtained results are comparable with the raw
network throughput (direct copy between two
computers) - The DB server does not add significant overhead,
neither for concurrent client session - The main bottleneck could be the network
- Further network optimization is possible
according to the DB usage pattern (DCS switches
are expandable to more performing backbone
technologies, servers can be organized in a
cluster) - The problem will be continuously studied as more
inputs from detectors will be available - For the pre-installation phase, the existing
configuration server is 2xXeon (3GHz) with 2TB
SATA RAID Array connected to powerful switch
114Data Download From Tables
- Performance was studied using a realistic example
the SPD configuration - The FERO data of one SPD readout chip consists of
- 44 DAC settings/chip
- There are in total 1200 chips used in the SPD
- Mask matrix for 8192 pixels/chip
- 15x120 front-end registers
- The full SPD configuration can be loaded within
3sec
115 116The DCS Archive and Conditions Data
- The DCS archive contains all measured values
(tagged for archival) - Used by trending tools
- allows for access to historical values, e.g. for
system debugging - too big and too complex for offline data analysis
(contains information which is not directly
related to physics data such as safety, services,
etc.) - The main priority of the DCS is assure reliable
archival of its data - The conditions database contains a sub-set of
measured values, stripped from the archive - Unlike the other LHC experiments, ALICE does not
use the COOL-based conditions database
117Archive Organization
- Online Archive
- Available in P2
- Buffers DCS data
- Limited size
- Backup Archive
- Created by Oracle streaming
- Contains full data set
- Available for external requests
- Conditions DB
- Population mechanism under development
- Contains data relevant to analysis
DAQ
TRI
HLT
Electricity
Ventilation
Config.
ECS
Cooling
Gas
PVSSII
Magnets
Safety
Access
LHC
Archive
P2
Online Archive
Conditions DB
Archive
Backup Archive
118DCS Archive implementation
- Standard PVSS archival
- Archives are stored in local files (one set of
files per PVSSII system) - OFFLINE archives are stored on backup media
(tools for backup and retrieval are part of the
PVSSII) - Problems with external access to archive files
due to PVSS proprietary format - The PVSS RDB archival
- is replacing the previous method based on local
files - Oracle database server is required
- Architecture resembles the previous concept based
on files - ORACLE tablespace replaces the file
- A library handles the management of data on the
Oracle Server (create/close tables, backup
tables)
119DCS Archival status
- Mechanism based on ORACLE was delivered by ETM
and tested by the experiments - All PVSS-based tools are compatible with the new
approach (we can profit from the PVSS trending
etc.) - Several problems were discovered
- Main concern is the performance (100 inserts/s
per PVSSII system) which is not enough to
handle peak load in reasonable time - ATLAS demonstrated a way of inserting 1000
changes/s by modifying the mechanism, confirmed
in ALICE DCS lab - Architectural changes were requested
- List of requirements compiled and sent to ETM
- ORACLE expert (paid by CERN) is working with ETM
on improvements (as of this week) - DCS will provide file-based archival during the
pre-installation phase and replace it with ORACLE
as soon as a new version qualifies for deployment - Tools for later conversion from files to the RDB
will be provided
120Data Insertion Rate to the DB Server
- The mediocre RDB archival performance was
confirmed also by other groups - In order to exclude possible incorrect server
configuration, additional tests were carried out
in the DCS lab - Data was inserted to 2 column table (number(38),
varchar2(128), no index) - Following results were obtained (inserting 107
rows into DB) - OCCI autocommit 500 values/s
- PL/SQL (bind variables) 10000 values/s
- PL/SQL (vararrays)
- gt73000 rows/s (1 client)
- gt42000 rows/s/client (2 concurrent clients)
- gt25000 rows/s/client (3 concurrent clients)
- Comparing the results with estimated performance
of PVSS systems we can conclude, that possible
limitation could be the interface implementation
results with new server installed last week
121Internal PVSSII Archive Organization
- For each group of archived values a set of tables
is generated - LastVal input table storing the most recent
value - Current a table keeping a history record of
measured values. - An Oracle trigger is used to write data to this
table - The table is closed if it exceeds a predefined
size and is replaced with a new one - The latest available values are copied to the new
table automatically - Online a copy of the closed Current table
- Offline a table written to backup media and not
directly available - A set of internal tables for database maintenance
exists - Archive tables are using their own tablespace
(this is why a trigger is needed) - For each Datapoint several parameters are stored
- Value
- Timestamp(s)
- Flags (for example reliability)
122Internal PVSSII Archive Organization
Offline SPD HistoryValues 00000011
Offline SPD History 00000011
Offline SPD HistoryValues 00000012
Offline SPD History 00000012
Online SPD History 00000123
Online SPD HistoryValues 00000123
Online SPD HistoryValues 00000124
Online SPD History 00000124
Current SPD HistoryValues 00000125
Current SPD History 00000125
SPD LastValValues
SPD LastVal
(arrays)
(basic datapoints)
123Accessing the Archive Tables
- A set of libraries is available on the PVSS side
to access the data from the database - The libraries hide the archival complexity user
need to specify the Datapoint name and the time
interval - A dedicated library allows for direct database
access (bypassing the PVSS managers)
124Example of Data Selection in PVSS (1)
125Example of Data Selection in PVSS (2)
126Example of Data Selection in PVSS (3)
127Example of Data Selection in PVSS (4)
Result of the query
Generated SQL Statement
128External Access to PVSSII RDB Archive
- Database schema is available data can be
queried from external programs - A set of basic Oracle views is provided by the
ETM - The complexity of the table partitioning is
hidden - A PVSS HTTP server allows for remote DP queries
129AMANDA
- Developments towards the conditions DB
implementation launched - AMANDA is a PVSS-II manager which uses the PVSS
API to access the archives - Developed by DCS in collaboration with the
offline team - Work is still under progress (performance tests
not yet finished) - AMANDA components
- PVSSII AMANDA Manager (Server)
- Win C client
- Root client and interface to the OFFLINE
- AMANDA communication protocol
- Archive architecture is transparent to AMANDA
- Dedicated client can access AMANDA at predefined
time intervals (or at the end of each run) and
retrieve data from DCS archive
130AMANDA
User Interface Manager
User Interface Manager
User Interface Manager
AMANDA Client
API Manager
Control Manager
Archive Manager(s)
Archive Manager(s)
Archive Manager
Event Manager
AMANDA Server
Database Manager
Driver
Driver
Driver
PVSS-II
Archive(s)
131Operation of AMANDA
- After receiving the connection request, AMANDA
creates a thread which will handle the client
request - Due to present limitations of PVSS DM the
requests are served sequentially as they arrive
(the DM is not multithreaded) - AMANDA checks the existence of requested data and
returns an error if it is not available - AMANDA retrieves data from archive and sends it
back to the client in formatted blocks - AMANDA adds additional load to the running PVSS
system! - Final qualification of AMANDA for use in the
production system depends on the user requirements
132AMANDA in distributed system
- In case of file-based data archival, the PVSSII
can directly access stored by its own data
manager - PVSSII can request also data from remote systems
if they are accessible via the DIST manager - DM of the remote system is always involved in the
data transfer - One or more AMANDA servers can be installed per
detector - In case of RDB archival, the DM can retrieve any
data provided by other PVSSII systems directly
from database without involving other systems - Dedicated PVSSII systems running general purpose
AMANDA servers can be provided
133AMANDA in the distributed environment (archiving
to files)
DIS
DIS
DIS
134AMANDA in the distributed environment (archiving
to RDB)
DIS
ORACLE
DIS
DIS
135Additional CondDB Developments
- Development of a new mechanism for retrieving
data from the RDB archive has been launched (at
present the work is driven by ATLAS) - A separate process will access the RDB directly
without involving the PVSS - this approach will overcome the PVSS API
limitations - the data gathering process can run outside of the
online DCS network - The conditions data will be described in the same
database (which datapoints should be retrieved,
what processing should be applied, etc.) - Configuration of the conditions will be done via
PVSS panels by the DCS users unified interface
for all detectors - Data will be written to the desired destination
(root files in case of ALICE, COOL for ATLAS, CMS
and LHCB) - Parts of AMANDA client could be re-used
- First prototype available
136Present status and Conclusions
- Configuration DB
- FW Configuration database concept was redesigned
- Significant speed improvement
- New functionality is being added (e.g. recipe
switching) - FW Configuration database and tools can be used
also for non-standard devices (e.g. FERO) if
their follow JCOP FW specifications - FERO Configuration performance was studies, input
from detectors is essential - Archival DB
- Present RDB problems are being solved
- File-based archival will be used in the early
stages of the pre-installation - AMANDA provides and interface with external
clients, independent from archive technology - Present ATLAS and future FW developments could
provide an efficient method for accessing the RDB
archive
137 138- DCS software components were presented
- PVSSII provides a solid base for detector
operation - FW tools significantly simplify the effort needed
by detector groups and contribute to integrity of
the final system - Some ALICE add-on are detector specific and their
design is based on detector requirements - Performance studies of the individual components
allow for estimate of the global DCS performance - See talk this afternoon
139Acknowledgments
- The data and know-how shown in this presentation
were collected from many sources - Many referenced numbers were provided by Angela
Brett , Paul Burkimsher, Giacinto de Cataldo, Jim
Cook, Clara Gaspar, Frank Glege, Piotr Golonka,
Oliver Holme, Sveto Kapusta, Sasha Schmeling,