Title: EGEE information systems MDS and BDII Zoltn Balaton MTA SZTAKI
1EGEE information systems MDS and BDIIZoltán
BalatonMTA SZTAKI
2Contents
- The Information Systems of the LCG-2
- Globus MDS2
- LCG-2 LDAP based IS (BDII)
- R-GMA
- Architecture
- GRIS
- GIIS
- BDII
- LDAP protocol
- GLUE Schema
3How to discover resources ?
Uses of the Information System
Once an user is logged into an User Interface
(s)he is ready to take advantage of the Grid for
his/her own application. But what are the
available resources to accomplish his/her
tasks? The answer to this question comes
through the interactions with the Information
System (IS). The Information System (IS)
provides information about the Grid resources and
their status. The resources are
hardware(CPU, Memory, Disk), software
(Applications, services), storage, network etc.
Both the UI (users) and other services (e.g. RB)
need the IS.
4Uses of the Information System
If you are a middleware developer Workload
Management System Matching job requirements and
Grid resources Monitoring Services Retrieving
information about Grid Resources status and
availability
- If you are a user
- Retrieve information about
- Grid resources and status
- Resources that can run your job
- Status of your jobs
If you are site manager or service You
generate the information for example relative
to your site or to a given service
5The Information Sytem
- Two main Information System technologies used in
EGEE are one LDAP based from Globus and one
developed by the European DataGrid Project R-GMA - The LDAP based information system is based on
Globus Monitoring and Discovery Service (MDS) - In LCG-2, the Berkeley DB Information Index
(BDII), based on an updated version of the
Monitoring and Discovery Service (MDS), from
Globus, was adopted as main provider of the
Information Service. - Relational Grid Monitoring Architecture (R-GMA)
is also adopted in both the current LCG
middleware (LCG-2) and in the new EGEE
middleware (gLite 3.0) to which the production
grid is currently transitioning
6What is LDAP
- The MDS and BDII information systems are built on
theLight-weight Directory Access Protocol - A directory is a specialized distributed database
optimized for reading, browsing and searching - It offers a hierarchical view of information made
up of entries - Entries are attribute collections identified by a
unique and global Distinguished Name (DN) - The DN also defines the hierarchy of entries that
are organized in a Directory Information Tree
(DIT) - Resources (computers, storage, ) each publish
their part in this tree - Queries can be posed to the Information and
Monitoring Service using LDAP search commands - LDAP establishes the transport and format of the
messages used by clients to access a directory.
It is the internal protocol used by the EGEE/LCG
services to share information
7Information in LDAP
- The LDAP information model is based on entries
- These are attribute collections identified by a
unique and global DN (Distinguished Name) - Information is organized in a tree-like
structure. A special attribute, objectclass, can
be defined for each entry. It defines the classes
tree corresponding to this entry. This attribute
can be used to filter entries containing that
object class - The information is imported and exported from and
to the LDAP server via LDIF files (LDAP Data
Interchange Format)
8The Directory Information Tree
- ? Lightweight Directory Access Protocol
structures data as a tree - ? Following a path from the nodeback to the root
of the DIT, aunique name is built (the DN) - idpml,ouIT,orCERN,stGeneva, \
- cSwitzerland,ogrid
o grid (root of the DIT)
c US cSwitzerland cSpain
st Geneva
or CERN
ou IT ou EP
objectClassperson cn Patricia M. L. phone
5555666 office 28-r019
id pml idgv idfd
9An LDAP Hierarchy
10The Glue Schema
- A Schema describes the attributes and the types
of the attributes associated with data objects - The EGEE Information System conforms to the GLUE
Schema - Grid Laboratory for a Uniform Enviroment
- The GLUE Schema activity aims to define a common
conceptual data model to be used for Grid
resources monitoring and discovery - There are three main components of the GLUE
Schema, they describe the attributes and the
value of Site information - Computing Element, Storage Element and Network
Monitoring - Key information for the RB
- GlueCEApplicationRuntimeEnvironment tags
- TotalCPUs, FreeCPUs
- EstimatedTraversalTime (ETT)
- Network Cost
- It describes the Grid resources information
stored by the IS - It follows the DIT (Directory Information Tree)
hierarchical structure for objectclasses and
attributes
11lcg-info --list-attrs
- Attribute name Glue object class
Glue attribute name - EstRespTime GlueCE
GlueCEStateEstimatedResponseTime - WorstRespTime GlueCE
GlueCEStateWorstResponseTime - TotalJobs GlueCE
GlueCEStateTotalJobs - TotalCPUs GlueCE
GlueCEInfoTotalCPUs - MaxRunningJobs GlueCE
GlueCEPolicyMaxRunningJobs - CE GlueCE
GlueCEUniqueID - WaitingJobs GlueCE
GlueCEStateWaitingJobs - MaxCPUTime GlueCE
GlueCEPolicyMaxCPUTime - MaxTotalJobs GlueCE
GlueCEPolicyMaxTotalJobs - CEStatus GlueCE
GlueCEStateStatus - CEVOs GlueCE
GlueCEAccessControlBaseRule - FreeCPUs GlueCE
GlueCEStateFreeCPUs - RunningJobs GlueCE
GlueCEStateRunningJobs - MaxWCTime GlueCE
GlueCEPolicyMaxWallClockTime - Accesspoint GlueCESEBind GlueCESEBindCEAcces
spoint
12Monitoring and Discovery Service
The Architecture of the MDS-2
- Computing and storage resources at a site
implement an entity called Information Provider,
which generates the relevant information of the
resource(e.g. number of free CPUs, the used
space in a SE). - This information is published by the Grid
Resource Information Servers, or GRISes. - One or more Grid Index Information Servers
(GIISes) can get information from different
GRISes and publishes it allowing specialised or
global view and searching.
13The Architecture of the MDS-2
14The MDS-2 Architecture and BDII
- Computing and storage resources at site report
their static and dynamic status via the GRISes
(Grid Resources Information Servers) to the GIIS
(Grid Index Information server) - The role of the GIIS is to collect info from all
the GRISes and other GIISes information sources,
but it has shown scalability limits with growing
number of sites - Because of this, the BDII (Berkely DB Information
Index) was introduced by LCG. - The GIIS has been first kept at site level, to
collect info from the site GRISes. Later it was
replaced by a site BDII.
15The BDII
- The BDII queries the GRISes periodically and acts
as a cache storing information about the Grid
status in its database. - Each BDII contains information from the site
GIISes defined by a configuration file, which it
accesses through web interfaces. - Very up-to-date information can be found by
directly interrogating the site GIISes or the
local GRISes that run on the specific resources. - The Resource Broker uses a BDII for matchmaking
purposes - Users and other Grid services can interrogate
BDIIs to get information about the Grid status
16The responsible services
- Lower level GRIS
- Scripts and configuration files generate ldif
files containing the information (for example,
general information of the nodes) - Other tools responsible of the dynamic
information (for example, available and/or used
space into a SE) the so called information
providers - Medium level local GIIS or site BDII
- Same procedure taking the information from the
registered GRISes - Top level BDII
- Publish the information of the site GIISs making
a refresh every 2 minutes
17IS Components GRISs, GIISs and BDII
The EGEE information system
Abbreviations BDII Berkeley DataBase
Information Index GIIS Grid Index Information
Server GRIS Grid Resource Information Server
Each site or VO can run a BDII. It
collects the information coming from the GIISs
At each site, a site GIIS collects the
information given by the GRISs Local
GRISes run on CEs and SEs at each site and report
dynamic and static information
From LCG2.3.0 site GIIS has been replaced
by local BDII
18LDAP Browser
- Command line to query a BDII
- ldapsearch -h gridit-cert-rb.cnaf.infn.it -p 2170
-b "mds-vo-namelocal, ogrid" -x - To query a GIIS/GRIS
- ldapsearch -h gridit-ce-001.cnaf.infn.it -p 2135
-b "mds-vo-namelocal, ogrid" -x - ldapsearch -h grid007g.cnaf.infn.it -p 2135 -b
"mds-vo-namelocal, ogrid" -x - Windows tool
- Softerra LDAP Browser 2.6(freeware),
- http//www.ldapadministrator.com/download/index.ph
p - Linux
- LDAP Browser\Editor, http//www.iit.edu/gawojar/l
dap - GQ LDAP client, http//biot.com/gq/
19ldapsearch
- E.g. ldapsearch -x
- -h lthostnamegt
- -p 2135
- -b "mds-vo-namelocal, ogrid"
- or
- ldapsearch -x
- -H ltLDAP_URIgt
- -b "mds-vo-namelocal, ogrid"
20Interrogating the GRIS on a CE
ldapsearch
- The command used to interrogate the GRIS located
on host lxn1181.cern.ch is - ldapsearch -x
- -h lxn1181.cern.ch
- -p 2135
- -b "mds-vo-namelocal, ogrid"
- or
- ldapsearch -x
- -H ldap//lxn1181.cern.ch2135
- -b "mds-vo-namelocal, ogrid"
21The complete hierarchy
Local GRISes run on CEs and SEs at each site
and report dynamic and static information
regarding the status and availability of the
services ldapsearch x h lthostnamegt -p 2135 b
mds-vo-namelocal,ogrid
At each site, a site GIIS or site BDII collects
the information of all resources given by the
GRISs ldapsearch x h lthostnamegt -p 2135 b
mds-vo-nameltnamegt,ogrid ldapsearch x h
lthostnamegt -p 2170 b mds-vo-nameltnamegt,ogrid
Each site can run a top level BDII It collects
the information coming from the sites and
collects it in a data base ldapsearch x h
lthostnamegt -p 2170 b ogrid
22How to query the IS?
LCG info commands
- In order to query directly the IS elements two
high level tools are provided.
These tools should be enough for most common
user needs and will usually avoid the necessary
of raw LDAP queries.
23lcg-infosites
- The lcg-infosites command can be used as an easy
way to retrieve information on Grid resources for
most use cases.
USAGE lcg-infosites --vo ltvo namegt options
-v ltverbose levelgt --is ltBDII
to querygt
- Check if LCG_GFAL_INFOSYS environment variable is
set to the correct Information Index (BDII) - echo LCG_GFAL_INFOSYS
- export LCG_GFAL_INFOSYSgrid004.ct.infn.it2170
24lcg-infosites options
25Obtaining information about CE
- lcg-infosites --vo gilda ce
- These are the related data for gilda (in terms
of queues and CPUs)
- CPU Free Total Jobs Running Waiting
ComputingElement - --------------------------------------------------
---------------------------------------- - 4 3 0
0 0 cn01.be.itu.edu.tr2119/jobmanage
r-lcglsf-long - 4 3 0 0
0 cn01.be.itu.edu.tr2119/jobmanager-lcg
lsf-short - 34 33 0 0
0 grid010.ct.infn.it2119/jobmanager-lcg
pbs-long - 16 16 0 0
0 grid011f.cnaf.infn.it2119/jobmanager-
lcgpbs-long - 1 1 0 0
0 grid006.cecalc.ula.ve2119/jobmanager
-lcgpbs-log - 2 1 1 0
1 gildace.oact.inaf.it2119/jobmanager-
lcgpbs-short - ..
lcg-infosites --vo gilda ce --v 2 RAMMemory
Operating System System Version Processor
CE Name ---------------------------------------
--------------------------------------------------
---------------------------------------- 1024
SLC 3 P4
ced-ce0.datagrid.cnr.it 4096 SLC
3 Xeon
cn01.be.itu.edu.tr 1024 SLC
3 PIII
cna02.cna.unicamp.br 917 SLC
3 PIII
gilda-ce-01.pd.infn.it 1024 SLC
3 Athlon
gildace.oact.inaf.it 1024 SLC
3 Xeon
grid-ce.bio.dist.unige.it ..
26Obtaining information about SE
- lcg-infosites --vo gilda se
- These are the related data for gilda (in terms
of SE)
- Avail Space(Kb) Used Space(Kb) Type SEs
- --------------------------------------------------
------------------------------------ - 143547680 2472756 disk
cn02.be.itu.edu.tr - 168727984 118549624 disk
grid009.ct.infn.it - 13908644 2819288 disk
grid003.cecalc.ula.ve - 108741124 2442872 disk
gildase.oact.inaf.it - 28211488 2948292 disk
testbed005.cnaf.infn.it - 349001680 33028 disk
gilda-se-01.pd.infn.it - 31724384 2819596 disk
cna03.cna.unicamp.br - 387834656 629136 disk
grid-se.bio.dist.unige.it
27lcg-info intro
- This command can be used to list either CEs or
the SEs that satisfy a given set of conditions,
and to print the values of a given set of
attributes. - The information is taken from the BDII specified
by the LCG_GFAL_INFOSYS environment variable. - The query syntax is like this
- attr1 op1 valueN, ...
- attrN opN valueN
- where attrN is an attribute name
- op is , gt or lt, and the cuts are ANDed.
- The cuts are comma-separated and spaces are
not allowed.
After the upgrading of the new GLUE SCHEMA
its not possible use the operator gt and lt
28lcg-info options
29lcg-info examples
List all the CE(s) in the BDII satisfying given
conditions
- lcg-info --list-ce --query 'FreeCPUs2' --attrs
'FreeCPUs,OS' - - - CE gildace.oact.inaf.it2119/jobmanager-lcgpb
s-infinite - - FreeCPUs 2
- - OS SLC
- - CE gildace.oact.inaf.it2119/jobmanager-lcgpbs-
long - - FreeCPUs 2
- - OS SLC
- - CE gildace.oact.inaf.it2119/jobmanager-lcgpbs-
short - - FreeCPUs 2
- - OS SLC
- - CE trigrid-ce00.unime.it2119/jobmanager-lcgpbs
-infinite - - FreeCPUs 2
30Questions
- THANK YOU FOR YOUR ATTENTION