Managing GridDatabases with GRelC - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

Managing GridDatabases with GRelC

Description:

SPACI Consortium and University of Salento (Lecce), Italy. ISSGC2007 - July 12th ... SQLite. driver. SQLite. Dinamic binding to: PostgreSQL MySQL SQLite IBM/DB2, ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 64
Provided by: icea
Category:

less

Transcript and Presenter's Notes

Title: Managing GridDatabases with GRelC


1
Managing Grid-Databases with GRelC
  • Ph.D. Sandro Fiore
  • SPACI Consortium and University of Salento
    (Lecce), Italy
  • ISSGC2007 - July 12th

2
Outline
  • Motivations
  • GRelC Project
  • GRelC DAS
  • Architecture
  • SDK
  • GUI
  • Queries
  • Experimental Results
  • Porting on gLite
  • Deployment
  • On Line User Tutorial (GILDA)
  • Conclusions

3
Motivations
  • Data Grids should provide a low level framework
    also for grid-database management (fine grained
    approach)
  • No new DBMS or new query language
  • Legacy systems/databases and standard SQL
  • Need for more complex and efficient query in
    grid
  • Integration with production grid environments
    (based on gLite, globus, )
  • Main requirements security, transparency,
    interoperability, efficiency, robustness, etc.

4
Introducing the GRelC Project
  • Grid Relational Catalog is a project which aims
    at designing and developing a set of efficient,
    secure and transparent Data Grid Services
    (Starting date, Jan 2001).
  • GRelC Data Access Service aims at providing a
    large set of functionalities to access both
    relational and non relational DataBases in a grid
    environment.

5
GRelC Project a bit of history
6
GRelC DAS Architecture
GRelC DAS
7
GRelC DAS Main Features
  • Entirely based on C programming language
  • Multithreaded web service
  • It exposes the web service interface GSI enabled
    and WS-I compliant
  • Mutual authentication based on GSI (X.509v3
    digital certificates)
  • GRelC DAS Authorization based on ACL for local
    management
  • VOMS Support, for global management
  • Information System Support (BDII compliant)
  • Wide set of data access control policyies
  • Full GSI support data encryption, data
    integrity, protection against replay attacks and
    detection of out of sequence packets

8
GRelC DAS Main Features
  • XML data validation for recordset
  • SingleQuery, MultiQuery and MultiSingleQuery
    Support
  • Support for synchronous and asynchronous queries
  • Dinamic binding to heterogeneous DBMSs
  • Two levels logging (users, connections, queries,
    etc.)
  • GSI enabled remote administration tools and
    remote log
  • Compression, chunking, prefetching and streaming
    to enhance performance on a WAN
  • Wide SDK for developers (both for C and C)
  • No dependencies concerning other middleware (only
    GSI)

9
GRelC DAS Architecture
10
GRelC DAS SDAI
11
Standard Database Access Interface
  • Features
  • Standard access to data sources
  • Types uniformity
  • Error uniformity
  • Plug-in architecture based on dynamic libraries
  • Dinamic binding to
  • PostgreSQL MySQL SQLite IBM/DB2,
    Oracle9.i, MS-SQL Server, UnixODBC, Textual DBs,
    etc.

12
New drivers IBM/DB2, Oracle, MS-SQL
SQ Access Policy
MQ Access Policy
AuthUser
DATA RESOURCES
Authorized client
Production Drivers
Client built on top of the set of Services
Unix ODBC Data Source
MySQL
SOAP
PostgreSQL
Oracle
IBM/DB2
XML GSS-API GSI
MS SQL Server
Pre-ProductionDrivers
AuthDB User
Configuration Policy
13
GRelC SDAI Library APIs (I)
  • int grelc_sdai_handle_set_grelc_dbname
    (grelc_sdai_handle handle, const char value)
  • int grelc_sdai_bind(grelc_sdai_handle handle)
  • int grelc_sdai_unbind(grelc_sdai_handle handle)
  • int grelc_sdai_init()
  • int grelc_sdai_exit()
  • int grelc_sdai_query_submission(grelc_sdai_handle
    handle, char query)
  • int grelc_sdai_ntuples(grelc_sdai_handle
    handle)
  • int grelc_sdai_nfields(grelc_sdai_handle
    handle)
  • int grelc_sdai_field_name(char field,
    grelc_sdai_handle handle, int i)
  • int grelc_sdai_field_type(grelc_sdai_handle
    handle, int i)

14
GRelC SDAI Library APIs (II)
  • int grelc_sdai_get_value(char value,
    grelc_sdai_handle handle, int i, int j)
  • int grelc_sdai_clear_result(grelc_sdai_handle
    handle)
  • int grelc_sdai_lock(grelc_sdai_handle handle,
    int mode, char table)
  • int grelc_sdai_unlock(grelc_sdai_handle handle)
  • int grelc_sdai_begin_transaction(grelc_sdai_handle
    handle)
  • int grelc_sdai_commit_transaction(grelc_sdai_handl
    e handle)
  • int grelc_sdai_rollback_transaction(grelc_sdai_han
    dle handle)
  • int grelc_sdai_get_tables (grelc_sdai_handle
    handle)
  • int grelc_sdai_get_fields (grelc_sdai_handle
    handle, char table)

15
A simple SDAI Client
SDAI Client
  • if (res grelc_sdai_bind (handle))
  • fprintf (stderr, "ERROR! Database bind failed
    Code d!\n", res)
  • return -1
  • if (grelc_sdai_query_submission (handle,
    query))
  • fprintf (stderr, "ERROR! Query submission
    failed!\n")
  • return -2
  • if (strcasestr (query, "SELECT"))
  • for (outer 0 outer lt grelc_sdai_ntuples
    (handle) outer)

16
GRelC DAS Internal Components
17
The GRelC Library APIs Classification
  • Database access and query services
  • bind
  • unbind
  • query submission
  • Remote manipulation services
  • get_value
  • get_current_tuple
  • Resultset store and retrieving services
  • store_result_disk
  • fetch_stored_recordset
  • User management services
  • add_user
  • remove_user
  • set_user_policy
  • Enterprise Grid management services
  • add_host
  • add_dbms
  • Virtual space management services
  • create_virtual_database

Wide SDK both for C and C developers
18
SDK (I)
  • Database access and query services
  • grelc__data_access_bind
  • grelc__data_access__unbind
  • grelc__data_access__query_submission
  • grelc__data_access__multi_query_submission
  • On-line Approach
  • grelc__data_access__ntuples
  • grelc__data_access__nfields
  • grelc__data_access__field_name
  • grelc__data_access__field_type
  • grelc__data_access__get_value
  • grelc__data_access__clear_result
  • grelc__data_access__get_current_tuple
  • Memory Approach
  • grelc__data_access__store_result_memory
  • File Approach
  • grelc__data_access__store_result_disk
  • grelc__data_access__fetch_stored_recordset.

19
SDK (II)
  • User management services
  • grelc__data_access__add_user
  • grelc__data_access__delete_user
  • grelc__data_access__get_users
  • grelc__data_access__set_user_policy
  • grelc__data_access__get_user_policy
  • grelc__data_access__delete_stored_procedure
  • grelc__data_access__alter_stored_procedure_alias
  • grelc__data_access__alter_stored_procedure
  • Enterprise Grid management services
  • grelc__data_access __add_host
  • grelc__data_access __delete_host
  • grelc__data_access __get_hosts
  • grelc__data_access __add_dbms
  • grelc__data_access __delete_dbms
  • grelc__data_access __set_dbms_port
  • grelc__data_access __set_dbms_login.

20
SDK (III)
  • Virtual space management services
  • grelc__data_access__get_databases
  • grelc__data_access__create_virtual_database
  • grelc__data_access__drop_virtual_database
  • grelc__data_access__register_database
  • grelc__data_access__create_database
  • grelc__data_access__drop_database
  • grelc__data_access__create_phy_db_and_register
  • grelc__data_access__clear_out_database
  • grelc__data_access__make_dump
  • grelc__data_access__get_login
  • QoS service
  • grelc__data_access __relocate_database

21
SDK (IV)
  • GRelCRecordset
  • grelc_service_open_recordset
  • grelc_service_movefirst
  • grelc_service_movenext
  • grelc_service_move
  • grelc_service_eof
  • grelc_service_eof_group_records
  • grelc_service_get_value
  • grelc_service_get_attribute_name
  • grelc_service_get_attribute_type
  • grelc_service_get_num_attribute
  • grelc_service_get_num_records
  • findfirst
  • findnext
  • grelc_service_free_recordset
  • remove_file_from_disk.

22
GRelC-CppProxy a C Module
  • A C module was created in order to allow an
    easy development of new web services client with
    this language.

This module hides the communication layer with
Web Services
23
CppProxy Class
class CppProxy public int bind
(string grelc_db_name) int unbind () int
query_submission (string query) .... int
create_database (string grelc_db_name, string
identity, string dbms, string host, string
istance, int log_type) int create_physical_dat
abase_and_register(string grelc_db_name, string
dbms, string host, string istance, int
log_type) int drop_database (string
grelc_db_name) int get_log(int num_lines,
string log) int get_log_database(string
grelc_dbname, int num_lines, string log)
int get_host_position_info(HostInfoRespons
e response) int get_value (int row, int
column, string value) private struct soap
soap struct gsi_plugin_data data char
connection bool connected bool
enable_credential string dn
24
XGRelC A consolle for Grid-DBs Mng
  • Functionalities
  • User management
  • Web Service registration
  • Host Management
  • Logging
  • DBMS configuration
  • Database creation
  • Import Database
  • Database configuration
  • Query submission
  • Map deployment

25
XGRelC GUI Snapshots
26
GRelC Queries
  • GRelC latest release supports the following query
    types
  • Single Query Online
  • Single Query Memory ( chunk management)
  • Single Query File ( chunk management)
  • Single Query File ZIP ( chunk management)
  • Single Query Prefetch (parallel chunk
    donwload/processing)
  • Single Query Stream (resultset streaming)
  • Web Single Query XHTML ( chunk management /
    paging)
  • CSS v2.0, XHTML v1.0 Strict
  • Results displayed in the following formats
  • Tabular
  • XML
  • HTML
  • RAW

27
Single Query On-Line
Client
SQS submission
Get Value/Get Tuple
Result submission
SQL
Recordset
DBMS
GRelC-Data-Access
This kind of query is suitable for DML statements
or to retrieve small resultsets
28
Single Query File Approach (Zip)
This kind of query is suitable to retrieve
medium/large resultsets
GrelCRecordset APIs
GrelCLoad Recordset
GrelCRecordset in XMLformat
Client
SQS query
Data Delivery
SQL
Recordset
DBMS
GRelC Data Access
29
Single Query File chunk (Zip)
This kind of query is suitable to retrieve
medium/large resultsets
GrelCRecordset APIs
GrelCLoad Recordset
GrelCRecordset in XMLformat
Client
SQS query
Data Delivery
SQL
Recordset
DBMS
GRelC Data Access
30
GRelC Data Access Clients
31
Single Query HTML
Client
Http connection
SQS submission
URI Result
SQL
Recordset
DBMS
GRelC-Data-Access
32
Single Query HTML (Pre-production)
Client
HTTPS connection using X.509 Certificates
SQS submission
URI Result
SQL
Recordset
DBMS
GRelC-Data-Access
33
Asynchronous Query
  • Asynchronous queries
  • Batch mode
  • Users can define a lifetime for results
    availability on the GRelC DAS
  • decoupling client/server (e.g. WN gLite)
  • New clients (submission, status, abort)
  • Additional thread to manage requests
  • Preliminary internal tests were ok
  • Added within the current release v2.2.0

34
Asynchronous Query
1 Asynchronous Query Submission
GrelCRecordset APIs
2 Request Dispatching
GrelCLoad Recordset
3 Data Delivery
4 Data Manipulation
GrelCRecordset in XMLformat
Client
SQS query
Get File
ID Query
Data Delivery
Recordset
SQL
DBMS
GRelC Data Access
ID Query
35
Async Query State diagram
FAILED
failure
RUNNING
purge v timeout
execution
DONE
completion
query submission
purge v timeout
QUEUED
PURGE
purge v timeout
abort
ABORTED
timeout
36
Async Query Functions list
  • Insert Query in the Catalog

grelc_service_insert_async_query -s ltserver_IPgt
-p ltportgt -d ltdb_namegt -q ltquerygt
  • Check Status

grelc_service_check_status_async_query -s
ltserver_IPgt -p ltportgt -i ltid_querygt
  • Abort Query

grelc_service_abort_async_query -s ltserver_IPgt -p
ltportgt -i ltid_querygt
  • Purge Query

grelc_service_purge_async_query -s ltserver_IPgt -p
ltportgt -i ltid_querygt
  • Get File

grelc_service_purge_async_query -s ltserver_IPgt -p
ltportgt -i ltid_querygt -f ltdestination_file_namegt
  • Get List

grelc_service_purge_async_query -s ltserver_IPgt -p
ltportgt -d ltdngt -S ltstatusgt
37
Single Query Stream
Client
SQS submission
Result submision
Recordset
SQL
DBMS
GRelC-Data-Access
This kind of query is suitable to retrieve VERY
LARGE resultsets
38
Testbed
  • SQs Comparison
  • Test DB bioinformatics relational database
  • Sequential tests
  • SELECT statements

39
Test Performance (III)
40
Test Performance (IV)
41
GRelC gLite
42
GRelC on gLite Porting
  • Porting of GRelC on gLite was straighforward
  • Porting on gLite is ok both for client and server
    side
  • The middleware works fine both on LCG-2-7-0 and
    current gLite 3.x middleware
  • GRelC DAS runs also on several platforms
  • Linux
  • MAC OS X
  • FreeBSD
  • Both IA64 and IA32 platforms are supported (we
    currently installed on SPACI-LECCE-IA64 (EGEE SA1
    partner) the GRelC DAS)

43
GRelC on gLite New Service
  • Straighforward integration within the EGEE farm
    model
  • GRelC DAS provides fine grained data mng service
  • This service can be used both as farm service and
    as VO service depending on the context, the
    database policies/constraints, etc.

BDII query
Datatransfer (files)
BDII
Extended EGEE Farm Model
ComputingElement
StorageElement
Files
Wn
Wn
Wn
44
GRelC on gLite VOMS
  • We provide global authorization by means of VOMS
    Extensions
  • High level of scalability concerning DAPs related
    to VOs
  • Double level authorization framework both local
    and global policies management can be provided
    (mixed mode)

Coarse Grained
Fine Grained
45
Two-level authorization
  • Global authorization (through VOMS extensions)
  • Local authorization (by means of the local GRelC
    DAS authorization framework)
  • The two masks obtained from global and local
    authorization are combined to infer the final
    User Privileges Mask (UPM)
  • 3 scenarios
  • global mode, coarse grained approach
  • local mode, fine grained approach
  • combined mode

46
Global Mode
  • User credentials must be obtained through
    voms-proxy-init
  • The UPM is inferred from the available VOMS
    extensions
  • No additional authorization setting is required
    on the GRelC DAS
  • Easy and fast setup procedure
  • It scales well
  • Feasible for a real production grid environment

47
Global Mode
48
Local Mode
  • User credentials must be obtained through
    grid-proxy-init
  • The UPM is drawn out of the GRelC DAS metadata
    catalogue
  • No VOMS extensions are added to the user proxy
  • The setup procedure must be carried out on each
    GRelC DAS
  • Scalability is worse

49
Local Mode
50
Combined Mode
  • User credentials must be obtained through
    voms-proxy-init
  • The UPM is inferred joining information on access
    policies coming from VOMS extensions and the
    GRelC DAS metadata catalogue
  • VOMS level (grant or revoke)
  • GRelC DAS level (setting, undefining, unsetting)

51
Combined Mode - An Example
52
Roles and Groups on VOMS (I)
Case A (fine grained)
/gilda/grelc/das/host1/grid-db1/Rolegrelc-db-inse
rt
53
Roles and Groups on VOMS (II)
Case B(intermediate level)
/gilda/grelc/das/host1/Rolegrelc-db-insert
54
Roles and Groups on VOMS (III)
Case C (coarse grained)
/gilda/grelc/das/Rolegrelc-db-insert
55
GRelC on gLite BDII
  • GLUE schema extension providing information about
    VOs and Databases (we plan to interact with OGF
    GLUE-WG)
  • Local admin can set up the Information Provider
    Level parameterMin 0 to publish just basic info
    (only the contact string)Max 7 for all info
    (contact string, VOs, DBs, tables, fields, etc.)

Information System Extensions
Database specific Information
56
GRelC on gLite Porting on SLC4.x
  • Porting on SLC4.x is an on going activity
  • Preliminary results are very good
  • Porting will be completed before EGEE Conference
    in Budapest
  • A release based on SLC4.x will be available on
    the GRelC website in September
  • Current test are connected both with IA32 and
    IA64 (Itanium2 processors) platforms
  • This activity is part of the SPACI-LECCE-IA64 SA1
    activity within the EGEE Project

57
INFN GRID Deployment
  • Involved Sites
  • INAF Trieste (IA32)
  • INFN Bari (IA32)
  • INFN Catania (IA32)
  • INFN Padova (IA32)
  • SPACI LECCE (IA32, IA64)
  • Testing Activities
  • Sequential tests
  • Concurrent tests
  • Bugs report
  • Bug Fixing
  • Optimization

DAS Server DAS Client
INAF Trieste
INFN Padova
INFN Bari
SPACI-Lecce
SPACI Lecce
INFN Catania
58
SEPAC Grid Deployment
59
Important numbers technologies
  • Some important numbers about the GRelC project
  • 1 Patent
  • About 18 International works
  • More than 50.000 code lines
  • 91 C classes
  • 28 QT GUI Windows
  • 103 services
  • Wide documentation
  • Technologies
  • GSI
  • gSOAP
  • GSI-plugin
  • QT Library
  • SDAI Library

60
GRelC WebSite
  • Main sections
  • Download
  • (rpms available)
  • News
  • Publications
  • Events
  • Deployment
  • Documentation
  • Components
  • ..

GRelC Website URL http//grelc.unile.it/ Mailing
List mail grelc-user_at_sara.unile.it
61
User tutorial GILDA t-Infrastructure
  • GRelC DAS User Tutorial
  • on GILDA Grid CT Wiki Website
  • Info about
  • Log in to the grid
  • Query Submission
  • For any information about GILDA t-Infrastructure
    please
  • contact roberto.barbera_at_ct.infn.it
    grid-prod_at_ct.infn.it
  • GRelC DAS Tutorial link https//grid.ct.infn.it/
    twiki/bin/view/GILDA/GRelCDataAccessService

Special thanks to the GILDA Staff for their
support
62
Conclusions
  • GRelC DAS provides support in Grid for a wide
    range of DBMSs.
  • It is currently tested on several grid
    environments (SPACI, SEPAC, GILDA, INFNGRID)
  • A wide SDK is available for developers
  • CLI XGRelC Graphical Interface to ease Grid-DB
    mng
  • gLite compliant (porting on gLite 3.x and
    integration with VOMS framework, BDII, etc.)
  • Support for several platforms (IA32 and IA64)
  • Currently the software is candidate at the EGEE
    Respect Program

63
For any information
  • Supervisor Prof. Giovanni Aloisio
    (giovanni.aloisio_at_unile.it)
  • Project P. I. Ph. D. Sandro Fiore
    (sandro.fiore_at_unile.it)
  • Team Members
  • Ph. D. Massimo Cafaro
  • MSc Alessandro Negro
  • MSc Salvatore Vadacca
  • GRelC WebSite http//grelc.unile.it
  • Mailing lists grelc-user_at_sara.unile.it
Write a Comment
User Comments (0)
About PowerShow.com