Title: Infiniband enables scalable Real Application Clusters
1Infiniband enables scalable Real Application
Clusters Update Spring 2008
- Sumanta Chatterjee, Oracle
- Richard Frank, Oracle
2What is Oracle Real Application Clusters (RAC)
Database?
Public Network to Grid Computing Nodes
Database
Instance 1
Instance 2
Instance 3
Instance 4
SGA 4
SGA 1
SGA 2
SGA 3
Private Network
- Multiple Instances
- One Database
- SGA database memory of all instances aggregated
and appears as one single database to
applications through Cache Fusion.
3RAC for SAP Benchmark ResultsScalability
4Advantages of RAC
- Performance
- Increase performance of a RAC database by adding
additional servers to the cluster. - Fault Tolerance
- A RAC database is made up of multiple instance.
While performance may degrade, loss of an
instance does not bring down the entire database.
- Scalability
- Scale a RAC database by adding instances to the
cluster database.
5Shifting Trend in Deployment Paradigm
Application Tier on Commodity servers
Application Tier on Commodity servers
Database Tier
Database Tier on Commodity Servers
Application and Database on Same SMP Server
- Monolithic SMP
- Application
- Database
- Mixed Configuration
- Commodity Application Servers
- SMP Database Servers
- Grid Computing
- All Commodity Servers
Past
Present
Future
6Commodity Cluster Requires Unified Fabric for
efficient scalable IPC Storage I/O
- RDS / IB shows significant real world application
performance gains - 50 less CPU than IP over IB, UDP
- ½ latency of UDP (no user-mode acks)
- 50 faster cache to cache Oracle block throughput
(ping) - Scales well beyond GE (600 mbytes ran out of
CPU) - Minimal Oracle code change
- Supports fail-over across and within HCAs
- Certified for 16 nodes (64 processors)
- GA in 10g r2 (10.2.0.3).
7Current Status
- Several TPC-H benchmarks with RDS and SRP
- Large scale deployments at several Oracle
customer sites. - Many pilot projects are in progress.
- Folks waiting for RDS on OFED
- 16 nodes Oracle 11G RAC certification of OFED
1.2.5.5 submitted for audit - Voltaire and Qlogic have completed platform
certification. Audit in progress. - Certification on Unix in progress
8RDS- Communication model
- Works well with existing IPC clients
- Parallel Query communications
- Buffer cache fusion
- Working on providing support for additional
clients with RDMA plus atomic operations - We expect significant performance improvements
with RDMA - With Atomics, even greater scalabilities and
performance can be gained. - Incentives for simple NICs to add RDMA Atomics
9RDS - evolution
- RDS v2 with b-copy send, rev in OFED 1.2.5.5
- New features in RDS v3 available in OFED 1.3
- supports RDMA read RDMA write
- Introduces cmsgs for asynchronous operation
submit and completion notifications - Large data transfers presently up to 1 MB. Will
go up to 8 MB
10RDS v4
- Plans for RDS v4
- Masked fetch_and_add
- Masked compare_and_swap
- Zero copy completions via cmsg
- RDS V4 will also be more portable - we will work
to abstract out the generic RDS operations from
O/S primitive support and network operations. A
platform that provides the O/S network
primitives library - should be able to take all
the generic RDS code - as is.
11RDS Compatibility
- Linux request to all IB vendors
- Please ensure compatibility across HCAs, switches
- Ideally RDS driver in OFED ported to all
platforms - Advantages include
- One code body wider testing
- Interoperability across platforms
- Towards this end, we plan to
- Abstract RDS protocol driver generically (OS,
RDMA)