Global Peer-to-Peer Replication Utilizing Sybase Replication Server presentation

About This Presentation

Transcript and Presenter's Notes

Title: Global Peer-to-Peer Replication Utilizing Sybase Replication Server

1
DM208 Global Peer-to-Peer Data Replication
Utilizing Sybase Replication Server
Mich Talebzadeh Consultant Peridale
Systems, mitch.talebzadeh_at_db.com
2
Aimed At Audience with

Full Familiarity with Sybase ASE
Working knowledge of Sybase's Replication Server
Products
Knowledge of UNIX and shell scripting useful.

3
Topics Covered

Reasons for Peer-to-Peer Replication
Project Issues to be Addressed
Planning for a Replication System
Issues to be Addressed
Peer-to-Peer Replication Implementation

4
Topics Covered

Tuning Replication Server for Better Performance
Monitoring the Latency and Delivery of Data
Use Replication Server to Measure the Volume of
Replication Traffic on a Daily Basis

5
Reasons for Peer-to-Peer Replication

Need to Provide Business Solutions to a Community
Scattered Around the Globe
Limitations of Traditional Techniques Using a
Central Database Repository in a Major Location
with Applications Connected Remotely are
Limited Application Performance Because of the
Geographical Distance
Degradation in Network Response When the Traffic
over the WAN Gets Heavy. For Example, Remote DSS
Versus Local Transaction Inputs

6
Reasons for Peer-to-Peer Replication

The Data Server Becomes a Bottleneck As a Larger
Business Community Contends for Access
Data Becomes Unavailable When There is a Network
Failure.
Impact on Maintenance Tasks When Remote Users
Logged In.

7
Reasons for Peer-to-Peer Replication

A typical third-party application may not access
data in the most efficient way. For example, if
the Application Makes a Large Number of Discreet
Queries to the Database, Connection Latency
between the Application and Data Server could
Cause Start-Up Delays
If the Application were to Crash, the System
Becomes Unusable to Traders and Has an
Unacceptable Business Impact.

8
Project Issues to Be addressed

The Package Eligibility for Replication
The Availability of Required Band-width
The Technical Knowledge of the Package or
Application to Be Replicated
Local DBA Support for Sybase Replication

9
Planning For A Replication System

Start A Pilot Project with One Way Replication
Between Two Sites (Not to be confused with Sybase
Warm Standby)
Establish the volume of data growth and Deltas on
a Daily Basis on the Replicated Database
Establish the Bandwidth between the Two Replicate
Sites. May have to Improve the Bandwidth
Identifies External Feeders to the Database

10
Planning For A Replication System

Best Option to set up the Pilot on the Local and
the Remote Sites, otherwise use two different
Servers Locally linked via a WAN Simulator
Decide on What to Replicate
Replicate Tables and Turn off Triggers for
Replicated Transactions
Not Replicate Tables and Let Triggers Insert the
Records
Replicate Stored Procedures

11
Issues to Be Addressed

If Table Replication
Which Tables you are Replicating
Does the Table Being Replicated Meets Criteria
for Replication
Does the Table Have Primary Key
How is the Unique ID on the Table Generated
Determine Rows and Columns to Be Replicated

12
Issues To Be Addressed

Time Zones Dependency and Handling Time Stamps.
Discuss and Agree with Business on How to Handle
These. Remember Remote Locations Mean Different
Time Zones. Serious Business Implication
Managing Conflicts in Peer-to-Peer Replication
Managing Inserts
Conflicting Updates
Let Us Take a Look at These Site Specific
Solutions.

13
Issues To Be Addressed

Managing Inserts
A typical local table will include inserts from
local application in addition to inserts
delivered via replicated transactions. In a
peer-to-peer set up both tables are
bi-directional.
Designers tend to use unique IDs to uniquely
identify records in a table. Primary keys or
unique clustered indexes are usually built on the
unique ID.

14
Issues To Be Addressed

The unique ID for a given table tends to be a
monolithically incrementing number, a 32-bit
integer, stored in a table, the so-called
table_next_row_id and retrieved from this table.
Inter site conflicts occur when rows are inserted
in a local table and distributed to the remote
table. If the remote table has already a record
with the same unique ID, the replicated insert
will be rejected, and the data at the remote
table will be inconsistent with the local table.
Possible solutions are

15
Issues To Be Addressed

Add a location key to the tables if not already
there. Include the location key in the primary
key for replicated tables. This is useful if the
application is at the design stage. As
replication implementation is normally an after
thought, this approach may not be possible
without a substantial change to database schema
and code. Beware of Maintenance and Retrofitting.
Localise the table_next_row_id and DO NOT
replicate it. Allocate ranges for next_row_id
column for each location. A 32-bit integer
provides ability to store up to 2 billion unique
values. For a lifetime of a typical application
this range is far more than enough to cater for
all sites.

16
Issues To Be Addressed

For example, you can allocate the following
ranges
Location Reserved range for next_row_id
Column
London 1-100,000,000
Hong Kong gt100,000,000 and lt 150,000,000
Tokyo gt 150,000,000 and lt 200,000,000

17
Issues To Be Addressed

Conflicting Updates
The best way to handle conflicting updates from
different sites is to construct the application
and the environment so that conflicts cannot
happen. However, in majority cases one needs to
rely on business rules to reduce/eliminate the
conflicts. These rules are application specific.
For example for a trading system these could be

18
Issues To Be Addressed

On performing simultaneous new trade on the same
holding on different site, problem will occur on
calculated field, such as quantity, PL,
Furthermore, there are no signals, which warn the
users, when the problem occurs. The adopted
solution is to recalculate these fields (nightly)
so that the following day when the portfolio is
loaded, they will have the correct figure.
However, we have not yet encountered such problem.

19
Issues To Be Addressed

On simultaneous update on the same order. This
could happen due to a mistake. The business rule
is whoever created the order should be the one
who updates the order. If this happens the
quantity (PL) data will be out of synch. Again
there will be no warning message to indicate this
and it will be very difficult for IT support to
detect it. The traders/ users will inform
Application support that PL or quantity is
wrong. The Application support group needs to
check the historical order and amend it
appropriately. Once this is carried out, the
correct details will be replicated to other sites
and the databases will be in sync again.

20
Issues To Be Addressed

Make Sure that the IT and Business Managers plus
Application Support are Aware of the Fact that
Technology is Not a Substitute for Business Rules
and Ultimately they are Responsible For
Consistency of the Data.

21
Peer-to-Peer Replication Implementation

For Simplicity We Illustrate the Peer-to-Peer
Implementation Among Three Sites. Although This
Model Can be Expanded Further to Include
Additional Sites
We Have the Following

22
Peer-to-Peer Replication Implementation
23
Peer-to-Peer Replication Set UP
24
Peer-to-Peer Replication Implementation

Basics
Create Replication Servers on all Three sites.
Each Replication Server Will Handle the Local
RSSD and the Local Database to Be Replicated
Choose the ID Server for Replication Where the
Replication and Support Knowledge is Highest
Ideally All Sites Should Have Sybase sa and
root Access to all Servers. As a Minimum the ID
Server Site Should Have Access to All These

25
Peer-to-Peer Replication Implementation

Create the Diagnostics Run Files for Replication
Servers so the DBA May Observe Each Replicated
Transaction Performed by the Server (Invaluable
in Identifying Problems).
This is Achieved by Replacing the Repserver
Binary with the repserver.diag Binary in the Run
File and Adding the Following Entry to the
Replication Server .cfg File
traceDSI,DSI_CMD_DUMP

26
Peer-to-Peer Replication Implementation

Create error class rs_sqlserver_error_class
(default sql server error class) in ID
replication server only.
This will Handle Sql Server Errors in
Replication Server. The default error action for
all errors returned by Sql Server is to Stop
Replication! You Can Assign Action to the Created
Error Class etc. Very Important!
Error Actions are Stored in Table
rs_erroractions. Use rs_helperror error_no, v to
Get Information About the Errors.

27
Peer-to-Peer Replication Implementation

Turn off trigger settings For Each Replicated
Database in the Repserver Controlling that
Database. Use configure connection command with
dsi keep triggers option set to off. For
example in lon_rep_server run the following
command
configure connection to london_sql_server.db set
dsi_keep_triggers to off
Create Interfaces Files with All Sql Server and
Repserver Entries for All Sites. Ensure that the
Application Servers also have All the Relevant
Information for All Sql Servers

28
Peer-to-Peer Replication Implementation

Creating Direct Routes
In our triangle diagram, we need to create routes
in order for our three replication servers to
send messages to destination replication servers.
A route is a one-way message stream that sends
requests from one replication server to another,
carrying data modification commands, replicated
functions and stored procedures. In this design
the routes are created as follows

29
Peer-to-Peer Replication Implementation

Route type Source destination
Direct lon_rep_server hk_rep_server
Direct lon_rep_server tyo_rep_server
Direct hk_rep_server lon_rep_server
Direct hk_rep_server tyo_rep_server
Direct tyo_rep_server lon_rep_server
Direct tyo_rep_server hk_rep_server

30
Peer-to-Peer Replication Implementation

For example in lon_rep_server run the following
command to create route to hk_rep_server and
tyo_rep_server respectively
create route to hk_rep_server set username
hk_rep_server_rsi set password hk_rep_server_rsi_p
s
create route to tyo_rep_server set username
tyo_rep_server_rsi set password
tyo_rep_server_rsi_ps

31
Peer-to-Peer Replication Implementation

Where hk_rep_server_rsi is the RSI username
already created by rs_init when you created the
hk_rep_server. hk_rep_server_rsi_ps is the
default password for such user etc. Use
rs_helproute in any RSSD to check the status of
the routes created.
Keep all passwords below 30 characters. If you
want to change password for RSI user etc, do so
when creating the repserver. Makes life easier
later.

32
Peer-to-Peer Replication Implementation

Loading the Database to Be Replicated
In order to perform the initial load of the
database to be replicated, you may consider the
following steps
Decide where you are going to load your initial
database. In our case we chose London as the
starting point. Otherwise you may have to merge
databases. Requires a good migration plan
dbcc the database and perform update statistics
in London.
Review all the primary keys for tables to be
replicated.

33
Peer-to-Peer Replication Implementation

Turn off all replication flags in the user
tables. Use sp_setreptable table_name, false
Do dbcc settrunc(ltm,ignore) on the database
Dump transaction with truncate_only
Dump database to the dump directory.
Ftp the dump file to the remote servers. May
consider zipping the file etc.
On the BCP site load the database

34
Peer-to-Peer Replication Implementation

Load the database from the dump file in remote
locations
Localise the so-called local tables. For example
if you have table_next_row_id, set next_row_id
column to the appropriate starting values for
location etc
Adding databases is quite straightforward. For
example
Add the London Production database to the
replication system using lon_rep_server
Add the Hong Kong database to the replication
system using hk_rep_server
Add the Tokyo database to the replication system
using tyo_rep_server

35
Peer-to-Peer Replication Implementation

Creating Replication and Subscription Definitions
All the Information you need is in the database
to be replicated
Use the Local Repserver and SQL Server to Create
Replication Definitions for the Database to be
Replicated
You will need to extract information from system
tables sysobjects, syscolumns systypes and
syskeys, provided that you have defined primary
keys for your tables!

36
Peer-to-Peer Replication Implementation

I have prepared shell scripts which will
automatically generate replication definitions
for all tables. This script takes the format
genrepdef.ksh -R ltREP_SERVER_NAMEgt -S
ltSQL_SERVER_NAMEgt -D ltDATABASE_NAMEgt
where the parameters refer to the repserver
controlling the database, the name of the SQL
server and the database name respectively.
In the same way, you can create subscription
definitions using script
gensubdef.ksh -R ltREP_SERVER_NAMEgt -S
ltSUBSCRIBER_SERVERgt -D ltDATABASE_NAMEgt

37
Peer-to-Peer Replication Implementation

Where the parameters refer to the repserver in
which subscriptions are defined, the name of the
SQL server which subscribes to the replication
definitions and the database name respectively.
Use activate.ksh, validate.ksh and checksub.ksh
to complete the subscription process.

38
Peer-to-Peer Replication Implementation

In summary for replications we have

39
Peer-to-Peer Replication Implementation

Likewise for subscriptions we have

40
Peer-to-Peer Replication Implementation

At the end of subscription definitions you should
have two subscription definitions for each
replication definitions. In other words, doing
sp_helpsub for each table should give you 3x2
subscriptions 6 lines. This should be shown in
any RSSD database.
Use of Function Strings to Apply Local Timestamps
Replication Server converts functions to commands
for destination data servers, and submits them to
these data servers.

41
Peer-to-Peer Replication Implementation

For example, a new row inserted in the source
table causes Replication Server to distribute an
rs_insert function specific to that table to the
subscriber databases.
A possible solution for applying local timestamps
at replicate database would be to modify
rs_insert for a given source table to invoke an
RPC at the destination database. The RPC in turn
inserts local timestamp to the subscribed table
as shown below

42
Peer-to-Peer Replication Implementation

alter function string tran_history_db_rd.rs_insert
for rs_sqlserver_function_class output rpc
'execute ins_tran_history_sp
_at_time_stamp ?time_stamp!new?,
_at_action ?action!new?,
_at_table_name ?table_name!new?,
_at_row_id ?row_id!new?,
_at_host_name ?host_name!new?,
_at_user_name ?user_name!new?,
_at_pid ?pid!new?
go

43
Peer-to-Peer Replication Implementation

create procedure ins_tran_history_sp
( _at_time_stamp datetime null, _at_action int
null,
_at_table_name varchar(24) null, _at_row_id int
null,
_at_hos_tname varchar(24 ) null, _at_user_name
varchar(24) null,
_at_pid numeric(35) null)
as
begin transaction
insert into tran_history
( timestamp, action, table_name, row_id,
host_name,user_name,
pid)

44
Peer-to-Peer Replication Implementation

values
( getdate(), _at_action, _at_table_name, _at_row_id,
_at_hostname, _at_user_name, _at_pid)
commit transaction
go

45
Tuning Replication Server for Better Performance

There are some configuration parameters that can
be altered in order to get better performance
from the Replication Servers
init_sqm_write_delay - stable queue manager waits
for at least init_sqm_write_delay milliseconds
for a block to fill before it writes the block to
the correct queue on the stable device (default
is 1000). Try decreasing this parameter.
init_sqm_max_write_delay - a flush to the queue
is guaranteed to happen after waiting for
init_sqm_max_write_delay, if the DSI or RSI
thread reading the queue is unable to connect to
the target or has been suspended (default 10000).
Decrease this parameter if required.

46
Tuning Replication Server for Better Performance

sqt_max_cache_size - need to increase this value
if there are a lot of open transactions and or
large transactions. Memory for sqt_max_cache_size
is taken from the global memory pool (default is
131072 bytes)
batch_sz - the larger the batch_sz the less often
the truncation point is updated (default 1000
commands). Try increasing this value

47
Monitoring the Latency and Delivery of Data

It is a common practice for DBAs to set up a DBA
specific table in the replicated database and try
to check the latency and health of the
replication system by updating data in this
table. In its simplest form one can insert or
update records in this table and see if the data
is being replicated to the other sites. The time
taken for data to get to the remote site will
give an indication of latency.

48
Monitoring the Latency and Delivery of Data

Although the above method can be used as a simple
monitoring tool it suffers from the following
drawbacks
It is not an indicator of whether the business
transactions arrive in remote sites in a timely
manner
DBA table is normally used by one application,
for example a cron job via a simple transaction.
User applications create a larger number of
transactions and the data delivery is impacted by
the following

49
Monitoring the Latency and Delivery of Data

Concurrency
Size of the user tables
Locking mechanism employed by the application
It should be noted here that the maintenance user
trying to deliver the replicated data could be
blocked by local users. If the statistics on the
user tables are not current then the replicated
data may take a longer time to be delivered
resulting in remote users being blocked waiting
for locks to be released.

50
Monitoring the Latency and Delivery of Data

It is important to get a more realistic estimate
of replication delivery. A possible solution
would be to look at the entries in an audit or
transact history table in the replicate database
and check the delivery timestamp. By comparing
the timestamps for records delivered and
adjusting for servers clock difference, the
latency can be estimated.

51
Use Replication Server to Measure the Volume of
Replication Traffic

Use Replication Command admin who, sqm
Take the Column First Seg.Block. This is the
first undeleted segment and block number in the
queue. As we go through the day the Seg. Block
value increases. When the Replication is shutdown
these values are reset. Take readings at the
different time intervals, the difference in the
values read gives an indication of data passed to
a replication queue over the time.
Need to convert blocks read to MB

52
Received0 Server Current
Previous Start of
Received/Sent Sent1
Read(S.Blk) Read(S.Blk) Day
Read(S.Blk) today/MB 1050
hk_server.db 40196.52 40196.35
40171.60 24.87 1051
hk_server.db 8487.27
8487.24 8472.60 14.48 1160
bcp_server.db 34872.64
34872.47 34844.20 28.69 1161
bcp_server.db 5245.16
5245.13 5236.18
8.97 1170 lon_server.db
3742.25 3742.25 3733.60
8.45 1171 lon_server.db
45706.58 45705.62 45660.20
46.59 1180
ny_server.db 21153.49 21153.32
21124.23 29.41 1181
ny_server.db 2709.48 2709.45
2702.48 7.00 1190
tyo_server.db 14081.53 14081.36
14053.30 28.36 1191
tyo_server.db 2664.40 2664.37
2656.28 8.19
53
Use Replication Server to Measure the Volume of
Replication Traffic

Stable queues are composed of segments stored on
disk partitions.
Each segment has a megabyte of message space,
divided into 64 blocks of 16K
Take the reading at Start of day (Say 11PM)
No_of_16K_Blocks_cur (Seg.Block_cur -1) 64
( Seg.Block_cur - int(Seg.Block_cur) ) 100
No_of_16K_Blocks_start (Seg.Block_start -1)
64 ( Seg.Block_start - int(Seg.Block_start) )
100
Traffic today (MB) ( No_of_16K_Blocks_cur -
No_of_16K_Blocks_start ) / 64

54
Total Messages attributed to different servers up
to now are London Production
55.05MB Hong Kong
39.36MB Tokyo
36.55MB New York
36.41MB London bcp
37.66MB DBA specific traffic
(estimate) 44.84MB Application traffic
160.17MB Sum total all sites
205.02MB

Write a Comment

User Comments (0)

About PowerShow.com

Global Peer-to-Peer Replication Utilizing Sybase Replication Server PowerPoint PPT Presentation