E110 Best Practices for Implementing with Adaptive Server IQ - PowerPoint PPT Presentation

1 / 58

About This Presentation

Title:

E110 Best Practices for Implementing with Adaptive Server IQ

Description:

Can map ASE proxy tables to stored procedure calls in IQ-M ... Better performance achieved by casting the date or datetime formats rather than ... – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 59

Provided by: Kar9252

Category:

more less

Transcript and Presenter's Notes

Title: E110 Best Practices for Implementing with Adaptive Server IQ

1
E110Best Practices for Implementing with
Adaptive Server IQ

Mark Mumy
Principal Systems Consultant
Business Intelligence Division
mark.mumy_at_sybase.com

2
About This Presentation

Hardware Configuration
Creating the Database
IQ-M Server Settings
Connectivity
Data Model Recommendations
Database Programming
Data Manipulation
Maintenance Tasks
Memory Usage

3
Target Audience

Database Administrators
All Levels
Query Developers
All Levels

4
For This Presentation

Lets Keep It Interactive
I will entertain questions on a subject
Would be happy to speak to you offline, if
desired
No Question is too Basic
Chances are others may have the same question
Open Question Forum at the end
Time Permitting

5
Hardware Configuration Drive Arrays

Volume Manager
Not a necessary or recommended component for IQ-M
Additional overhead and software that has no
value add to an IQ-M installation
Acceptable for an IQ-M single node operation
where hardware does not have the ability to apply
RAID to the disk devices
RAID Level
Recommend using RAID 5 as a blend between
performance, protection, and cost
Recommend using raw devices (a must for
multi-node IQ-M)
For details on drive array specifications and
configurations see the Sun Reference Architecture
Whitepapers and EMC Whitepapers on array
configurations with IQ-M

6
Hardware Configuration Memory

How Much RAM for IQ-M
As much as possible!
Most systems should have at least 2 GB
Dont forget additional RAM for each load being
performed

7
Hardware Configuration Processors and Disk
Controllers

Optimal CPU configuration
Queries 1 CPU per active query more if queries
are complex and can be run in parallel
Loads 1 CPU per HG index plus 1 CPU per 2-5
columns being loaded
Disk Controllers
1 fiber controller per 5 write CPUs
1 fiber controller per 10 read CPUs
In mixed mode operation use the 51 ratio
Recommend a minimum of 2 controllers

8
Creating The Database Creation Options

Creation Options
Block Size and Page Size
Minimum should be 128K Page Size (8K Page)
Use 256K Page for larger databases
Larger the page size the more RAM that will be
required
CASE IGNORE vs. CASE RESPECT
Use CASE RESPECT whenever possible
Removes case comparison steps and improves
performance
Java in the Database
Install if Java will be needed in the database
No impact if installed and not used

9
Creating The Database File Placement

Use relative links makes it easier to relocate
files
Use symbolic links for IQ MAIN and IQ TEMP
/IQM_devs/IQ_MAIN_00
/IQM_devs/IQ_TEMP_00
Place the transaction log and transaction log
mirror on significantly large file system
Recommend 5 GB of filesystem space per IQ-M node
for database related storage
Transaction logs, IQ MSG, scripts, catalog file,
etc

10
Creating The Database Files Locations

Filename rules
Keep the names unique for each IQ-M node
Catalog DATABASE_NAME.db
Transaction log DATABASE_NAME.tlog
IQ MSG DATABASE_NAME.iqmsg
IQ Main Store DATABASE_NAME_iqmain_000
000 would be an incrementing number
IQ Temp Store DATABASE_NAME_iqtmp_000
000 would be an incrementing number
IQ Configuration File DATABASE_NAME.cfg

11
IQ-M Server Settings Query Performance

FORCE_NO_SCROLL_CURSORS
Should always be set to ON
Very few applications require this to be OFF
Can improve query performance

12
IQ-M Server Settings Query Plans

Query Plan Settings to Provide Optimal Query
Information to DBAs and Sybase Engineering
set temporary option query_plan'on'
set temporary option query_plan_after_run'on'
set temporary option query_timing'on'
set temporary option query_detail'on'
set temporary option dml_options10'on'
set temporary option query_plan_as_html'on'
set option QUERY_NAME Query Name
Should not be set globally as the IQ MSG file
will grow rapidly

13
IQ-M Server Settings Storage

Append_Load
Can be used to improve load performance
Will not reuse Row IDs or the space occupied by
those Row IDs
Great for systems where large, contiguous chunks
of data are deleted
May not good if random rows are deleted as it can
lead to fragmentation and allocated, but unused
space

14
IQ-M Server Settings Storage Continued

Disk_Striping
If ON, IQ-M will stripe writes to all available
devices
If OFF, the first device must be full before the
next is used
Disk_Striping_Packed
If ON, it forces better space usage and less
fragmentation
Fragmentation is indicated when out of space
messages are returned but the main dbspace used
is less than 100 full
Allows Adaptive Server IQ to better utilize small
pieces of unused space that remain after
compression

15
IQ-M Server Settings Data Loads

LOAD_MEMORY_MB
Set to 0 (default) on systems with enough RAM
Set to something lt 500 on systems where RAM is
tight or there are simultaneous loads taking
place

16
Client Access and Network Connectivity

Different internal environments and settings for
ODBC, JDBC, and Open Client connections
JDBC and ODBC are recommended
JConnect is Sybases implementation of JDBC
Use the JDBC 2 driver (JConnect 5.2/5.5)
com.sybase.jdbc2.jdbc.SybDriver
Open Client
Use with caution
Most applications written using Open Client
expect an ASE server

17
Client Access and Network Connectivity Open
Client vs. ODBC

See Chapter 32 of the ASA Users Manual as well as
the ASA Reference Section Transact-SQL and
SQL/92 compatibility options for complete list
of differences
If writing stored procedures or embedded
application code, make sure to explicitly make
settings for compatibility as these options will
get set to different values for Open Client vs.
ODBC connections
ALLOW_NULLS_BY_DEFAULT
QUOTED_IDENTIFIER
STRING_RTRUNCATION
ANSI_BLANKS
ANSINULL
CHAINDED

18
Client Access and Network Connectivity

AutoPreCommit Within ODBC
Set registry setting AutoPreCommit to Y
Forces applications to issue a COMMIT before each
query
Go to the registry and update the corresponding
Sybase Data Source Name (DSN) created, by adding
a new value 'AutoPreCommit with a value of Y'
HKEY_LOCAL_MACHINE/SOFTWARE/ODBC/ODBC.INI/DSN
Packet Sizes
Larger packet sizes will help with large data
retrieval
Use p option in IQ-M configuration file to
increase size
Use CommBufferSize parameter in ODBC connection
string

19
Client Access and Network Connectivity

Network Speed
Data retrieval will be as fast as the network
100 MB of data will take 80 seconds on a 10 Mbit
LAN
100 MB of data will take 8 seconds on a 100 Mbit
LAN
100 MB of data will take 0.8 seconds on a Gigabit
LAN
LAN speed may be the performance bottleneck for
queries that return large amounts of data
The faster the network cards and LAN the better
off concurrency will be more available bandwidth
per user
Network Interface Cards
Adding multiple network cards to the IQ-M node
will help with network

20
Client Access and Network Connectivity

ASE Component Integration Services (CIS)
Can have ASE CIS reference IQ-M tables (proxy
tables)
Use ASE 12.5 CIS class of ASIQ
Prior to to ASE 12.5 ASAnywhere or ASEnterprise
classes had to be used
Not a viable option if joining non-IQ-M tables
with an IQ-M proxy table
Can map multiple ASE logins to a single IQ-M
login
Data modifications should not be performed on
proxy tables
Can map ASE proxy tables to stored procedure
calls in IQ-M

21
Client Access and Network Connectivity

IQ-M Component Integration Services (CIS)
Can only be used on the Solaris and WinNT
versions of IQ-M
Can map to tables in Oracle (ODBC), ASE (ODBC
JDBC), ASA (ODBC JDBC), DB2 (ODBC), MS SQL
Server (ODBC), and any ODBC source
Not a viable option if joining IQ-M tables with
non-IQ-M proxy tables
Data modifications may be performed on proxy
tables

22
Data Model Recommendations IQ UNIQUE

Use whenever possible to help save storage space
and possibly improve query performance
Not necessary for cardinality greater than 64K
(65536)
Not necessary for datatypes of tinyint or char(1)
Does not need to be exact, but should be close to
cardinality
If the value is lt 255 then IQ-M will place a
1-byte FP index on the column 1 byte of storage
per row
If the value is gt 255 but lt 65536 then IQ-M will
place a 2-byte FP index on the column 2 bytes
of storage per row
May slightly hinder data loads, but improve query
speeds

23
Data Model Recommendations NULL Values

Always specify NULL or NOT NULL
Open Client and ODBC connections have different
defaults
Allows the optimizer a better guess at join
criteria
NULL data does not save space on the database
page as it would in ASE
Will be compressed out when stored on disk

24
Data Model Recommendations Unsigned Datatypes

Use unsigned datatypes where possible
Use for surrogate keys and join columns
Unsigned data comparisons are quicker
The caveat to this is that Open Client may
misinterpret the value if it is too large as it
does not understand large unsigned data
Can convert to signed integer, numeric, or
decimal if returning data to an Open Client
application
This caveat applies to moving data between IQ-M
servers with INSERT FROM LOCATION

25
Data Model Recommendations Long Varchar and
Long Varbinary

Can be used to store large amounts of text or
binary data
VARCHAR() or VARBINARY() datatypes
Maximum width is 32K (32768) bytes
The WORD index is the only index allowed on data
256 bytes or wider
Storage will be allocated in 256 byte chunks
A 257 byte string will require 512 bytes of
storage
Much less than the 2K requirement in ASE TEXT and
IMAGE columns

26
Data Model Recommendations Varchar vs. Char

Use CHAR() whenever possible
All data in IQ-M is stored fixed width
VARCHAR() types only add overhead
A VARCHAR(100) columns will require 101 bytes of
storage
100 bytes for data
1 byte for the size of data
CHAR() data is blank padded, VARCHAR() is not

27
Data Model Recommendations Proper Datatype
Sizing

Use the smallest datatypes possible for data
Be aware of all datatypes in IQ-M there may be
more than you know
If hour, minute and second information is not
necessary, use DATE instead of DATETIME
If the data will fit within a TINYINT or SMALLINT
datatype use that rather than INTEGER or BIGINT
Allows the engine to store data in smaller units
(1-byte TINYINT or 2-byte SMALLINT versus 4-byte
INTEGER or 8-byte BIGINT
Dont over allocate storage when defining
NUMERIC() or DECIMAL() as it can be costly for
data that doesnt need all that space

28
Data Model Recommendations When and Where to
Use Indexes

Always use indexes on
Join columns
Searchable columns
Aggregation columns
If uncertain, place an LF or HG index on the
column
A column with an HNG, CMP, or WD index should
have a corresponding LF or HG index (very rare
circumstances will negate this)
Indexes are not needed on columns whose data is
ONLY returned to the client (projected)
Use Primary Key or UNIQUE HG indexes where
appropriate

29
Data Model Recommendations Multi-column Indexes

Currently, only UNIQUE HG and PRIMARY KEY indexes
support multiple columns
HG inserts are the most expensive in IQ-M
Try to guarantee that inserts will happen at the
end of the index
Place generally incrementing data at the
beginning of the index list
For instance, a transaction date or batch number
Something that will try to guarantee a sequential
key

30
Data Model Recommendations Join Column

Prefer joining integer datatypes (unsigned if
possible)
Integer comparisons are quicker than character
comparisons
Keep the datatypes as narrow as possible to
improve join performance

31
Data Model Recommendations Primary Keys

Multi-column primary keys should have an
additional LF or HG index placed on each
individual column
Must be done manually
UNIQUE CONSTRAINT, UNIQUE HG, and Primary Key are
identical structures HG index with no G-Array
Use when possible
Helps optimizer make more informed query path
decisions even if index is not used in joins or
searches
You get an HG Index created automatically
This HG index has no G-Array (uses less space)

32
Data Model Recommendations Temporary Tables

3 Types of Temporary Tables
tables
create table temp table ( col1 int )
Local Temporary Tables
declare local temporary table temp table ( col1
int )
Behave just like tables
Global Temporary Tables
create global temporary table temp table ( col1
int )
Tables structure is static across connections and
reboots

33
Data Model Recommendations Temporary Tables

On Commit Preserve Rows
Use this option so that rows in temporary tables
remain after the transaction has been committed
Temporary tables are available at the current
level (parent) and all of its children
A parent cannot see a child's temporary table(s)

34
Data Model Recommendations Cursors

Avoid using cursors
Generally means row based processing
IQ-M was designed for set based processing
Sometimes they cannot be avoided
Open With Hold
Allows the cursor to remain open across
transactions
If not used, the cursor is closed when a commit
is issued

35
Database Programming Language Watcom SQL vs.
T-SQL

IQ-M (ASA) is not 100 T-SQL Compatible, but very
close
Recommend using Watcom SQL
All system procedures written with it
Many more code examples and more IQ-M people
versed in it
Watcom SQL has some extensions that T-SQL does
not
Dynamic SQL
Better Loop control
Full cursor movement rather than just read next
Batches and procedures must be written in the
same dialect
Cannot mix T-SQL with Watcom SQL

36
Database Programming Language Watcom SQL vs.
T-SQL

Behavior differences include

DECLARE CURSOR
GOTO
IF
PRINT
RAISERROR
SET

WHILE (T-SQL) vs. LOOP
Global variables
Variable Names
CALL
FOR
ASA requires variables to be declared immediately
after a BEGIN

37
Database Programming Language Commit and
Rollback

Use transaction control around logical units of
work, even read only queries
Should commit before a read/write batch is
started to ensure latest version of data is
available
Should issue commit and rollback after batch
completion to release all query resources
Rollback will free memory resources in use by
previous operations

38
Database Programming Language Custom Functions

Custom functions can be written in either SQL or
Java
Great way to encapsulate business logic for
transforming data
Can have a significant performance impact on
queries
Functions are executed in the catalog portion of
the engine
All result rows may need to be moved to ASA
Can be time consuming for large result sets
Turn on query plans to see what impact the
functions have on effective query plans

39
Database Programming Language Outer Joins

T-SQL Outer Join or
ANSI Standard/Watcom SQL Outer Join
leftrightfull outer join
Be careful of outer join syntax
The T-SQL syntax can be very ambiguous or
non-deterministic for IQ-M to translate
All T-SQL outer joins must be converted to ANSI
outer joins and then processed
Use the ANSI standard instead as they are not
ambiguous and are always clear in their meaning
Visit http//my.sybase.com/detail?id1017447 for
more details

40
Data Manipulation Load Table

Parallel Load Table
Make sure to put column delimiter after last
column
Must use ROW DELIMITED BY and DELIMITED BY
options in load table command
Column and row delimiters must be a single
character
1 to 4 byte delimiters allowed for serial loads

41
Data Manipulation Load Table

If possible, perform load table from binary
datafiles
Can be 3 to 10 times faster than ASCII loads
Can use FILLER() clause with a delimiter or byte
count
Better performance achieved by casting the date
or datetime formats rather than letting IQ-M
guess
If possible, issue a single load table with
multiple files rather than 1 load table per file
to be loaded into a table

42
Data Manipulation Insert From Location

Great way to move data from any Open Client
source
Syntax
insert into TABLE()
location SERVERNAME.DBNAME
select statement
IQ-M username and login must match on remote
system
Interfaces entry must match the SERVERNAME
Can also be used to move data quickly from an ASA
table to an IQ-M in the same server

43
Data Manipulation Single Row Operations

Avoid at all costs for large data manipulation
operations
Different from single statement operations that
modify many rows
Individual INSERT VALUES() will be slower than
bulk load operations
Expect no more than 5,000 to 20,000 operations
per hour

44
Data Manipulation Named Pipes and Flat Files

Named pipes can be faster no disk I/O
How to make a named pipe
mknod PIPE_NAME p
Can use BCP, GZIP, UNCOMPRESS, or applications to
push IQ-M formatted data into named pipe
LOAD TABLE command can read from named pipe
Can also fast extract data to named pipes so that
they can be read by another application or even
compressed and stored

45
Data Manipulation Partial Width Inserts

Can induce fragmentation and overly large space
consumption if not watched
Row IDs from deleted data are not reused during
a partial width insert operation
Space from delete data is not reused (because Row
IDs are not reused)
Partial width inserts are analogous to
APPEND_LOADon in terms of Row ID and space
behavior
Only becomes a problem if partial width inserts
are a way of life for a table

46
Data Loading The Deep Fact Table

Many databases have a deep table
A growing table with tens of millions of rows
Rows typically rolled out and replaced over
time
HG indexes are very slow to load/delete
Loading time can steadily increases as rows are
added
Solution (with a caveat)
Partition the table (Example time day, week,
month)
Build a view that is a Union of all the tables

47
Data Loading Partitioned Fact Table
Big Fact Table
Accessed by View
Create View bigtable as Select from t1 Union
All Select from t2 Union All Select from t3
48
Data Loading Why Partitioning?

Loads are faster and predictable
x million rows will load consistently
IQ may process the Union All in parallel
As long as cpu resources are available
To roll data out, truncate one table
Truncate table is much faster than Delete
No changes to DDL required
Load new data into the empty (truncated) table

49
Data Extraction

For fastest data unloads use the TEMP_EXTRACT
options
Can unload data in ASCII or BINARY format
Recommend BINARY format as it is faster to reload
into IQ-M
Great way to archive portions of the database
Can unload to one or more files, serially
Avoid using ISQL or DBISQL redirection to a file
Much slower than fast unloads

50
Maintenance Tasks Database Consistency Checks

The routine to check the database for potential
corruptions is SP_IQCHECKDB
There are no run-time options for this command,
however the DBCC options control the behavior of
the SP_IQCHECKDB command
Recommendations
Run Level 1 DBCCs every 1 to 4 weeks (will
process 1-2 GB per second)
Run Level 2 DBCCs every 1 to 3 months (will
process 50-100 MB per second)

51
Maintenance Tasks Parallel Create index

Create Multiple Indexes in a batch
Syntax
BEGIN PARALLEL IQ
Create HG Index
Create LF Index
END PARALLEL IQ
Recommendation
Create no more indexes in a batch than you have
cpu's
Note Two indexes on same column will be serial

52
Maintenance Tasks Setting the IQ Caches

Start by allocating memory to caches
40 IQ Main Cache
60 IQ Temporary Cache
Adjust allocations from here based on monitoring
Number of HG indexes in a table may alter this
plan for loading data

53
Maintenance Tasks IQ Monitoring

IQ Monitor is a diagnostic tool for DBAs
It collects and reports internal counters from
the IQ Buffer Caches
Main Cache interaction with the IQ Store
Temporary Cache interaction with the IQ Temp
Store
IQ Monitor offers a series of views of the
counters to showing differing aspects of the
server and buffer cache workload

54
Maintenance Tasks IQ Monitoring

Provides different views of buffer activity
summary report of both caches
detailed report of one cache
i/o activity of a cache
a debug report of all buffer cache activity
You must specify an option when you start the
monitor to specify what view to monitor
One Monitor may be running for a cache
May have one for each (Main Temp)

55
Managing Memory Inventory Your System Memory