E110 Best Practices for Implementing with Adaptive Server IQ - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

E110 Best Practices for Implementing with Adaptive Server IQ

Description:

Can map ASE proxy tables to stored procedure calls in IQ-M ... Better performance achieved by casting the date or datetime formats rather than ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 59
Provided by: Kar9252
Category:

less

Transcript and Presenter's Notes

Title: E110 Best Practices for Implementing with Adaptive Server IQ


1
E110Best Practices for Implementing with
Adaptive Server IQ
  • Mark Mumy
  • Principal Systems Consultant
  • Business Intelligence Division
  • mark.mumy_at_sybase.com

2
About This Presentation
  • Hardware Configuration
  • Creating the Database
  • IQ-M Server Settings
  • Connectivity
  • Data Model Recommendations
  • Database Programming
  • Data Manipulation
  • Maintenance Tasks
  • Memory Usage

3
Target Audience
  • Database Administrators
  • All Levels
  • Query Developers
  • All Levels

4
For This Presentation
  • Lets Keep It Interactive
  • I will entertain questions on a subject
  • Would be happy to speak to you offline, if
    desired
  • No Question is too Basic
  • Chances are others may have the same question
  • Open Question Forum at the end
  • Time Permitting

5
Hardware Configuration Drive Arrays
  • Volume Manager
  • Not a necessary or recommended component for IQ-M
  • Additional overhead and software that has no
    value add to an IQ-M installation
  • Acceptable for an IQ-M single node operation
    where hardware does not have the ability to apply
    RAID to the disk devices
  • RAID Level
  • Recommend using RAID 5 as a blend between
    performance, protection, and cost
  • Recommend using raw devices (a must for
    multi-node IQ-M)
  • For details on drive array specifications and
    configurations see the Sun Reference Architecture
    Whitepapers and EMC Whitepapers on array
    configurations with IQ-M

6
Hardware Configuration Memory
  • How Much RAM for IQ-M
  • As much as possible!
  • Most systems should have at least 2 GB
  • Dont forget additional RAM for each load being
    performed

7
Hardware Configuration Processors and Disk
Controllers
  • Optimal CPU configuration
  • Queries 1 CPU per active query more if queries
    are complex and can be run in parallel
  • Loads 1 CPU per HG index plus 1 CPU per 2-5
    columns being loaded
  • Disk Controllers
  • 1 fiber controller per 5 write CPUs
  • 1 fiber controller per 10 read CPUs
  • In mixed mode operation use the 51 ratio
  • Recommend a minimum of 2 controllers

8
Creating The Database Creation Options
  • Creation Options
  • Block Size and Page Size
  • Minimum should be 128K Page Size (8K Page)
  • Use 256K Page for larger databases
  • Larger the page size the more RAM that will be
    required
  • CASE IGNORE vs. CASE RESPECT
  • Use CASE RESPECT whenever possible
  • Removes case comparison steps and improves
    performance
  • Java in the Database
  • Install if Java will be needed in the database
  • No impact if installed and not used

9
Creating The Database File Placement
  • Use relative links makes it easier to relocate
    files
  • Use symbolic links for IQ MAIN and IQ TEMP
  • /IQM_devs/IQ_MAIN_00
  • /IQM_devs/IQ_TEMP_00
  • Place the transaction log and transaction log
    mirror on significantly large file system
  • Recommend 5 GB of filesystem space per IQ-M node
    for database related storage
  • Transaction logs, IQ MSG, scripts, catalog file,
    etc

10
Creating The Database Files Locations
  • Filename rules
  • Keep the names unique for each IQ-M node
  • Catalog DATABASE_NAME.db
  • Transaction log DATABASE_NAME.tlog
  • IQ MSG DATABASE_NAME.iqmsg
  • IQ Main Store DATABASE_NAME_iqmain_000
  • 000 would be an incrementing number
  • IQ Temp Store DATABASE_NAME_iqtmp_000
  • 000 would be an incrementing number
  • IQ Configuration File DATABASE_NAME.cfg

11
IQ-M Server Settings Query Performance
  • FORCE_NO_SCROLL_CURSORS
  • Should always be set to ON
  • Very few applications require this to be OFF
  • Can improve query performance

12
IQ-M Server Settings Query Plans
  • Query Plan Settings to Provide Optimal Query
    Information to DBAs and Sybase Engineering
  • set temporary option query_plan'on'
  • set temporary option query_plan_after_run'on'
  • set temporary option query_timing'on'
  • set temporary option query_detail'on'
  • set temporary option dml_options10'on'
  • set temporary option query_plan_as_html'on'
  • set option QUERY_NAME Query Name
  • Should not be set globally as the IQ MSG file
    will grow rapidly

13
IQ-M Server Settings Storage
  • Append_Load
  • Can be used to improve load performance
  • Will not reuse Row IDs or the space occupied by
    those Row IDs
  • Great for systems where large, contiguous chunks
    of data are deleted
  • May not good if random rows are deleted as it can
    lead to fragmentation and allocated, but unused
    space

14
IQ-M Server Settings Storage Continued
  • Disk_Striping
  • If ON, IQ-M will stripe writes to all available
    devices
  • If OFF, the first device must be full before the
    next is used
  • Disk_Striping_Packed
  • If ON, it forces better space usage and less
    fragmentation
  • Fragmentation is indicated when out of space
    messages are returned but the main dbspace used
    is less than 100 full
  • Allows Adaptive Server IQ to better utilize small
    pieces of unused space that remain after
    compression

15
IQ-M Server Settings Data Loads
  • LOAD_MEMORY_MB
  • Set to 0 (default) on systems with enough RAM
  • Set to something lt 500 on systems where RAM is
    tight or there are simultaneous loads taking
    place

16
Client Access and Network Connectivity
  • Different internal environments and settings for
    ODBC, JDBC, and Open Client connections
  • JDBC and ODBC are recommended
  • JConnect is Sybases implementation of JDBC
  • Use the JDBC 2 driver (JConnect 5.2/5.5)
  • com.sybase.jdbc2.jdbc.SybDriver
  • Open Client
  • Use with caution
  • Most applications written using Open Client
    expect an ASE server

17
Client Access and Network Connectivity Open
Client vs. ODBC
  • See Chapter 32 of the ASA Users Manual as well as
    the ASA Reference Section Transact-SQL and
    SQL/92 compatibility options for complete list
    of differences
  • If writing stored procedures or embedded
    application code, make sure to explicitly make
    settings for compatibility as these options will
    get set to different values for Open Client vs.
    ODBC connections
  • ALLOW_NULLS_BY_DEFAULT
  • QUOTED_IDENTIFIER
  • STRING_RTRUNCATION
  • ANSI_BLANKS
  • ANSINULL
  • CHAINDED

18
Client Access and Network Connectivity
  • AutoPreCommit Within ODBC
  • Set registry setting AutoPreCommit to Y
  • Forces applications to issue a COMMIT before each
    query
  • Go to the registry and update the corresponding
    Sybase Data Source Name (DSN) created, by adding
    a new value 'AutoPreCommit with a value of Y'
  • HKEY_LOCAL_MACHINE/SOFTWARE/ODBC/ODBC.INI/DSN
  • Packet Sizes
  • Larger packet sizes will help with large data
    retrieval
  • Use p option in IQ-M configuration file to
    increase size
  • Use CommBufferSize parameter in ODBC connection
    string

19
Client Access and Network Connectivity
  • Network Speed
  • Data retrieval will be as fast as the network
  • 100 MB of data will take 80 seconds on a 10 Mbit
    LAN
  • 100 MB of data will take 8 seconds on a 100 Mbit
    LAN
  • 100 MB of data will take 0.8 seconds on a Gigabit
    LAN
  • LAN speed may be the performance bottleneck for
    queries that return large amounts of data
  • The faster the network cards and LAN the better
    off concurrency will be more available bandwidth
    per user
  • Network Interface Cards
  • Adding multiple network cards to the IQ-M node
    will help with network

20
Client Access and Network Connectivity
  • ASE Component Integration Services (CIS)
  • Can have ASE CIS reference IQ-M tables (proxy
    tables)
  • Use ASE 12.5 CIS class of ASIQ
  • Prior to to ASE 12.5 ASAnywhere or ASEnterprise
    classes had to be used
  • Not a viable option if joining non-IQ-M tables
    with an IQ-M proxy table
  • Can map multiple ASE logins to a single IQ-M
    login
  • Data modifications should not be performed on
    proxy tables
  • Can map ASE proxy tables to stored procedure
    calls in IQ-M

21
Client Access and Network Connectivity
  • IQ-M Component Integration Services (CIS)
  • Can only be used on the Solaris and WinNT
    versions of IQ-M
  • Can map to tables in Oracle (ODBC), ASE (ODBC
    JDBC), ASA (ODBC JDBC), DB2 (ODBC), MS SQL
    Server (ODBC), and any ODBC source
  • Not a viable option if joining IQ-M tables with
    non-IQ-M proxy tables
  • Data modifications may be performed on proxy
    tables

22
Data Model Recommendations IQ UNIQUE
  • Use whenever possible to help save storage space
    and possibly improve query performance
  • Not necessary for cardinality greater than 64K
    (65536)
  • Not necessary for datatypes of tinyint or char(1)
  • Does not need to be exact, but should be close to
    cardinality
  • If the value is lt 255 then IQ-M will place a
    1-byte FP index on the column 1 byte of storage
    per row
  • If the value is gt 255 but lt 65536 then IQ-M will
    place a 2-byte FP index on the column 2 bytes
    of storage per row
  • May slightly hinder data loads, but improve query
    speeds

23
Data Model Recommendations NULL Values
  • Always specify NULL or NOT NULL
  • Open Client and ODBC connections have different
    defaults
  • Allows the optimizer a better guess at join
    criteria
  • NULL data does not save space on the database
    page as it would in ASE
  • Will be compressed out when stored on disk

24
Data Model Recommendations Unsigned Datatypes
  • Use unsigned datatypes where possible
  • Use for surrogate keys and join columns
  • Unsigned data comparisons are quicker
  • The caveat to this is that Open Client may
    misinterpret the value if it is too large as it
    does not understand large unsigned data
  • Can convert to signed integer, numeric, or
    decimal if returning data to an Open Client
    application
  • This caveat applies to moving data between IQ-M
    servers with INSERT FROM LOCATION

25
Data Model Recommendations Long Varchar and
Long Varbinary
  • Can be used to store large amounts of text or
    binary data
  • VARCHAR() or VARBINARY() datatypes
  • Maximum width is 32K (32768) bytes
  • The WORD index is the only index allowed on data
    256 bytes or wider
  • Storage will be allocated in 256 byte chunks
  • A 257 byte string will require 512 bytes of
    storage
  • Much less than the 2K requirement in ASE TEXT and
    IMAGE columns

26
Data Model Recommendations Varchar vs. Char
  • Use CHAR() whenever possible
  • All data in IQ-M is stored fixed width
  • VARCHAR() types only add overhead
  • A VARCHAR(100) columns will require 101 bytes of
    storage
  • 100 bytes for data
  • 1 byte for the size of data
  • CHAR() data is blank padded, VARCHAR() is not

27
Data Model Recommendations Proper Datatype
Sizing
  • Use the smallest datatypes possible for data
  • Be aware of all datatypes in IQ-M there may be
    more than you know
  • If hour, minute and second information is not
    necessary, use DATE instead of DATETIME
  • If the data will fit within a TINYINT or SMALLINT
    datatype use that rather than INTEGER or BIGINT
  • Allows the engine to store data in smaller units
    (1-byte TINYINT or 2-byte SMALLINT versus 4-byte
    INTEGER or 8-byte BIGINT
  • Dont over allocate storage when defining
    NUMERIC() or DECIMAL() as it can be costly for
    data that doesnt need all that space

28
Data Model Recommendations When and Where to
Use Indexes
  • Always use indexes on
  • Join columns
  • Searchable columns
  • Aggregation columns
  • If uncertain, place an LF or HG index on the
    column
  • A column with an HNG, CMP, or WD index should
    have a corresponding LF or HG index (very rare
    circumstances will negate this)
  • Indexes are not needed on columns whose data is
    ONLY returned to the client (projected)
  • Use Primary Key or UNIQUE HG indexes where
    appropriate

29
Data Model Recommendations Multi-column Indexes
  • Currently, only UNIQUE HG and PRIMARY KEY indexes
    support multiple columns
  • HG inserts are the most expensive in IQ-M
  • Try to guarantee that inserts will happen at the
    end of the index
  • Place generally incrementing data at the
    beginning of the index list
  • For instance, a transaction date or batch number
  • Something that will try to guarantee a sequential
    key

30
Data Model Recommendations Join Column
  • Prefer joining integer datatypes (unsigned if
    possible)
  • Integer comparisons are quicker than character
    comparisons
  • Keep the datatypes as narrow as possible to
    improve join performance

31
Data Model Recommendations Primary Keys
  • Multi-column primary keys should have an
    additional LF or HG index placed on each
    individual column
  • Must be done manually
  • UNIQUE CONSTRAINT, UNIQUE HG, and Primary Key are
    identical structures HG index with no G-Array
  • Use when possible
  • Helps optimizer make more informed query path
    decisions even if index is not used in joins or
    searches
  • You get an HG Index created automatically
  • This HG index has no G-Array (uses less space)

32
Data Model Recommendations Temporary Tables
  • 3 Types of Temporary Tables
  • tables
  • create table temp table ( col1 int )
  • Local Temporary Tables
  • declare local temporary table temp table ( col1
    int )
  • Behave just like tables
  • Global Temporary Tables
  • create global temporary table temp table ( col1
    int )
  • Tables structure is static across connections and
    reboots

33
Data Model Recommendations Temporary Tables
  • On Commit Preserve Rows
  • Use this option so that rows in temporary tables
    remain after the transaction has been committed
  • Temporary tables are available at the current
    level (parent) and all of its children
  • A parent cannot see a child's temporary table(s)

34
Data Model Recommendations Cursors
  • Avoid using cursors
  • Generally means row based processing
  • IQ-M was designed for set based processing
  • Sometimes they cannot be avoided
  • Open With Hold
  • Allows the cursor to remain open across
    transactions
  • If not used, the cursor is closed when a commit
    is issued

35
Database Programming Language Watcom SQL vs.
T-SQL
  • IQ-M (ASA) is not 100 T-SQL Compatible, but very
    close
  • Recommend using Watcom SQL
  • All system procedures written with it
  • Many more code examples and more IQ-M people
    versed in it
  • Watcom SQL has some extensions that T-SQL does
    not
  • Dynamic SQL
  • Better Loop control
  • Full cursor movement rather than just read next
  • Batches and procedures must be written in the
    same dialect
  • Cannot mix T-SQL with Watcom SQL

36
Database Programming Language Watcom SQL vs.
T-SQL
  • Behavior differences include
  • DECLARE CURSOR
  • GOTO
  • IF
  • PRINT
  • RAISERROR
  • SET
  • WHILE (T-SQL) vs. LOOP
  • Global variables
  • Variable Names
  • CALL
  • FOR
  • ASA requires variables to be declared immediately
    after a BEGIN

37
Database Programming Language Commit and
Rollback
  • Use transaction control around logical units of
    work, even read only queries
  • Should commit before a read/write batch is
    started to ensure latest version of data is
    available
  • Should issue commit and rollback after batch
    completion to release all query resources
  • Rollback will free memory resources in use by
    previous operations

38
Database Programming Language Custom Functions
  • Custom functions can be written in either SQL or
    Java
  • Great way to encapsulate business logic for
    transforming data
  • Can have a significant performance impact on
    queries
  • Functions are executed in the catalog portion of
    the engine
  • All result rows may need to be moved to ASA
  • Can be time consuming for large result sets
  • Turn on query plans to see what impact the
    functions have on effective query plans

39
Database Programming Language Outer Joins
  • T-SQL Outer Join or
  • ANSI Standard/Watcom SQL Outer Join
  • leftrightfull outer join
  • Be careful of outer join syntax
  • The T-SQL syntax can be very ambiguous or
    non-deterministic for IQ-M to translate
  • All T-SQL outer joins must be converted to ANSI
    outer joins and then processed
  • Use the ANSI standard instead as they are not
    ambiguous and are always clear in their meaning
  • Visit http//my.sybase.com/detail?id1017447 for
    more details

40
Data Manipulation Load Table
  • Parallel Load Table
  • Make sure to put column delimiter after last
    column
  • Must use ROW DELIMITED BY and DELIMITED BY
    options in load table command
  • Column and row delimiters must be a single
    character
  • 1 to 4 byte delimiters allowed for serial loads

41
Data Manipulation Load Table
  • If possible, perform load table from binary
    datafiles
  • Can be 3 to 10 times faster than ASCII loads
  • Can use FILLER() clause with a delimiter or byte
    count
  • Better performance achieved by casting the date
    or datetime formats rather than letting IQ-M
    guess
  • If possible, issue a single load table with
    multiple files rather than 1 load table per file
    to be loaded into a table

42
Data Manipulation Insert From Location
  • Great way to move data from any Open Client
    source
  • Syntax
  • insert into TABLE()
  • location SERVERNAME.DBNAME
  • select statement
  • IQ-M username and login must match on remote
    system
  • Interfaces entry must match the SERVERNAME
  • Can also be used to move data quickly from an ASA
    table to an IQ-M in the same server

43
Data Manipulation Single Row Operations
  • Avoid at all costs for large data manipulation
    operations
  • Different from single statement operations that
    modify many rows
  • Individual INSERT VALUES() will be slower than
    bulk load operations
  • Expect no more than 5,000 to 20,000 operations
    per hour

44
Data Manipulation Named Pipes and Flat Files
  • Named pipes can be faster no disk I/O
  • How to make a named pipe
  • mknod PIPE_NAME p
  • Can use BCP, GZIP, UNCOMPRESS, or applications to
    push IQ-M formatted data into named pipe
  • LOAD TABLE command can read from named pipe
  • Can also fast extract data to named pipes so that
    they can be read by another application or even
    compressed and stored

45
Data Manipulation Partial Width Inserts
  • Can induce fragmentation and overly large space
    consumption if not watched
  • Row IDs from deleted data are not reused during
    a partial width insert operation
  • Space from delete data is not reused (because Row
    IDs are not reused)
  • Partial width inserts are analogous to
    APPEND_LOADon in terms of Row ID and space
    behavior
  • Only becomes a problem if partial width inserts
    are a way of life for a table

46
Data Loading The Deep Fact Table
  • Many databases have a deep table
  • A growing table with tens of millions of rows
  • Rows typically rolled out and replaced over
    time
  • HG indexes are very slow to load/delete
  • Loading time can steadily increases as rows are
    added
  • Solution (with a caveat)
  • Partition the table (Example time day, week,
    month)
  • Build a view that is a Union of all the tables

47
Data Loading Partitioned Fact Table
Big Fact Table
Accessed by View
Create View bigtable as Select from t1 Union
All Select from t2 Union All Select from t3
48
Data Loading Why Partitioning?
  • Loads are faster and predictable
  • x million rows will load consistently
  • IQ may process the Union All in parallel
  • As long as cpu resources are available
  • To roll data out, truncate one table
  • Truncate table is much faster than Delete
  • No changes to DDL required
  • Load new data into the empty (truncated) table

49
Data Extraction
  • For fastest data unloads use the TEMP_EXTRACT
    options
  • Can unload data in ASCII or BINARY format
  • Recommend BINARY format as it is faster to reload
    into IQ-M
  • Great way to archive portions of the database
  • Can unload to one or more files, serially
  • Avoid using ISQL or DBISQL redirection to a file
  • Much slower than fast unloads

50
Maintenance Tasks Database Consistency Checks
  • The routine to check the database for potential
    corruptions is SP_IQCHECKDB
  • There are no run-time options for this command,
    however the DBCC options control the behavior of
    the SP_IQCHECKDB command
  • Recommendations
  • Run Level 1 DBCCs every 1 to 4 weeks (will
    process 1-2 GB per second)
  • Run Level 2 DBCCs every 1 to 3 months (will
    process 50-100 MB per second)

51
Maintenance Tasks Parallel Create index
  • Create Multiple Indexes in a batch
  • Syntax
  • BEGIN PARALLEL IQ
  • Create HG Index
  • Create LF Index
  • END PARALLEL IQ
  • Recommendation
  • Create no more indexes in a batch than you have
    cpu's
  • Note Two indexes on same column will be serial

52
Maintenance Tasks Setting the IQ Caches
  • Start by allocating memory to caches
  • 40 IQ Main Cache
  • 60 IQ Temporary Cache
  • Adjust allocations from here based on monitoring
  • Number of HG indexes in a table may alter this
    plan for loading data

53
Maintenance Tasks IQ Monitoring
  • IQ Monitor is a diagnostic tool for DBAs
  • It collects and reports internal counters from
    the IQ Buffer Caches
  • Main Cache interaction with the IQ Store
  • Temporary Cache interaction with the IQ Temp
    Store
  • IQ Monitor offers a series of views of the
    counters to showing differing aspects of the
    server and buffer cache workload

54
Maintenance Tasks IQ Monitoring
  • Provides different views of buffer activity
  • summary report of both caches
  • detailed report of one cache
  • i/o activity of a cache
  • a debug report of all buffer cache activity
  • You must specify an option when you start the
    monitor to specify what view to monitor
  • One Monitor may be running for a cache
  • May have one for each (Main Temp)

55
Managing Memory Inventory Your System Memory
  • Assess whats running and how much memory it uses
    (exclusive of IQ-M)
  • Typically these include
  • Operating System
  • OLAP Servers
  • Middleware
  • Other applications

56
Managing Memory
  • Deduct that total from RAM
  • Allow 20 for File System Cache
  • If using file systems
  • Deduct another 10 (just in case)
  • The rest is for IQ and IQ Caches
  • A picture tells this story

57
Memory The Big Picture
  • Start with the O/S and other applications
  • Determine what the IQ server needs (normal load)
  • IQ Overhead (Heap for loading)
  • Allocate IQ Caches
  • 40 IQ Main Cache
  • 60 IQ Temp Cache

58
Best Practices for Implementing with Adaptive
Server IQ
Session E110
Questions?
Write a Comment
User Comments (0)
About PowerShow.com