Tuning a Very Large Data Warehouse - PowerPoint PPT Presentation

About This Presentation
Title:

Tuning a Very Large Data Warehouse

Description:

Low Intensity Changes with low impact but with high performance benefits. Localized changes ... Like SUNDAY can be reloaded again on the same partition next Sunday. ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 36
Provided by: bpic
Learn more at: http://www.nocoug.org
Category:
Tags: data | tuning | warehouse

less

Transcript and Presenter's Notes

Title: Tuning a Very Large Data Warehouse


1
Tuning a Very Large Data Warehouse
  • Pichai Bala

2
About Me
  • Working in the IT industry for the past 17 years
  • Working in Oracle since 1993.
  • Working in Data Warehouse and BI since 2003

3
Disclaimer
  • The views expressed in this presentation are
  • mine and does not represent the organization I
    work for
  • or the organizations I had worked for in the
    past.
  • Please always test and validate the ideas
    presented here
  • in a test environment first.

4
A Day in the life of a DBA
5
How about this one?
6
What the chaos mean for the DBA?
  • Free buffer waits
  • enq KO - fast object checkpoint
  • enq TM - contention
  • db sequential read
  • CPU time
  • Logical I/O
  • Physical I/O

7
What it means to the End Users?
  • ETL Load/Batch Job Delays
  • Reporting Delays
  • Decision Making Delays
  • Business Analytics Delays
  • Customer Intelligence Delays
  • Planning and Forecasting Delays
  • Key Performance Metrics Delays

8
Data Warehouse is now in Death Bed
9
But, Why?
10
Data Warehouse Vicious Cycle
  • Data gets deployed
  • Gains User Acceptance
  • More Users and More Demands and Needs
  • Existing Data Grows and New Data gets Deployed
  • and gets into the Death Spiral

11
(No Transcript)
12
Possible Causes.
  • Lack of proper and meaningful maintenance
  • Human errors
  • Poor Design
  • Bad SQLs by developers, users
  • Poor monitoring and scheduling etc..

13
Tuning Strategy
  • Keep it Simple
  • Low Intensity Changes with low impact but with
    high performance benefits
  • Localized changes
  • No change in logic
  • Easy to understand, test and deploy

14
Reduce Wastage
  • Reduce CPU
  • Reduce Logical IO
  • Reduce Physical IO
  • Reduce UNDO
  • Reduce Direct Path Reads

15
How it can be done?
  • Server Tuning
  • Instance Tuning and Maintenance
  • Database Tuning and Maintenance
  • Table Reorganizations/Redefinitions
  • New Indexes
  • Regular Statistics Collection
  • Views
  • SQL/PLSQL Code Changes
  • Working with other teams
  • Educating/Training the users

16
Instance/Database Tuning
  • SGA Max Size
  • DB Cache Size
  • Shared Pool
  • Large Pool
  • No. of DB Writers
  • Redo Log File Size
  • Typical Init.ORA parameters like QUERY_REWRITE,
    BITMAP_MERGE_JOIN

17
SQL/PLSQL Tuning
  • Avoid Clutter
  • Use Indexes when appropriate
  • Full Table Scan is not bad
  • Revisit the code
  • Cunning code is not always necessary
  • Work with other teams and business to reduce
    complexity in code
  • Avoid Hints

18
Query Results can be wrong
  • In 10G use ORDER BY whenever GROUP BY is used
  • Hidden parameter can be enabled with the help of
    Oracle Support

19
Pillars of the Data Warehouse
  • Partitioning
  • Parallelism
  • Aggregations
  • Compression
  • Materialized Views
  • Read Only Tablespaces
  • Data Archival

20
Partitioning
  • Range Partitioning
  • List Partitioning
  • Range List Partitioning
  • Range Hash Partitioning
  • Hash Partitioning
  • Caveat Joins beware.

21
Parallelism
  • Tables can be built parallel
  • Parallel Indexes
  • Parallel Hints while loading or querying.
  • Alter table ltxxxgt move parallel (degree 8)
  • Alter table ltxxxgt split parallel( degree 4)
  • Create table ltxxxgt parallel(degree 4)
  • Sufficient LARGE_POOL helps greatly

22
Aggregations
  • Aggregations and MVs are the soul of any DSS
  • Most BI tools supports Aggregation Awareness
  • Have multiple aggregations
  • Aggregations help users with adhoc queries
  • Daily, Monthly and Yearly Aggregations are very
    common in most DSS

23
Compression
  • Saves Disk Space by 40 to 50
  • Reduces Logical IO
  • Reduces Physical IO
  • Reads will be fast
  • DMLs will be slow
  • Compress Table as well as Index
  • Caveat You cant uncompress after the table is
    compressed
  • ORA-01735 invalid ALTER TABLE option

24
Materialized View
  • Fast Refresh may be very slow
  • From 10G MV can be parallel
  • MVs can be partitioned
  • MV_CAPABILITY results can be misleading.
  • ALTER MATERIALIZED VIEW ltmv_namegt parallel
    (degree 4 )
  • For MV Fast Refresh to be successful a Complete
    Refresh should happen before

25
Exchange Partitions
  • Very Useful
  • Dictionary update only
  • Cant Exchange a table with bitmap indexes with a
    partition
  • Partition exchange has issue with BITMAP
    indexes with the ora error for mismatch indexes
    0RA-14098

26
READONLY Tablespaces
  • Data Warehouse has time variant non-volatile data
  • Say Range Partition on TIME, and making historic
    tablespaces READONLY helps Database Checkpoint
    process

27
Data Archival
  • With various regulatory and internal requirements
    data needs to be retained for 2 to 30 years.
  • Data growth is exponential
  • Archival is needed to start it small and keep it
    small
  • Saves in Database licenses and maintenance.
  • Helps the optimizer to get results faster from a
    smaller set

28
Rolling Partitions
  • If design permits instead of creating new
    partitions every time the same partition can be
    reused again and again.
  • Like SUNDAY can be reloaded again on the same
    partition next Sunday.
  • Rolling Partitions by HOUR or by DAY of the WEEK
    can be considered
  • Helps Data Retention Strategies too.

29
Case of HUGE UNDO
  • More than 30G of UNDO was getting generated for a
    1.5G table
  • Fix the code and fix the problem.

30
Misleading Vlock
  • Blocking locks wont show in vlock but locks
    would exist
  • Use xkgllk or xkglpn to identify and kill the
    blocking sessions.

31
Package Invalidations
  • Package gets invalidated but cant recompile
    itself because of sessions holding them invisibly
  • Coding and deployment standards can help

32
ORA-02049 timeout distributed transaction
waiting for lock
  • Flush the Shared Pool, the failures go away
  • From 10G you can avoid bounces by flushing
    buffer_cache and shared_pool

33
Again?
34
Stuck in traffic? Meet the new supercar based on
Ferrari that could fly you out of jams. Only
500,000.
35
Flying Ferrari
Write a Comment
User Comments (0)
About PowerShow.com