BIN315 : SSIS Connectivity With ETL Best Practice - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

BIN315 : SSIS Connectivity With ETL Best Practice

Description:

Describe connectivity and extensibility in SSIS ... Who are our connectivity partners? What data sources can SSIS load from/to? ... – PowerPoint PPT presentation

Number of Views:412
Avg rating:3.0/5.0
Slides: 34
Provided by: gal89
Category:

less

Transcript and Presenter's Notes

Title: BIN315 : SSIS Connectivity With ETL Best Practice


1
BIN315 SSIS Connectivity WithETL Best Practice
  • Itamar Ben Hemo
  • CO-CEO Vision.bi
  • Itamar_at_vision.bi

2
Session Objectives
  • Design ETL Why, What and How
  • Describe connectivity and extensibility in SSIS
  • Explain how the various data sources are
    supported

Key Takeaways
  • ETL Design - Success keys to make it smart and
    simple
  • ?? ??? ???? ???? ???? ?? ????"
  • Who are our connectivity partners?
  • What data sources can SSIS load from/to?
  • What to expect from these connections in terms
    of performance and 64bit?

3
ETL Why, What and How
Extract, Transform and Load Data from Operational
system to DWH environment (engine for
data-centric-integration). Data is recognized as
a strategic asset ,Many large organizations are
data processing engines E-World, Insurance,
banking, financial services The ETL Must be
clear, stable, easy for maintenance and with the
best of data quality (Bad/missing data have
greater implied costs)
4
Independencies between packages
5
Referential Integrity
Dont Handle Referential Integrity in the SSIS
ETL packages 5 Advantages for separating the
referential integrity Simple ETL packages- best
performance, simple for testing Independencies
between packages, dimensions and facts Save
memorySave maintenance timeOne convention for
all shared dimensions
6
DWH Sample Architecture
7
Operational Fields
Sources - SourceCreationTime, SourceUpdateTime DWH
- DwhCreationTime, DwhUpdateTime Dimensions
RI_Flag 5 Advantages for working with
Operational Fields Testing In development and
production Nearly real time DWH Gap time
between OS to DWH Fast recovery from garbage
data Quality Assurance - Rate of RI records for
each dimension
8
Configuration files and Parameters
Incremental Repository One central table with
parameters for all source Table_ID, SourceDB,
SourceSchema, SourceTable, StatusLastETL,
NoRecordsLastETL, FromID, ToID, ChunkSize
Separate Configuration files (one for each Data
Source)- XML with Environment parameters 5
Advantages for working with MngSourceTables and
Configuration file Production Testing
Parameters and values for Quality Assurance tests
(Status, NoRecords, From, To) Recovery from
unexpected gap (Chunk size) History Migration at
the first upload (Chunk size)Transfer between
environments (Configuration Dev, Testing,
Production)Management all incremental processes
in one place
9
Standard and other small tips- Dim
Dimension ETL Truncate insert to STGUpdate
Else Insert to DWHIf data in the source is
sorted Set IsSortedTrue. Performance- save
unnecessary sorting
10
Standard and other small tips
  • Convention names- Documentation, Debugging and
    Package execution reports
  • Select just needed columns save warnings,
    Save execution time
  • OLE DB Source component SQL statement
    Select a.Version_id,a.Version_Name From
  • Lookup / Merge join Not absolute answer (even
    though Lookup should be faster specially in
    small tables). Anyway when using Lookup -gt
    use cache type Full/partial

11
Standard and other small tips
  • Insert log into table (for analyzing and fine
    tuning)
  • Filter the data in the source adapter- if the
    source is file use the conditional split
  • Fine tuning of DefaultBufferMaxSize and
    DefaultBufferMaxRows properties

12
Standard and other small tips
  • Use Templates for one standard (Logging, Event
    handler and configuration), Our
    recommendation

13
Data Sources
14
Data Source Categories
15
Application Systems
16
Relational DB Systems
17
Structured Semi-Structured Data
18
Queue Systems Protocols
19
Providers Provider Stack
Source Provider
Destination Provider
20
Providers Provider Stack
21
SSIS vs. Provider Stacks
22
SSIS Boundary
23
SSIS Boundary
24
SQL Server(built-in)Support
25
64-bit Support
26
Office (2007) Support
27
Oracle Support
28
CautionOracle Number type
29
IBM DB2 Support
30
SAP Integration
31
Sowhat can SSIS connect to?
32
Summary
  • ETL Design - Design it smart and simple
    Success keys to make it smart and simple
  • ?? ??? ???? ???? ???? ?? ????"
  • Who are our connectivity partners?
  • What data sources can SSIS load from/to?
  • What to expect from these connections in terms
    of performance and 64bit?

33
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com