A Hybrid Model for Data Synchronism in Data Warehouse Projetcs - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

A Hybrid Model for Data Synchronism in Data Warehouse Projetcs

Description:

Analytical DB dos not contain any pre-aggregated information. Any aggregated value requirement ... Dr dynamic model (real time synchronism) ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 22
Provided by: terryk1
Category:

less

Transcript and Presenter's Notes

Title: A Hybrid Model for Data Synchronism in Data Warehouse Projetcs


1
A Hybrid Model for Data Synchronism in Data
Warehouse Projetcs
  • Isabel Cristina Italiano
  • ici_at_ime.usp.br

João Eduardo Ferreira jef_at_ime.usp.br
Universidade de Sao Paulo - Brazil
Seventh International Database Engineering and
Applications Symposium IDEAS 2003 Hong Kong
2
Data Warehouse Synchronism Model
CriticalInformation
3
Objective
  • Introduce a framework to define the most suitable
    synchronism option
  • Parameters and function
  • For each data portion

4
Dynamic Synchronism Model
  • Dynamic Synchronism updates on transactional
    environment reflected immediately on analytical
    environment
  • DW loading divided among various maintenance
    transactions
  • Data portions based on business niches
  • Solution for paralelization of static and dynamic
    loadings

5
Hybrid Synchronism Model
Static DW
  • Portions are synchronized in different time
    intervals
  • Based on Characteristics of
  • Transaction/Operational Environment
  • Analytical Applications Environment

Dynamic DW
6
Transactional Environment Characteristics
  • Transactional Environment
  • Set of functions that generates information
    stored in Data Warehouse
  • Functions that handle data sets
  • Regarded as functional units in transactional
    environment

7
Transactional Parameter P1Applicability
  • The need for dynamic maintenance
  • Typically static
  • Transactional system has characteristics that
    require a typically static loading, with no
    requirement for implementing dynamic updates
  • Dynamic
  • Transactional system requires a data warehouse
    dynamic update, considering the analyzed set of
    transactions

8
Transactional Parameter P2Adaptability
  • The adaptation capacity of the transactional
    environment
  • Dynamic loading is allowed to be implemented
  • Requirements could be added for gathering and
    propagating information to the data warehouse and
    the transactions involved
  • Dynamic loading is not allowed to be implemented
  • There is not a minimum control level to implement
    the functions

9
Transactional Parameter P3Time Interval
  • The desirable time interval for synchronism
  • Minimum Possible (Real Time)
  • Sysnchronism must be in real time, keeping only
    the minimum time required for information
    transmission and processing
  • Determined Interval
  • Application requires synchronism in a certain
    period of time, longer than required for real
    time, but shorter than the static loading
    frequency

10
Transactional Parameter P3Time Interval
  • The desirable time interval for synchronism
    (cont.)
  • Postponed Synchronism
  • Synchronism can be kept at regular intervals
    determined by the data warehouse update static
    process during its periodic loading (eg. day-1,
    day-2, month etc.)

11
Transactional Parameter P4Complexity in
Propagation
  • Complexity in Propagation number of different
    objects involved in the transaction during a
    transactional database update
  • Low
  • Small number of distinct objects involved in the
    update (1 or 2)
  • Average
  • From 3 to 5 distinct objects involved
  • High
  • More than 5 distinct objects involved

12
Transactional Parameter P5Complexity in
Consistency
  • Complexity in Consistency degree of complexity
    of check and validation operations
  • Low
  • There is not, or it is too small the need to
    validate or check other elements in addition to
    the information handled by the transaction
    (simple selections with no more complex
    operations)
  • Average
  • Average level of complexity when preparing
    information for loading (joining-related
    selection operations)
  • High
  • Transaction requieres highly complex checks and
    validations when preparing information to be
    transmitted to DW (joins, group bys, having
    operators)

13
Analytical Environment Characteristics
  • Analytical Environment
  • Different types of users, queries
  • Variety of views of the DW
  • Numerous combinations of aggregations and
    detailed data
  • Several strategies to implement dynamic
    aggregations (on fly), virtual views,
    materialized views, summarization tables, cubes

14
Analytical Parameter P6Use of Aggregates
  • The use of aggregates in analytical queries
  • Just Detail
  • Queries involved do not use aggregates em,
    therefore, can only access information at a lower
    granularity level
  • Aggregates
  • Queries perform functions such as sum, count,
    group by and others, supplying aggregate
    information to the user

15
Analytical Parameter P7Aggregates Implementation
  • How aggregates are implemented in the analytical
    base
  • None
  • Analytical DB dos not contain any pre-aggregated
    information. Any aggregated value requirement
  • Virtual Views with Aggregates
  • Analytical DB contains aggregation views (not
    materialized) that can be used by the queries
    (computed during queries executions)

16
Analytical Parameter P7Aggregates Implementation
  • How aggregates are implemented in the analytical
    base (cont.)
  • Materialized views with aggregates
  • Analytical DB contains tabels or views of the
    materialized aggregations, which contain
    aggregated information required by the inquiry
  • Cubes
  • All aggregation levels required by the queries
    have been preciously prepared and stored in cubes

17
Decision Trees
  • Parameters must be jointly evaluated
  • Displayed as decision trees representing all
    possible combinations
  • Leaves represent the select synchronism level
  • Dr dynamic model (real time synchronism)
  • Ddt dynamic model with synchronism in a certain
    time interval (?t)
  • St static model with synchronism based on
    periodic loadings
  • -- combination path is invalid

18
The Parameter Analysis Function
  • Applied on information subsets contained in data
    warehouse
  • Analysis considering that portion of DW
    information and the characteristics of the
    transactional environment that originates that
    information
  • Based on values of decision trees parameters

19
The Parameter Analysis Function
  • Result the most suitable synchronism option to
    an specific data portion
  • Prototype was developed
  • Requirements change over the time
  • New business requirements
  • New analysis requirements
  • DW volume different aggregation strategies
  • Frequent monitoring to identify possible changes
    in the synchronism model

20
Conclusion
  • We have presented
  • Customized solution for different processes of
    data update between transactional and analytical
    environments
  • We have proposed
  • Hybrid data synchronism model for data warehouse
    projects
  • Guidelines for transactions classification
  • Parameters and function to determine the most
    suitable synchronism option

21
Future Works
  • Incorporation of the prototype developed through
    data transfer tools
  • Evaluation and extension of the hybrid data model
    for heterogeneous databases
  • Methodology for using hybrid models for data
    synchronism in data warehouse projetcs
Write a Comment
User Comments (0)
About PowerShow.com