Title: Using i* modeling for the multidimensional design of data warehouses
1Using i modeling for the multidimensional design
of data warehouses
- Jose-Norberto Mazón, jnmazon_at_dlsi.ua.es
- Juan Trujillo, jtrujillo_at_dlsi.ua.es
- Toronto, 17th July 2008
2Contents
- Introduction
- Current research
- Requirements for DWs
- Reconciling with data sources
- Deriving logical representations
- Conclusions and short term research
3Contents
- Introduction
- Current research
- Requirements for DWs
- Reconciling with data sources
- Deriving logical representations
- Conclusions and short term research
4IntroductionResearch problem
- Data warehouse
- Integrated collection of historical data in
support of decision making process - Multidimensional (MD) modeling
- Fact
- Contains interesting measures of a business
process - Dimension
- Represents context of analysis
- Resembles traditional method for database design
- Model at conceptual level
- Abstracting details related to specific
technologies
5IntroductionResearch problem
- Integrated collection of historical data in
support of decision makers
OLAP
INTERNAL
DATA MINING
DATAWAREHOUSE
ETL
CUBES
REPORTS
DATA SOURCES
WHAT-IF ANALYSIS
EXTERNAL
6IntroductionResearch problem
- Integrated collection of historical data in
support of decision makers
OLAP
INTERNAL
DATA MINING
DATAWAREHOUSE
ETL
CUBES
REPORTS
DATA SOURCES
DATA SOURCES
WHAT-IF ANALYSIS
EXTERNAS
7IntroductionResearch problem
- Integrated collection of historical data in
support of decision makers
OLAP
INTERNAL
DATA MINING
DATAWAREHOUSE
ETL
CUBES
REPORTS
DATA SOURCES
DATA SOURCES
WHAT-IF ANALYSIS
EXTERNAS
- Information needs cannot be understood by only
analyzing data sources
8IntroductionResearch problem
- Integrated collection of historical data in
support of decision makers
OLAP
INTERNAL
DATA MINING
DATAWAREHOUSE
ETL
CUBES
REPORTS
DATA SOURCES
DATA SOURCES
DECISION MAKERS
EXTERNAS
- Information needs cannot be understood by only
analyzing data sources
9IntroductionResearch problem
- Integrated collection of historical data in
support of decision makers
OLAP
INTERNAL
DATA MINING
DATAWAREHOUSE
ETL
CUBES
REPORTS
DATA SOURCES
DATA SOURCES
DECISION MAKERS
WHAT-IF ANALYSIS
WHAT-IF ANALYSIS
- Decision making processes must be understood
by designers
EXTERNAS
- Information needs cannot be understood by only
analyzing data sources
10IntroductionDrawbacks of the state-of-the-art
- Only data sources are analyzed to define the
conceptual MD model - Incorrect information needs may be modeled
- Requirements are specified once the conceptual MD
model is defined (even after the deployment of
the DW) - Incorrect MD elements may be modeled
- Requirements and data sources are not reconciled
- Complex ETL processes to populate the DW
- Thus, the DW is not viewed as a valuable resource
11IntroductionNovelty of our proposal
- 1. Explicit requirement analysis stage
- Focus on decision making processes
- Information requirements
- 2. Transformation to a conceptual MD model
- Model Driven approach
- MD model agrees with decision makers
expectations - 3. Reconcile requirement model with data sources
- MD model agrees with data sources
- Completeness
- Faithfulness
12IntroductionNovelty of our proposal
- 1. Explicit requirement analysis stage
- Focus on decision making processes
- Information requirements
- 2. Transformation to a conceptual MD model
- Model Driven approach
- MD model agrees with decision makers
expectations - 3. Reconcile requirement model with data sources
- MD model agrees with data sources
- Completeness
- Faithfulness
13IntroductionNovelty of our proposal
- 1. Explicit requirement analysis stage
- Focus on decision making processes
- Information requirements
- 2. Transformation to a conceptual MD model
- Model Driven approach
- MD model agrees with decision makers
expectations - 3. Reconcile requirement model with data sources
- MD model satisfies decision makers needs
- MD model agrees with data sources
- Completeness
- Faithfulness
14IntroductionObjectives of our proposal
- Defining a goal-oriented approach for DWs
- Based on i
- Model decision processes
- Decision makers are concerned about GOALS not
directly DATA - Traceability to a conceptual MD model
- Align with MDA
- Integrate requirements and data sources
15MDA
- Model Driven Architecture (MDA)
- Object Management group (OMG) standard
- Using models in software development
- Computation Independent Model (CIM)
- Platform Independent Model (PIM)
- Platform Specific Model (PSM)
- Transformations between models
- Query/View/Transformation language (QVT)
- The code is obtained from PSMs
16MDA
- Model Driven Architecture (MDA)
Describes user requirements
Contains information about functionality
and structure of the system without taking into
account the technology used to implement it
Includes information about the specific technology
that is used in the implementation of the system
on a specific platform
Every PSM is transformed into code to
be executed, obtaining the final software product.
17MDA
- Query/View/Transformation language (QVT)
- Declarative part of QVT
- Transformation ?? set of relations
- Relations between metamodels formally defined and
automatically performed - Relations applied to models
18MDA
MODEL 1
Declarative approach of QVT specifies
relationships that must hold between candidate
models
CANDIDATE MODEL
DOMAIN
R
MODEL2
METAMODEL NAME
KIND OF RELATION
WHEN WHERE CLAUSES
19IntroductionOur proposal
DOLAP 2005 DaWaK 2006 DSS 2008
REBNITA 2005 RIGIM 2007
ER 2006 ER 2007 DKE 2007
20Contents
- Introduction
- Current research
- Requirements for DWs
- Reconciling with data sources
- Deriving logical representations
- Conclusions and short term research
21Requirements for DWs
- Goal Oriented Requirement Engineering
- DW supports the decision making process to
fulfill goals of an organization - Decision makers are concerned about goals
- Information requirements are obtained by refining
decision makers goals - MDA approach
- Information requirements must be derived into a
conceptual MD model
22Requirements for DWs
- CIM
- Goals and information requirements
- PIM
- Conceptual MD model
- QVT
- Transformation between models
23Requirements for DWsDefining a CIM
- Classification of DW goals
- Strategic goals
- Change to a better situation
- Decision goals
- Take appropiate actions
- Information goals
- Related to required information
- Information requirements
- Interesting measures of business process
- Context of analysis
24Requirements for DWsDefining a CIM
- i framework
- Modeling goals of decision makers and the
required tasks and resources to fulfil them - Several decision makers with different goals
- Two extensions of UML
- Profile for i
- Profile for adapting i to the DW domain
25Requirements for DWsDefining a CIM
26Requirements for DWsSample CIM
27Requirements for DWsSample CIM
28Requirements for DWsSample CIM
29Requirements for DWsSample CIM
30Requirements for DWsSample CIM
31Requirements for DWsSample CIM
32Conceptual MD model
- UML Profile for MD modeling
- Luján, Trujillo, Song. A UML profile for
Multidimensional Modeling in Data Warehouses.
Data and Knowledge Engineering. 2006. - Class diagram
Stereotype Icon
Fact
Dimension
Base
FactAttribute
DimensionAttribute
Rolls-UpTo ltltRolls-UpTogtgt
33Conceptual MD model
34Conceptual MD modelObtaining an initial PIM
35Conceptual MD modelObtaining an initial PIM
36Conceptual MD modelSample initial PIM
37Reconciling with data sources
RECONCILIATION
INITIAL PIM
USER REQUIREMENTS
PIM
DATA SOURCES
PSM
38Reconciling with data sources
- The MD conceptual model is reconciled with the
available data sources - The DW will be properly populated from data
sources - The analysis potential provided by the data
sources is captured by the MD conceptual model - Redundancies are avoided
- Optional dimension levels are controlled to
enable summarizability and to avoid inconsistent
queries - Reconciliating process is automatically performed
- QVT relations based on Multidimensional Normal
Forms - Lechtenbörger and Vossen. Multidimensional normal
forms for data warehouse design. Information
Systems 28(2003)
39Reconciling with data sources
40Reconciling with data sources
d
1..n
ltltRolls-upTogtgt
r
1
n_t1district, n_t2state
41Deriving logical representations
- PIM
- UML profile for MD modeling Luján et al. DKE
2006 - PSM
- Common Warehouse Metamodel (CWM)
- From PIM to each PSM
- QVT transformation
42Deriving logical representations
- Common Warehouse Metamodel (CWM)
- Resource layer
- Standard to represent the structure of data
according to certain technologies - Relational metamodel
- Tables, columns, primary keys, and so on
- Multidimensional metamodel
- Generic data structures
- Vendor specific extension
- Oracle Express extension
43Contents
- Introduction
- Current research
- Requirements for DWs
- Reconciling with data sources
- Deriving logical representations
- Conclusions and short term research
44ConclusionsObjectives
- DW projects fail in support decision making
process - Requirement analysis stage is overlooked for
defining a conceptual MD model - Using i framework together with MDA
45ConclusionsScientific contributions
- MDA framework
- UML profile for i
- Extension for using i in the DW domain
- Transformations to obtain a conceptual MD model
- Several kind of logical representations
- Multidimensional normal forms
- Reconciling data sources and requirements in a
hybrid approach - Eclipse-based prototype
46Eclipse-based prototype
47ConclusionsRelated work at LUCENTIA research
group
MDA DKE 2007 DSS 2008
Requirements for DWs RIGiM 2007
CIM
UML profile for Data mining DKE 2007
UML Profile for MD Modeling at DKE 2006
Data sources analysis ER 2007
PIM
Common Warehouse Metamodel
Security DSS 2006 IS 2007
UML for Physical Modeling at JCIS 2006
PSM
48Short term research
- Studying unstructured decision processes in-deth
to model them in i diagrams - Taking advantage of every i feature
- Considering complex mechanisms to reason about
goals and structure decision processes - Prioritization of goals
49Using i modeling for the multidimensional design
of data warehouses
- Jose-Norberto Mazón, jnmazon_at_dlsi.ua.es
- Juan Trujillo, jtrujillo_at_dlsi.ua.es
- Toronto, 17th July 2008