Sosialisasi A2 - PowerPoint PPT Presentation

Loading...

PPT – Sosialisasi A2 PowerPoint presentation | free to download - id: 6b994f-MDkwZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Sosialisasi A2

Description:

Pengantar Data Warehouse dan OLAP An OLAM Architecture Data Warehouse Meta Data MDDB OLAM Engine OLAP Engine User GUI API Data Cube API Database API Data cleaning ... – PowerPoint PPT presentation

Number of Views:5
Avg rating:3.0/5.0
Date added: 19 January 2020
Slides: 37
Provided by: ImasSSit
Learn more at: http://www.biomaterial.lipi.go.id
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Sosialisasi A2


1
Pengantar Data Warehouse dan OLAP
2
Agenda
  • Pengertian data warehouse
  • Model data multidimensi
  • Operasi-operasi dalam OLAP
  • Arsitektur data warehouse
  • Kegunaan data warehouse

3
Apa itu Data Warehousing?
  • Data warehouse adalah koleksi dari data yang
    subject-oriented, terintegrasi, time-variant, dan
    nonvolatile, dalam mendukung proses pembuatan
    keputusan.
  • Sering diintegrasikan dengan berbagai sistem
    aplikasi untuk mendukung pemrosesan informasi dan
    analisis data dengan menyediakan platform untuk
    historical data.
  • Data warehousing proses konstruksi dan
    penggunaan data warehouse.

4
Data warehouse -- subject oriented
  • Data warehouse diorganisasikan di seputar
    subjek-subjek utama seperti customer, produk,
    sales.
  • Fokus pada pemodelan dan analisis data untuk
    pembuatan keputusan, bukan pada operasi harian
    atau pemrosesan transaksi.
  • Menyediakan sebuah tinjauan sederhana dan ringkas
    seputar subjek tertentu dengan tidak
    mengikutsertakan data yang tidak berguna dalam
    proses pembuatan keputusan.

5
Data warehouse -- terintegrasi
  • Dikonstruksi dengan mengintegrasikan banyak
    sumber data yang heterogen.
  • relational database, flat file, on-line
    transaction record
  • Teknik data cleaning dan data integration
    digunakan
  • Untuk menjamin konsistensi dalam
    konvensi-konvensi penamaan, struktur pengkodean,
    ukuran-ukuran atribut dll diantara sumber data
    yang berbeda.
  • Contoh Hotel price currency, tax, breakfast
    covered, dll.
  • Data dikonversi ketika dipindahkan ke warehouse.

6
Data WarehouseTime Variant
  • Data disimpan untuk menyediakan informasi dari
    perspektif historical, contoh 5-10 tahun yang
    lalu.
  • Struktur kunci dalam data warehouse
  • Mengandung sebuah elemen waktu, baik secara
    ekspisit atau secara implisit.
  • Tetapi kunci dari data operasional bisa
    mengandung elemen waktu atau tidak.

7
Data Warehouse Non-Volatile
  • Data warehouse adalah penyimpanan data yang
    terpisah secara fisik yang ditransformasikan dari
    lingkungan operasional.
  • Data warehouse tidak memerlukan pemrosesan
    transaksi, recovery dan mekanisme kontrol
    konkurensi.
  • Biasanya hanya memerlukan dua operasi dalam
    pengaksesan data, yaitu initial loading of data
    dan access of data.

8
OLAP (on-line analitical processing)?
  • OLAP adalah operasi basis data untuk mendapatkan
    data dalam bentuk kesimpulan dengan menggunakan
    agregasi sebagai mekanisme utama.
  • Ada 3 tipe
  • Relational OLAP (ROLAP)
  • Multidimensional OLAP (MOLAP)
  • Hybrid OLAP (HOLAP) ? membagi data antara tabel
    relasional dan tempat penyimpanan khusus.

9
Data Warehouse vs. Operational DBMS
  • OLTP (on-line transaction processing)?
  • Major task of traditional relational DBMS
  • Day-to-day operations purchasing, inventory,
    banking, manufacturing, payroll, registration,
    accounting, etc.
  • OLAP (on-line analytical processing)?
  • Major task of data warehouse system
  • Data analysis and decision making
  • Distinct features (OLTP vs. OLAP)
  • User and system orientation customer vs. market
  • Data contents current, detailed vs. historical,
    consolidated
  • Database design ER application vs. star
    subject
  • View current, local vs. evolutionary, integrated
  • Access patterns update vs. read-only but complex
    queries

10
(No Transcript)
11
Dari tabel dan spreadsheet ke Kubus Data
  • Data warehouse didasarkan pada model data
    multidimensional, dimana data dipandang dalam
    bentuk kubus data
  • Kubus data, seperti sales, memungkinkan data
    dipandang dan dimodelkan dalam banyak dimensi
  • Tabel dimensi, seperti item (item_name, brand,
    type), or time(day, week, month, quarter, year)
  • Tabel fakta mengandung measures (seperti
    dollars_sold) dan merupakan kunci untuk setiap
    tabel-tabel dimensi terkait.
  • n-D base cube dinamakan base cuboid. 0-D cuboid
    merupakan cuboid pada level paling tinggi, yang
    menampung ringkasan data dalan level paling
    tinggi, dinamakan apex cuboid. Lattice dari
    cuboid-cuboid membentuk sebuah data cube.

12
Cube A Lattice of Cuboids
all
0-D(apex) cuboid
time
item
location
supplier
1-D cuboids
time,item
time,location
item,location
location,supplier
2-D cuboids
time,supplier
item,supplier
time,location,supplier
time,item,location
3-D cuboids
item,location,supplier
time,item,supplier
4-D(base) cuboid
time, item, location, supplier
13
Pemodelan Konseptual Data Warehouse
  • Star schema Sebuah tabel fakta di tengah-tengah
    dihubungkan dengan sekumpulan tabel-tabel
    dimensi.
  • Snowflake schema perbaikan dari skema star
    ketika hirarki dimensional dinormalisasi ke dalam
    sekumpulan tabel-tabel dimensi yang lebih kecil
  • Fact constellations Beberapa tabel fakta
    dihubungkan ke tabel-tabel dimensi yang sama,
    dipandang sebagai kumpulan dari skema star,
    sehingga dinamakan skema galaksi atau fact
    constellation.

14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
Hirarki Konsep Dimensi (Lokasi)?
all
all
Europe
North_America
...
region
Mexico
Canada
Spain
Germany
...
...
country
Vancouver
...
...
Toronto
Frankfurt
city
M. Wind
L. Chan
...
office
18
Tampilan datawarehouse dan hirarki
  • Specification of hierarchies
  • Schema hierarchy
  • day lt month lt quarter week lt year
  • Set_grouping hierarchy
  • 1..10 lt inexpensive

19
Data Multidimensional
  • Sales volume sebagai fungsi dari product, month,
    dan region

Dimension Product, Location, Time Hierarchical
summarization paths
Region
Industry Region Year Category
Country Quarter Product City Month
Week Office Day
Product
Month
20
(No Transcript)
21
Cuboid yang terkait dengan kubus
all
0-D(apex) cuboid
country
product
date
1-D cuboids
product,date
product,country
date, country
2-D cuboids
3-D(base) cuboid
product, date, country
22
Browsing kubus data
  • Visualization
  • OLAP capabilities
  • Interactive manipulation

23
Operasi-operasi OLAP
  • Roll up (drill-up) summarize data
  • by climbing up hierarchy or by dimension
    reduction
  • Drill down (roll down) reverse of roll-up
  • from higher level summary to lower level summary
    or detailed data, or introducing new dimensions
  • Slice and dice
  • project and select
  • Pivot (rotate)
  • reorient the cube, visualization, 3D to series of
    2D planes.
  • Other operations
  • drill across involving (across) more than one
    fact table
  • drill through through the bottom level of the
    cube to its back-end relational tables (using
    SQL)?

24
Ilustrasi
  • Ilustrasi untuk operasi-operasi pada data
    multidimensi.

25
Rancangan Data Warehouse Business Analysis
Framework
  • Four views regarding the design of a data
    warehouse
  • Top-down view
  • allows selection of the relevant information
    necessary for the data warehouse
  • Data source view
  • exposes the information being captured, stored,
    and managed by operational systems
  • Data warehouse view
  • consists of fact tables and dimension tables
  • Business query view
  • sees the perspectives of data in the warehouse
    from the view of end-user

26
Proses Perancangan Data Warehouse
  • Top-down, bottom-up approaches or a combination
    of both
  • Top-down Starts with overall design and planning
    (mature)?
  • Bottom-up Starts with experiments and prototypes
    (rapid)?
  • From software engineering point of view
  • Waterfall structured and systematic analysis at
    each step before proceeding to the next
  • Spiral rapid generation of increasingly
    functional systems, short turn around time, quick
    turn around
  • Typical data warehouse design process
  • Choose a business process to model, e.g., orders,
    invoices, etc.
  • Choose the grain (atomic level of data) of the
    business process
  • Choose the dimensions that will apply to each
    fact table record
  • Choose the measure that will populate each fact
    table record

27
(No Transcript)
28
Data Warehouse Back-End Tools and Utilities
  • Data extraction
  • get data from multiple, heterogeneous, and
    external sources
  • Data cleaning
  • detect errors in the data and rectify them when
    possible
  • Data transformation
  • convert data from legacy or host format to
    warehouse format
  • Load
  • sort, summarize, consolidate, compute views,
    check integrity, and build indicies and
    partitions
  • Refresh
  • propagate the updates from the data sources to
    the warehouse

29
Three Data Warehouse Models
  • Enterprise warehouse
  • collects all of the information about subjects
    spanning the entire organization
  • Data Mart
  • a subset of corporate-wide data that is of value
    to a specific groups of users. Its scope is
    confined to specific, selected groups, such as
    marketing data mart
  • Independent vs. dependent (directly from
    warehouse) data mart
  • Virtual warehouse
  • A set of views over operational databases
  • Only some of the possible summary views may be
    materialized

30
Data Warehouse Development A Recommended Approach
Multi-Tier Data Warehouse
Distributed Data Marts
Enterprise Data Warehouse
Data Mart
Data Mart
Model refinement
Model refinement
Define a high-level corporate data model
31
OLAP Server Architectures
  • Relational OLAP (ROLAP)
  • Use relational or extended-relational DBMS to
    store and manage warehouse data and OLAP middle
    ware to support missing pieces
  • Include optimization of DBMS backend,
    implementation of aggregation navigation logic,
    and additional tools and services
  • greater scalability
  • Multidimensional OLAP (MOLAP)
  • Array-based multidimensional storage engine
    (sparse matrix techniques)?
  • fast indexing to pre-computed summarized data
  • Hybrid OLAP (HOLAP)?
  • User flexibility, e.g., low level relational,
    high-level array
  • Specialized SQL servers
  • specialized support for SQL queries over
    star/snowflake schemas

32
Data Warehouse Usage
  • Three kinds of data warehouse applications
  • Information processing
  • supports querying, basic statistical analysis,
    and reporting using crosstabs, tables, charts and
    graphs
  • Analytical processing
  • multidimensional analysis of data warehouse data
  • supports basic OLAP operations, slice-dice,
    drilling, pivoting
  • Data mining
  • knowledge discovery from hidden patterns
  • supports associations, constructing analytical
    models, performing classification and prediction,
    and presenting the mining results using
    visualization tools.
  • Differences among the three tasks

33
From On-Line Analytical Processing to On Line
Analytical Mining (OLAM)?
  • Why online analytical mining?
  • High quality of data in data warehouses
  • DW contains integrated, consistent, cleaned data
  • Available information processing structure
    surrounding data warehouses
  • ODBC, OLEDB, Web accessing, service facilities,
    reporting and OLAP tools
  • OLAP-based exploratory data analysis
  • mining with drilling, dicing, pivoting, etc.
  • On-line selection of data mining functions
  • integration and swapping of multiple mining
    functions, algorithms, and tasks.
  • Architecture of OLAM

34
An OLAM Architecture
Layer4 User Interface
Mining query
Mining result
User GUI API
OLAM Engine
OLAP Engine
Layer3 OLAP/OLAM
Data Cube API
Layer2 MDDB
MDDB
Meta Data
Database API
FilteringIntegration
Filtering
Layer1 Data Repository
Data Warehouse
Data cleaning
Databases
Data integration
35
Referensi
  • Data Mining Concepts and Techniques by Jiawei
    Han and Micheline Kamber, 2001
  • Introduction to Data Mining by Tan, Steinbach,
    Kumar, 2004

36
  • Terima kasih
About PowerShow.com