University of Economics, Prague - PowerPoint PPT Presentation

About This Presentation
Title:

University of Economics, Prague

Description:

KEX (decision rules), large preprocessing module including SQL, ... Berka,P. - Rauch,J.: Data Mining using GUHA and KEX. In: (Callaos, Yang, Aguilar eds. ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 21
Provided by: ber59
Category:

less

Transcript and Presenter's Notes

Title: University of Economics, Prague


1
University of Economics, Prague
  • MLNET related activities of Laboratory for
    Intelligent Systems
  • and
  • Dept. of Information and Knowledge Engineering
  • http//lisp.vse.cz/berka/MLNet.html

2
Research
  • probabilistic methods - decomposable probability
    models and bayesian networks
  • symbolic methods - generalized association rules
    and decision rules
  • logical calculi for knowledge discovery in
    databases

3
People
Petr Berka
Jirí Ivánek
Radim Jiroušek
Jan Rauch
Tomáš Kocka
Vojtech Svátek
4
Software
  • LISp-Miner
  • two data mining procedures
  • 4FT Miner (generalised association rules) and
  • KEX (decision rules),
  • large preprocessing module including SQL,
  • output of rules in database format enables the
    users to implement own interpretation procedures.

5
LISP-Miner procedures
  • 4FT-Miner (GUHA procedure)
  • generalised association rules in the form
  • Ant Suc / Cond
  • KEX
  • weighted decision rules in the form
  • Ant gt C (weight)

6
4FT-Miner
Data Matrix CLIENTS
LOANS Id Age Sex Salary District
Amount Payment Months Quality 1 45
F 28 000 Prague 48 000 1 000 48
good ... ... ... ... ...
... ... ... ... ... ...
... ... ... ... ... ...
... 70 000 18 M 12 000 Brno
36 000 2 000 18 bad Problem Are
there segments of clients SC and segments of
loans SL such that To be in SC is
at 90 equivalent to have a loan from SL and

there is at least 100 such
clients Ant is at 90 equivalent to Suc
Ant ?0.90, 100 Suc is true iff
a/(abc) ? 0.9 ? a ? 100
Suc
?Suc a - number of objects
satisfying Ant and Suc Ant a
b b- number of objects
satisfying Ant and not satisfying Suc ?Ant
c d c-
number of objects not satisfying Ant and
satisfying Suc
d- number of objects
satisfying neither Ant nor Suc
7
4FT Miner
  • Input
  • Data matrix,
  • quantifier ?0.90, 100
  • Derived attributes for SC (possible Ant) Age (7
    values), Sex (2 values),
    Salary (3 values), District (77 values)
  • Derived attributes for SL (possible Suc)
    Amount (6 values), Duration (5 values),
    Quality (2 values)
  • Output
  • All Ant ?0.90, 100 Suc true in data matrix
  • (5 equivalences from about 5 milions possible
    relations)
  • an example
  • Age(20 - 30) ? Sex(F) ? Salary(low) ? District
    (Prague) ?0.90, 100 Amountlt20,50) ?
    Quality(Bad)
  • Suc
    ?Suc
  • a/(abc) 0.95 ? 0.9
    Ant 950 30
  • ? 950 ? 100 ?Ant
    20 69000

8
KEX - classification
9
KEX - learning
10
LISp-Miner
11
LISp-Miner
12
LISp-Miner
13
LISp-Miner
14
4FT Miner and KEX
  • Applications
  • truck reliability assessment
  • quality control in a brewery
  • segmentation of clients of a bank
  • short-term electric load prediction

15
LISp Miner
  • References
  • Berka,P. - Ivanek,J. Automated Knowledge
    Acquisition for PROSPECTOR-like Expert Systems.
    In (Bergadano, deRaedt eds.) Proc. ECML'94,
    Springer 1994, 339-342.
  • Berka,P. - Rauch,J. Data Mining using GUHA and
    KEX. In (Callaos, Yang, Aguilar eds.) 4th. Int.
    Conf. on Information Systems, Analysis and
    Synthesis ISAS'98, 1998, Vol 2, 238- 244.
  • Rauch,J. Classes of Four Fold Table Quantifiers.
    In (Zytkow, Quafafou eds.) Principles of Data
    Mining and Knowledge Discovery. Springer 1998,
    203 - 211.

16
Datasets
  • PKDD99 Discovery Challenge data
    (http//lisp.vse.cz/pkdd99/chall.htm)
  • financial data clients of a bank, their
    accounts, transactions, loans etc,
  • medical data patients with collagen disease

17
Financial data
18
Medical data
19
Other activities
  • Organized conferences
  • Teaching (in czech)
  • KDD
  • KDD seminar
  • ML

http//lisp.vse.cz/ecml97/
http//lisp.vse.cz/pkdd99/
20
New projects
  • SOL-EU-NET project Data Mining and Decision
    Support for Business Competitiveness A European
    Virtual Enterprise
  • (supported by EU grant IST-1999-11.495)
Write a Comment
User Comments (0)
About PowerShow.com