Title: University of Economics, Prague
1University of Economics, Prague
- MLNET related activities of Laboratory for
Intelligent Systems - and
- Dept. of Information and Knowledge Engineering
- http//lisp.vse.cz/berka/MLNet.html
2Research
- probabilistic methods - decomposable probability
models and bayesian networks - symbolic methods - generalized association rules
and decision rules - logical calculi for knowledge discovery in
databases
3People
Petr Berka
Jirí Ivánek
Radim Jiroušek
Jan Rauch
Tomáš Kocka
Vojtech Svátek
4Software
- LISp-Miner
- two data mining procedures
- 4FT Miner (generalised association rules) and
- KEX (decision rules),
- large preprocessing module including SQL,
- output of rules in database format enables the
users to implement own interpretation procedures.
5LISP-Miner procedures
- 4FT-Miner (GUHA procedure)
- generalised association rules in the form
- Ant Suc / Cond
- KEX
- weighted decision rules in the form
- Ant gt C (weight)
64FT-Miner
Data Matrix CLIENTS
LOANS Id Age Sex Salary District
Amount Payment Months Quality 1 45
F 28 000 Prague 48 000 1 000 48
good ... ... ... ... ...
... ... ... ... ... ...
... ... ... ... ... ...
... 70 000 18 M 12 000 Brno
36 000 2 000 18 bad Problem Are
there segments of clients SC and segments of
loans SL such that To be in SC is
at 90 equivalent to have a loan from SL and
there is at least 100 such
clients Ant is at 90 equivalent to Suc
Ant ?0.90, 100 Suc is true iff
a/(abc) ? 0.9 ? a ? 100
Suc
?Suc a - number of objects
satisfying Ant and Suc Ant a
b b- number of objects
satisfying Ant and not satisfying Suc ?Ant
c d c-
number of objects not satisfying Ant and
satisfying Suc
d- number of objects
satisfying neither Ant nor Suc
74FT Miner
- Input
- Data matrix,
- quantifier ?0.90, 100
- Derived attributes for SC (possible Ant) Age (7
values), Sex (2 values),
Salary (3 values), District (77 values)
- Derived attributes for SL (possible Suc)
Amount (6 values), Duration (5 values),
Quality (2 values) - Output
- All Ant ?0.90, 100 Suc true in data matrix
- (5 equivalences from about 5 milions possible
relations) - an example
- Age(20 - 30) ? Sex(F) ? Salary(low) ? District
(Prague) ?0.90, 100 Amountlt20,50) ?
Quality(Bad) - Suc
?Suc - a/(abc) 0.95 ? 0.9
Ant 950 30 - ? 950 ? 100 ?Ant
20 69000
8KEX - classification
9KEX - learning
10LISp-Miner
11LISp-Miner
12LISp-Miner
13LISp-Miner
14 4FT Miner and KEX
- Applications
- truck reliability assessment
- quality control in a brewery
- segmentation of clients of a bank
- short-term electric load prediction
15LISp Miner
- References
- Berka,P. - Ivanek,J. Automated Knowledge
Acquisition for PROSPECTOR-like Expert Systems.
In (Bergadano, deRaedt eds.) Proc. ECML'94,
Springer 1994, 339-342. - Berka,P. - Rauch,J. Data Mining using GUHA and
KEX. In (Callaos, Yang, Aguilar eds.) 4th. Int.
Conf. on Information Systems, Analysis and
Synthesis ISAS'98, 1998, Vol 2, 238- 244. - Rauch,J. Classes of Four Fold Table Quantifiers.
In (Zytkow, Quafafou eds.) Principles of Data
Mining and Knowledge Discovery. Springer 1998,
203 - 211.
16Datasets
- PKDD99 Discovery Challenge data
(http//lisp.vse.cz/pkdd99/chall.htm) - financial data clients of a bank, their
accounts, transactions, loans etc, - medical data patients with collagen disease
17Financial data
18Medical data
19Other activities
- Teaching (in czech)
- KDD
- KDD seminar
- ML
http//lisp.vse.cz/ecml97/
http//lisp.vse.cz/pkdd99/
20New projects
- SOL-EU-NET project Data Mining and Decision
Support for Business Competitiveness A European
Virtual Enterprise - (supported by EU grant IST-1999-11.495)