Data Mining Concepts and Research Trends - PowerPoint PPT Presentation

Loading...

PPT – Data Mining Concepts and Research Trends PowerPoint presentation | free to download - id: 5eed98-ZjQwY



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Data Mining Concepts and Research Trends

Description:

KISS-SIGDB Tutorial 1998 Data Mining Concepts and Research Trends Do-Heon LEE Database Laboratory Dept. of Computer Science Chonnam National University – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 35
Provided by: 6649912
Learn more at: http://cs.sungshin.ac.kr
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Data Mining Concepts and Research Trends


1
Data Mining Concepts and Research Trends
KISS-SIGDB Tutorial 1998
Do-Heon LEE Database Laboratory Dept. of Computer
Science Chonnam National University
1998. 5. 21.
2
Table of Contents
  • Definition and Motivation of Data Mining
  • Classification of Data Mining Techniques
  • Mining Association Rules
  • Attribute Dependencies
  • Database Summarization
  • Data Mining Projects
  • DBMiner/GeoMiner/WebMiner
  • MineSet
  • Data Mining and Data Warehousing
  • References

3
Definition of Data Mining
  • Data mining is the
  • nontrivial extraction of
  • implicit, beyond databases and catalogs
  • previously unknown, and exclude well-known
    knowledge
  • potentially useful information
    application-dependent usefulness
  • from large volume of performance perspective
  • actual data . missing, erroneous data
  • Some counter examples
  • The 3th attribute of table EMP is SALARY.
  • Explicit information in the DB catalog
  • Most of college students have been graduated from
    high schools.
  • Well-known information, common sense

4
Motivation of Data Mining Research
Growing reliance on database systems
Database operational data collection useful
resource reflecting domain characteristics
Fast advance of database system technology
Increasing volume of data stored in databases
Mining databases for useful knowledge that can be
exploited in decision making
5
Comparison with Machine Learning
Data Mining Dynamic data Errorneous
data Uncertain data Missing data Coexistence
of irrelvant data Immense size Structured data
Machine Learning Static data Error-free
data Exact data No missing data Only relevant
data Moderate size Flat collection of data
  • Data mining is an actual application of machine
    learning methodologies.

6
Classification of Data Mining Techniques
  • On knowledge types to be discovered
  • Characterization generalized description of
    data characteristics
  • Classfication description of discriminating
    characteristics
  • Clustering grouping data having common
    properties
  • Association co-occurence relationships among
    multiple events
  • Trend analysis characterize evolution trend
    of temporal data
  • Pattern analysis find specified patterns in
    large DBs
  • Types of mining targets are continuously evolved
  • according to emerged application demands. ( cf.
    SQL evolution )
  • On database types to be mined
  • relational, transactional, object-oriented,
    temporal, multi-media etc ..
  • On techniques adopted
  • statistics, symbolic learning, neural networks,
    visualization etc..

7
Association Rules Definition and Applications
  • QUEST project at IBM Almaden Research Center
  • Association rules ( among items )
  • Given a collection of transactions each of which
    is item-1, ..., item-n ,
  • an association rule has a form of
  • item-11, item-12, ... , item-1m --gt
    item-21, item-22, ... , item-2k
  • antecedent items
    consequence items
  • The existence of an item(or items) implies the
    existence of other item(s) in the same
    transaction.

In a POS(Point-Of-Sales) data set, 10/15/1301
coke, bread, hamburger 10/15/1421 coke,
hamburger , juice 10/15/1425 milk, sandwich,
juice 10/15/1513 sandwich, milk, juice,
bread 10/15/1631 hamburger, juice,
coke .....
association rules
decision making for shelf layout design, direct
mailing, etc ...
hamburger --gt coke sandwich, juice --gt
milk
  • Customer usage patterns in public communication
    services
  • Fault co-occurence analysis in complex systems

8
Association Rules Usefulness Measures
  • Two measures for identifying useful association
    rules
  • support statistical significance - the
    fraction of transactions containing all items
  • confidence rule strength - the fraction of
    transactions containing consequence items to

  • transactions containing antecedent items

hamburger o o x x o o o o o x 7
coke o o x x o o o x x o 6
both o o x x o o o x x x 5
coke, bread, hamburger coke, hamburger ,
juice milk, sandwich, juice sandwich,
milk, juice, bread hamburger, juice, coke
coke, bread, hamburger coke, hamburger ,
juice hamburger, juice milk, hamburger,
sweater coke, milk, juice
For an assoication rule coke --gt hamburger
, support 5 out of 10 50 confidence 5
out of 6 83
9
Association Rules Mining Procedures
The first phase finding frequent item-sets (
high support ) the threshold value for support
is given as 40
coke, bread, hamburger coke, hamburger ,
juice milk, sandwich, juice sandwich,
milk, juice, bread hamburger, juice, coke
coke, bread, hamburger coke, hamburger ,
juice hamburger, juice milk, hamburger,
sweater coke, milk, juice coke, juice
coke, sweater
coke 8 bread 3 hamburger 7 juice
8 milk 4 sandwich 2 sweater 2
coke, hamburger 5 coke, juice
5 hamburger, juice 4
coke, hamburger, juice 2
The second phase finding strong associations
(high confidence) the threshold value for
confidence is given as 70
  • Blind search 2N candidates
  • AIS basic algorithm
  • SETM sort-merge algorithm
  • Apriori tree-structured candidate sets
  • AprioriTid temprary table generation
  • Partition partitioned mining
  • DHP hash-based algorithm

coke --gt hamburger 5 out of 8 62.5
hamburger --gt coke 5 out of 7 71
coke --gt juice 5 out of 8
62.5 juice --gt coke 5 out of 8
62.5
10
Sequential Patterns
CID 1 1 2 2 2 3 4 4 4 5
Time 95/06/25 95/06/30 95/06/10 95/06/15 95/06/2
0 95/06/25 95/06/25 95/06/30 95/07/25 95/06/12
Items 30 90 10,20 30 40,60,70 30,50,70 30 40,7
0 90 90
CID 1 2 3 4 5
Sequence lt(30) (90)gt lt(10,20) (30)
(40,60,70)gt lt(30,50,70)gt lt(30) (40,70)
(90)gt lt(90)gt
Maximal sequential patterns with support gt 25
lt(30) (90)gt lt(30) (40,70)gt
11
Telecommunication Network Diagnosis
node-A
node-B
time 30 min
(C, 123 )
( F, 678 )
(E, 256 )
node-C
node-F
node-D
Co-occurence of 123 alarm in C and 256 alarm in
E implies 678 alarm in F in 30 minintes.
node-E
node-I
node-G
node-H
12
Attribute Dependencies
  • Given attributes A1, A2, ..., Am
  • f(A1, A2, ..., Am, a set of constants) gt

  • g(A1, A2, ... Am, a set of constants)
  • where f and g are arbitrary (boolean)
    functions.
  • e.g. (A1 c1 and A2 c2) then (A3 c3 and
    A4 c4)
  • Intractable problems because the number of
    possible functions and constants are potentially
    infinite.
  • Thus, several constraints are given to make them
    tractable in actual domains.
  • e.g. LHS is a conjuction of simple
    predicates and RHS is an assertion of
    classification --gt Classification problem

13
Classification
  • Symbolic classification rules(e.g. decision
    trees)
  • The most well-studied area among inductive
    learning problems.

A1
A1 A2 C a d 1 a e 2 b f 3 b g 3
a
b
A2
d
e
1
2
3
  • Neural network approach
  • Weight values in edges --gt symbolic description
    of classification rules
  • Still far from a practical solution lt-- too
    costly learning time
  • Suitable for single-learning/multiple-runs
    problems

14
Bottom-Up Summarization
  • DBLEARN project at J.Han's Lab., Simon Fraser
    Univ., Canada

Name Lee Kim Yoon Park Choi Hong
Major music physics math painting computing stati
stics
Birth_Place Kwangju Sunchon Mokpo Yeosu Taegu Suw
on
GPA 3.4 3.9 3.7 3.4 3.8 3.2
vote 1 1 1 1 1 1
Major art science science art science science
Birth_Place Chunnam Chunnam Chunnam Chunnam Kyung
buk Kyonggi
GPA good execellent execellent good execellent go
od
vote 1 1 1 1 1 1
Major art science science science
Birth_Place Chunnam Chunnam Kyungbuk Kyonggi
GPA good execellent execellent good
vote 2 2 1 1
attribute-oriented substitution
merging redundant records
Domain Knowledge
15
Top-Down Summarization
  • CLEVER system at DB Lab. KAIST

Table to be summarized
user's selection
tSD 0.4
lt w, w gt 1.000
PROGRAM vi emacs word gcc tetris
USER John Tom Lee Park Yang
lt engineering, w gt 0.833
lt w, developer gt 0.800
lt w, marketer gt 0.411
lt engineering, developer gt 0.700
lt w, programmer gt 0.589
Fuzzy set hierarchies
PROG_01
USR_01
lt editor, developer gt 0.489
lt engineering, programmer gt 0.522
lt editor, programmer gt 0.456
16
Data Mining Projects
  • QUEST IBM Almeden Research Center
  • a common set of operations in a unified framework
  • classfication, association etc..
  • KDW(Knowledge Discovery Workbench) GTE
    Laboratory Inc.
  • focus on architectural issues of data mining
    system
  • clustering, classification, summarization,
    deviation detection etc
  • IMACS(Intelligent Market Analysis and
    Classification System) ATT Bell Lab
  • focus on human interaction on data mining
  • data archaeology
  • CoverStory Information Resources Incorporated
  • summarization on supermarket scanner data
  • DBMiner/GeoMiner/WebMiner Simon Fraser Univ.
  • MineSet Silicon Graphics Inc.

17
DBMiner
  • DBMiner Research Group in Simon Fraser Univ.,
    Canada
  • DMQL a SQL-like Data Mining Query Language
  • Data structures Generalized relations,
    multi-dimensional data cube

18
DBMiner(contd)
  • Functions
  • Characterizer the general characteristics of a
    set of user-specified data
  • attribute-oriented induction
  • eg. Cold(x) gt headache(x) and cough(x)
  • eg. Fever(x) gt headache(x) and
    low-leucocyte-count(x)
  • Discriminator features that distinguish the
    target class from constrasting classes
  • eg. Low-leucocyte-count(x) gt Fever(x)
  • Classifier generalization-based decision tree
    induction
  • Association rule finder multi-level association
    rules
  • Meta-rule guided miner confine the search to
    specific forms of rules
  • eg. Meta-rule major(s student, x) and p(s, y)
    gt GPA(s, z)
  • Predictor predict the possible values for
    missing data, after factor analysis
  • eg. An employees potential salary can be
    predicted based on the salary distribution of
    similar employees in the company
  • Data evolution evaluator
  • eg. Growth patterns of certain stocks
  • Deviation evaluator
  • eg. A set of stocks whose growth patterns deviate
    from the major trend.

19
GeoMiner/WebMiner
  • GeoMiner with GMQL(Geo-Mining Query Language)
  • An extension of DBMiner for spatial data mining
  • Modules
  • Geo-characterizer
  • eg. Given spatial hierarchies of Western Canada,
    discover general weather patterns according to
    region partitions
  • Geo-comparator( discriminator)
  • eg. The differences in weather patterns between
    British Columbia and Alberta
  • Geo-associator
  • WebMiner with WebQL
  • It finds resources in the internet related to a
    specific topic
  • eg. What is the most popular document about data
    mining in terms of number of accesses
  • cf. WEB traversal pattern discovery(by Chen, Park
    and Yu, 1996)
  • eg. If a user visits h1 gt h2 gt h5 then he/she
    is apt to visit h8 gt h11

20
MineSet
  • Developed by Silicon Graphics Inc.
  • Combine intelligent data mining algorithms and
    multidimensional data visualization techniques
  • Association rule generator/rule visualizer
  • Classification tools
  • MLC based classification modules
  • Decision tree inducer
  • Option tree inducer
  • Evidence classifier inducer
  • Decision table inducer
  • Tree/evidence visualizers
  • Map visualizer spatial data analysis
  • Clustering module
  • Regressin tree inducer predict unknown values

21
Rule Visualizer of MineSet
Cited from the Silicon Graphics Inc. Home Pages
22
Decision Tree Visualizer of MineSet
Cited from the Silicon Graphics Inc. Home Pages
23
Map Visualizer of MineSet
Cited from the Silicon Graphics Inc. Home Pages
24
Two Perspectives on Data Mining
  • AI practitioners perspective
  • Extensions of machine learning technology
  • Focus on sophisticated measures and theories
    rather than efficiency improvement
  • DB practitioners perspective
  • Application of machine learning paradigms to
    massive and actual data management problems
  • A suggestion as a DB practitioner
  • First step Blindly search possible knowledge
    gt Data Mining
  • There is no guru who could guide the search
    directions.
  • No available heuristics Rather ignore
    heuristics for unknown patterns.
  • Second step Validate the discovered rough
    knowledge in detail

25
Data Mining and Data Warehousing
Process-oriented
Data Mining
Metadata
Subject-oriented
Relational DB-1
Data mart-1
Relational DB-2
Data warehouse builder/ manager
Data mart-2
Object-oriented DB-1
Data warehouse
Data mart-3
Object-oriented DB-2
Data mart-4
Legacy DB-1
Data mart-5
File system-1
Operational Data
Data for Decision Support
26
Research Issues
  • Looking for useful mining targets
  • Associations, characteristic rules,
    classification, clustering
  • Functional dependency, regression trees
  • Similar sequential patterns/time series
  • Variations of association rules
  • Alternatives for simple support and confidence
    measures
  • Generalized/multilevel association rules
  • Performance enhancement for association rule
    discovery
  • System implementation issues
  • Identify core functions(eg. A tightly-coupled
    architectureMEO98, MLC)
  • Elicit common DBMS requirements for various data
    mining tasks
  • Integration with relational databases and/or
    multi-dimensional databases
  • Data/knowledge visualization
  • Extended query language or extened CLI eg. DMQL
  • And so on ...

27
References
  • Data Mining General
  • FRW91 W. J. Frawley, G. Piatetsky-Shapiro and
    C. J. Matheus, Knowledge Discovery in Databases
    An Overview, Knowledge Discovery in Databases,
    G. Piatetsky-Shapiro and W. J. Frawley Ed., AAAI
    Press, 1991, pp. 1-27
  • AGR93a R. Agrawal, T. Imielinski and A. Swami,
    Database Mining A Performance Perspective,
    IEEE Trans. on Knowledge and Data Enginieering,
    Vol. 5, No. 6, 1993, pp. 914-925
  • MAT93 C. J. Matheus, P. Chan and G.
    Piatetsky-Shapiro, Systems for Knowledge
    Discovery in Databases, IEEE TKDE, Vol. 5, No.
    6, 1993, pp. 903-913
  • HOL94a M Holsheimer and A. Siebes, Data Mining
    The Search for Knowledge in Databases, Report
    CS-R9406, ISSN 0169-118X, CWI(Centrum voor
    Wiskunde en Informatica), The Netherland, 1994
  • Association Rules
  • AGR93b R. Agrawal, T. Imielinski and A. Swami,
    Mining Associations between Sets of Items in
    Massive Databases, Proc. ACM SIGMOD, Washington
    D.C., May 1993
  • AGR94 R. Agrawal and R. Srikant, Fast
    Algorithms for Mining Association Rules in Large
    Databases, Proc. VLDB, Santiago, Sep. 1994, pp.
    487-499
  • KLE94 M. Klemettien, H. Mannila, P. Ronakainen,
    H. Toivonen and A. Verkamo, Finding Interesting
    Rules from Large Sets of Discovered Association
    Rules, Proc. CIKM, Gaithersburg, Nov. 1994, pp.
    401-407

28
References(Contd)
  • HOT95 M. Houtsma and A. Swami, Set-Oriented
    Mining for Association Rules in Relational
    Databases, Proc. ICDE, Taipei, Mar. 1995, pp.
    25-33
  • SAV95 A. Savasere, E. Omiecinski, S. Navathe,
    An Efficient Algorithm for Mining Association
    Rules in Large Databases, Proc. VLDB, Zurich,
    Sep. 1995, pp. 432-444
  • SRI95 R. Srikant and R. Agrawal, Mining
    Generalized Association Rules, Proc. VLDB,
    Zurich, Sep. 1995, pp. 407-419
  • HAN95 J. Han and Y. Fu, Discovery of
    Multiple-level Association Rules from Large
    Databases, Proc. VLDB, Zurich, Sep. 1995, pp.
    420-431
  • PAR95a J. -S. Park and Y. Fu, An Efficient
    Hash Based Algorithm for Mining Association
    Rules, Proc. SIGMOD, 1995, pp. 175-186
  • PAR95b J. -S. Park, M. -S. Chen and P. S. Yu,
    Efficient Parallel Data Mining for Association
    Rules, Proc. CIKM, 1995
  • SRI96 R. Srikant and R. Agrawal, Minining
    Quantitative Association Rules in Large
    Relational Tables, Proc. SIGMOD, Quebec, Jun.
    1996, pp. 1-12
  • FUK96 T. Fukuda, Y. Morimoto, S. Morishita and
    T.Tokuyama, Data Mining Using Two-Dimensional
    Optimized Association Rules Scheme, Algorithms,
    and Visualization, Proc. SIGMOD, Quebec, Jun.
    1996, pp. 13-23
  • CHE96 D. Cheung, J. Han, V. Ng and C.Wong,
    Maintenance of Discovered Association Rules in
    Large Databases An Incremental Updating
    Technique, Proc. ICDE, New Orleans, Feb. 1996,
    pp. 106-114

29
References(Contd)
  • BRI97a S. Brin, R. Motwami, J. Ullman and S.
    Tsur, Dynamic Itemset Counting and Implication
    Rules for Market Basket Data, Proc. SIGMOD,
    1997, pp. 255-264
  • BRI97b S. Brin, R. Motwami and C. Silverstein,
    Beyond Market Baskets Generalizing Association
    Rules to Correlations, Proc. SIGMOD, 1997, pp.
    265-276
  • HAN97 E. H. Han, G. Karypis and V. Kumar,
    Scalable Parallel Data Mining for Association
    Rules, Proc. SIGMOD, 1997, pp. 277-288
  • AGG98 C. C. Aggarwal and P. S. Yu, Online
    Generation of Association Rules, Proc. Intl
    Conf. on Data Engineering, 1998, pp. 402-411
  • OZD98 B. Özden, S. Ramaswamy and A.
    Silberschatz, Cyclic Association Rules, Proc.
    Intl Conf. on Data Engineering, 1998, pp.
    412-423
  • LIN98 J. -L. Lin and M. H. Dunham, Mining
    Association Rules Anti-Skew Algorithms, Proc.
    Intl Conf. on Data Engineering, 1998, pp.
    486-493
  • SAV98 A. Savasere, E. Omiecinski ans S.
    Navathe, Mining for Strong Negative Associations
    in a Large Database of Customer Transactions,
    Proc. Intl Conf. on Data Engineering, 1998, pp.
    494-502
  • RAS98 R. Rastogi and K. Shim, Mining Optimized
    Association Rules with Categorical and Numeric
    Attributes, Proc. Intl Conf. on Data
    Engineering, 1998, pp. 503-513

30
References(Contd)
  • Characterization
  • HAN91 Y. Cai, N. Cercone and J. Han,
    Attribute-Oriented Induction in Relational
    Databases, Knowledge Discovery in Databases, G.
    Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
    1991, pp. 213-228
  • HAN92a J. Han, Y. Cai and N. Cercone,
    Knowledge Discovery in Databases An
    Attribute-Oriented Approach, Proc. VLDB, 1992,
    pp. 547-559
  • HAN92b J. Han, Y. Cai, N. Cercone and Y. Huang,
    DBLEARN A Knowledge Discovery System for Large
    Databases, Proc. CIKM, 1992, pp. 473-481
  • HAN93 J. Han, Y. Cai and N. Cercone,
    Data-Driven Discovery of Quantitative Rules in
    Relational Databases, IEEE TKDE, Vol. 5, No. 1,
    Feb. 1993, pp. 29-40
  • LEE94 D.-H. Lee and M. H. Kim, Discovering
    Database Summaries through Refinements of Fuzzy
    Hypotheses, Proc. ICDE, Houston, Feb. 1994, pp.
    223-230
  • LEE97 D.H. Lee and M.H. Kim, "Database
    Summarization Using Fuzzy ISA Hierarchies", IEEE
    Transactions on Systems, Man and Cybernetics,
    Vol.27, No.4, August 1997, pp. 671-680

31
References(Contd)
  • Sequential Patterns
  • ARG93c R. Agrawal, C. Faloutsos and A. Swami,
    Efficient Similarity Search in Sequence
    Databases, Proc. the 4th Intl Conf. on
    Foundations of Data Organization and Algorithms,
    Chicago, Oct 1993
  • FAL94 C. Faloutsos, M. Ranganathan and Y.
    Manolopoulos, Fast Subsequence Matching in
    Time-Series Databases, Proc. SIGMOD,
    Minneapolis, May. 1994, pp. 419-429
  • AGR95a R. Agrawal and R. Srikant, Mining
    Sequential Patterns, Proc. ICDE, Taipei, Mar.
    1995, pp. 3-14
  • AGR95b R. Agrawal, K.Lin, H. Sawhney and K.
    Shim, Fast Similarity Search in the Presense of
    Noise, Scaling, and Translation in Time-Series
    Databases, Proc. VLDB, Zurich, Sep. 1995, pp.
    490-501
  • AGR95c R. Agrawal, G. Psaila, E. Wimmers and M.
    Zait, Querying Shapes of Histories, Proc. VLDB,
    Zurich, Sep. 1995, pp. 502-514
  • HAT96 K. Hatonen, M. Klemettinen, H. Mannila,
    P. Ronkainen and H. Toivonen, Knowledge
    Discovery from Telecommunication Network Alarm
    Databases, Proc. ICDE, New Orleans, Feb. 1996,
    pp. 115-123
  • SHA96 H. Shatkay and S.Zdonik, Approximate
    Queries and Representations for Large Data
    Sequences, Proc. ICDE, New Orleans, Feb. 1996,
    pp. 536-545
  • LI96 C. Li, P. Yu and V. Castelli,
    HierarchyScan A Hierarchical Similarity Search
    Algorithm for Databases of Long Sequences, Proc.
    ICDE, New Orleans, Feb. 1996, pp. 546-555
  • CHE96 M. -S. Chen, J. S. Park and P. S. Yu,
    Data Mining for Path Traversal Patterns in a Web
    Environment, Proc. ICDCS, 1997, pp. 385-392
  • SHA97 J. Shafer and R. Agrawal, Parallel
    Algorithms for High-Dimensional Proximity Joins,
    Proc. VLDB, 1997, pp. 176-185

32
References(Contd)
  • Classification/Clustering
  • QUI89 J. Quinlan and R. Rivest, Inferring
    Decision Trees Using the Minimum Description
    Length Principle, Information and Computation,
    Vol. 80, 1989, pp. 227-248
  • YAS91 R. Yasdi, Learning Classification Rules
    from Database in the Context of Knowledge
    Acquisition and Representation, IEEE TKDE, Vol.
    3, No. 3, Sep. 1991, pp. 293-306
  • CHA91 K. Chan and A. Wong, A Statistical
    Technique for Extracting Classificatory Knowledge
    from Databases, Knowledge Discovery in
    Databases, G. Piatetsky-Shapiro and W. Frawley
    Ed., AAAI Press, 1991, pp. 107-123
  • UTH91 R. Uthursamy, U. Fayyad and S. Spangler,
    Learning Useful Rules from Inconclusive Data,
    Knowledge Discovery in Databases, G.
    Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
    1991, pp. 141-157
  • ZIA91 W. Ziarko, The Discovery, Analysis and
    Representation of Data Dependencies in
    Databases, Knowledge Discovery in Databases, G.
    Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
    1991, pp. 195-209
  • PIA91 G. Piatetsky-Shapiro, Discovery,
    Analysis and Presentation of Strong Rules,
    Knowledge Discovery in Databases, G.
    Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
    1991, pp. 229-248
  • MAN91 M. Manago and Y. Kodratoff, Induction of
    Decision Trees from Complex Structured Data,
    Knowledge Discovery in Databases, G.
    Piatetsky-Shapiro and W. Frawley Ed., AAAI Press,
    1991, pp. 289-306

33
References(Contd)
  • SMY92 P. Smyth and R. Goodman, An Information
    Theoretic Approach to Rule Induction from
    Databases, IEEE TKDE, Vol. 4, No. 4, Aug. 1992,
    pp. 301-316
  • WAN92 L. Wang and J. Mendel, Generating Fuzzy
    Rules by Learning from Examples, IEEE TSMC, Vol.
    22, No. 6, Nov. 1992, pp. 1414-1427
  • AGR92 R. Agrawal, S. Ghosh, T. Imielinski, B.
    Iyer and A. Swami, An Interval Classifier for
    Database Mining Applications, Proc. VLDB,
    Vancouver, Aug. 1992, pp.207-216
  • LU95 H. Lu, R. Setiono and H. Liu, NeuroRule
    A Connectionist Approach to Data Mining, Proc.
    VLDB, Zurich, Sep. 1995, 478-489
  • HON91 J. Hong and C. Mao, Incremental
    Discovery of Rules and Structure by Hierarchical
    and Parallel Clustering, Knowledge Discovery in
    Databases, G. Piatetsky-Shapiro and W. Frawley
    Ed., AAAI Press, 1991, pp. 177-194
  • NG94 R. Ng and J. Han, Efficient and Effective
    Clustering Methods for Spatial Data Mining,
    Proc. VLDB, 1994, pp. 144-155
  • XU98 X. Xu, M. Ester, H. -P. Kriegel and J.
    Sander, A Distribution-Based Clustering
    Algorithm for Mining in Large Spatial Databases,
    Proc. Intl Conf. on Data Engineering, 1998, pp.
    324-333

34
References(Contd)
  • System Implementations
  • SEL96 P.Selfridge, D.Srivastava and L. Wilson,
    IDEA Interactive Data Exploration and
    Analysis, Proc. SIGMOD, Quebec, Jun. 1996, pp.
    24-34
  • MEO98 R. Meo, G. Psalia and S. Ceri, A
    Tightly-Coupled Architecture for Data Mining,
    Proc. Intl Conf. on Data Engineering, 1998, pp.
    316-323
  • HAN96 J. Han et. al., DBMiner A System for
    Mining Knowledge in Large Relational Databases,
    Proc. KDD, 1996
  • HAN97 J. Han et. al., GEOMiner A System
    Prototype for Spatial Data Mining, Proc. SIGMOD,
    1997
  • HAN98 WebMiner A Resource and Knowledge
    Discovery System for the Internet,
    http//db.cs.sfu.ca/WebMiner/
  • KOH96 R. Kohavi et. al., Data Mining Using
    MCL A Machine Learning Library in C, Proc.
    Tools with AI, 1996, pp. 234-245
  • HAL98 C. Hall ed., MineSet 2.0 for Data Mining
    and Multidimensional Data Analysis,
    http//www.cgi.com/Products/software/MineSet/DMStr
    ategies/index.html
About PowerShow.com