CIS303 Advanced Forensic Computing - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

CIS303 Advanced Forensic Computing

Description:

12. (.47 .00) Specs=myope and Astigm=yes and Tear-prod=normal == Lenses=hard ... has a short closed car': eastbound(T):-hasCar(T,C), clength(C,short),not croof ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 69
Provided by: osirisSun
Category:

less

Transcript and Presenter's Notes

Title: CIS303 Advanced Forensic Computing


1
CIS303Advanced Forensic Computing
  • Dr Giles Oatley

2
Relational data mining
  • Relational data mining is the data mining
    technique for relational databases.
  • Unlike traditional data mining algorithms, which
    look for patterns in a single table
    (propositional patterns), relational data mining
    algorithms look for patterns among multiple
    tables (relational patterns).
  • For most types of propositional patterns, there
    are corresponding relational patterns.
  • For example, there are relational classification
    rules, relational regression trees, relational
    association rules, and so on.
  • The most important theoretical foundation of
    relational data mining is inductive logic
    programming.

3
Rule-Based Learning
positive examples
negative examples
  • Goal Induce a rule (or rules) that explains ALL
    positive examples and NO negative examples

4
Inductive Logic Programming (ILP)
  • Encode background knowledge in first-order logic
    as facts

containsBlock(ex1,block1A). containsBlock(ex1,bloc
k1B). is_red(block1A). is_square(block1A).
is_blue(block1B). is_round(block1B). on_to
p_of(block1B,block1A).
and logical relations
above(A,B) - onTopOf(A,B) above(A,B) -
onTopOf(A,Z), above(Z,B).
5
Inductive Logic Programming (ILP)
  • Covering algorithm applied to explain all data

-
-
-
-
-
-
-
-
-
Choose some positive example
Generate best rule that covers this example
Remove all examples covered by this rule
Repeat until every positive example is covered
6
Inductive Logic Programming (ILP)
  • Saturate an example by writing everything true
    about it
  • The saturation of an example is the bottom clause
    (?)

positive(ex2) - contains_block(ex2,block2A),
contains_block(ex2,block2B),
contains_block(ex2,block2C), isRed(block2A),
isRound(block2A), isBlue(block2B),
isRound(block2B), isBlue(block2C),
isSquare(block2C), onTopOf(block2B,block2A),
onTopOf(block2C,block2B), above(block2B,block2
A), above(block2C,block2B),
above(block2C,block2A).
ex2
C
B
A
7
Inductive Logic Programming (ILP)
Selected literals from ?
  • Candidate clauses are generated by
  • choosing literals from ?
  • converting ground terms to variables
  • Search through the space of candidate clauses
    using standard AI search algo
  • Bottom clause ensures search finite

containsBlock(ex2,block2B)
isRed(block2A)
onTopOf(block2B,block2A)
Candidate Clause
positive(A) - containsBlock(A,B),
onTopOf(B,C), isRed(C).
8
Contact lense dataset
9
Contact lense decision list
IF Tear-production reduced THEN Lenses none
(12) ELSE / Tear-production normal / IF
Astigmatic no THEN Lenses soft (6/1)
ELSE / Astigmatic yes / IF
Spectacles myope THEN Lenses hard (3)
ELSE / Spectacles hypermetrope /
Lenses none (3/1) Confusion Matrix
a b c lt-- classified as 5 0 0 a
soft 0 3 1 b hard 1 0 14 c none
10
Contact lense association rules
1. (1.00) Tear-productionreduced 12 gt
Lensesnone 12 2. (1.00) Astigmaticyes
Tear-productionreduced 6 gt Lensesnone 6 3.
(1.00) Astigmaticno Tear-productionreduced 6
gt Lensesnone 6 4. (1.00) Spectacleshypermetro
pe Tear-productionreduced 6 gt Lensesnone 6
5. (1.00) Spectaclesmyope Tear-productionreduced
6 gt Lensesnone 6 6. (1.00) Lensessoft 5 gt
Astigmaticno Tear-productionnormal 5 7. (1.00)
Astigmaticno Lensessoft 5 gt
Tear-productionnormal 5 8. (1.00)
Tear-productionnormal Lensessoft 5 gt
Astigmaticno 5 9. (1.00) Lensessoft 5 gt
Tear-productionnormal 5 10. (1.00) Lensessoft 5
gt Astigmaticno 5 11. (0.86) Astigmaticno
Lensesnone 7 gt Tear-productionreduced 6 12.
(0.86) Spectaclesmyope Lensesnone 7 gt
Tear-productionreduced 6 13. (0.83)
Astigmaticno Tear-productionnormal 6 gt
Lensessoft 5 14. (0.83) Spectacleshypermetrope
Astigmaticyes 6 gt Lensesnone 5 15. (0.80)
Lensesnone 15 gt Tear-productionreduced 12 16.
(0.75) Astigmaticyes Lensesnone 8 gt
Tear-productionreduced 6 17. (0.75)
Spectacleshypermetrope Lensesnone 8 gt
Tear-productionreduced 6 18. (0.75)
Agepresbyopic 8 gt Lensesnone 6
B ?B H 12 315 ?H 0 9 9
121224
B ?B H 6 915 ?H 2 7 9
81624
11
Clauses sorted by confirmation
1. (.76 .00) Tear-prodreduced gt Lensesnone
2. (.76 .12) Lensesnone gt Tear-prodreduced
3. (.67 .04) Lensesnone gt Agepresb or
Tear-prodreduced 4. (.63 .04) Astigmno and
Tear-prodnormal gt Lensessoft 5. (.54 .00)
Astigmno and Tear-prodnormal gt Agepresb or
Lensessoft 6. (.50 .08) Astigmyes and
Tear-prodnormal gt Lenseshard 7. (.50 .08)
Lensesnone gt Agepre-presb or
Tear-prodreduced 8. (.47 .04) Lensesnone gt
Specshmetr or Tear-prodreduced 9. (.47 .04)
Lensesnone gt Astigmyes or Tear-prodreduced 10
. (.47 .00) Lensessoft gt Astigmno 11. (.47
.00) Lensessoft gt Tear-prodnormal 12. (.47
.00) Specsmyope and Astigmyes and
Tear-prodnormal gt Lenseshard 13. (.47 .00)
Lensesnone gt Agepresb or Specshmetr or
Tear-prodreduced 14. (.47 .00) Lensesnone gt
Agepresb or Astigmyes or Tear-prodreduced 15.
(.45 .00) Specshmetr and Astigmno and
Tear-prodnormal gt Lensessoft 16. (.44 .29)
Astigmno gt Lensessoft 17. (.44 .29)
Tear-prodnormal gt Lensessoft
B ?B H 12 315 ?H 0 9 9
121224
B ?B H 5 0 5 ?H 71219
121224
12
A toy example
13
East-West trains (flattened)
  • Example eastbound(t1).
  • Background knowledgehasCar(t1,c1).
    hasCar(t1,c2). hasCar(t1,c3).
    hasCar(t1,c4).cshape(c1,rect). cshape(c2,rect).
    cshape(c3,rect). cshape(c4,rect).clength(c1,shor
    t).clength(c2,long).clength(c3,short).clength(c4,l
    ong).croof(c1,none). croof(c2,none).
    croof(c3,peak). croof(c4,none).cwheels(c1,2).
    cwheels(c2,3). cwheels(c3,2).
    cwheels(c4,2).hasLoad(c1,l1). hasLoad(c2,l2).
    hasLoad(c3,l3). hasLoad(c4,l4).lshape(l1,circ).
    lshape(l2,hexa). lshape(l3,tria).
    lshape(l4,rect).lnumber(l1,1). lnumber(l2,1).
    lnumber(l3,1). lnumber(l4,3).
  • Hypothesis eastbound(T)-hasCar(T,C),clength(C,s
    hort), not
    croof(C,none).

14
East-West trains (flattened)
  • Example eastbound(t1).
  • Background knowledgehasCar(t1,c1).
    hasCar(t1,c2). hasCar(t1,c3).
    hasCar(t1,c4).cshape(c1,rect). cshape(c2,rect).
    cshape(c3,rect). cshape(c4,rect).clength(c1,shor
    t).clength(c2,long).clength(c3,short).clength(c4,l
    ong).croof(c1,none). croof(c2,none).
    croof(c3,peak). croof(c4,none).cwheels(c1,2).
    cwheels(c2,3). cwheels(c3,2).
    cwheels(c4,2).hasLoad(c1,l1). hasLoad(c2,l2).
    hasLoad(c3,l3). hasLoad(c4,l4).lshape(l1,circ).
    lshape(l2,hexa). lshape(l3,tria).
    lshape(l4,rect).lnumber(l1,1). lnumber(l2,1).
    lnumber(l3,1). lnumber(l4,3).
  • Hypothesis eastbound(T)-hasCar(T,C),clength(C,s
    hort), not
    croof(C,none).

15
East-West trains (terms)
  • Example eastbound(car(rect,short,none,2,load(cir
    c,1)), car(rect,long,
    none,3,load(hexa,1)),
    car(rect,short,peak,2,load(tria,1)),
    car(rect,long, none,2,load(rect,3))).
  • Background knowledge member/2, arg/3
  • Hypothesis eastbound(T)-member(C,T),arg(2,C,sho
    rt), not arg(3,C,none).

16
East-West trains (terms)
  • Example eastbound(car(rect,short,none,2,load(cir
    c,1)), car(rect,long,
    none,3,load(hexa,1)),
    car(rect,short,peak,2,load(tria,1)),
    car(rect,long, none,2,load(rect,3))).
  • Background knowledge member/2, arg/3
  • Hypothesis eastbound(T)-member(C,T),arg(2,C,sho
    rt), not arg(3,C,none).

17
ER diagram for East-West trains
18
Train-as-set database
SELECT DISTINCT TRAIN_TABLE.TRAIN FROM
TRAIN_TABLE, CAR_TABLE WHERE TRAIN_TABLE.TRAIN
CAR_TABLE.TRAIN AND CAR_TABLE.SHAPE 'short'
AND CAR_TABLE.ROOF ! 'none'
19
Individual-centred representations
  • ER diagram is a tree (approximately)
  • root denotes individual
  • looking downwards from the root, only one-to-one
    or one-to-many relations are allowed
  • one-to-one cycles are allowed
  • Database can be partitioned according to
    individual
  • Alternative all information about a single
    individual packed together in a term
  • tuples, lists, trees, sets, multisets, graphs,

20
Mutagenesis
21
Complexity of learning problems
  • Simplest case single table with primary key
  • attribute-value or propositional learning
  • example corresponds to tuple of constants
  • Next single table without primary key
  • multi-instance problem
  • example corresponds to set of tuples of constants
  • Complexity resides in many-to-one foreign keys
  • non-determinate variables
  • lists, trees, sets, multisets, graphs,

22
Subgroup discovery
  • An interesting subgroup has a class distribution
    which differs significantly from overall
    distribution
  • This can be modelled as classification with
    profits (for true pos/neg) and costs (for false
    pos/neg)
  • Requires different heuristics and/or trade-off
    between accuracy and generality

23
Upgrading to first-order logic
  • Use function-free Prolog as representation
    language
  • normal-form logic, simple syntax
  • specialisation well understood
  • For rule evaluation, generate all grounding
    substitutions
  • specialisation may increase sample size
  • if problematic, use first-order features and
    count only over global variables

24
First-order features
  • Features concern interactions of local variables
  • The following rule has one boolean feature has
    a short closed car
  • eastbound(T)-hasCar(T,C), clength(C,short),
    not croof(C,none).
  • The following rule has two boolean features has
    a short car and has a closed car
  • eastbound(T)- hasCar(T,C1),clength(C1,short
    ), hasCar(T,C2),not croof(C2,none).

25
Propositionalising rules
  • Equivalently
  • eastbound(T)-hasShortCar(T),hasClosedCar(T).
  • hasShortCar(T)-hasCar(T,C1),clength(C1,short).
  • hasClosedCar(T)-hasCar(T,C2),not croof(C2,none).
  • Given a way to construct and select first-order
    features, rule construction is semi-propositional
  • head and body literals have the same global
    variable(s)
  • corresponds to single table, one row per example

26
First-order feature bias in Tertius
  • Flattened representation, but derived from
    strongly-typed term representation
  • one free global variable
  • each (binary) structural predicate introduces a
    new existential local variable and uses either
    global variable or local variable introduced by
    other structural predicate
  • utility predicates only use variables
  • all variables are used
  • NB. features can be non-boolean
  • if all structural predicates are one-to-one

27
The Tertius system
  • A, anytime top-down search algorithm
  • optimal refinement operator
  • 7500 lines of GNU C
  • propositional Weka plug-in available
  • P.A. Flach N. Lachiche (2001),
    Confirmation-guided discovery of first-order
    rules with Tertius, Machine Learning 42(1/2)
    6195
  • www.cs.bris.ac.uk/Research/MachineLearning/Tertius
    /

28
Subgroups vs. classifiers
  • Classification rules aim at pure subgroups
  • Subgroups aim at significantly higher (or
    different) proportion of positives
  • essentially the same as cost-sensitive
    classification
  • instead of FNcost we have TPprofit

29
Relational Data Mining
  • Single-table assumption
  • (Multi-)relational data mining and ILP
  • FO representations
  • Upgrading propositional DM systems to FOL
  • A case study Mining Association rules
  • Conclusions

30
Standard Data Mining Approach
  • Most existing data mining approaches look for
    patterns in a single table of data (or DB
    relation)
  • Each row represents an object and columns
    represent properties of objects.
  • Single table assumption

31
Standard Data Mining Approach
  • In the customer table we can add as many
    attributes about our customers as we like.
  • A persons number of children
  • For other kinds of information the single-table
    assumption turns out to be a significant
    limitation
  • Add information about orders placed by a
    customer, in particular
  • Delivery and payment modes
  • With which kind of store the order was placed
    (size, ownership, location)
  • For simplicity, no information on the goods
    ordered

32
Standard Data Mining Approach
  • This solution works fine for once-only customers
  • What if our business has repeat customers?
  • Under the single-table assumption we can
  • Make one entry for each order in our customer
    table
  • We have usual problems of non-normalized tables
  • Redundancy, anomalies,

33
Standard Data Mining Approach
  • one line per order ? analysis results will really
    be about orders, not customers, which is not what
    we might want!
  • Aggregate order data into a single tuple per
    customer.
  • No redundancy. Standard DM methods work fine, but
  • There is a lot less information in the new table
  • What if the payment mode and the store type are
    important?

34
Relational Data
  • A database designer would represent the
    information in our problem as a set of tables (or
    relations)

35
Relational Data Mining
  • (Multi-)Relational data mining algorithms can
    analyze data distributed in multiple relations,
    as they are available in relational database
    systems.
  • These algorithms come from the field of inductive
    logic programming (ILP)
  • ILP has been concerned with finding patterns
    expressed as logic programs
  • Initially, ILP focussed on automated program
    synthesis from examples
  • In recent years, the scope of ILP has broadened
    to cover the whole spectrum of data mining tasks
    (association rules, regression, clustering, )

36
ILP successes in scientific fields
  • In the field of chemistry/biology
  • Toxicology
  • Prediction of Dipertene classes from nuclear
    magnetic resonance (NMR) spectra
  • Analysis of traffic accident data
  • Analysis of survey data in medicine
  • Prediction of ecological biodegradation rates
  • The first commercial data mining systems with ILP
    technology are becoming available.

37
Relational patterns
  • Relational patterns involve multiple relations
    from a relational database.
  • They are typically stated in a more expressive
    language than patterns defined on a single data
    table.
  • Relational classification rules
  • Relational regression trees
  • Relational association rules
  • IF Customer(C1,N1,FN1,Str1,City1,Zip1,Sex1,SoSt1,
    In1,Age1,Resp1)
  • AND order(C1,O1,S1,Deliv1, Pay1)
  • AND Pay1 credit_card
  • AND In1 ? 108000
  • THEN Resp1 Yes

38
Relational patterns
  • IF Customer(C1,N1,FN1,Str1,City1,Zip1,Sex1,SoSt1,
    In1,Age1,Resp1)
  • AND order(C1,O1,S1,Deliv1, Pay1)
  • AND Pay1 credit_card
  • AND In1 ? 108000
  • THEN Resp1 Yes
  • good_customer(C1) ?
  • customer(C1, N1,FN1,Str1,City1,Zip1,Sex1,SoSt1,
    In1,Age1,Resp1) ? order(C1,O1,S1,Deliv1,
    credit_card) ?
  • In1 ? 108000
  • This relational pattern is expressed in a subset
    of first-order logic!
  • A relation in a relational database corresponds
    to a predicate in predicate logic (see deductive
    databases)

39
Relational decision tree
  • Equivalent Prolog program
  • class(sendback) - worn(X), not_replaceable(X),
    !.
  • class(fix) - worn(X), !.
  • class(keep).

40
Relational regression rule
  • Background knowledge

Induced model
41
Relational association rule
  • Relational database

likes(KID, piglet), likes(KID, ice-cream)
? likes (KID, dolphin) (9, 85)
likes(KID, A), has(KID, B) ? prefers (KID, A, B)
(70, 98)
42
First-order representations
  • An example is a set of ground facts, that is a
    set of tuples in a relational database
  • From the logical point of view this is called a
    (Herbrand) interpretation because the facts
    represent all atoms which are true for the
    example, thus all facts not in the example are
    assumed to be false.
  • From the computational point of view each example
    is a small relational database or a Prolog
    knowledge base
  • A Prolog interpreter can be used for querying an
    example.
  •  

43
FO representation (ground clauses)
  • Example
  • eastbound(t1)- car(t1,c1),rectangle(c1),short(c1
    ),none(c1),two_wheels(c1), load(c1,l1),circle(l1
    ),one_load(l1), car(t1,c2),rectangle(c2),long(c2)
    ,none(c2),three_wheels(c2), load(c2,l2),hexagon(
    l2),one_load(l2), car(t1,c3),rectangle(c3),short(
    c3),peaked(c3),two_wheels(c3), load(c3,l3),trian
    gle(l3),one_load(l3), car(t1,c4),rectangle(c4),lo
    ng(c4),none(c4),two_wheels(c4), load(c4,l4),rect
    angle(l4),three_load(l4).
  • Background theory
  • polygon(X) - rectangle(X)
  • polygon(X) - triangle(X)
  • Hypothesis eastbound(T)-car(T,C),short(C),not
    none(C).

44
Background knowledge
  • As background knowledge is visible for each
    example, all the facts that can be derived from
    the background knowledge and an example are part
    of the extended example.
  • Formally, an extended example is the minimal
    Herbrand model of the example and the background
    theory.
  • When querying an example, it suffices to assert
    the background knowledge and the example the
    Prolog interpreter will do the necessary
    derivations.

45
Learning from interpretations
  • The ground-clause representation is peculiar of
    an ILP setting denoted as learning from
    interpretations.
  • Similar to older work on structural matching.
  • It is common to several relational data mining
    systems, such as
  • CLAUDIEN searches for a set of clausal
    regularities that hold on the set of examples
  • TILDE top-down induction of logical decision
    trees
  • ICL Inductive classification logic (upgrade of
    CN2)
  • It contrasts with the classical ILP setting
    employed by the systems PROGOL and FOIL.

46
FO representation (flattened)
  • Example eastbound(t1).
  • Background theorycar(t1,c1). car(t1,c2).
    car(t1,c3). car(t1,c4).rectangle(c1).
    rectangle(c2). rectangle(c3).
    rectangle(c4).short(c1). long(c2).
    short(c3). long(c4).none(c1).
    none(c2). peaked(c3).
    none(c4).two_wheels(c1). three_wheels(c2).
    two_wheels(c3). two_wheels(c4).load(c1,l1).
    load(c2,l2). load(c3,l3).
    load(c4,l4).circle(l1). hexagon(l2).
    triangle(l3). rectangle(l4).one_load(l1).
    one_load(l2). one_load(l3).
    three_loads(l4).
  • Hypothesis eastbound(T)-car(T,C),short(C),not
    none(C).

47
FO representation (terms)
  • Example eastbound(c(rectangle,short,none,2,l(cir
    cle,1)), c(rectangle,long,none,3,l(hexa
    gon,1)), c(rectangle,short,peaked,2,l(t
    riangle,1)), c(rectangle,long,none,2,l(
    rectangle,3))).
  • Background theory empty
  • Hypothesis eastbound(T)-member(C,T),arg(2,C,sho
    rt), not arg(3,C,none).

48
FO representation (strongly typed)
  • Type signature data Shape Rectangle
    Hexagon data Length Long Shortdata
    Roof None Peaked data Object Circle
    Hexagon type Wheels Int type Load
    (Object,Number) type Number Inttype Car
    (Shape,Length,Roof,Wheels,Load) type Train
    CareastboundTrain-gtBool
  • Example eastbound((Rectangle,Short,None,2,(Circ
    le,1)), (Rectangle,Long,None,3,(Hexagon
    ,1)), (Rectangle,Short,Peaked,2,(Triang
    le,1)), (Rectangle,Long,None,2,(Rectang
    le,3))) True
  • Hypothesis eastbound(t) (exists \c -gt
    member(c,t) proj2(c)Short
    proj3(c)!None)
  • Example language Escher functional logic
    programming

49
FO representation (database)
SELECT DISTINCT TRAIN_TABLE.TRAIN FROM
TRAIN_TABLE, CAR_TABLE WHERE TRAIN_TABLE.TRAIN
CAR_TABLE.TRAIN AND CAR_TABLE.LENGTH short
AND CAR_TABLE.ROOF ! 'none'
50
Individual-centered representation
  • The database contains information on a number of
    trains.
  • Each train is an individual.
  • The database can be partitioned according to
    individual to obtain a ground-clause
    representation
  • Problem sometime individuals share common parts.
  • Example we want to discriminate
  • black and white figures on the basis of their
  • position.
  • Each geom. figure is an individual

51
Object-centered representation
  • The whole sequence is an object, which can be
    represented by a multiple-head ground clause
  • black(x11) ? black(x12) ? white(x13) ?
    black(x14) -
  • first(x11), crl(x11), next(x12,x11), crl(x12),
  • sqr(x13), crl(x14), next(x14,x13),
    next(x13,x12)
  • This is the representation adopted in ATRE.

52
How to upgrade propositional DM algorithms to
first-order
  • Identify the propositional DM system that best
    matches the DM task
  • Use interpretations to represent examples
  • Upgrade the representation of propositional
    hypotheses attribute-value tests by first-order
    literals and modify the coverage test
    accordingly.
  • Structure the search-space by a more-general-than
    relation that works on first-order
    representations
  • ?-subsumption
  • Adapt the search operators for searching the
    corresponding rule space
  • Use a declarative bias mechanism to limit the
    search space
  • Implement
  • Evaluate your (first-order) implementation on
    propositional and relational data
  • Add interesting extra features

53
Mining association rules a case study
  • A set I of literals called items.
  • A set D of transactions ts such that t ? I.
  • X ? Y (s, c)
  • "IF a pattern X appears in a transaction t, THEN
    the pattern Y tends to hold in the same
    transaction t"
  • X ?I, Y ?I, X?Y?
  • s p(X?Y) support
  • c p(YX) p(X?Y) / p(X) confidence
  • Agrawal, Imielinsky Swami.
  • Mining association rules between sets of items in
    large databases.
  • Proc. SIGMOD 1993

54
What is an association rule?
Example market basket analysis. Each
transaction is the list of items bought by a
customer on a single visit to a store. It is
represented as a row in a table IF a customer
buys bread and butter THEN he also buys cheese
(20, 66) Given that 20 of customers buy
bread, cheese and butter, 66 of customers who
buy bread and butter also buy cheese
55
Mining association rules The propositional
approach
  • Problem statement
  • Given
  • a set of transactions D
  • a couple of thresholds, minsup and minconf
  • Find
  • all association rules that have support and
    confidence greater than minsup and minconf
    respectively.

56
Mining association rules The propositional
approach
  • Problem decomposition
  • Find large (or frequent) itemsets
  • Generate highly-confident association rules
  • Representation issues
  • The transaction set D may be a data file, a
    relational table or the result of a relational
    expression
  • Each transaction is a binary vector

57
Mining association rules The propositional
approach
  • Solution to the first sub-problem
  • The APRIORI algorithm (Agrawal Srikant, 1999)
  • Find large 1-itemsets
  • Cycle on the size (kgt1) of the itemsets
  • APRIORI-gen Generate candidate k-itemsets from
    large (k-1)-itemsets
  • Generate large k-itemsets from candidate
    k-itemsets (cycle on the transactions in D)
  • until no more large itemsets are found.

58
Mining association rules The propositional
approach
  • Solution to the second sub-problem
  • For every large itemset Z, find all non-empty
    subsets Xs of Z
  • For every subset X, output a rule of the form X ?
    (Z-X) if support(Z)/support(X) ? minconf.
  • Relevant work
  • Agrawal Srikant (1999). Fast Algorithms for
    Mining Association Rules, in Readings in Database
    Systems, Morgan Kaufmann Publishers.
  • Han Fu (1995). Discovery of Multiple-Level
    Association Rules from Large Databases, in Proc.
    21st VLDB Conference

59
Mining association rules The ILP approach
  • Problem statement
  • Given
  • a deductive relational database D
  • a couple of thresholds, minsup and minconf
  • Find
  • all association rules that have support and
    confidence greater than minsup and minconf
    respectively.

60
Mining association rules The ILP approach
  • Problem decomposition
  • Find large (or frequent) atomsets
  • Generate highly-confident association rules
  • Representation issues
  • A deductive relational database is a relational
    database which may be represented in first-order
    logic as follows
  • Relation ? Set of ground facts (EDB)
  • View ? Set of rules (IDB)

61
Mining association rules The ILP approach
  • Example Relational database

likes(joni, ice-cream) atom
likes(KID, piglet), likes(KID, ice-cream)
atomset
? likes (KID, dolphin) (9, 85)
likes(KID, A), has(KID, B) ? prefers (KID, A, B)
(70, 98)
62
Mining association rules The ILP approach
  • Solution to the first sub-problem
  • The WARMR algorithm (Dehaspe De Raedt, 1997)
  • L. Dehaspe L. De Raedt (1997). Mining
    Association Rules in Multiple Relations, Proc.
    Conf. Inductive Logic Programming
  • Compute large 1-atomsets
  • Cycle on the size (kgt1) of the atomsets
  • WARMR-gen Generate candidate k-atomsets from
    large (k-1)-atomsets
  • Generate large k-atomsets from candidate
    k-atomsets (cycle on the observations loaded from
    D)
  • until no more large atomsets are found.

63
Mining association rules The ILP approach
  • WARMR
  • Breadth-first search on the atomset lattice
  • Loading of an observation o from D (query result)
  • Largeness of candidate atomsets computed by a
    coverage test
  • APRIORI
  • Breadth-first search on the itemset lattice
  • Loading of a transaction t from D (tuple)
  • Largeness of candidate itemsets computed by a
    subset check

64
Mining association rules The ILP approach
  • Pattern Space

false ?q Q1 ? ? is_a(X, large_town) ?
intersects(X, R) ? is_a(R, road) ?q Q2? ?
is_a(X, large_town) ? intersects(X,Y) ?q Q3? ?
is_a(X, large_town) ?q true
65
Mining association rules The ILP approach
  • Candidate generation

Refinement step
is_a(X, large_town), intersects(X,R), is_a(R,
road)
Pruning step
66
Mining association rules The ILP approach
  • Candidate evaluation

is_a(X, large_town), intersects(X,R), is_a(R,
road), adjacent_to(X,W), is_a(W, water)
?- is_a(X, large_town), intersects(X,R), is_a(R,
road), adjacent_to(X,W), is_a(W, water)
D
Large?
ltXbarletta,Ra14,Wadriaticogt ltXbari,Rss16bis,W
adriaticogt...
67
Mining association rules The ILP approach
is_a(X, large_town), intersects(X,R), is_a(R,
road), adjacent_to(X,W), is_a(W, water)
68
Conclusions and future work
  • Multi-relational data mining more data mining
    than logic program synthesis
  • choice of representation formalisms
  • input format more important than output format
  • data modelling e.g. object-oriented data mining
  • new learning tasks and evaluation measures
  • Reference
  • Saso Dzeroski and Nada Lavrac, editors,
  • Relational Data Mining,
  • Springer-Verlag, September 2001
Write a Comment
User Comments (0)
About PowerShow.com