CFI-Stream:%20Mining%20Closed%20Frequent%20Itemsets%20in%20Data%20Streams - PowerPoint PPT Presentation

About This Presentation
Title:

CFI-Stream:%20Mining%20Closed%20Frequent%20Itemsets%20in%20Data%20Streams

Description:

CFI-Stream: Mining Closed Frequent Itemsets in Data Streams Nan Jiang,Le Gruenwald SIGKDD 06 2006/10/04 Introduction mining Closed frequent ... – PowerPoint PPT presentation

Number of Views:184
Avg rating:3.0/5.0
Slides: 18
Provided by: Aqu114
Category:

less

Transcript and Presenter's Notes

Title: CFI-Stream:%20Mining%20Closed%20Frequent%20Itemsets%20in%20Data%20Streams


1
CFI-Stream Mining Closed Frequent Itemsets in
DataStreams
  • Nan Jiang,Le Gruenwald
  • SIGKDD06

  • ??????
  • 2006/10/04

2
Introduction
  • mining Closed frequent itemsets
  • computes and maintains closed itemsets online and
    incrementally
  • perform the closure checking
  • output the current closed frequent itemsets in
    real time based on users specified thresholds

3
Definition
  • Ddata stream
  • I , , , a set of n elements,
  • called items
  • T subsets of all the transactions
  • X subsets of all the items appearing
  • in a data stream

4
Definition
  • C(X)the smallest closed set containing X
  • Definition 1
  • An itemset X is said to be closed if and only
    if C(X) f(g(X)) fg(X) X

5
Algorithm
  • CFI-Stream algorithm
  • DIrect Update (DIU) tree
  • perform the closure checking online over a data
    stream sliding window
  • Conditions need to check for closed itemsets
  • check when performing addition and deletion
    operations on the DIU tree

6
DIU tree
  • maintain the current closed itemsets
  • k levels in the DIU tree, each level i
  • stores the closed i-itemsets

7
DIU tree
  • Each node in the DIU tree stores
  • a closed itemset
  • its current support information
  • links to its parent and children nodes

8
Add a Transaction to the DIU Tree
  • T1original transaction set
  • tnew arrived transaction
  • Conditions to Check for Closed Itemsets
  • (1)
  • t is in the T1, if the largest itemset X
    it
  • contains is not currently in the DIU tree
  • -gtcheck for all Xs subsets Y, which are in
  • T1

9
  • (2)
  • when t is not in T1, for each its subset
  • Y, if Y is in T1, we need to check

10
Closure Checking for Addition
11
  1. C,D

2 A,B
3 A,B,C
4 A,B,C
1
1
2
AB
2
12
Delete a Transaction in DIU Tree
  • Conditions to Check for Closed Itemsets
  • When the number of the transactions
  • with same itemset of X is equal to zero, if Y
    is a subset of X, and Y is a closed itemset in
    the original transaction set

13
Closure Checking for Deletion
14
  1. C,D

2 A,B
3 A,B,C
4 A,B,C
3
C
3
1
CD
AB
2
ABC
15
Experiment
  • Synthetic datasets
  • T10.I6.D100K and T5.I4.D100K

16
Experiment
17
Experiment
Write a Comment
User Comments (0)
About PowerShow.com