Mining Frequent Patterns Without Candidate Generation - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Mining Frequent Patterns Without Candidate Generation

Description:

arma – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 52
Provided by: Jiaw166
Category:

less

Transcript and Presenter's Notes

Title: Mining Frequent Patterns Without Candidate Generation


1
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

2
????????
  • ????(Time Series)?????????????????,????????? ?
  • ???,??????????????????????????????????????????????
    ???????????????,??????????????????????????????????
    ???????
  • ???????????????????,???????????,???????????

3
????????
  • ????????,????????????????????????,????????????????
  • ??????????????????????,???????,?????????????????
  • ???,??????????????????????????????????????????????
    ????????,?????????????,????????????????????
  • ????????,?????????????????X(t)????,??????t1,t2,,t
    n(t????,?t1ltt2lt,lttn)??????????Xt1,Xt2,,Xtn??????
    ??????X(t)???????,Xti (i1,2,,n)????????,????????
    ??

4
????????
  • ?????????????????????,??????????????????????
    ??????????
  • ??????????????????,??????????????????????
  • ??????????????????????????,???????????????????????
    ?????????????????
  • ?????????????????????????????????,???????????????
  • ?????????????????????????????????,???????????????
  • ????????????????????????????,????????????????????
    ????(????)???,???????????????

5
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

6
???????????
  • ????????????????,????????????????????,???????
    ??????????????????????,?????????????????????
  • ???????????
  • ??????????
  • ????

7
???????????(?)
  • ???????????
  • ???????????????,??????????????,?????????????????
    ????,?????????????,???????????????????????
  • ???????????????????????????????,????????????????
    ????????????????
  • ?????????????????????????????????????
  • ??????????(???)????????(??????????)?
  • ???????????????
  • ?Tt??????,St ?????????,Ct ?????????,Rt???????,yt
    ???????????????????????????????
  • ????yt Tt St Ct Rt?
  • ????yt TtStCtRt?
  • ????yt TtSt Rt ?yt St TtCtRt?

8
???????????(?)
  • ??????????
  • ????????,???????????,????????
  • ?????????,??????(Auto Regressive,??AR)?????????(Mo
    ving Average,??MA)????????(Auto Regressive Moving
    Average,??ARMA)?????????
  • ????
  • ??????????????,???????????????????????????,????
    ????????????????????????????????,?????????????????
    ??????,?????????????,??????????

9
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

10
??ARMA?????????
  • ARMA??(??????AR??)?????????????????????????1927?,G
    . U. Yule????AR??,??,AR???????ARMA?????ARMA???ARMA
    ????????????ARMA???????????,??????????????????????
    ,????????????????
  • 1.ARMA??
  • ??????????????
    ,?X?t??????????n?????
    ??,?????m??????
    ??(n,m1,2,),???????????,???????ARMA(n,m)??
  • ?? ?

11
??ARMA?????????(?)
  • 2.AR??
  • AR(n)???ARMA(n,m)???????????ARMA(n,m)?????,?
    ?,?
  • ?? ????????????????,????n?????
    ?,??AR(n)?
  • 3 . MA??
  • MA(m)???ARMA(n,m)????????????ARMA(n,m)?????,?
    ?,?
  • ?? ?????????????,????m?????(
    Moving Average)??,??MA(m)?

12
??AR??
??AR????????????????????? ??AR(n)??,?
,??
, ????????????? ,
, ,
? ?????????? , ?? ???
???????,???? ???????? ?
,
,
,
,
?
13
??????
  • ???????,??????????
    ????? ,????????????????????Yi????? ?
  • ? ??n???,?????n??????,??????????????
    n???Rn?????????,????????????????????
  • 1.Euclide
  • 2.????????
  • ?? ???????????,N??????????
  • 3.Mahalanobis????
  • ?? ????????????
  • 4.Mann????
  • ??, ???????????, ?????????

,
,
,
?
14
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

15
???????????????????
  • ??????,??????????????????????
  • ??????
  • Len(X)????X???
  • First(X)????X??????
  • Last(X)????X???????
  • ??X?i?????,
  • ????????lt??,???X?,??iltj ,??XiltXj
  • ??? ??X????,????X?k????,???????????? ?
  • ?????lt??, ?X????,??
  • ,?? ?
  • ?????(Overlap),??X S1,XS2?X??????,??
    ?
  • ??,?XS1?XS2???

16
???????????????????
  • ???,??????????
  • ????(Whole Matching)???N???
  • ???????X,??????????,????
    ,????? ?????
  • ?????(Subsequence Matching)???N??????????
    ???????X??????????????
    ????????,???????X???????????

17
????
  • ???????????????????????????????,????????,?????
    ?????
  • 1.????
  • ????????
    ,?X?????????,?? ,
  • ??,X?xt??????,? ? ??????,
    , ???????
  • 2.????
  • ??Parseval???,?????????????????,??

18
????(?)
  • ??Parseval???,?????????
  • ????????,?????????????????,???????????????????????
    ??????? ???,?
  • ??,
  • ???????????????????????
  • 3.????
  • ???????????????????????????,?????????????????,????
    ???

19
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

20
???????????

?6-3 ??X?Y
?6-4 ??Gap????X?Y

?6-5 ????????X?Y ?6-6
????????X?Y
21
????
  • ??6-1 ???? ??????????????
    ?Y ?????????????
    ????3???,????
  • ? ??-similar
  • (1)????
    ???
  • (2)?????????????????????
  • ????????????,?????????????????,??????????
  • (3)???,????
  • ???????????X?Y?????????????????????????,????
    ?X?Y??-similar???????????,???????????????
  • ???????X?Y???????,???????????????

?
22
???????????
  • Agrawal?X?Y???????????????????????????????
  • 1.??????
  • ?????????????????????,??????(Atomic
    Matching)???????????????????????(???520),????????
    ???????,???????????????????
  • ????????????????????????,?????????????????
  • ?? ??????i????, ? ???????????????????
    ????????????????????(-1,1)??????????????????

23
???????????(?)
  • 2.????
  • ????(Window Stitching)??????,?????????????????????
    ??????????
  • ???X?Y?m????????,??
  • ??????????????????
  • (1)?????i?? ??
  • (2)????jgti,
  • (3)???igt1,?? ?? ??,? ?
    ???Gap?????,??Y????????? ? ??,?????d, ?
    ??????????d?
  • (4)X???????????????????????,Y?????????????????????
    ????

24
???????????(?)
  • 3.?????
  • ?????????????????,?????????(Subsequence
    Ordering),???????????????
  • ????????????????????????????????????
  • ????????????????????,???????????????????,?????????
    ????,??????????????

25
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

26
????
  • ????????????,?????????????????????????????????????
    ??,????????????????????????,??????????????????????
    ???????????,????????????????
  • ??????????Agrawal?????,???????????????????????????
    ????????????????????????
  • ????????????????????????,???????????????,?DNA?????
    ???????Web???????????????????????

27
????????
  • ??6-3 ????(Sequence)???????,??aa1?a2???an,????ai?
    ????(Itemset)????????(Length)???????????k???????k-
    ???
  • ??6-4 ???aa1?a2???an,??ßß1?ß2???ßm
    ??????i1lti2lt?ltin,??

  • ,
  • ????a???ß????,???ß????a???????,?????a?????????
    ?,??a????????(Maximal sequence)?
  • ??6-5 ????S,?????DT,??S????(Support)??S?DT????????
    ????????S????????????????????(min-sup)?k-??,??DT??
    ??k-???

28
??????????
?6-1?????????????
??????????????????????(Customer-id)?????(Transacti
on-Time)??????????(Item)?????????6-1??????????????
????????????????,?????????????????????,??????????
???????????????6-2????6-1???????????
???(Cust_id) ????(Tran_time) ??(Item)
1 1 June 2599 June 3099 30 90
2 2 2 June 1099 June 1599 June 2099 10,20 30 40,60,70
3 June 2599 30,50,70
4 4 4 June 2599 June 3099 July 2599 30 40,70 90
5 June 1299 90
?6-2???????
???(Cust_id) ????(Customer Sequence)
1 lt(30)(90)gt
2 lt(10,20)(30)(40,60,70)gt
3 lt(30,50,70)gt
4 lt(30)(40,70)((90) gt
5 lt(90)gt
29
??????????(?)
?????????????????????????????????????????????????
??????????????????????????????????????????????????
???6-3??????????????,????????????????????????????
?6-3??????????
???(Pro_id) ????(Call_time) ???(Call_id)
744 744 1069 9 1069 744 1069 9 -1 04011030 04011031 04011032 04011034 04011035 04011038 04011039 04011040 23 14 4 24 5 81 62 16
?6-2???????
?6-4???????????
???(Pro_id) ????(Call_sequence)
744 1069 9 lt(23,14,81)gt lt(14,24,16)gt lt(4,5,62)gt
30
???????????
  • ??????????????????????????????????????????????????
    ?????????????
  • 1. ????
  • ????????(Sort),????????????????????(??????????????
    ???????)???,??????????,??????(Cust_id)?????(trans-
    time)????,???????????????????????????????
  • 2. ?????
  • ??????????????(????)?????L????,????????1-???????,?
    ltlgt l ?L?
  • ????6-2???????????,??????2,???????(30),(40),(70),(
    40,70)?(90)??????,??????????????????,????????????6
    -6????????,???????????????????

Large Itemsets Mapped To
(30) (40) (70) (40,70) (90) 1 2 3 4 5
31
???????????(?)
  • 3. ????
  • ???????????,?????????????????????????????????
  • ?6-7????6-2???????????????,??ID??2????????????
    ,??(10,20)????,???????????????(40,60,70)????????
    (40),(70),(40,70)???
  • 4. ????
  • ????????????????,????(Large Sequence)?
  • 5. ?????
  • ????????????(Maximal Sequences)?

Large Itemsets Mapped To
(30) (40) (70) (40,70) (90) 1 2 3 4 5
32
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

33
AprioriAll??
  • AprioriAll?????????Apriori,??Apriori?????????????,
    ???????????????
  • ????????????????????????,??????????????????????
  • ???????,????????????????1-??????
  • ??????,???????????????,???????,?????????????
  • ???????,????????????1-?????????

34
AprioriAll??
1. AprioriAll???? ??6-1 AprioriAll?? ???????????
?????DT ???????? (1) L1large 1-sequences//
?????????? (2) FOR(k2Lk-1 ? ?k) DO BEGIN (3)
CkaprioriALL_generate(Lk-1) //
Ck??Lk-1????????? (4) FOR each
customer-sequence c in DT DO //???????????????c
(5) Sum the count of all candidates in Ck
that are contained in c //????c?Ck????????? (6)
Lk Candidates in Ck with minimum support //
LkCk???????????? (7) END (8) Answer Maximal
Sequences in ?kLk ????????????Apriori??????????,
??????????????????????????????,??????????
?6-2???????
35
AprioriAll????
?? 6-1 ??????????3?????????????aprioriALL_gener
ate?????,????????6-10 (b)????????????L3?????,?????
??6-10 (c)??????????,??lt1,2,4,3gt,??lt2,4,3gt??L3?3??
?,??lt1,2,4,3gt??????????????????????,???????????4??
?????????4???,??lt1,2,4gt?lt1,3,5gt???,???WHERE???????
????
36
AprioriAll????
??6-2????????????????6-11(a)??,????????????????,??
??????????????????????40(???????????)????????????
?1-??,????AprioriAll??????????????????6-11????????

37
AprioriAll????

38
AprioriAll????
??,AprioriAll?????????-k???,?L1?L2?L3?L4,?
??-k?????Maximal Sequences in ?kLk??,?????????????
??????? AprioriAll???Apriori?????,??????????
????????????????????????????6-8???????????,???????
???????????????6-11?,??L2??C3???,??lt2,3,4gt?lt2,4,3gt
????????,?????lt2,4,3gt????????????L3????????
Sequences
Support lt1,2,3,4gt 2 lt1,3,5gt
2 lt4,5gt
2
39
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

40
AprioriSome??
  • AprioriSome???????AprioriAll?????,??????????
  • ??????????????????????
  • ??????????????????????

??6-3 AprioriSome?? ????????????????DT ????????
// Forward Phase ???? (1) L1 large
1-sequences//????????? (2) C1 L1 (3) last
1 //?????Clast (4) FOR(k 2Ck-1
? ? and Llast ??k)DO BEGIN (5) IF (Lk-1
know) THEN Ck New candidates generated from
Lk-1 //Ck???Lk-1????? (6) ELSE Ck New
candidates generated from Ck-1 //
Ck???Ck-1????? (7) IF (k next(last)) THEN
BEGIN (9) FOR each customer-sequence c in the
database DO //???????????????c (10) Sum
the count of all candidates in Ck that are
contained in c //????c??Ck????????? (11) Lk
Candidates in Ck with minimum support //
Lk?Ck???????????? (12) last k (13)
END (14)END
41
AprioriSome??(?)
// Backward Phase ???? (15)FOR (k - - k gt
1 k - - ) DO (16)IF (Lk not found in forward
phase) THEN BEGIN // Lk???????????? (17)
Delete all sequences in Ck contained in Some Li,
i gt k //?????Ck????Lk????,igtk (18) FOR each
customer-sequence c in DT DO
//???DT?????????c (19) Sum the count of all
candidates in Ck that are contained in
c //??Ck????c??????????? (20) Lk Candidates
in Ck with minimum support // Lk
?Ck???????????? (21)END (22)ELSE Delete all
sequences in Lk contained in Some Li,i gt k// Lk
?? (23)Answer ?k Lk //?k?m?Lk??? ?????(fo
rward phase)?,??????????????????,??????????1?2?4?6
?????(?????),????3?5?????????????next?????????????
???,????????????????? ??6-4 next(k integer) IF
(hitk lt 0.666)THEN return k 1 ELSEIF (hit k lt
0.75)THEN return k 2 ELSEIF (hit k lt
0.80)THEN return k 3 ELSEIF (hit k lt
0.85)THEN return k 4 ELSE THEN return k
5 hitk?????k-??(large k-sequence)???k-??(candidat
e k-sequence)???,?Lk/Ck????????????????????,??
?????????????????????????????
?6-2???????
42
AprioriSome??
?? 6-3 ??????AprioriAll??????6-11(a)????????Aprior
iSome?????????????L1(??6-9(b)?L1??)???next(k)2k,?
????C2????L2(??6-11(d)??L2??)???????,apriori_gener
ate???L2?????????C3??6-12(e)???C3????????????C3,??
????L3????apriori_generate???C3???C4,??????,??????
?6-9(i)???C4?????C4??L4(?6-12(i))??,??????C5,?????
???????????????6-12???
43
AprioriSome??

44
AprioriSome??

????????????6-13???
45
AprioriAll?AprioriSome??
  • AprioriAll?AprioriSome??
  • AprioriAll?Lk-1????????Ck,?AprioriSome????Ck-1????
    ???? Ck.,??Ck-1??Lk-1,??AprioriSome??????????
  • ??AprioriSome???????,?????????????,??????????????
  • ??????,AprioriSome???????????????,(??????????)???,
    ????????????????????,???AprioriSome????AprioriAll?
    ??
  • ????????, ???????, ????????????, ?? AprioriSome
    ????

46
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

47
GSP??
  • GSP??????????
  • ????????,?????1?????L1,????????
  • ??????i ????Li ????????????????i1???????Ci1????
    ?????,??????????????,?????i1?????Li1,??Li1?????
    ??
  • ??????,????????????????????????
  • ??,?????????????
  • ????????????S1?????????????S2???????????????,????
    S1?S2????,??S2??????????S1??
  • ????????????????????????,????????????????,???????
    ??????
  • ????????????????????
  • ?????????????C,???????DT,??????????d,????C??d?????
    ???????,??????????

48
GSP??
??6-5 GSP?? ????????????????DT? ?????? (1) L1
large 1-sequences// ?????????? (2) FOR (k
2Lk-1 ? ?k) DO BEGIN (3) Ck
GSPgenerate(Lk-1) (4) FOR each
customer-sequence c in the database DT DO (5)
Increment the count of all candidates in Ck
that are contained in c (6) Lk Candidates
in Ck with minimum support (7) END (8) Answer
Maximal Sequences in ?kLk
49
GSP????
?? 6-5 ?6-9???????3??????????4???????????
?????,??lt(1,2),3gt???lt2,(3,4)gt??,??lt(,2),3gt?lt2,(3
,)gt????,???????lt(1,2),(3,4)gtlt(1,2),3gt?lt2,3,5gt??,
??lt(1,2),3,5gt???????????????3??????,??lt(1,2),4gt???
?????3?????,??????????lt(2),(4,)gt??lt(2),(4)()gt???
? ?6-9 GSP???? ?????lt(1,2),3,5gt????,????
lt1,3,5gt???L3?,?lt(1,2),(3,4)gt????3??????L3????????
Sequential patterns With Length 3 Candidate4-Sequences Candidate4-Sequences
Sequential patterns With Length 3 After Join After Pruning
lt(1,2),3gt lt(1,2),4gt lt1,(3,4)gt lt(1,3),5gt lt2,(3,4)gt lt2,3,5gt lt(1,2),(3,4)gt lt(1,2),3,5gt lt(1,2),(3,4)gt
50
??? ??????????? ????
  • ????????
  • ???????????
  • ??ARMA?????????
  • ???????????????????
  • ???????????
  • ??????????
  • AprioriAll ??
  • AprioriSome ??
  • GSP??

51
  • Thank you !!!
Write a Comment
User Comments (0)
About PowerShow.com