Why Not Store Everything in Main Memory? Why use disks? - PowerPoint PPT Presentation

Loading...

PPT – Why Not Store Everything in Main Memory? Why use disks? PowerPoint presentation | free to download - id: 731f44-YmIxN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Why Not Store Everything in Main Memory? Why use disks?

Description:

FAUST Analytics X(X1..Xn) Rn, |X|=N. If X is a classified training set with classes, C={C1..CK} then X=X((X1..Xn,C}. d=(d1..dn), p=(p1..pn) Rn. – PowerPoint PPT presentation

Number of Views:15
Avg rating:3.0/5.0
Slides: 37
Provided by: William1269
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Why Not Store Everything in Main Memory? Why use disks?


1
FAUST Analytics X(X1..Xn) ? Rn, XN. If
X is a classified training set with classes,
CC1..CK then XX((X1..Xn,C. d(d1..dn),
p(p1..pn)?Rn. Functionals, FRn?R, FL, S, R
(in terms of bit columns (compressed or not), of
mappings from a PTS to a SPTS).
Ld,p ? (X-p)od Xod - pod And letting Ld ?
Xod, Ld,p Ld - pod
Sp ? (X-p)o(X-p) XoX Xo(-2p) pop
L-2p XoX pop
Rd,p ? Sp - L2d,p XoXL-2ppop-(Ld)2-2podXod(
pod)d2 L-2p-(2pod)d - (Ld)2 XoX
pop(pod)2
Fmind,p,k ? min(Fd,pCk) minFd,p,k where Fd,p,k
Fd,p Ck Fmaxd,p,k? max(Fd,pCk) maxFd,p,k

FPCCd,p,k,j? jth precipitous count change
(left-to-right) of Fd,p,k. Same notation for
PCIs and PCDs (incr/decr)
GAP Gap Clusterer If DensityThreshold, DT,
isn't reached, cut C mid-gap of Ld,pC using the
next (d,p) from dpSet
PCC Precipitous Count Change Clusterer If DT
isn't reached, cut C at PCCs?Ld,pC using the
next (d,p) from dpSet
Fusion step may be required? Use density,
proximity, or use Pillar pkMeans (next slide).
TKO Top K Outlier Detector Use
rankn-1Sx for TopKOutlier-slider.
LIN Linear Classifier y?Ck iff y?LHk ?
z minLd,p,k ? Ld,p,k(z) ? maxLd,pd,k ?
(d,p)?dpSet LHk is a
Linear hull around Ck. dpSet is a set of (d,p)
pairs, e.g., (Diag,DiagStartPt).
LSR Linear Spherical Radial Classifier y?Ck
iff y?LSRHk?z minFd,p,k? Fd,p,k(z) ?
maxFd,p,k ?d,p?dpSet, ?FL,S,R
XoX can be pre-computed, one time.
What should we pre-compute besides XoX?
stats(min/avg/max/std) Xop pclass_Avg/Med
Xod Xox d2(X,x) Rkid2(X,x) Ld,p, Rd,p
2
William Perrizo Here attached, are today's
notes. I hope to get some new Research
Assistants in place (so far Maninder Singh) soon
to do the following 1.  sign a Treeminer NDA. 2.
 Get access to and learn to use the Treeminer
development environment (eclipse, Java, GIT) to
code up all the new algorithms we are looking
at. 3.  Test new algorithms using the Treeminer
real-world datasets. I hope to get these RAs
going soon. Others that can jjoin this project
group (e.g., Rajat Singh). The payoff will be
great if this happens to a substantial degree.
It will take a concentrated effort and maybe a
little assistance (help desk type) from Treemner?
Mark Silverman ltmsilverman_at_treeminer.comgt Tue
11/25/2014 526 PM FYI, some updated results in
text classification   a clear indication that
faust in and of itself is capable of producing
results as accurate as anything out there.  This
is famous dataset (7,500 docs).  Plain Jane FAUST
got 80 - extra boost by - Eliminating any term
that appears in 2 documents or less in the
training set - Using chi-squared to reduce
attributes a further 20 (e.g. pick 80 most
important attributes from the training set). A
key advantage we have is that by processing
vertically, we can toss attributes easily before
we expend a lot of CPU on them.  If we can toss
them intelligently, we can improve the accuracy
of the results, as well as reduce classification
time!  In this case, we eliminated about 70 of
the attributes from the test dataset and achieved
better accuracy than the classifiers referenced
on Stanford Natural Language Processing site!!
 Were exploring other approaches to further
identify the critical attribute. About to turn
this loose on datasets approaching 1TB in size.
William Perrizo Wed 11/26/2014 810 AM Great
news! If the classification setting is such that
every test sample goes in some class (i.e., no
"other" or "no class" samples) then FAUST Oblique
using the midpoint of each gap as the cut point
should be the best approach (i.e., the original
FAUST Oblique).   If the dataset has test samples
which do not go in any of the classes (which is
always the case when doing "one class"
classification for example), then by making two
cuts for every gap, one at the beginning of the
gap and the other at the end of the gap, we
produce a "piecewise linear hull" around each
class and thereby accommodate samples that do not
belong to class (namely those test samples that
fall in the interior of a gap). That's really the
only difference between the older FAUST Oblique
(cutpointgap midpoint) method and the newer
FAUST Oblique Hull method.
Mark Silverman Wed 11/26/2014 906 AM We
are adjusting the midpoint as well based on
cluster deviation- this gives us an extra 4
percentage points or so accuracy over straight
midpoint. The hull is interesting case, as we
are looking at situations like this we are
already able to predict which members are poor
matches to a class,  I will look more closely at
that this is very interesting and very
important case (multiclass even).
William Perrizo Wed 11/26/2014 Yes, we have
discovered also that one has to think about the
quality of the training set.   If it is very high
quality (expected to fully encompass all
borderline cases of all classes) then using exact
gap endpoints is probably wise, but if there is
reason to worry about the comprehensiveness of
the training set (e.g., when there are very few
training samples - which is often the case in
medical expert systems where getting a sufficient
number of training samples is difficult and
expensive), then it is probably better to move
the cutpoints toward the midpoint (reflecting the
vagueness of training set class boundaries). 
What does one use to decide how much to move
away from the endpoints?  That's not an easy
question to answer.  Cluster deviation seems like
a useful measure to employ. One last though on
how to decide whether to cut at gap midpoint,
endpoints, or to move the cut-points away from
the endpoints toward the midpoint, If one has a
time-stamp on training samples, one might assess
the "class endpoint" change rate over time.  As
the training set gets larger and larger, if an
endpoint stops moving much and isn't an outlier,
then cutting at the endpoint seems wise.
 However, if an endpoint is still changing a lot,
then moving away from that endpoint seems wise
(maybe based on the rate of change of that
endpoint as well as other measures?).
Mark Silverman Wed 11/26/2014 A related
point dominant attributes may exist in only some
classes - must be factored in when ascribing
weight/value to an attribute.
3
Graph theory (Wikipedia)
hyperedge is an edge with any of vertices. A
simple graph ia a special case of the hypergraph
(2-uniform hypergraph W/o qualification, an edge
is assumed to consist of at most 2 vertices, and
a graph is never confused with a hypergraph.
A graph is connected if ? a path between any 2
vertexes otherwise, graph is disconnected. A
cut set (vertex cut, separating set) is a set of
vertices whose removal disconnects the remaining
subgraph. A bridge set is an analogous edge set.
If ? path between any 2 vertexes even after
removing any k-1 vertices, G is k-connected (iff
it has k internally disjoint paths between any 2
vertices) The vertex (edge) connectivity or
connectivity of a graph G is the minimum number
of vertices (edges) that need to be removed to
disconnect G.
The set of neighbors of v, that is, vertices
adjacent to v not including v itself, is called
the (open) neighborhood of v and denoted NG(v).
When v is also included, it is called a closed
neighborhood and denoted by NGv. A graph with
n vertices can be represented by its adjacency
matrix an n-by-n matrix whose entry in row-i,
col-j is the of edges from vertex i to j.
A graph with two disjoint vertex sets s.t. an
edge must run from one set to the other is
bipartite 3 sets, tripartite k sets, k-partite
multipartite. A complete multipartite graph is a
graph in which vertices are adjacent if and only
if they belong to different partite sets. A
complete bipartite graph is also referred to as a
biclique if its partite sets contain n and m
vertices, respectively, then the graph is denoted
Kn,m.
Let G(X,Y,E) be a bipartite graph.. A bicliqure
(Sx, Sy) is a complete bipartite subgraph induced
by bipartite vertex set (Sx, Sy).
The Consensus Set of Sx, Py(Sx) ?x?SxNy(x),
i.e., the set of all y's that are adjacent (edge
connected) to every x in Sx.
Thm1 (Sx, Sy) is a maximal biclique iff Sy
Py(Sx) and Sx Px(Sy)
Find all bi-cliques starting with Sysingletons.
Then examine Sy1y2-doubletons s.t. Px(Sy1y2)??
i.e., N(y1)?N(y2)??
Then examine Sy1y2y3-tripletons s.t.
Px(Sy1y2y3)?? i.e., Px(Syiyj)??
iltj and Px(Syiyj)?N(yk)?? k not i or j.
Then examine Sy1y2y3y4-quadrupletons s.t.
Px(Sy1y2y3y4)?? i.e., Px(Syiyjyk)?? iltjltk and
Px(Syiyjyk)?N(yh)?? h not i or j or k...
Will this find all bi-cliques or do we need to
also reverse x and y?
Examining MGRs, (xdocs, ywords) all singleton
wordsets, Sy, form a nonempty bi-clique.
AND pairwise to find all nonempty doubleton
wordset bicliques, Sy1y2.
AND those nonempty doubleton wordset with each
other singleton wordset to find all nonempty
tripleton wordset bicliques, Sy1y2y3...
Start with singleton docs and include another...
until empty set. The last nonempty set is a
max-biclique and all subsets are bicliques so we
can remove all of them and iterate.
7 13 w4 7 35 42 w7 7 35
w10 7 13 33 45 w13 7 30 43
w24 7 9 23 29 45 w42 7 10 11 12 25 41
w44 7 13 w4 w13 7 35
w7 w10 7 45 w13 w42
1 8 w58 1 14 w21 1 17 w49 1 23 w52 1 28
w52 1 30 w49 1 41 w52 1 46 w49 1 48 w52 1 8
none 1 14 none 1 17 30 46 w49 1 23 28 41 48
w52 1 28 23 41 48 1 30 17 46 1 41 23 28 48 1 46
17 30 1 48 23 28 41
4 8 w25 4 29 w2 4 30 w2 4 35 w25 4 39 w2 4
46 w2 w25 4 50 w25 4 8 35 46 50 w25 4 29 30
39 46 w2
5 10 14 w26 5 11 17 32 36 w38 5 36 41
w34 5 36 w34 w38
6 15 18 32 w22 6 49 w5
2 37 w57 2 46 w45 2 47 w57 2 37 47 w57
3 13 w51 3 29 w8 3 46 w51 3 47 w8 3 13 46
w51 3 29 47 w8
8 26 w16 8 35 46 50 w25
9 26 27 45 w3 9 27 29 45 w42 9 44 w35 9
45 w3 w42
9 26 27 45 w3 9 27 29 45 w42 9 44 w35 9
45 w3 w42
10 11 12 25 41 w44 10 14 w42 1010 44
w32 10 21 w12 w19
11 12 25 41 w44 11 14 17 32 36 w38 11 35 37
w17
13 21 50 w47 13 23 w4 13 33 45 w13 13 21 50
w47 13 21 43 w54 13 46 w51 13 21 w47
w54
17 30 46 w42 17 32 36 w38 17 39 47 w18 17 48
w56
14 17 32 36 w47 14 39 w55
12 25 41 w44 12 25 w59 12 26 w15 12 25
w44 w59
14 17 32 36 w47 14 39 w55
15 18 32 w22 15 35 w50 15 44 w31
16 33 37 w48 16 28 w6
26 27 45 w3 26 28 w60 26 29 w1 26 38
w36 26 39 w30
30 32 41 w27 30 35 w23 30 43
w24 30 46 w2 w49
22 35 w43 22 50 w53
28 35 w28 28 38 46 w20 28 39 50 w9 28 41 48
w52
21 35 w10 21 42 w4 21 43 w54 21 50 w47
23 28 41 48 w52
29 45 w28 29 47 w8 29 30 39 46 w2
27 35 w43 27 50 w53
18 32 w22 18 44 w10
25 41 w44
35 36 w33 35 37 w17 35 38 w40 35 39 w45 35 41
w34 35 42 w7
39 41 w39 39 46 w2 39 47 w18 39 50 w9 w45
45
47 50 w25
41 48 w52
33 37 w48 33 45 w13 33 49 w29
36
32 36 w22 32 41 w27
42
43
44
50
46
48
49
37 47 w41 w57
38 46 w57
So there are 12 non-single-word bicliques. Note
also that the single-word-bicliques are not
necessarily the entire single-word-query-result
either.
4
FAUST Clustering1
L-Gap Clusterer Cut, C, mid-gap (of FC) using
next (d,p) from dpSet, where FLSR
2-1 separates 7,50 2-2 separates.27s
2? 1 0 -1 -2 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0.27 0 0 0
1 0.27 0 0 0 1 0.27 0 0 0 1 0.27 0 0 0
1 0.27 0 0 0 1 0.27 0 0 0 1 0.27 0 0
0 1 0.27 0 0 0 1 0.27 0 0 0 1 0.27 0 0
0 1 0.27 0 0 0 1 0.27 0 0 0 1 0.27 0
0 0 1 0.27 0 0 0 1 0.27 0 0 0 1 0.27 0
0 0 1 0.27 0 0 0 1 0.55 0 0 1 0 0.55
0 0 1 0 3.60 1 1 1 0
Dd35 0 d26 0 d1 0 d27 0 d3 0
d44 0 d16 0 d6 0 d17 0 d47 0 d18
0 d10 0 d43 0 d12 0 d33 0 d14 0
d23 0 d49 0 d25 0 d45 0 d2 0 d29
0 d13 0 d9 0 d32 0.27 d28 0.27 d41 0.27
d42 0.27 d30 0.27 d21 0.27 d22 0.27 d15 0.27
d36 0.27 d11 0.27 d38 0.27 d46 0.27 d5 0.27
d8 0.27 d37 0.27 d48 0.27 d39 0.27 d4 0.55
d50 0.55 d7 3.60 d35 35, 7, 50 outliers
D?.27s 0 d9 0 d49 0 d45 0.09 d6 0.09
d3 0.09 d33 0.09 d18 0.09 d44 0.18 d43 0.18
d25 0.18 d22 0.18 d12 0.18 d16 0.18 d2 0.27
d27 0.27 d23 0.27 d42 0.27 d15 0.27 d13 0.27
d47 0.36 d26 0.36 d29 0.36 d36 0.46 d38 0.46
d14 0.46 d48 0.46 d8 0.46 d10 0.46 d37 0.55
d32 0.55 d1 0.55 d5 0.64 d21 0.64 d4 0.64
d11 0.64 d17 0.92 d30 1.01 d41 1.01 d28 1.10
d39 1.29 d46 28,30,39,41,46 Cluster
D?.64s 0 d26 0 d33 0 d3 0 d27
0 d45 0 d2 0 d44 0 d23 0 d9 0
d15 0 d49 0 d16 0 d38 0 d6 0 d18
0 d22 0.25 d1 0.25 d37 0.25 d43 0.25 d8 0.25
d29 0.25 d25 0.25 d42 0.25 d12 0.25 d47 0.25
d48 0.51 d32 0.51 d14 0.51 d4 0.51 d36 0.51
d13 0.51 d5 0.77 d10 1.03 d11 1.29 d17 1.54
d21 the 0's, .25s, .51s are clusters. d10, d11,
d17, d21 outliers
Going back to Dd35, how close does HOB come?
21, 20 separate 35
C1 (.17 ? xod ? .25)2,3,6,16,18,22,42,43,49
Dsum of all C31docs 0.63 d17 0.63 d29 0.63
d11 0.84 d50 0.84 d13 0.84 d30 0.95 d26 0.95
d28 0.95 d10 0.95 d41 1.16 d21 C311(..63)
11,17,29 C312(.84) 13,30,50 C313(.95)
10,26,28,41 21 outlier
C2 (.34 ? xod ? .56)1,4,5,8,9,12,14,15,23,25,27,
32,33,36,37,38,44,45,47,48
C3 (.64?xod?.86)10,11,13,17,21,26,28,29,30,39,41
,50
Single 46 (xod.99) 7 (1.16) 35 (1.47)
Dsum of allC2docs 0.27 d23 0.36 d25 0.36 d4 0.36
d38 0.45 d15 0.45 d33 0.45 d12 0.45 d36 0.54
d8 0.54 d44 0.54 d47 0.63 d1 0.63 d37 0.63
d5 0.63 d32 0.63 d50 0.72 d27 0.72 d45 0.72
d9 0.81 d14
Next, on each Ck try D?Ck, Thres.2
Dsum of all C1docs 0.42 d16 0.42 d2 0.42 d3 0.42
d42 0.42 d43 0.42 d22 0.63 d18 0.63 d49 0.85
d6 C11(xod.42)231622,42,43 6,18,49 outliers
Dsum of all C3docs 0.56 d11 0.66 d17 0.66
d29 0.75 d13 0.85 d30 0.85 d10 0.94 d28 0.94
d26 0.94 d41 0.94 d50 1.03 d21 1.41
d39 C31(.56?xod?1.03) 10,11,13,17,21,26,28,29,30
,41,50 39 outlier
Dsum of all C11docs 0.57 d2 0.57 d3 0.57
d16 0.57 d22 0.57 d42 0.57 d43
Other Clustering methods later
D?44docs GT.08 0.17 d22 0.17 d49 0.21 d42 0.21
d2 0.21 d16 0.25 d18 0.25 d3 0.25 d43 0.25
d6 0.34 d23 0.34 d15 0.34 d44 0.34 d38 0.34
d25 0.34 d36 0.38 d33 0.38 d48 0.38 d8 0.43
d4 0.43 d12 0.47 d47 0.47 d9 0.47 d37 0.51
d5 0.56 d1 0.56 d32 0.56 d45 0.56 d14 0.56
d27 0.64 d10 0.64 d17 0.64 d21 0.64 d29 0.64
d11 0.69 d26 0.69 d50 0.69 d13 0.73 d30 0.77
d28 0.82 d41 0.86 d39 0.99 d46 1.16 d7 1.47 d35
C11 2. This little pig went to market. This
little pig stayed at home. This little pig had
roast beef. This little pig had none. This little
pig said Wee, wee. I can't find my way home. 3.
Diddle diddle dumpling, my son John. Went to bed
with his breeches on, one stocking off, and one
stocking on. Diddle diddle dumpling, my son
John. 16. Flour of England, fruit of Spain, met
together in a shower of rain. Put in a bag tied
round with a string. If you'll tell me this
riddle, I will give you a ring. 22. Had a little
husband no bigger than my thumb. I put him in a
pint pot, and there I bid him drum. I bought a
little handkerchief to wipe his little nose and a
little garters to tie his little hose. 42. Bat
bat, come under my hat and I will give you a
slice of bacon. And when I bake I will give you a
cake, if I am not mistaken. 43. Hark hark, the
dogs do bark! Beggars are coming to town. Some in
jags and some in rags and some in velvet gowns.
C2 1. Three blind mice! See how they run! They
all ran after the farmer's wife, who cut off
their tails with a carving knife. Did you ever
see such a thing in your life as three blind
mice? 4. Little Miss Muffet sat on a tuffet,
eating of curds and whey. There came a big spider
and sat down beside her and frightened Miss
Muffet away. 5. Humpty Dumpty sat on a wall.
Humpty Dumpty had a great fall. All the Kings
horses, and all the Kings men cannot put Humpty
Dumpty together again. 8. Jack Sprat could eat no
fat. His wife could eat no lean. And so between
them both they licked the platter clean. 9. Hush
baby. Daddy is near. Mamma is a lady and that is
very clear. 12. There came an old woman from
France who taught grown-up children to dance. But
they were so stiff she sent them home in a sniff.
This sprightly old woman from France. 14. If all
seas were one sea, what a great sea that would
be! And if all the trees were one tree, what a
great tree that would be! And if all the axes
were one axe, what a great axe that would be! And
if all the men were one man what a great man he
would be! And if the great man took the great axe
and cut down the great tree and let it fall into
great sea, what a splish splash it would be! 15.
Great A. little a. This is pancake day. Toss the
ball high. Throw the ball low. Those that come
after may sing heigh ho! 23. How many miles is it
to Babylon? Three score miles and ten. Can I get
there by candle light? Yes, and back again. If
your heels are nimble and light, you may get
there by candle light. 36. Little Tommy
Tittlemouse lived in a little house. He caught
fishes in other mens ditches. 37. Here we go
round mulberry bush, mulberry bush, mulberry
bush. Here we go round mulberry bush, on a cold
and frosty morning. This is way we wash our
hands, wash our hands, wash our hands. This is
way we wash our hands, on a cold and frosty
morning. This is way we wash our clothes, wash
clothes, wash our clothes. This is way we wash
our clothes, on a cold and frosty morning. This
is way we go to school, go to school, go to
school. This is the way we go to school, on a
cold and frosty morning. This is the way we come
out of school, come out of school, come out of
school. This is the way we come out of school, on
a cold and frosty morning. 38. If I had as much
money as I could tell, I never would cry young
lambs to sell. Young lambs to sell, young lambs
to sell. I never would cry young lambs to sell.
44. The hart he loves the high wood. The hare
she loves the hill. The Knight he loves his
bright sword. The Lady loves her will. 47. Cocks
crow in the morn to tell us to rise and he who
lies late will never be wise. For early to bed
and early to rise, is the way to be healthy and
wealthy and wise. 48. One two, buckle my shoe.
Three four, knock at the door. Five six, ick up
sticks. Seven eight, lay them straight. Nine ten.
a good fat hen. Eleven twelve, dig and delve.
Thirteen fourteen, maids a courting. Fifteen
sixteen, maids in the kitchen. Seventeen
eighteen. maids a waiting. Nineteen twenty, my
plate is empty.
C311 11. One misty moisty morning when cloudy
was weather, I met an old man clothed all in
leather. He began to compliment and I began to
grin. How do And how do? And how do again 17.
Here sits the Lord Mayor. Here sit his two men.
Here sits the cock. Here sits the hen. Here sit
the little chickens. Here they run in. Chin
chopper, chin chopper, chin chopper, chin! 29.
When little Fred went to bed, he always said his
prayers. He kissed his mamma and then his papa,
and straight away went upstairs.
C312 13. A robin and a robins son once went to
town to buy a bun. They could not decide on plum
or plain. And so they went back home again. 30.
Hey diddle diddle! The cat and the fiddle. The
cow jumped over the moon. The little dog laughed
to see such sport, and the dish ran away with the
spoon. 50. Little Jack Horner sat in the corner,
eating of Christmas pie. He put in his thumb and
pulled out a plum and said What a good boy am I!
C313 10. Jack and Jill went up the hill to fetch
a pail of water. Jack fell down, and broke his
crown and Jill came tumbling after. When up Jack
got and off did trot as fast as he could caper,
to old Dame Dob who patched his nob with vinegar
and brown paper. 26. Sleep baby sleep. Our
cottage valley is deep. The little lamb is on the
green with woolly fleece so soft and clean. Sleep
baby sleep. Sleep baby sleep, down where the
woodbines creep. Be always like the lamb so mild,
a kind and sweet and gentle child. Sleep baby
sleep. 28. Baa baa black sheep, have you any
wool? Yes sir yes sir, three bags full. One for
my master and one for my dame, but none for the
little boy who cries in the lane. 41. Old King
Cole was a merry old soul. And a merry old soul
was he. He called for his pipe and he called for
his bowl and he called for his fiddlers three.
And every fiddler, he had a fine fiddle and a
very fine fiddle had he. There is none so rare as
can compare with King Cole and his fiddlers three.
5
FAUST Cluster 1.2
real HOB Alternate WS0, DS0
OUTLIER 38. If I had as much money as I could
tell, I never would cry young lambs to sell.
Young lambs to sell, young lambs to sell. I never
would cry young lambs to sell.
Notes Using HOB, the final WordSet is the
document cluster theme! When the theme is too
long to be meaningful (C4) we can recurse on
those (using the opposite DS)WS0?). The other
thing we can note is that DS) almost always gave
us an outliers (except for C5) and only WS)
almost always gave us clusters (excpt for the
first one, 46). What happens if we reverse it?
What happens if we just use WS0?
6
real HOB Alternate WS0, DS0 recuring on C3 and C4
FAUST Cluster 1.2.1
Doc43 and
doc44 have none of the 6 words in commong so
these two will come out outliers on the next
recursion! OUTLIERS 43. Hark hark, the dogs do
bark! Beggars are coming to town. Some in jags
and some in rags and some in velvet gowns. 44.
The hart he loves the high wood. The hare she
loves the hill. The Knight he loves his bright
sword. The Lady loves her will.
recurse on C3
7
HOB Alternate WS0, DS0
FAUST Cluster 1.2.2
16 OUTLIERS 2 3 6 10 13 16 22 26 30 35 38 39
42 43 44 46
Categorize clusters (hub-spoke, cyclic, chain,
disjoint...)? Separate disjoint sub-clusters?
Each of the 3 C423 words gives a disjoint
cluster! Each of the 2 C32 work gives a
disjoint sub-clusters also.
C4231 day 15. Great A. little a. This is
pancake day. Toss ball high. Throw ball low.
Those come after sing heigh ho! 18. I had 2
pigeons bright and gay. They flew from me other
day. What was reason they go? I can not tell, I
do not know.
C4232 eat 4. Little Miss Muffet sat on
tuffet, eat curd, whey. Came big spider, sat down
beside her, frightened away 8. Jack Sprat could
eat no fat. Wife could eat no lean. Between them
both they licked platter clean.
C4233 girl 33. Buttons, farthing pair! Come
who will buy them? They are round, sound,
pretty, fit for girls of city. Come, who will
buy ? Buttons, farthing a pair 49. There was
little girl had little curl right in the middle
of her forehead. When she was good she was very
good and when she was bad she was horrid.
C1 mother 7. Old Mother Hubbard went to
cupboard to give her poor dog a bone. When she
got there cupboard was bare, so poor dog had
none. She went to baker to buy some bread. When
she came back dog was dead. 9. Hush baby. Daddy
is near. Mamma is a lady and that is very
clear. 27. Cry baby cry. Put your finger in your
eye and tell your mother it was not I. 29. When
little Fred went to bed, he always said his
prayers. He kissed his mamma and then his papa,
and straight away went upstairs. 45. Bye baby
bunting. Father has gone hunting. Mother has gone
milking. Sister has gone silking. And brother has
gone to buy a skin to wrap the baby bunting in.
C2 fiddle old men cyclic 11. 1 misty
moisty morning when cloudy was weather, Chanced
to meet old man clothed all leather. He began to
compliment,I began to grin. How do you do How do?
How do again 32. Jack come give me your fiddle,
if ever you mean to thrive. No I'll not give
fiddle to any man alive. If I'd give my fiddle
they will think I've gone mad. For many joyous
day fiddle and I've had 41. Old King Cole was
merry old soul. Merry old soul was he. He called
for his pipe, he called for his bowl, he called
for his fiddlers 3. And every fiddler, had a fine
fiddle, a very fine fiddle had he. There is none
so rare as can compare with King Cole and his
fiddlers three.
C11 cut men run cyclic 1. Three
blind mice! See how run! All ran after farmer's
wife, cut off tails with carving knife. Ever see
such thing in life as 3 blind mice? 14. If all
seas were 1 sea, what a great sea that would be!
And if all trees were 1 tree, what a great tree
that would be! And if all axes were 1 axe, what a
great axe that would be! if all men were 1 man
what a great man he would be! And if great man
took great axe and cut down great tree and let it
fall into great sea, what a splish splash that
would be! 17. Here sits Lord Mayor. Here sit his
2 men. Here sits the cock. Here sits hen. Here
sit the little chickens. Here they run in. Chin
chopper, chin chopper, chin chopper, chin!
C321 men 5. Humpty Dumpty sat on wall. Humpty
Dumpty had great fall. All Kings horses, all
Kings men can't put Humpty together again. 36.
Little Tommy Tittlemouse lived in little house.
He caught fishes in other mens ditches.
C322 three 23. How many miles to Babylon? 3
score 10. Can I get there by candle light? Yes,
back again. If your heels are nimble, light, you
may get there by candle light. 28. Baa baa black
sheep, have any wool? Yes sir yes sir, 3 bags
full. One for my master and one for my dame, but
none for the little boy who cries in the
lane. 48. One two, buckle my shoe. Three four,
knock at the door. Five six, pick up sticks.
Seven eight, lay them straight. Nine ten. a good
fat hen. Eleven twelve, dig and delve. Thirteen
fourteen, maids a courting. Fifteen sixteen,
maids in the kitchen. Seventeen eighteen. maids a
waiting. Nineteen twenty, my plate is empty.
C4.1 morn way 37. Here we go round mulberry
bush, mulberry bush, mulberry bush. Here we go
round mulberry bush, on cold and frosty morn.
This is way wash our hands, wash our hands, wash
our hands. This is way wash our hands, on a cold
and frosty morning. This is way we wash our
clothes, wash our clothes, wash our clothes. This
is way we wash r clothes, on a cold and frosty
morning. This is way we go to school, go to
school, go to school. This is the way we go to
school, on a cold and frosty morning. This is the
way we come out of school, come out of school,
come out of school. This is the way we come out
of school, on a cold and frosty morning. 47.
Cocks crow in the morn to tell us to rise and he
who lies late will never be wise. For early to
bed and early to rise, is the way to be healthy
and wealthy and wise.
C421 plum 21. Lion Unicorn were fighting
for crown. Lion beat Unicorn all around town.
Some gave them white bread and some gave them
brown. Some gave them plum cake sent them out of
town. 50. Little Jack Horner sat in corner,
eating of Christmas pie. He put in his thumb and
pulled out a plum and said What a good boy am I!
C422 old woman 12. There came an old woman
from France who taught grown-up children to
dance. But they were so stiff she sent them home
in a sniff. This sprightly old woman from
France. 25. There was old woman. What do you
think? She lived upon nothing but victuals, and
drink. Victuals and drink were the chief of her
diet, and yet this old woman could never be quiet.
Let's pause and ask "What are we after?" Of
course it depends upon the client. 3 main
categories for relatioinship mining? text
corpuses, market baskets (includes recommenders),
bioinformatics? Others? What do we want from
text mining? (anomalies detection, cliques,
bicliques?) What do we want from market basket
mining? (future purchase predictions,
recommendations...) What do we want in
bioinformatics? (cliques, strong clusters,
...???)
8
FAUST Cluster 1.2.3
word-labeled document graph
We have captured only a few of the salient
sub-graphs. Can we capture more of them? Of
course we can capture a sub-graph for each word,
but that might be 100,000. Let's stare at what
we got and try to see what we might wish we had
gotten in addition.
A bake-bread sub-corpus would have been
strong. (docs7 21 35 42)
A bake-bread sub-corpus would have been
strong. (docs7 21 35 42)
There are many others.
Using AVG1 2 9 10 25 45 47 d21 0 0 1 0
0 1 d35 0 0 1 1 1 0 d39 1 1 0 0 1
0 d46 1 0 0 1 0 0 d50 0 1 0 1 1 1
9
HOB2 Alt (use other HOBs)
FAUST Cluster 1.2.4
wAvg1, dAvg1 a b b e p p w o r
a i l a y e t e u y a
m d 2 9 10 25 45 47 d21 0 0 1 0
0 1 d35 0 0 1 1 1 0 d39 1 1 0 0 1
0 d46 1 0 0 1 0 0 d50 0 1 0 1 1 1
recurse wAv2,dAvg-1 e p
a i t e 2 9 10 25 45 d35 0
0 1 1 1 d39 1 1 0 0 1 d46 1 0 0 1
0 d50 0 1 0 1 1
And if we want to pull out a particular word
cluster, just turn the word-pTree into a list.
wbaby a b w a b a y 2
3 d9 0 1 d26 1 1 d27 0 1 d45 1
wboy a b w o a y 2 9 d28 0
1 d39 1 1 d50 0 1
For a particular doc cluster, just turn the
doc-pTree into a list
10
FAUST HULL Classification 1
Using the clustering of FAUST Clustering1 as
classes, we extract 80 from each class as
TrainingSet (w classcluster). How accurate is
FAUST Hull Classification on the remaining 20
plus the outliers (which should be "Other").
Use Lpd, Sp, Rpd with pClassAvg and dunitized
ClassSum.
All 6 class hulls separated using Lpd, pCLavg,
DCLsum. D311 separates C311, D312 separates
C312 and D313 separates C313 from all others. D2
separates C11 and C2. Now, remove some false
positives with S and R using the same p's and d's
D11?C11 pavC11 Sp 1.6C11 3.4 4 4C311
5.4 6C313 2.4
4.4C2 5C312
D2?C2 pavC2 Sp 2 2.3C11 4.5
5.8C313 1.8 3.5C2
5 5.1C312 2.5 3.5C311
D313?C313 pavC313 Sp 3.5 4.2C11
6.5C312 2.8
6.2C2 3.8 6.2C311 2.5
3.5C313
D311?C311 pavC311 Sp 1.2C311 4.2C11
6.2 7.2C312 2.2 6.2C2
6.2 8.2C313
D312?C312 pavC312 Sp 3.5 4.5C11
6.5 7.5C313 4.5
6.5C2 2.5C312 5.5C311
Sp removes a lot of the potential for false
positives. (Many of the classes lie a single
distance from p.)
D11?C11 pavC11 Rpd 1.2C11 1.4
2C2 1.7 2C311
2.2 2.C312 2.2 2.4C313
D1?TS pavTS Rpd 1.3 1.4C11 1.3
1.9C2 1.5 1.8C311
2.1C312 2.0
2.2C313
D2?C2 pavC2 Rpd 1.3 1.4C11 1.3
1.8C2 1.6 1.8C311
2.2C312 2.1 2.4C313
D312?C312 pavC312 Rpd 1.3 1.4C11 1.4
2C2 1.7 1.9C311
1.5C312 2.2 2.4C313
D313?C313 pavC313 Rpd 1.3 1.4C11 1.3
2C2 1.6 2C311
2.2C312 1.5 1.8C313
D311?C311 pavC311 Rpd 1.4C11
1.2 2C2 1.1C311
2.2C312 2.2 2.4C313
Rpd removes even more of the potential for false
positives.
11
FAUST Hull Classification 2 (TESTING)
D?TS Rpd Sp Lpd trueCL Predicted____CLASS
Final R S
L predicted 1.41 2.19 -0.4 11 d3 2
Oth 11 Other 1.40 2.06 -0.3 2 d4 2
2 11 Other 1.92 3.71 0.01 2 d14
Oth 2 11 Other 1.38 1.99 -0.2 2
d23 211 211 Oth Other 1.97 3.92 -0.1
311 d29 Oth 312313 11 Other 2.22 4.99
-0.2 312 d13 313 313 11 Other 2.60
6.78 -0.0 313 d26 Oth Oth 11
Other 1.40 2.13 -0.3 d6 211 2 11
Other 2.50 6.37 0.34 d7 313 Oth
2 Other 1.40 2.06 -0.3 d18 211 211
Oth Other 2.42 5.92 -0.1 d21 313
Oth Oth Other 3.46 12.2 0.47 d35 Oth
Oth Oth Other 2.60 6.78 -0.0 d39
Oth Oth 11 Other 2.35 5.57 0.14
d46 Oth Oth 2 Other 1.41 2.19 -0.4
d49 2 2 Oth Other 8/15 53
correct just with D?TS pAvgTS Note It's
likely to get worse as we consider more D's.
Let's think about TrainingSet quality resulting
from clustering. This a poor quality TrainingSet
(from clustering Mother Goose Rythmes. MGR is a
difficult corpus to cluster since 1., in MGR,
almost every document is isolated (an outlier),
so the clustering is vague (no 2 MGRs deal with
the same topic so their word use is quite
different.). Instead of tightening the class
hulls by replacing CLASSmin and CLASSmax by
CLASSfpci (fpcifirst percipitous count increase)
and CLASSlpcd, we might loosen class hulls (since
we know the classes somewhat arbitrary) by
expanding the CLASSmin, CLASSmax interval as
follows Let A AvgClASSmin, CLASSmax and
R (for radius) A-CLASSmin (CLASSmax-A
also). Use A-R-e, ARe. Let e.8 increases
accuracy to 100 (assuming all Other stay Other.).
e.8 predicted Class 11 2 2 2 311(all 3112
all) 312(all 312313 a Other . . . . . . . Other
Finally, it occurs to me that Clustering to
produce a TrainingSet, then setting aside a
TestSet gives a good way to measure the quality
of the clustering. If the TestSet part
classifies well under the TrainingSet part, the
clustering must have been high quality (produced
a good TrainingSet for classification). This
clustering quality test method is probably not
new (check the literature?). If it is new, we
might have a paper here? (discuss this quality
measure and assess using different e's?)
12
APPENDIX FAUST Clustering 2 Other variations of
the FAUST Clustering1 Algorithm
Functional Gap Cluster Dendogram
Dsum of all docs in subcluster, but use all gaps!
22. I had a little husband no bigger than my
thumb. I put him in a pint pot, and there I bid
him drum. I bought a little handkerchief to wipe
his little nose and a pair of little garters to
tie his little hose. 49. There was a little girl
who had a little curl right in the middle of her
forehead. When she was good she was very very
good and when she was bad she was horrid.
2. This little pig went to market. This little
pig stayed at home. This little pig had roast
beef. This little pig had none. This little pig
said Wee, wee. I can't find my way home. 16.
Flour of England, fruit of Spain, met together in
a shower of rain. Put in a bag tied round with a
string. If you'll tell me this riddle, I will
give you a ring. 42. Bat bat, come under my hat
and I will give you a slice of bacon. And when I
bake I will give you a cake, if I am not mistaken.
3. Diddle diddle dumpling, my son John. Went to
bed with his breeches on, one stocking off, and
one stocking on. Diddle diddle dumpling, my son
John. 43. Hark hark, the dogs do bark! Beggars
are coming to town. Some in jags and some in rags
and some in velvet gowns.
6. See a pin and pick it up. All the day you will
have good luck. See a pin and let it lay. Bad
luck you will have all the day. 18. I had two
pigeons bright and gay. They flew from me the
other day. What was the reason they did go? I can
not tell, for I do not know.
23. How many miles is it to Babylon? Three score
miles and ten. Can I get there by candle light?
Yes, and back again. If your heels are nimble and
light, you may get there by candle light. 25.
There was an old woman, and what do you think?
She lived upon nothing but victuals, and drink.
Victuals and drink were the chief of her diet,
and yet this old woman could never be quiet. 36.
Little Tommy Tittlemouse lived in a little house.
He caught fishes in other mens ditches.
8. Jack Sprat could eat no fat. His wife could
eat no lean. And so between them both they licked
the platter clean. 33. Buttons, a farthing a
pair! Come, who will buy them of me? They are
round and sound and pretty and fit for girls of
the city. Come, who will buy them of me? Buttons,
a farthing a pair! 48. One two, buckle my shoe.
Three four, knock at the door. Five six, ick up
sticks. Seven eight, lay them straight. Nine ten.
a good fat hen. Eleven twelve, dig and delve.
Thirteen fourteen, maids a courting. Fifteen
sixteen, maids in the kitchen. Seventeen
eighteen. maids a waiting. Nineteen twenty, my
plate is empty.
37. Here we go round mulberry bush, mulberry
bush, mulberry bush. Here we go round mulberry
bush, on a cold and frosty morning. This is way
we wash our hands, wash our hands, wash our
hands. This is way we wash our hands, on a cold
and frosty morning. This is way we wash our
clothes, wash our clothes, wash our clothes. This
is way we wash our clothes, on a cold and frosty
morning. This is way we go to school, go to
school, go to school. This is the way we go to
school, on a cold and frosty morning. This is the
way we come out of school, come out of school,
come out of school. This is the way we come out
of school, on a cold and frosty morning. 47.
Cocks crow in the morn to tell us to rise and he
who lies late will never be wise. For early to
bed and early to rise, is the way to be healthy
and wealthy and wise.
1. Three blind mice! See how they run! They all
ran after the farmer's wife, who cut off their
tails with a carving knife. Did you ever see such
a thing in your life as three blind mice? 27. Cry
baby cry. Put your finger in your eye and tell
your mother it was not I. 45. Bye baby bunting.
Father has gone hunting. Mother has gone milking.
Sister has gone silking. And brother has gone to
buy a skin to wrap the baby bunting in.
11. One misty moisty morning when cloudy was the
weather, I chanced to meet an old man clothed all
in leather. He began to compliment and I began to
grin. How do you do And how do you do? And how do
you do again 29. When little Fred went to bed, he
always said his prayers. He kissed his mamma and
then his papa, and straight away went upstairs.
13. A robin and a robins son once went to town to
buy a bun. They could not decide on plum or
plain. And so they went back home again. 50.
Little Jack Horner sat in the corner, eating of
Christmas pie. He put in his thumb and pulled out
a plum and said What a good boy am I!
13
FAUST Clustering3 (HOB clustering1)
Functional Gap Clusterer
Dsum of all docs in subcluster but use HOB!
27. Cry baby cry. Put your finger in your eye and
tell your mother it was not I. 45. Bye baby
bunting. Father has gone hunting. Mother has gone
milking. Sister has gone silking. And brother has
gone to buy a skin to wrap the baby bunting in.
1 d45 1 d27 1 d1
d7 d35
3 4 6 8 9 23 25 33 36 38 43 47
2 16 22 42 49
1.. Three blind mice! See how they run! They all
ran after the farmer's wife, who cut off their
tails with a carving knife. Did you ever see such
a thing in your life as three blind mice? 5.
Humpty Dumpty sat on a wall. Humpty Dumpty had a
great fall. All the Kings horses, and all the
Kings men cannot put Humpty Dumpty together
again. 10. Jack and Jill went up the hill to
fetch a pail of water. Jack fell down, and broke
his crown and Jill came tumbling after. When up
Jack got and off did trot as fast as he could
caper, to old Dame Dob who patched his nob with
vinegar and brown paper. 11. One misty moisty
morning when cloudy was the weather, I chanced to
meet an old man clothed all in leather. He began
to compliment and I began to grin. How do you do
And how do you do? And how do you do again 13. A
robin and a robins son once went to town to buy a
bun. They could not decide on plum or plain. And
so they went back home again. 14. If all the seas
were one sea, what a great sea that would be! And
if all the trees were one tree, what a great tree
that would be! And if all the axes were one axe,
what a great axe that would be! And if all the
men were one man what a great man he would be!
And if the great man took the great axe and cut
down the great tree and let it fall into the
great sea, what a splish splash that would
be! 17. Here sits the Lord Mayor. Here sit his
two men. Here sits the cock. Here sits the hen.
Here sit the little chickens. Here they run in.
Chin chopper, chin chopper, chin chopper,
chin! 21. The Lion and the Unicorn were fighting
for the crown. The Lion beat the Unicorn all
around the town. Some gave them white bread and
some gave them brown. Some gave them plum cake,
and sent them out of town. 26. Sleep baby sleep.
Our cottage valley is deep. The little lamb is on
the green with woolly fleece so soft and clean.
Sleep baby sleep. Sleep baby sleep, down where
the woodbines creep. Be always like the lamb so
mild, a kind and sweet and gentle child. Sleep
baby sleep. 28. Baa baa black sheep, have you any
wool? Yes sir yes sir, three bags full. One for
my master and one for my dame, but none for the
little boy who cries in the lane. 29. When little
Fred went to bed, he always said his prayers. He
kissed his mamma and then his papa, and straight
away went upstairs. 30. Hey diddle diddle! The
cat and the fiddle. The cow jumped over the moon.
The little dog laughed to see such sport, and the
dish ran away with the spoon. 32. Jack come and
give me your fiddle, if ever you mean to thrive.
No I will not give my fiddle to any man alive. If
I should give my fiddle they will think that I've
gone mad. For many a joyous day my fiddle and I
have had 50. Little Jack Horner sat in the
corner, eating of Christmas pie. He put in his
thumb and pulled out a plum and said What a good
boy am I!
d7 d35
1 5 10 11 13 14 17 21 26 27 28 29 30 32 39 41 45 4
6 50
.4 d11 .4 d29
1.37 d50 1.37 d13
6 23 25 36 43
3 4 8 9 33 38 47
3 4 8 9 33 38
47
15. Great A. little a. This is pancake day. Toss
the ball high. Throw the ball low. Those that
come after may sing heigh ho! 44. The hart he
loves the high wood. The hare she loves the hill.
The Knight he loves his bright sword. The Lady
loves her will.
39 46
27 45
1 5 10 11 13 14 17 21 26 28 29 30 32 50
12. There came an old woman from France who
taught grown-up children to dance. But they were
so stiff she sent them home in a sniff. This
sprightly old woman from France. 48. One two,
buckle my shoe. Three four, knock at the door.
Five six, ick up sticks. Seven eight, lay them
straight. Nine ten. a good fat hen. Eleven
twelve, dig and delve. Thirteen fourteen, maids a
courting. Fifteen sixteen, maids in the kitchen.
Seventeen eighteen. maids a waiting. Nineteen
twenty, my plate is empty.
4 8 9 33 38
3
d48 d33 d8
4. Little Miss Muffet sat on a tuffet, eating of
curds and whey. There came a big spider and sat
down beside her and frightened Miss Muffet
away. 9. Hush baby. Daddy is near. Mamma is a
lady and that is very clear. 33. Buttons, a
farthing a pair! Come, who will buy them of me?
They are round and sound and pretty and fit for
girls of the city. Come, who will buy them of me?
Buttons, a farthing a pair! 38. If I had as much
money as I could tell, I never would cry young
lambs to sell. Young lambs to sell, young lambs
to sell. I never would cry young lambs to sell.
4 9 33 38
8
6. See a pin and pick it up. All the day you will
have good luck. See a pin and let it lay. Bad
luck you will have all the day. 23. How many
miles is it to Babylon? Three score miles and
ten. Can I get there by candle light? Yes, and
back again. If your heels are nimble and light,
you may get there by candle light. 25. There was
an old woman, and what do you think? She lived
upon nothing but victuals, and drink. Victuals
and drink were the chief of her diet, and yet
this old woman could never be quiet. 36. Little
Tommy Tittlemouse lived in a little house. He
caught fishes in other mens ditches. 43. Hark
hark, the dogs do bark! Beggars are coming to
town. Some in jags and some in rags and some in
velvet gowns.
0.63 d3 0.63 d43
0.94 d18 0.94 d6
41
0.47 d23 0.47 d25 0.47 d36
2. This little pig went to market. This little
pig stayed at home. This little pig had roast
beef. This little pig had none. This little pig
said Wee, wee. I can't find my way home. 16.
Flour of England, fruit of Spain, met together in
a shower of rain. Put in a bag tied round with a
string. If you'll tell me this riddle, I will
give you a ring. 22. I had a little husband no
bigger than my thumb. I put him in a pint pot,
and there I bid him drum. I bought a little
handkerchief to wipe his little nose and a pair
of little garters to tie his little hose. 42. Bat
bat, come under my hat and I will give you a
slice of bacon. And when I bake I will give you a
cake, if I am not mistaken. 49. There was a
little girl who had a little curl right in the
middle of her forehead. When she was good she was
very very good and when she was bad she was
horrid.
0.17 d22 0.17 d49
0.21 d42 0.21 d2 0.21 d16
14
FAUST Clustering4
C1 (mother theme) 7. Old Mother Hubbard went to
the cupboard to give her poor dog a bone. When
she got there cupboard was bare and so the poor
dog had none. She went to baker to buy him some
bread. When she came back dog was dead. 9. Hush
baby. Daddy is near. Mamma is a lady and that is
very clear. 27. Cry baby cry. Put your finger in
your eye and tell your mother it was not I. 45.
Bye baby bunting. Father has gone hunting. Mother
has gone milking. Sister has gone silking. And
brother has gone to buy a skin to wrap the baby
bunting in.
WS0 2 3 13 20 22 25 38 42 44 49
52 ------------------------------------- DS1
WS1 3 13 20 42 7 ------------------ 27
DS2 WS2 3 13 42 45 7 ------------ 46
9 DS3C1 27 7 45 9
27 45
C2 1. Three blind mice! See how they run! They
all ran after the farmer's wife, who cut off
their tails with a carving knife. Did you ever
see such a thing in your life as three blind
mice? 4. Little Miss Muffet sat on a tuffet,
eating of curds and whey. There came a big spider
and sat down beside her and frightened Miss
Muffet away. 11. One misty moisty morning when
cloudy was the weather, I chanced to meet an old
man clothed all in leather. He began to
compliment and I began to grin. How do you do And
how do you do? And how do you do again 17. Here
sits the Lord Mayor. Here sit his 2 men. Here
sits the cock. Here sits the hen. Here sit the
little chickens. Here they run in. Chin chopper,
chin chopper, chin chopper, chin! 30. Hey diddle
diddle! The cat and the fiddle. The cow jumped
over the moon. The little dog laughed to see such
sport, and the dish ran away with the spoon. 32.
Jack come and give me your fiddle, if ever you
mean to thrive. No I will not give my fiddle to
any man alive. If I should give my fiddle they
will think that I've gone mad. For many a joyous
day my fiddle and I have had 41. Old King Cole
was a merry old soul. And a merry old soul was
he. He called for his pipe and he called for his
bowl and he called for his fiddlers three. And
every fiddler, he had a fine fiddle and a very
fine fiddle had he. There is none so rare as can
compare with King Cole and his fiddlers
three. 46. Tom Tom the piper's son, stole a pig
and away he run. The pig was eat and Tom was beat
and Tom ran crying down the street.
WS0 2 22 25 38 44 49 52 --------------------------
--- DS1 WS1 2 25 27 38 44 49 52 1
----------------------- 4 DS2 11 1 17
4 30 11 32 17 41 30 46 32
41 46
This is not as good a cluster as C!. Lets try
starting with DS0docs dc(MG')gt½max (gt6.5)
C2 (pie theme) 35. Sing a song of sixpence, a
pocket full of rye. Four and twenty blackbirds,
baked in a pie. When the pie was opened, the
birds began to sing. Was not that a dainty dish
to set before the king? The king was in his
counting house, counting out his money. The queen
was in the parlor, eating bread and honey. The
maid was in the garden, hanging out the clothes.
When down came a blackbird and snapped off her
nose. 39. A little cock sparrow sat on a green
tree. And he chirped and chirped, so merry was
he. A naughty boy with his bow and arrow,
determined to shoot this little cock sparrow.
This little cock sparrow shall make me a stew,
and his giblets shall make me a little pie, too.
Oh no, says sparrow, I'll not make a stew. So he
flapped his wings and away he flew. 50. Little
Jack Horner sat in the corner, eating of
Christmas pie. He put in his thumb and pulled out
a plum and said What a good boy am I!
DS0 WS130 45 26 -------------------- 35 DS1
WS29 25 30 45 39 26 --------------- 35
DS2WS39 25 45 39 35 -----------
50 39 DS3 50 35 39
50
C3 1. Three blind mice! See how they run! They
all ran after the farmer's wife, who cut off
their tails with a carving knife. Did you ever
see such a thing in your life as three blind
mice? 11. One misty moisty morning when cloudy
was the weather, I chanced to meet an old man
clothed all in leather. He began to compliment
and I began to grin. How do you do And how do you
do? And how do you do again 17. Here sits the
Lord Mayor. Here sit his two men. Here sits the
cock. Here sits the hen. Here sit the little
chickens. Here they run in. Chin chopper, chin
chopper, chin chopper, chin! 30. Hey diddle
diddle! The cat and the fiddle. The cow jumped
over the moon. The little dog laughed to see such
sport, and the dish ran away with the spoon. 32.
Jack come and give me your fiddle, if ever you
mean to thrive. No I will not give my fiddle to
any man alive. If I should give my fiddle they
will think that I've gone mad. For many a joyous
day my fiddle and I have had 41. Old King Cole
was a merry old soul. And a merry old soul was
he. He called for his pipe and he called for his
bowl and he called for his fiddlers three. And
every fiddler, he had a fine fiddle and a very
fine fiddle had he. There is none so rare as can
compare with King Cole and his fiddlers
three. 46. Tom Tom the piper's son, stole a pig
and away he run. The pig was eat and Tom was beat
and Tom ran crying down the street.
WS02 22 38 44 49 52 ------------------------- DS1
WS12 27 38 44 49 52 1 --------------------
11 DS2 17 1 30 11 32 17 41 30
46 32 41 46
This is not a good cluster! Lets Again starting
with DS0docs dc(MG'')gt½max (gt3.5)
C3 (crown and brown theme?) 10. Jack and Jill
went up the hill to fetch a pail of water. Jack
fell down, and broke his crown and Jill came
tumbling after. When up Jack got and off did trot
as fast as he could caper, to old Dame Dob who
patched his nob with vinegar and brown paper. 21.
The Lion and the Unicorn were fighting for the
crown. The Lion beat the Unicorn all around the
town. Some gave them white bread and some gave
them brown. Some gave them plum cake, and sent
them out of town.
DS0 WS11 8 12 19 26 32 41 47 54 57 60 10 29
---------------------------------- 13 37 DS1
WS212 19 14 44 10 --------------------------
--- 21 47 21 DS2 26 10 28
21
Remove C3. Start with DS0docs dc(MG''')gt½max
(gt3.5)
C4 (morning theme) 37. Here we go round mulberry
bush, mulberry bush, mulberry bush. Here we go
round mulberry bush, on a cold and frosty
morning. This is way we wash our hands, wash our
hands, wash our hands. This is way we wash our
hands, on a cold and frosty morning. This is way
we wash our clothes, wash our clothes, wash our
clothes. This is way we wash our clothes, on a
cold and frosty morning. This is way we go to
school, go to school, go to school. This is the
way we go to school, on a cold and frosty
morning. This is the way we come out of school,
come out of school, come out of school. This is
the way we come out of school, on a cold and
frosty morning. 47. Cocks crow in the morn to
tell us to rise and he who lies late will never
be wise. For early to bed and early to rise, is
the way to be healthy and wealthy and wise.
DS0 WS1 1 8 41 57 60 13
---------------------------- 14 DS1 WS2 1
8 41 57 26 26 ---------------------- 28
29 DS2 WS3 8 41 57 29 37 29
---------------- 37 47 37 DS3 WS4 41
57 44 47 37 ----------- 47
47 DS4 37
47
Remove C4. Start with DS0docs dc(MG''')gt½max
(gt3.5)
15
FAUST Clustering4 (continued)
Remove C3. Start with DS0docs dc(MG''')gt½max
(gt3.5)
C5 (sheep theme? (But 13 is an internal class
outlier!)) Let's consider an alternative C5
starting with DS0 instead of WS0! 13. A robin and
a robins son once went to town to buy a bun. They
could not decide on plum or plain. And so they
went back home again. 26. Sleep baby sleep. Our
cottage valley is deep. The little lamb is on the
green with woolly fleece so soft and clean. Sleep
baby sleep. Sleep baby sleep, down where the
woodbines creep. Be always like the lamb so mild,
a kind and sweet and gentle child. Sleep baby
sleep. 28. Baa baa black sheep, have you any
wool? Yes sir yes sir, three bags full. One for
my master and one for my dame, but none for the
little boy who cries in the lane.
WS0 1 3 4 5 6 8 11 13 15 16 20 22 25 26 29 31 36
38 44 48 51 52 54 59 60 --------------------------
---------- DS1 WS1 1 3 4 6 9 13 15 16 20 28 30
36 47 51 52 54 6 13 ---------------------
---------- 26 28 DS2 13 26 28
C5 (sleep-lamb hub(26) and spokes(28,29)
theme? 26. Sleep baby sleep. Our cottage valley
is deep. The little lamb is on the green with
woolly fleece so soft and clean. Sleep baby
sleep. Sleep baby sleep, down where the woodbines
creep. Be always like the lamb so mild, a kind
and sweet and gentle child. Sleep baby sleep. 28.
Baa baa black sheep, have you any wool? Yes sir
yes sir, three bags full. One for my master and
one for my dame, but none for the little boy who
cries in the lane. 29. When little Fred went to
bed, he always said his prayers. He kissed his
mamma and then his papa, and straight away went
upstairs.
DS0 WS1 1 60 13 -------------- 14 DS1
WS2 1 60 26 26 28 28 29 29
44
Remove C5. Start with DS0docs dc(MG''')gt½max
(gt2.5)
C6 fall (and men) theme 5. Humpty Dumpty sat on a
wall. Humpty Dumpty had a great fall. All the
Kings horses, and all the Kings men cannot put
Humpty Dumpty together again. 14. If all the seas
were one sea, what a great sea that would be! And
if all the trees were one tree, what a great tree
that would be! And if all the axes were one axe,
what a great axe that would be! And if all the
men were one man what a great man he would be!
And if the great man took the great axe and cut
down the great tree and let it fall into the
great sea, what a splish splash that would be!
DS0 WS113 26 31 38 5 33----------
8 38DS1WS226 14 15 38 12 44 5
------ 13 4814 DS2 5 14
Remove C6. Start with DS0docs dc(MG''')gt½max
(gt2.5)
C7 hub(buy,13,33) spoke(high,15,44) theme 13. A
robin and a robins son once went to town to buy a
bun. They could not decide on plum or plain. And
so they went back home again. 15. Great A. little
a. This is pancake day. Toss the ball high. Throw
the ball low. Those that come after may sing
heigh ho! 33. Buttons, a farthing a pair! Come,
who will buy them of me? They are round and sound
and pretty and fit for girls of the city. Come,
who will buy them of me? Buttons, a farthing a
pair! 44. The hart he loves the high wood. The
hare she loves the hill. The Knight he loves his
bright sword. The Lady loves her will.
DS0WS1 13 31 8 -------------- 12 DS1
WS213 31 13 13 --------- 15 15 33 33
38 44 44 48
Remove C7. Start with DS0docs dc(MG''')gt½max
(gt1.5)
C8 old people theme 11. One misty moisty morning
when cloudy was the weather, I chanced to meet an
old man clothed all in leather. He began to
compliment and I began to grin. How do you do And
how do you do? And how do you do again 25. There
was an old woman, and what do you think? She
lived upon nothing but victuals, and drink.
Victuals and drink were the chief of her diet,
and yet this old woman could never be quiet.
DS0WS1 5 22 25 44 52 59 all-------------
-- DS1WS244 59 6 ----------- 12
DS2 WS3 25 12 44 59 25
Remove C8. Start with DS0docs dc(MG''')gt½max
(gt1.5)
C9 theme? 4. Little Miss Muffet sat on a tuffet,
eating of curds and whey. There came a big spider
and sat down beside her and frightened Miss
Muffet away. 6. See a pin and pick it up. All the
day you will have good luck. See a pin and let it
lay. Bad luck you will have all the day. 18. I
had two pigeons bright and gay. They flew from me
the other day. What was the reason they did go? I
can not tell, for I do not know. 49. There was a
little girl who had a little curl right in the
middle of her forehead. When she was good she was
very very good and when she was bad she was
horrid.
DS0WS1 5 22 25 53 all-------------- DS1WS2
5 22 25 6 --------------- 8 DS2WS3 5
22 25 18 4 22 6 49 18 49
Remove C9. Start with DS0docs dc(MG''')gt½max
(gt1.5)
C10 theme? 2. This little pig went to market.
This little pig stayed at home. This little pig
had roast beef. This little pig had none. This
little pig said Wee, wee. I can't find my way
home. 3. Diddle diddle dumpling, my son John.
Went to bed with his breeches on, one stocking
off, and one stocking on. Diddle diddle dumpling,
my son John. 8. Jack Sprat could eat no fat. His
wife could eat no lean. And so between them both
they licked the platter clean. 16. Flour of
England, fruit of Spain, met together in a shower
of rain. Put in a bag tied round with a string.
If you'll tell me this riddle, I will give you a
ring. 22. I had a little husband no bigger than
my thumb. I put him in a pint pot, and there I
bid him drum. I bought a little handkerchief to
wipe his little nose and a pair of little garters
to tie his little hose. 23. How many miles is it
to Babylon? Three score miles and ten. Can I get
there by candle light? Yes, and back again. If
your heels are nimble and light, you may get
there by candle light. 36. Little Tommy
Tittlemouse lived in a little house. He caught
fishes in other mens ditches. 37. Here we go
round mulberry bush, mulberry bush, mulberry
bush. Here we go round mulberry bush, on a cold
and frosty morning. This is way we wash our
hands, wash our hands, wash our hands. This 38.
If I had as much money as I could tell, I never
would cry young lambs to sell. Young lambs to
sell, young lambs to sell. I never would cry
young lambs to sell. 39. A little cock sparrow
sat on a green tree. And he chirped and chirped,
so merry was he. A naughty boy with his bow and
arrow, determined to shoot this little cock
sparrow. This little cock sparrow shall make me a
stew, and his giblets shall make me a little pie,
too. Oh no, says the sparrow, I will not make a
stew. So he flapped his wings and away he
flew. 42. Bat bat, come under my hat and I will
give you a slice of bacon. And when I bake I will
give you a cake, if I am not mistaken. 43. Hark
hark, the dogs do bark! Beggars are coming to
town. Some in jags and some in rags and some in
velvet gowns. 48. One two, buckle my shoe. Three
four, knock at the door. Five six, ick up sticks.
Seven eight, lay them straight. Nine ten. a good
fat hen. Eleven twelve, dig and delve. Thirteen
fourteen, maids a courting. Fiftee
About PowerShow.com