Why Not Store Everything in Main Memory? Why use disks? - PowerPoint PPT Presentation

1 / 4

About This Presentation

Title:

Why Not Store Everything in Main Memory? Why use disks?

Description:

Near Neighbor Classifiers and FAUST Faust is really a Near Neighbor Classifier (NNC) in which, for each class, we construct a big box neighborhood (bbn) which we ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 5

Provided by: William1455

Learn more at: http://www.cs.ndsu.nodak.edu

Category:

more less

Transcript and Presenter's Notes

Title: Why Not Store Everything in Main Memory? Why use disks?

1
Near Neighbor Classifiers and FAUST Faust
is really a Near Neighbor Classifier (NNC) in
which, for each class, we construct a big box
neighborhood (bbn) which we think, based on the
training points, is most likely to contain that
class and least likely to contain the other
classes.
In the current FAUST, each bbn is a coordinate
box, i.e., for coordinate (band) R,
coordinate_box cb(R,class,aR,bR) is the set of
all points, x, such that aR lt xR lt bR (either
of aR or bR can be infinite). Either or both of
the lt can be ?. The values, aR and bR are what
we have called the cut_points for that class.
bbn's are constructed using the training set and
applied to the full set of unclassified pixels.
The bbn's are always applied sequentially, but
can be constructed either sequentially or
divisively. In case the construction is
sequential, the application sequence is the same
as the construction sequence (and the application
for each class, follows the construction for that
class immediately. i.e., before the next bbn
construction) All pixels in the first bbn are
classified into that first class (the class of
that bbn). All remaining pixels which are in
the second bbn are classified into the second
class (class of that bbn), etc. Thus,
iteratively, all remaining unclassified pixels
which are in the next bbn are classified into its
class.
The reason cn's are applied sequentially is that
they intersect. Thus, the first bbn should be
the strongest in some sense, then the next
strongest, then the next strongest, etc. In each
round, from the remaining classes, we construct
FAUST cn's by choosing the attribute-class with
the maximum gap_between_consecutive_mean_values,
or the maximum_number_of_stds_between_consecutive
_means or the gap_between_consecutive_means
allowing the minimum rank (i.e., the "best
remaining gap"). Note that mean can be replaced
by median or any representer.
We could take the bbn's to be "multi-coordinate_ba
nd" or mcb, of the form, the INTERSECTION of the
"best" k (k ? n-1, assuming n classes ) cb's for
a given class (where "best" can be with respect
to any of the above maximizations). And instead
of using a fixed number of coordinates, k, we
could use only those coordinates in which the
"quality" of its cb is higher than a threshold,
where "quality" might be measured many ways
involving the dimensions of the gaps (or other
ways?). Many pixels may not get classified (this
hypothesis needs testing!). It should be
accurate though.
2
Near Neighbor Classifiers and FAUST-2
We note that mcb's are used for vegetation
indexing high green ( aG high and bG ?, i.e.,
all x such that xG gt aG ) and low red ( aR -?
and bR low, i.e., all x such that xR lt bR) is the
standard "vegetation index" and measures crop
health well. So, if in instead of predicting
grass if we were predicting lush grass, we could
use vi, which involves mcb bbn's Similarly mcb
bbn's would be used for any color object which is
not pure (in the bands provided). Therefore a
"blue-red" car would ideally involve a bbn that
is the intersection of a red cn and a blue cn.
Most paint colors are not pure. Worse yet, what
does pure mean? Pure only makes sense in the
context of the camera taking the image in the
first place. The definition of a pure color in
a given image is a color entirely within one band
(column) of that image dataset (with all other
bands showing zero values only). So almost all
actual objects would be multi-color objects and
would require, or at least benefit from, a
multi-cn bbn approach.
3
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1
1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 1 1 0 0 0 0 0
0 0 0 1 0
1 1 1 1 0 1 1 1 0 1 1 1 0 1 0 0 1 0 1 1 1 0 1 1 0
0 1 1 0 0
0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 0 1 0 1 1
0 0 0 0 1
0 0 0 0 0 1 0 0 1 0 1 1 1 1 1 1 0 1 0 0 1 1 0 0 1
1 0 0 1 0
1 1 1 1 0 1 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 0 0 0 0
1 1 0 1 0
0 1 1 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0 1 1 1 1 0 0 0
1 1 0 1 1
0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 1 0
0 1 1 1 1
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
1 1 1 1 1
0 1 0 1 1 1 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0
0 0 0 1 1
1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0
1 0 0 0 0
0 1 1 1 1 1 0 1 0 1 0 1 1 0 0 1 0 0 1 0 0 1 1 0 1
0 0 0 1 0
0 1 1 1 1 1 1 0 0 1 0 0 1 0 0 1 0 1 0 1 1 1 1 0 0
0 0 1 0 0
1 1 0 0 0 0 0 0 1 0 0 1 1 1 1 1 1 0 0 0 0 1 1 1 0
1 1 1 0 1
1 1 1 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 0 0 0 1 1 1 0
1 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0
1 0 1 0 0 0 0 1 1 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1
1 1 1 0 0
1 0 1 0 0 0 0 1 1 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1
1 1 1 0 0
1 0 1 1 1 0 0 1 1 1 0 1 1 1 1 0 0 1 0 1 0 1 1 1 1
0 1 0 1 0
1 0 1 0 1 1 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 1 0 1 1
0 0 0 0 0
0 0 1 0 1 0 0 1 1 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 0
1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 1 1
0 1 1 1 1
1 0 0 1 1 0 1 0 1 1 0 0 1 0 1 1 1 0 1 1 1 0 1 0 0
1 0 0 0 0
0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1
0 1 0 0 0
1 0 1 1 0 1 1 1 1 1 0 0 0 1 0 1 0 1 1 1 1 1 0 1 1
0 1 1 0 1
Appendix
Note on problems Difficult separations
problem e.g., white cars from white roofs.
Include as feature attributes, the pixel
coordinate value columns as well as the bands.
If the color is not sufficiently different to
make the distinction (and no other non-visible
band makes the distinction either) and if the
classess are contiguous objects (as they are in
Aroura), then because the white car training
points are likely to be far from the white roof
training points, FAUST may still work well,
using x and y pixel coordinates as additional
feature attributes (and attributes such as
"shape", edge_sharpness, etc., if available).
CkNN applied to nbrs taken from the training set,
should work also.
Noise Class Problem In pixel classification,
there's may be a Default_Class or NOise, N)
(Aurora classes are Red_Cars, White_Cars,
Black_Cars, ASphalt, White_Roof, GRass, SHadow
and in the "Parking Lot Scene" case at least,
there does not appear to be a NOise_class - i.e.,
every pixel is in one of the 7 classes
above). So, in some cases, we may have 8 classes
RC, WC, BC, AS, WR, GR, SH, NO. Picking out NO
may be a challenge for any algorithm if it
contains pixels that match training pixels from
several of the legitimate classes - i.e., if NO
is composed of tuples with values similar to
other classes (Dr. Wettstein calls this the "red
shirt" problem - if a person has a red shirt and
is in the field of view, those pixels may be
electromagnetically indistinguishable from
Red_Car pixels. In that case, no correct
algorithm will distinguish them
electromagnetically (using only reflectance
bands). Such other attributes as x and y
position, size and shape (if available) etc. may
provide a distinction.
Using FAUSTseq, where we maximize the 1. size
of gap between consecutive means or 2.
maximize the number of stds in the gap between
consecutive means or 3. minimize the K which
produces no overlap (betweeen the rankK set and
the rank(n-K1) set of the next class) in the gap
between consecutive classes instead taking as
cut_point, the point produced by that
maximization, we should back off from that and
narrow the interval around that class mean by
going only a fraction either way (some
parameterized fraction), which would remove many
of the NC points from that class prediction.
Inconsistent ordering of classes over the various
attributes (columns) may be an indicator of
something?
4
An old version of the basic alg. I took the
first 40 of setosa, versicolor and virginica and
put the other 30 tuples in a class called "noise".
1. Sort ACS's asc by median gaprankK(this
class)-rank(n-K1)(next class) 2. Do Until (
rankK(ACS) ? rank(n-K1)(next higher ACS in same
A) Kn/2 ) 3. Find gap, except Kth 4.
KK-1 END DO return K for each Att, Class
pair.
Build ACS tables (gapgt0). cut_ptrankKS(gap),
S1. Minimize K.
TsLN cl md K rnK gap se 50 12 52 1 no 57
12 51 1 ve 60 15 62 1 vi 64
TsWD cl md K rnK gap ve 28 20 29 230 vi 30 10
28 1 no 30 12 31 1 se
TpLN cl md K rnK gap se 15 10 19 3 no 42 17
41 2 ve 44 5 48 1 vi 56
TpWD cl md K rnK gap se 2 7 4 2 no 12 16
12 1 ve 14 5 16 1 vi 20
1st pass produces a tie for min K, in (pLN, vi)
and (pWD, vi) (note in both vi doesn't have
higher gap since it's highest). Thus we can take
both - either AND the conditions or OR the
conditions. If we OR the conditions ( PpLN,vi ?
48) (PpWD,vi ? 16) get perfect classification
and if AND get 5 mistakes
recompute
TsLN cl md K rnK gap se 50 12 52 1 no 57
12 51 1 ve 60
TsWD cl md K rnK gap ve 28 15 29 1 no 30 12
31 1 se
TpLN cl md K rnK gap se 15 10 19 3 no 42 17
41 2 ve 44
TpWD cl md K rnK gap se 2 7 4 2 no 12 16
12 1 ve 14
min K in (pWD, vi). PpWD,vi?5 get 9 mistakes.
TsLN cl md K rnK gap no 57 12 51 1 ve 60
TsWD cl md K rnK gap ve 28 15 29 1 no 30
TpLN cl md K rnK gap no 42 17 41 2 ve 44
TpWD cl md K rnK gap no 12 16 12 1 ve 14
min K in (sLN, no). PpWD,vi?51 get 12 mistakes.
FAUSTseq,mrk VPHD Set of training values in 1
col and 1 class called Attribute-Class-Set, ACS.
K(ACS)ACS (all ACSn10 here). In the alg
below, croot_count and psposition (there's a
separate root_count and position for each ACS and
each of K and n-K1 for that ACS. So cc( attr,
class, K(n-K1) ). Sgap enlargement parameter
(It can be djusted to try to clip out Noise
Class, NC) 1. Sort ACS's asc by median gap
rankK(this class) - rank(n-K1)(next class)
2. Do Until ( rankK(ACS) ? rank(n-K1)(next
ACS) K0 ) 3. Find rankK and rank(n-K1)
values of each ACS (except 1st an and Kth)
4. KK-1 END DO return K for each
Attribute, Class pair. 5. Cut_pts placed
above/below that class (using values in attr)
hi cut_ptrankKS(higher_gap) low
cut_ptrank(n-K1)S(lower_gap)

Write a Comment

User Comments (0)