Multidimensional Detective - PowerPoint PPT Presentation

About This Presentation
Title:

Multidimensional Detective

Description:

473 batches, 16 processes (X1 X16) ... It doesn't include some batches having high X3 value (nearly 0 defects) ... 5 shows those batches which does not have ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 18
Provided by: Zhen7
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Multidimensional Detective


1
Multidimensional Detective
  • Alfred Inselberg, Multidimensional Graphs Ltd
  • Tel Aviv University, Israel
  • Presented by Yimeng Dou
  • 04-24-2002 ydou_at_ics.uci.edu

2
Parallel Coordinates
  • We can use parallel coordinates to model
    relations among multiple variables, and turn our
    problem into a 2-D pattern recognition problem.
  • Its very useful for Visual Data Mining.
  • Two examples VLSI chip and model of a countrys
    economy.
  • The model can be used to do trade-off analyses,
    discover sensitivities, do approximate
    optimizations, monitor and Decision Support.

3
Goals of The Program
  • Without any loss of information.
  • Low representational complexity O(N) (N is the
    number of dimensions).
  • Works for any N.
  • Treat every variable uniformly.
  • Can use transformations to recognize objects
    (rotation, translation, scaling, etc.).
  • Easily/Intuitively convey information on the
    properties of the N-Dimensional object.
  • Should be based on rigorous mathematical and
    algorithmic results.

4
In order to discover patterns from a large data
set
  • Must use parallel coordinates effectively, with
    proper geometrical understanding and queries
    (hence the notion of Multidimensional
    Detective).
  • Instead of mimicking the experience derived from
    standard display, a good model should exploit the
    special strengths of the methodology, avoids its
    weakness.
  • This task is similar to accurately cutting
    complicated portions of an N-dimensional
    watermelon. The cutting tools should be well
    chosen and intuitive.

5
The VLSI Chip Problem
  • Understand Figure 1the full real data set. 473
    batches, 16 processes (X1X16).
  • X1Yield (The percentage of useful chips produced
    in the batch).
  • X2Quality (Speed performance)
  • X3 through X12 10 different types of defects. 0
    defect appears on top.
  • X13 through X16physical parameters.
  • The author didnt specify how to find high yield
    or high quality. I think high values appear on
    top, with hints from some of his later
    description.

6
Objective
  • Raise the yield (X1), and maintain high quality
    (X2). Its a multiobjective optimization problem.
  • Its believed that the presence of defects
    hindered high yields and qualities.
  • So the goal isto achieve zero defects.
  • (But is that really the case? .lets see)

7
Observations From Figure 2
  • It isolates the batches having the highest X1 and
    X2. Also, notice the two clusters of X15.
  • It doesnt include some batches having high X3
    value (nearly 0 defects). So it casts doubt on
    the goal of achieve zero defects. Is it the
    right aim?
  • To answer this question, we construct Figure 3,
    which includes batches having 0 defects in at
    least 9 categories (they are really close to the
    aim of zero defects). Do they have high yields
    and quality?

8
Figure 3Our assumption is challenged.
  • The nine batches have poor yields and low
    quality.
  • Heres another visual cueX6. The process is much
    more sensitive to variations in X6 than the other
    defects.
  • Treat X6 differentlyselect those batches with 0
    X6 defectsthe very best batch is included. (As
    shown in Figure 4).

9
Figure 5 and Figure 6Test The Assumption
  • Figure 5 shows those batches which does not have
    zeros for X3 and X6.
  • Figure 6 shows the cluster of batches with top
    yields (notice theres a gap in X1 between them
    and remaining batches, as seen in Figure 1).
  • The findingsmall amounts of X3 and X6 type
    defects are essential for high yields and
    quality.
  • Besides, back to Fig.2, we can see X15s
    relationship with X1/X2.

10
Our Conclusion For VLSI Chip Problem
  • Small ranges of X3, X6 close to (but not equal
    to) zero, together with the lower range of X15
    provide necessary conditions for high yields and
    quality.
  • Fig.9 shows the result of constraining only X1
    and the resulting gap in X15.
  • Fig.10 shows only constraining X2 does not yield
    a gap in X15.

11
Other Insights and The Lesson We Learned From
VLSI Example
  • Fig.11 shows that except for two batches, the
    others all have very high X2. So we isolate these
    two batches in Fig.12and find that the high
    yields but lower quality may be due to ranges of
    X6, X13, X14, X15.
  • So it suggests that we can further partition this
    multivariate problem into sub-problems pertaining
    to individual objectives.

12
The Economic Model Example
  • This example illustrates how to use interior
    point algorithm with the model, to do trade-off
    analyses, understand the impact of constraints,
    and in some cases do optimizations.
  • Interior point algorithmWe can use it to find a
    point that is interior to a region, and satisfies
    all the constraints simultaneously, so in this
    case, it represents a feasible economic policy
    for a country.
  • It is done interactively by sequentially choosing
    values of the variables. (Fig 13)

13
Result of Choosing The First Variable
  • Once a value of the first variable is
    chosen(Agriculture output), the dimensionality of
    the region is reduced by one. We can see the
    relationship between Agriculture and Fishing (Low
    ranges corresponds to each other).
  • So its possible to find a policy that favors
    Agriculture but not favoring Fishing and vice
    versa.
  • Mining and Fishing (see from the lower lines of
    Fishing in Fig.13). We find the competition
    between them.

14
Neighborhood
  • In Fig.15, a 20-dimensional model. The
    intermediate curves provide useful insights.
  • The steep strips in X13, X14 and X15. These 3 are
    critical variables, where the point is bumping
    the boundary.

15
Boundary Point and Exterior Point
  • Boundary pointIf the polygonal line is tangent
    to anyone of the intermediate curves then it
    represents a boundary point.
  • Exterior pointIf it crosses any intermediate
    curves.
  • Exterior point enables us to see the first
    variable for which the construction failed and
    what is needed to make corrections.
  • By changing variables interactively, we can
    discover sensitive regions and other patterns.

16
Before We Come To Conclusion
  • Is this model merely a model, or is it used (with
    the intuitive functionalities and high
    interactivity) in any software products?
  • Is this model accurate enough?
  • Is it sufficient to come to any conclusion about
    a problem using this technique when data set is
    very large?
  • How to become a skillful detective? Can any
    software substitute people?

17
Conclusion
  • Each multivariate dataset and problem has its own
    personality , so it requires substantial
    variations in the discovery scenarios and calls
    for considerable ingenuity ( a characteristic of
    a detective).
  • An effort of automating the exploration process
    is under way. It will have a number of new
    features, like intelligent agents, which will
    learn from gathered experiences.
Write a Comment
User Comments (0)
About PowerShow.com