Searching by Authority - PowerPoint PPT Presentation

About This Presentation

Title:

Searching by Authority

Description:

Based on vector space alone, what would you expect to get ... Pros: Robust to irrelevant features, some noise, fast prediction, perspicuous rule reading ... – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 37

Provided by: classesCs

Learn more at: https://www.classes.cs.uchicago.edu

Category:

more less

Transcript and Presenter's Notes

Title: Searching by Authority

1
Searching by Authority

Artificial Intelligence
CMSC 25000
February 12, 2008

2
A Conversation with Students

Speaker Bill Gates
Title Bill Gates Unplugged On Software,
Innovation, Entrepreneurshop, and Giving Back
Date February 20, 2008
Tickets By lottery
http//studentactivities.uchicago.edu/billgates

3
Authoritative Sources

Based on vector space alone, what would you
expect to get searching for search engine?
Would you expect to get Google?

4
Conferring Authority

Authorities rarely link to each other
Competition
Hubs
Relevant sites point to prominent sites on topic
Often not prominent themselves
Professional or amateur
Good Hubs Good Authorities

5
Googles PageRank

Identifies authorities
Important pages are those pointed to by many
other pages
Better pointers, higher rank
Ranks search results
t page pointing to A C(t) number of outbound
links
d damping measure
Actual ranking on logarithmic scale
Iterate

6
Contrasts

Internal links
Large sites carry more weight
If well-designed
HA ignores site-internals
Outbound links explicitly penalized
Lots of tweaks.

7
Web Search

Search by content
Vector space model
Word-based representation
Aboutness and Surprise
Enhancing matches
Simple learning model
Search by structure
Authorities identified by link structure of web
Hubs confer authority

8
Medical Decision MakingLearning Decision Trees

Artificial Intelligence
CMSC 25000
February 12, 2008

9
Agenda

Decision Trees
Motivation Medical Experts Mycin
Basic characteristics
Sunburn example
From trees to rules
Learning by minimizing heterogeneity
Analysis Pros Cons

10
Expert Systems

Classic example of classical AI
Narrow but very deep knowledge of a field
E.g. Diagnosis of bacterial infections
Manual knowledge engineering
Elicit detailed information from human experts

11
Expert Systems

Knowledge representation
If-then rules
Antecedent Conjunction of conditions
Consequent Conclusion to be drawn
Axioms Initial set of assertions
Reasoning process
Forward chaining
From assertions and rules, generate new
assertions
Backward chaining
From rules and goal assertions, derive evidence
of assertion

12
Medical Expert Systems Mycin

Mycin
Rule-based expert system
Diagnosis of blood infections
450 rules experts, better than junior MDs
Rules acquired by extensive expert interviews
Captures some elements of uncertainty

13
Medical Expert Systems Issues

Works well but..
Only diagnoses blood infections
NARROW
Requires extensive expert interviews
EXPENSIVE to develop
Difficult to update, cant handle new cases
BRITTLE

14
Modern AI Approach

Machine learning
Learn diagnostic rules from examples
Use general learning mechanism
Integrate new rules, less elicitation
Decision Trees
Learn rules
Duplicate MYCIN-style diagnosis
Automatically acquired
Readily interpretable
cf Neural Nets/Nearest Neighbor

15
Learning Identification Trees

(aka Decision Trees)
Supervised learning
Primarily classification
Rectangular decision boundaries
More restrictive than nearest neighbor
Robust to irrelevant attributes, noise
Fast prediction

16
Sunburn Example
17
Learning about Sunburn

Goal
Train on labeled examples
Predict Burn/None for new instances
Solution??
Exact match same features, same output
Problem 233 feature combinations
Could be much worse
Nearest Neighbor style
Problem Whats close? Which features matter?
Many match on two features but differ on result

18
Learning about Sunburn

Better Solution
Identification tree
Training
Divide examples into subsets based on feature
tests
Sets of samples at leaves define classification
Prediction
Route NEW instance through tree to leaf based on
feature tests
Assign same value as samples at leaf

19
Sunburn Identification Tree
Blonde
Brown
Red
Emily Burn
Alex None John None Pete None
No
Yes
Sarah Burn Annie Burn
Katie None Dana None
20
Simplicity

Occams Razor
Simplest explanation that covers the data is best
Occams Razor for ID trees
Smallest tree consistent with samples will be
best predictor for new data
Problem
Finding all trees finding smallest Expensive!
Solution
Greedily build a small tree

21
Building ID Trees

Goal Build a small tree such that all samples at
leaves have same class
Greedy solution
At each node, pick test such that branches are
closest to having same class
Split into subsets with least disorder
(Disorder Entropy)
Find test that minimizes disorder

22
Minimizing Disorder
Brown
Blonde
Tall
Short
Red
Average
AlexN AnnieB KatieN
Sarah B Dana N Annie B Katie N
SarahB EmilyB JohnN
Alex N Pete N John N
DanaN PeteN
Emily B
Yes
No
Heavy
Light
Average
SarahB AnnieB EmilyB PeteN JohnN
DanaN AlexN KatieN
DanaN AlexN AnnieB
EmilyB PeteN JohnN
SarahB KatieN
23
Minimizing Disorder
Tall
Short
Average
AnnieB KatieN
SarahB
DanaN
Yes
No
Heavy
Light
Average
SarahB AnnieB
DanaN KatieN
DanaN AnnieB
SarahB KatieN
24
Measuring Disorder

Problem
In general, tests on large DBs dont yield
homogeneous subsets
Solution
General information theoretic measure of disorder
Desired features
Homogeneous set least disorder 0
Even split most disorder 1

25
Measuring Entropy

If split m objects into 2 bins size m1 m2, what
is the entropy?

26
Measuring DisorderEntropy
the probability of being in bin i
Entropy (disorder) of a split
Assume
27
Computing Disorder
N instances
Branch 2
Branch1
N2 a N2 b
N1 a N1 b
28
Entropy in Sunburn Example
Hair color 4/8(-2/4 log 2/4 - 2/4log2/4)
1/80 3/8 0 0.5 Height
0.69 Weight 0.94 Lotion 0.61
29
Entropy in Sunburn Example
Height 2/4(-1/2log1/2-1/2log1/2)
1/401/40 0.5 Weight 2/4(-1/2log1/2-1/2l
og1/2) 2/4(-1/2log1/2-1/2log1/2) 1 Lotion
0
30
Building ID Trees with Disorder

Until each leaf is as homogeneous as possible
Select an inhomogeneous leaf node
Replace that leaf node by a test node creating
subsets with least average disorder
Effectively creates set of rectangular regions
Repeatedly draws lines in different axes

31
Features in ID Trees Pros

Feature selection
Tests features that yield low disorder
E.g. selects features that are important!
Ignores irrelevant features
Feature type handling
Discrete type 1 branch per value
Continuous type Branch on gt value
Need to search to find best breakpoint
Absent features Distribute uniformly

32
Features in ID Trees Cons

Features
Assumed independent
If want group effect, must model explicitly
E.g. make new feature AorB
Feature tests conjunctive

33
From Trees to Rules

Tree
Branches from root to leaves
Tests gt classifications
Tests if antecedents Leaf labels consequent
All ID trees-gt rules Not all rules as trees

34
From ID Trees to Rules
Blonde
Brown
Red
Emily Burn
Alex None John None Pete None
No
Yes
Sarah Burn Annie Burn
Katie None Dana None
(if (equal haircolor blonde) (equal lotionused
yes) (then None)) (if (equal haircolor blonde)
(equal lotionused no) (then Burn)) (if (equal
haircolor red) (then Burn)) (if (equal haircolor
brown) (then None))
35
Identification Trees

Train
Build tree by forming subsets of least disorder
Predict
Traverse tree based on feature tests
Assign leaf node sample label
Pros Robust to irrelevant features, some noise,
fast prediction, perspicuous rule reading
Cons Poor feature combination, dependency,
optimal tree build intractable

36
C4.5 vs Mycin