Hubs and Authorities - PowerPoint PPT Presentation

About This Presentation
Title:

Hubs and Authorities

Description:

If there is one element in the set, stop. Otherwise pick a ... One entry can change prediction. Definition of distance metric. How to combine different features ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 31
Provided by: classesCs
Category:

less

Transcript and Presenter's Notes

Title: Hubs and Authorities


1
Hubs and Authorities Learning Perceptrons
  • Artificial Intelligence
  • CMSC 25000
  • February 3, 2004

2
Roadmap
  • Problem
  • Matching Topics and Documents
  • Methods
  • Classic Vector Space Model
  • Challenge I Beyond literal matching
  • Expansion Strategies
  • Challenge II Authoritative source
  • Hubs Authorities
  • Page Rank

3
Authoritative Sources
  • Based on vector space alone, what would you
    expect to get searching for search engine?
  • Would you expect to get Google?

4
Issue
  • Text isnt always best indicator of content
  • Example
  • search engine
  • Text search -gt review of search engines
  • Term doesnt appear on search engine pages
  • Term probably appears on many pages that point to
    many search engines

5
Hubs Authorities
  • Not all sites are created equal
  • Finding better sites
  • Question What defines a good site?
  • Authoritative
  • Not just content, but connections!
  • One that many other sites think is good
  • Site that is pointed to by many other sites
  • Authority

6
Conferring Authority
  • Authorities rarely link to each other
  • Competition
  • Hubs
  • Relevant sites point to prominent sites on topic
  • Often not prominent themselves
  • Professional or amateur
  • Good Hubs Good Authorities

7
Computing HITS
  • Finding Hubs and Authorities
  • Two steps
  • Sampling
  • Find potential authorities
  • Weight-propagation
  • Iteratively estimate best hubs and authorities

8
Sampling
  • Identify potential hubs and authorities
  • Connected subsections of web
  • Select root set with standard text query
  • Construct base set
  • All nodes pointed to by root set
  • All nodes that point to root set
  • Drop within-domain links
  • 1000-5000 pages

9
Weight-propagation
  • Weights
  • Authority weight
  • Hub weight
  • All weights are relative
  • Updating
  • Converges
  • Pages with high x good authorities y good hubs

10
Googles PageRank
  • Identifies authorities
  • Important pages are those pointed to by many
    other pages
  • Better pointers, higher rank
  • Ranks search results
  • tpage pointing to A C(t) number of outbound
    links
  • ddamping measure
  • Actual ranking on logarithmic scale
  • Iterate

11
Contrasts
  • Internal links
  • Large sites carry more weight
  • If well-designed
  • HA ignores site-internals
  • Outbound links explicitly penalized
  • Lots of tweaks.

12
Web Search
  • Search by content
  • Vector space model
  • Word-based representation
  • Aboutness and Surprise
  • Enhancing matches
  • Simple learning model
  • Search by structure
  • Authorities identified by link structure of web
  • Hubs confer authority

13
Efficient Implementation K-D Trees
  • Divide instances into sets based on features
  • Binary branching E.g. gt value
  • 2d leaves with d split path n
  • d O(log n)
  • To split cases into sets,
  • If there is one element in the set, stop
  • Otherwise pick a feature to split on
  • Find average position of two middle objects on
    that dimension
  • Split remaining objects based on average position
  • Recursively split subsets

14
K-D Trees Classification
Yes
No
No
Yes
Yes
No
No
Yes
No
Yes
No
No
Yes
Yes
Poor
Good
Good
Poor
Good
Good
Poor
Good
15
Efficient ImplementationParallel Hardware
  • Classification cost
  • distance computations
  • Const time if O(n) processors
  • Cost of finding closest
  • Compute pairwise minimum, successively
  • O(log n) time

16
Nearest Neighbor Summary
17
Nearest Neighbor Issues
  • Prediction can be expensive if many features
  • Affected by classification, feature noise
  • One entry can change prediction
  • Definition of distance metric
  • How to combine different features
  • Different types, ranges of values
  • Sensitive to feature selection

18
Nearest Neighbor Analysis
  • Problem
  • Ambiguous labeling, Training Noise
  • Solution
  • K-nearest neighbors
  • Not just single nearest instance
  • Compare to K nearest neighbors
  • Label according to majority of K
  • What should K be?
  • Often 3, can train as well

19
Nearest Neighbor Analysis
  • Issue
  • What is a good distance metric?
  • How should features be combined?
  • Strategy
  • (Typically weighted) Euclidean distance
  • Feature scaling Normalization
  • Good starting point
  • (Feature - Feature_mean)/Feature_standard_deviatio
    n
  • Rescales all values - Centered on 0 with std_dev 1

20
Nearest Neighbor Analysis
  • Issue
  • What features should we use?
  • E.g. Credit rating Many possible features
  • Tax bracket, debt burden, retirement savings,
    etc..
  • Nearest neighbor uses ALL
  • Irrelevant feature(s) could mislead
  • Fundamental problem with nearest neighbor

21
Nearest Neighbor Advantages
  • Fast training
  • Just record feature vector - output value set
  • Can model wide variety of functions
  • Complex decision boundaries
  • Weak inductive bias
  • Very generally applicable

22
Summary
  • Machine learning
  • Acquire function from input features to value
  • Based on prior training instances
  • Supervised vs Unsupervised learning
  • Classification and Regression
  • Inductive bias
  • Representation of function to learn
  • Complexity, Generalization, Validation

23
Summary Nearest Neighbor
  • Nearest neighbor
  • Training record input vectors output value
  • Prediction closest training instance to new data
  • Efficient implementations
  • Pros fast training, very general, little bias
  • Cons distance metric (scaling), sensitivity to
    noise extraneous features

24
Learning Perceptrons
  • Artificial Intelligence
  • CMSC 25000
  • February 3, 2003

25
Agenda
  • Neural Networks
  • Biological analogy
  • Perceptrons Single layer networks
  • Perceptron training
  • Perceptron convergence theorem
  • Perceptron limitations
  • Conclusions

26
Neurons The Concept
Dendrites
Axon
Nucleus
Cell Body
Neurons Receive inputs from other neurons (via
synapses) When input exceeds threshold,
fires Sends output along axon to other
neurons Brain 1011 neurons, 1016 synapses
27
Artificial Neural Nets
  • Simulated Neuron
  • Node connected to other nodes via links
  • Links axonsynapselink
  • Links associated with weight (like synapse)
  • Multiplied by output of node
  • Node combines input via activation function
  • E.g. sum of weighted inputs passed thru
    threshold
  • Simpler than real neuronal processes

28
Artificial Neural Net
w
x
w
Sum Threshold
x
w
x
29
Perceptrons
  • Single neuron-like element
  • Binary inputs
  • Binary outputs
  • Weighted sum of inputs gt threshold

30
Perceptron Structure
y
w0
wn
w1
w3
w2
x01
x1
x3
x2
xn
. . .
compensates for threshold
x0 w0
31
Perceptron Convergence Procedure
  • Straight-forward training procedure
  • Learns linearly separable functions
  • Until perceptron yields correct output for all
  • If the perceptron is correct, do nothing
  • If the percepton is wrong,
  • If it incorrectly says yes,
  • Subtract input vector from weight vector
  • Otherwise, add input vector to weight vector

32
Perceptron Convergence Example
  • LOGICAL-OR
  • Sample x1 x2 x3 Desired Output
  • 1 0 0 1
    0
  • 2 0 1 1
    1
  • 3 1 0 1
    1
  • 4 1 1 1
    1
  • Initial w(0 0 0)After S2, wws2(0 1 1)
  • Pass2 S1ww-s1(0 1 0)S3wws3(1 1 1)
  • Pass3 S1ww-s1(1 1 0)

33
Perceptron Convergence Theorem
  • If there exists a vector W s.t.
  • Perceptron training will find it


  • Assume

    for all
    ive examples x
  • w2 increases by at most x2, in each
    iteration
  • wx2 lt w2x2 ltk x2
  • v.w/w gt lt 1




















    Converges in k lt O
    steps

34
Perceptron Learning
  • Perceptrons learn linear decision boundaries
  • E.g.

x2

0
But not
0

x1
xor
X1 X2 -1 -1 w1x1 w2x2 lt 0 1
-1 w1x1 w2x2 gt 0 gt implies w1 gt 0 1
1 w1x1 w2x2 gt0 gt but should be
false -1 1 w1x1 w2x2 gt 0 gt implies
w2 gt 0
35
Perceptron Example
  • Digit recognition
  • Assume display 8 lightable bars
  • Inputs on/off threshold
  • 65 steps to recognize 8

36
Perceptron Summary
  • Motivated by neuron activation
  • Simple training procedure
  • Guaranteed to converge
  • IF linearly separable
Write a Comment
User Comments (0)
About PowerShow.com