VECTOR SPACE MODEL - PowerPoint PPT Presentation

About This Presentation

Title:

VECTOR SPACE MODEL

Description:

It is a standard technique in Information Retrieval ... d1 d2 d1 FA - Fagin's Algorithm. d2 d3 d3 TA Threshold Algorithm. d3 d4 d2. d5 d2 d4 ... – PowerPoint PPT presentation

Number of Views:202

Avg rating:3.0/5.0

Slides: 20

Provided by: Arj12

Learn more at: https://ranger.uta.edu

Category:

Tags: model | space | vector | fagin

Transcript and Presenter's Notes

Title: VECTOR SPACE MODEL

1
VECTOR SPACE MODEL

Its Applications and implementations
in Information Retrieval
Lecture -3

2
Slides for Lecture -3

The Vector Space Model (VSM) is a way of
representing documents through the words that
they contain
It is a standard technique in Information
Retrieval
The VSM allows decisions to be made about which
documents are similar to each other and to
keyword queries

3
Slides for Lecture 3

The Vector Space Model
Documents and queries both are vectors
Di (wdi1,wdi2,.wdij)
each wij is the weight of the term j in document
i
similarity metric is measured as equal to
cosine of
the angle between them

4
Slides for lecture 3

Documents and Queries are represented as vectors.
Position 1 corresponds to term 1, position 2 to
term 2, position t to term t

5
Slides for Lecture 3

Cosine Similarity measure
Similarity(dvector, qvector) cos ?
( x.y x . y cos ?)
j1mSwij qij
(j1mSwij2)1/2(j1m S qij2)1/2
Cosine is a normalized dot product

6
Slides for lecture 3

TF-IDF Normalization
Normalize the term weights (so longer documents
are not unfairly given more weight)
The longer the document, the more likely it is
for a given term to appear in it, and the more
often a given term is likely to appear in it. So,
we want to reduce the importance attached to a
term appearing in a document based on the length
of the document

7
Slides for lecture 3

Cosine is the normalized dot product
Documents are ranked according to decreasing
order of Cosine value
sim( dvector, qvector) 1 when
dvector qvector
sim( dvector, qvector) 0 when dvector and
qvector share no terms

8
Slides for lecture 3

A user enters a query
The query is compared to all documents using a
similarity measure
A vector distance between the query and documents
is used to rank the retrieved pages
The user is shown the documents in decreasing
order of similarity to the query term

9
Slides for lecture 3

How to weight terms?
Higher the weight term higher impact on cosine
What terms are important?
gtif term is present in query then its
presence in
the document is relevant to the query
gtInfrequent in other documents
gtFrequent in document A
So the cosine needs to be modified in this
respect

10
Slides for lecture 3

Modeling and Implementation
Example suppose a query is fired by the user
specifying three particular terms T1, T2,T3
Query q (T1, T2, T3)
lets there be n documents with a total of m
terms
Now for implementation

11
Slides for lecture-3

Document Ranking
A user enters a query
The query is compared to all documents using a
similarity measure
The user is shown the documents in decreasing
order of similarity to the query term

12
Slides for lecture 3

Example
Term T1 T2 T3
Tm
Document d1 d2 d1
.. d1
d2 d4 d7
d7
d3 d8 d9
.d10
d9 d10 d6
d11
. .
.
.
. .
.
d65
d7 d89 .
d76
we can arrange the documents in a descending
order of the corresponding score that is
computed from tf idf

13
Slides for lecture 3
tf idf measure term frequency (tf) inverse
document frequency (idf)
14
Slides for lecture 3
Slides for lecture 3

In the case of multiple values we take the
smallest list of documents among corresponding to
the Query values T1, T2,T3
T1 T2 T3 FA and TA
algorithms are used for merging of the lists
d1 d2 d1 FA -
Fagins Algorithm
d2 d3 d3 TA
Threshold Algorithm
d3 d4 d2
d5 d2 d4
. . .
d4 . .
. d6
d7

15
Slides for lecture 3

After the intersected list for the multi values
has been found , then take the TfIDF score for
each and add them for the corresponding terms
arrange them in decreasing order of the total
T1 T2 T3
(tfidf tfidf tfidf)
d1 106 106 0 (this
will be ranked higher)
d2 4 4 4

16
Slides for lecture 3

DATBASE CONTEXT
All distinct values are words or terms
A tuple is taken as a Document
Important Points
Vector space does not force any broad conditions
No search engine uses vector space model
It is implemented but using some constraints

17
Slides for lecture 3

Advantages
Disadvantages
gtRanked Retrieval gtterms are
taken
gtTerms are weighted independent
according to importance gtWeighting is not
very formal

18
Slides for Lecture -3

Thank you
Slides Made By Arjun Saraswat

19
Slides for lecture -3

References
www.scit.wlv.ac.uk/jphb/cp4040/mtnotes
http//www.cs.wisc.edu/dbbook/openAccess/thirdEdi
tion/slides
http//krakow.lti.cs.cmu.edu
http//www.cs.utexas.edu/users/mooney/ir-course/sl
ides
http//db.uwaterloo.ca/tozsu/courses

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Implementation of Vector Space Model PowerPoint PPT Presentation

Implementation of Vector Space Model - Implementation of Vector Space Model March 27, 2006 How TA Can Be Used in Vector Space Model? Let consider a query with keyword microsoft and corporation, q ... | PowerPoint PPT presentation | free to view

Vector Space Model PowerPoint PPT Presentation

Vector Space Model - Vector Space Model Rong Jin * * Choosing Bases for VSM Modify the bases of the vector space Each basis is a concept: a group of words Every document is a mixture of ... | PowerPoint PPT presentation | free to view

The Vector Space Model PowerPoint PPT Presentation

The Vector Space Model - a, and, cat, dog, frog. Example, continued. Document A: 'A dog ... frog. and. a. Queries. Queries can be represented as vectors in the same way as documents: ... | PowerPoint PPT presentation | free to view

The Vector Space Model of Information Retrieval PowerPoint PPT Presentation

The Vector Space Model of Information Retrieval - The Boolean retrieval model imagines IR in set theoretic terms. Aside from its putative 'unfriendliness' this approach doesn't provide a ... | PowerPoint PPT presentation | free to view

Computing Relevance, Similarity: The Vector Space Model PowerPoint PPT Presentation

Computing Relevance, Similarity: The Vector Space Model - Vector operations to capture boolean query conditions ... 'The art of finding groups in data.' -- Kaufmann and Rousseeu. 23. Problems with Vector Space ... | PowerPoint PPT presentation | free to view

Lecture 11 Vector Spaces and Singular Value Decomposition PowerPoint PPT Presentation

Lecture 11 Vector Spaces and Singular Value Decomposition - Lecture 11 Vector Spaces and Singular Value Decomposition | PowerPoint PPT presentation | free to view

Primordial density perturbations from the vector fields PowerPoint PPT Presentation

Primordial density perturbations from the vector fields - Primordial density perturbations from the vector fields Mindaugas Kar iauskas in collaboration with Konstantinos Dimopoulos Jacques M. Wagstaff | PowerPoint PPT presentation | free to view

Support Vector and Kernel Methods PowerPoint PPT Presentation

Support Vector and Kernel Methods - Support Vector and Kernel Methods John Shawe-Taylor University of Southampton | PowerPoint PPT presentation | free to view

Lecture 11 Vector Spaces and Singular Value Decomposition PowerPoint PPT Presentation

Lecture 11 Vector Spaces and Singular Value Decomposition - Lecture 11 Vector Spaces and Singular Value Decomposition Natural solution. Smallest error, just ||d0||^2. Smallest model parameter length, just ||m0||2. | PowerPoint PPT presentation | free to view

An Introduction to Support Vector Machine Classification PowerPoint PPT Presentation

An Introduction to Support Vector Machine Classification - An Introduction to Support Vector Machine Classification Bioinformatics Lecture 7/2/2003 by Pierre D nnes Outline What do we mean with classification, why is it ... | PowerPoint PPT presentation | free to view

Geographical Data Modeling UML and Data Modeling Elements Examples from the Marine Data Model and ArcHydro (Thanks to Dawn Wright) PowerPoint PPT Presentation

Geographical Data Modeling UML and Data Modeling Elements Examples from the Marine Data Model and ArcHydro (Thanks to Dawn Wright) - data model = limited representation of reality. a discretization or partitioning of space ... Data Model. Representation of information about a form or a process ... | PowerPoint PPT presentation | free to view

Support Vector Machines PowerPoint PPT Presentation

Support Vector Machines - Support Vector Machines (SVMs) Hypothesis Space of linear functions ... State-of-the-art NLP-tool suited for real applications. represents a good balance of: ... | PowerPoint PPT presentation | free to view

Parallel Vector Tile-Optimized Library (PVTOL) Architecture PowerPoint PPT Presentation

Parallel Vector Tile-Optimized Library (PVTOL) Architecture - Parallel Vector TileOptimized Library PVTOL Architecture | PowerPoint PPT presentation | free to view

Programming in the Distributed SharedMemory Model PowerPoint PPT Presentation

Programming in the Distributed SharedMemory Model - Nice, France. Naming Issues ... Nice, France. The Message Passing Model. Programmers control data and work distribution ... Nice, France. Tutorial Emphasis ... | PowerPoint PPT presentation | free to view

An Examination of DSLs for Concisely Representing Model Traversals and Transformations PowerPoint PPT Presentation

An Examination of DSLs for Concisely Representing Model Traversals and Transformations - An Examination of DSLs for Concisely Representing Model Traversals and Transformations Jeff Gray University of Alabama at Birmingham G bor Karsai | PowerPoint PPT presentation | free to view

Model%20Checking%20Foundations%20and%20Applications PowerPoint PPT Presentation

Model%20Checking%20Foundations%20and%20Applications - Automated Verification of Temporal Properties of Finite State Systems ... pre and post-conditions are not enough. Reactive Systems: A Very Simple Model ... | PowerPoint PPT presentation | free to view

Boolean and Vector Space Retrieval Models PowerPoint PPT Presentation

Boolean and Vector Space Retrieval Models - Boolean models can be extended to include ranking. ... Comments on Vector Space Models. Simple, mathematically based approach. ... | PowerPoint PPT presentation | free to view

Earth Science Applications of Space Based Geodesy PowerPoint PPT Presentation

Earth Science Applications of Space Based Geodesy - Earth Science Applications of Space Based Geodesy DES-7355 Tu-Th 9:40-11:05 Seminar Room in 3892 Central Ave. (Long building) Bob Smalley | PowerPoint PPT presentation | free to view

A VECTOR APPROACH TO FORECASTING MCS MOTION PowerPoint PPT Presentation

A VECTOR APPROACH TO FORECASTING MCS MOTION - A VECTOR APPROACH TO FORECASTING MCS MOTION | PowerPoint PPT presentation | free to view

GIS Data Models: Vector PowerPoint PPT Presentation

GIS Data Models: Vector - The real world can only be depicted in a GIS through the use ... tessellation: a mosaic, typically consisting of small square stones. Vector Model: Dime files ... | PowerPoint PPT presentation | free to view

SEDRIS Spatial Reference Model PowerPoint PPT Presentation

SEDRIS Spatial Reference Model - ... will be used as much as practical in order to make complex topics accessible ... is generally application-dependent. Spatial Reference Model. 35. 08-22-00 ... | PowerPoint PPT presentation | free to view

Dynamical Systems Model of the Simple Genetic Algorithm PowerPoint PPT Presentation

Dynamical Systems Model of the Simple Genetic Algorithm - Dynamical Systems Model of the Simple Genetic Algorithm. Introduction to Michael Vose's Theory ... Perron-Frobenius Theorem (for matrices with positive real entries) ... | PowerPoint PPT presentation | free to view

The Roofline Model: A pedagogical tool for program analysis and optimization PowerPoint PPT Presentation

The Roofline Model: A pedagogical tool for program analysis and optimization - The Roofline Model: A pedagogical tool for program analysis and optimization ParLab Summer Retreat Samuel Williams, David Patterson samw@cs.berkeley.edu | PowerPoint PPT presentation | free to view

Scaling, Phasing, Anomalous, Density modification, Model building, and Refinement PowerPoint PPT Presentation

Scaling, Phasing, Anomalous, Density modification, Model building, and Refinement - ... the phases gives the Patterson map, which is the map of all inter-atom vectors. ... 3 data is good enough to se the backbone with space inbetween. ... | PowerPoint PPT presentation | free to view

A New Approach to Gravity & Space Propulsion Systems PowerPoint PPT Presentation

A New Approach to Gravity & Space Propulsion Systems - International Space Development Conference 2003, May, Sna Jose, CA. | PowerPoint PPT presentation | free to view

Boolean and Vector Space Retrieval Models by Ray Mooney PowerPoint PPT Presentation

Boolean and Vector Space Retrieval Models by Ray Mooney - Boolean models can be extended to include ranking. ... Comments on Vector Space Models. Simple, mathematically based approach. ... | PowerPoint PPT presentation | free to view

A Vector Space Search Engine for Web Services PowerPoint PPT Presentation

A Vector Space Search Engine for Web Services - The Vector Space Model for Information Retrieval ... extractor. Stemmer, Stop words, Weighting schema. Local. Vectorspace. Remote. Vectorspace ... | PowerPoint PPT presentation | free to view