Web Search Engines PowerPoint PPT Presentation

presentation player overlay
1 / 32
About This Presentation
Transcript and Presenter's Notes

Title: Web Search Engines


1
Web Search Engines
  • by Greg R. Notess
  • notess_at_imt.net
  • imt.net/notess/search

2
Overview
  • Comparing the database content
  • Change
  • Comparative Size
  • Overlap
  • Looking towards future developments
  • Portal or Destination
  • Output sorting

3
Results are limited by
  • Database content
  • The Web sites included
  • The depth to which they are indexed

4
  • If its not in the database, the best search
    engine will not be able to find the Web page

5
So whatre they like?
  • Very large databases
  • Most index all words on page
  • None index words in images
  • Lets see how the databases compare to the real
    Web

6
Change over time?
7
Overall Size Change
  • Is the Web in general
  • Growing?
  • Shrinking?
  • Remaining the same?

8
Excite 6 Searches 10/96-8/98
9
What about the rest?
  • Whos the biggest?
  • How to measure?
  • Actual search results
  • Verified hits

10
(No Transcript)
11
And over time?
  • 8/98 -- AltaVista, Northern Light, HotBot
  • 5/98 -- AltaVista, HotBot, Northern Light
  • 2/98 -- HotBot, AltaVista, Northern Light
  • 10/97 -- AltaVista, HotBot, Northern Light
  • 9/97 -- Northern Light, Excite, HotBot
  • 6/97 -- HotBot, AltaVista, Infoseek
  • 10/96 -- HotBot, Excite, AltaVista

12
Back to change in size
  • Lets look at six search engines
  • Over the course of two years

13
(No Transcript)
14
But at least
  • They have a high degree of duplication between
    them
  • Right?

15
Try 4 small searches
  • Using five search engines
  • How many pages are found by all five or at least
    by four of them?

16
ZERO
17
Overlap
18
And they exclude most
  • Content of Adobe PDF and formatted files
  • The content in most sites requiring a log in
  • CGI output data requested by a form
  • Other dynamically produced data
  • Pages protected by a robots.txt file
  • Intranets, pages not linked from anywhere else
  • Commercial resources with domain limitations
  • Non-Web resources

19
Scope Summary
  • Inconsistent growth
  • Not full coverage
  • Surprisingly low duplication

20
Positive Side?
  • Essential for searching the Net
  • Can be used effectively
  • Phrase search
  • Use more than one
  • Smart searching

21
  • Incredibly popular
  • Even when they fail
  • But then, since when is finding information
    always easy?

22
Overview
  • Comparing the database content
  • Change
  • Comparative Size
  • Overlap
  • Looking towards future developments
  • Portal or Destination
  • Output sorting

23
What is a search engine?
  • Portal?
  • Gateway?
  • Destination?

24
Search Engine
  • the software than searches a database

25
Development
  • Database of Web pages
  • adds Supplementary Database
  • Phone numbers, reference, businesses, news
  • then adds Subject directory
  • then Services
  • email, ISP, shopping, travel agent
  • now Communities

26
Portal to Destination?
  • Driving force
  • advertising revenue
  • Keep users longer for more
  • Conflicts with portal and gateway principle

27
Future possibilities?
  • Smaller databases
  • Less pointing to external pages
  • Paid advertising or sponsorship for visibility
  • Rise of search only sites?

28
Output Development
  • Initially, Relevance ranking
  • Crude
  • Not site or URL based
  • Some site sorting from Excite
  • No date sorting

29
Site Sorting
  • Infoseek, then Lycos, now HotBot
  • Group together by site
  • More relevant than prior algorithms
  • Northern Light includes it in
  • Custom Folders

30
Other Output
  • RealName on AltaVista
  • Direct Hit on HotBot
  • Subject Directory Categories
  • News
  • Books, CDs, etc. about search term

31
Search Engine Showdown
  • imt.net/notess/search
  • Search engine features
  • See also
  • www.searchenginewatch.com
  • See also
  • Rich Wiggins, Coming up next . . .

32
Web Search Engines
  • by Greg R. Notess
  • notess_at_imt.net
  • imt.net/notess/search
Write a Comment
User Comments (0)
About PowerShow.com