How Search Engines Work General Search Strategies - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

How Search Engines Work General Search Strategies

Description:

http://search.yahoo.com/web/advanced?ei=UTF-8&p=dr dania bilal&fr=yfp-t-471 ... Web 2.0, use of alltheweb, Yahoo Maps, Podcasts, audio and all other features ... – PowerPoint PPT presentation

Number of Views:283
Avg rating:3.0/5.0
Slides: 28
Provided by: DaniaB
Category:

less

Transcript and Presenter's Notes

Title: How Search Engines Work General Search Strategies


1
How Search Engines Work General Search Strategies
  • Dr. Dania Bilal
  • IS 587
  • SIS Fall 2007

2
Fun Quiz
  • Take the search engine quiz located at
  • http//websearch.about.com/library/quizzes/search_
    engine_quiz/blsearchenginequiz.htm
  • Record the no. of incorrect answers
  • Share the results of the quiz with a classmate.

3
How Search Engines Work?
  • They collect information from selected web sites
  • The employ special software robots, called
    spiders, to crawl web pages
  • Spiders build lists of the words found in Web
    sites.
  • When a spider is building its lists, the spider
    is Web crawling.
  • Spiders store the lists in the engines database
  • The engines indexing software builds an index of
    words
  • Information is matched against query input and
    retrieved (processing algorithm)

4
How Spiders and Crawlers Work?
  • They begin with popular and heavily used web
    servers.
  • They begin with a popular site, collect the words
    on its pages and follow every link found within
    the site.
  • Spiders travel across pages and the most widely
    used portions of the Web

5
How Spiders and Crawlers Work?
  • A dedicated server of URLs is built by a search
    engine company (e.g., Google) so that spiders
    collect information quickly
  • More than one spider is used to craw web pages at
    a time
  • Google uses 3-4 spiders and collect over 100
    pages per second

6
How Spiders and Crawlers Work?
  • When no dedicated URL server is used, search
    engine company relies on ISP for the domain names
    (translated into addresses) to use for crawling
    the web
  • Delay in gathering information
  • Delay in updating information
  • Lack of control over URL addresses

7
Google Spider and How it Works
  • A spider looks at the html or xml or other coding
    used to build a web page and collects information
    from the meta-tags
  • It indexes words within the actual text of a
    page
  • It indicates where the words were found (URL,
    title, headings, etc.)
  • It disregards initial articles
  • It disregards pages that should not be crawled or
    indexed

8
Google Spider and How it Works
  • It uses Robot-Exclusion Protocol in disregarding
    pages
  • Implemented in the meta-tag section at the
    beginning of a Web page
  • Tells a spider to leave the page alone, neither
    index the words on the page nor try to follow its
    links
  • Franklin, C. How Internet Search Engines Work.
    http//computer.howstuffworks.com/search-engine.ht
    m

9
How Search Engines Store Words Indexed?
  • The process varies among engines
  • Words are stored with no. of times they appear on
    a pages (posting)
  • Weight is assigned to each word.
  • Words appearing near top of a page may have more
    weight than those appearing in subheadings, in
    links, in meta tags, in title, etc.

10
How Search Engines Store Words Indexed?
  • Information is encoded to save space
  • Information is indexed
  • An index of words is built by the automatic
    indexer (indexing software)
  • A hash table is created with an assigned weight
    or value for each word indexed
  • Hashing allows for even the distribution of
    popular entries (e.g., letter M) with those that
    are less popular (e.g., letter X) for quick
    retrieval

11
Using General Directories
  • Yahoo and its family
  • Browsing directory
  • Directory database
  • Small and human-selected and indexed
  • Searching using keywords
  • Search database
  • Larger and non-selective database
  • Spider and machine indexing

12
Yahoo
  • Yahoo.com
  • Works like a search engine rather than a
    directory
  • Searches the web
  • Exercise search under my name and see how Yahoo
    processes query while youre inputting
    information
  • Directory found under more or at
  • http//search.yahoo.com/dir

13
Yahoo Search Engine
  • Search
  • Web
  • Images
  • Videos
  • Local information
  • Shopping
  • More

14
Yahoo Advanced Search
  • Advanced Search feature
  • Shown on screen after you perform a search, or by
    going directly to
  • http//search.yahoo.com/web/advanced?eiUTF-8pdr
    daniabilalfryfp-t-471
  • Lots of search features to explore

15
Yahoo Advanced Search Features
  • Boolean
  • Phrase
  • Currency
  • Domain
  • File format
  • Country
  • Language
  • Other

16
Yahoo Advanced Search Features
  • Exercise
  • Perform a search on a topic of your choice
  • Use Boolean equivalents
  • All the wordsAND
  • The exact phrasephrase proximity search
  • Any of these wordsOR
  • None of these wordsNot
  • Choose part of page to search
  • Choose language other than English
  • Report results in class

17
Yahoo Search Services
  • For searching specific content area such as
  • Search Services
  • Web SearchFind anything from across the Web
  • AnswersAsk questions and get answers from real
    people
  • Audio SearchFind over 50mm audio files from
    across the Web
  • Creative Commons SearchFind Creative Commons
    content that you can share or re-use in your own
    works
  • Directory SearchSearch or browse Yahoo!'s
    categorized guide to the Web
  • Image SearchFind over 1.6 Billion photos and
    illustrations from all over the Web
  • Job SearchSearch for jobs, post your resume and
    more on Yahoo! HotJobs
  • LocalFind everything in your area from dry
    cleaners to day spas
  • MapsFind maps and driving directions for
    anywhere you want to go
  • Mobile SearchFind whatever, wherever you are
  • My Web (Beta)The newest way to save, share and
    organize any page you want on the Web
  • News SearchSearch for news stories and related
    photos, videos and audio clips

18
Yahoo Next
  • http//next.yahoo.com/
  • Cutting edge technology at Yahoo
  • Blogs, Web 2.0, use of alltheweb, Yahoo Maps,
    Podcasts, audio and all other features that are
    in Beta testing

19
Yahoo Preferences
  • Customize Yahoo to fit your needs
  • Go to Preferences from the Web search page
  • Edit preferences based on your needs
  • Edited preferences are saved in browser on
    desktop

20
General Search Strategies in Search Engines
21
Strategies
  • Boolean
  • Boolean equivalents
  • Proximity and phrase searching
  • Searching within a field
  • Search limits

22
Yahoo Search Strategies
  • Explore Yahoos help page
  • Read the Search Tips
  • Read the search limit parameters such as
  • Intitle
  • url
  • inurl
  • Read how to use Boolean equivalents and other
    search parameters

23
General Search Engines Besides Yahoo Search
24
Engines and Information Need
  • Several general search engines on the Web
  • Select engine(s) that best fit your need
  • Visit the Web Search Guide for latest
    information
  • http//websearch.about.com/od/generalsearchengines
    /General_AllPurpose_Search_Engines.htm

25
Hands-on Activity
  • Browe the list of general search engines in Web
    Search Guide
  • Explore 4 of the engines listed
  • Wisenut, Snap.com, Lycos, Exalead
  • Search under my name in each engine
  • Compare the results by viewing the first two
    pages retrieved
  • How many overlaps were found among the three
    engines
  • How many unique results were found in each engine

26
Specialized Search Engines
  • Web Search Guide has a listing of specialized
    search engines
  • Web companion to the textbook, chapter 3
    describes a variety of specialized engines
  • Explore chapter 3 familiarize yourself with the
    engines described

27
Hands-on Activity
  • Find the answer or relevant information for these
    two queries using an appropriate, specialized
    search engine
  • Do squirrels hybernate?
  • Find me a list of foreign-owned companies based
    in the U.S., organized by state.
Write a Comment
User Comments (0)
About PowerShow.com