Search Engine Project - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Search Engine Project

Description:

Create a high quality search engine. Provides highly relevant content to user queries ... Aggregation Engine. Dynamic Collection distributor. Search Capabilities ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 15
Provided by: CDVP
Category:
Tags: engine | project | search

less

Transcript and Presenter's Notes

Title: Search Engine Project


1
Search Engine Project
  • Cathal Gurrin
  • Paul Ferguson
  • Georgina Gaughan
  • Peter Wilkins

2
Objectives
  • Create a high quality search engine
  • Provides highly relevant content to user queries

  • Index distributed data from hyper collections,
  • Deliver content in timely manner.

3
Project Context
  • Funded by SFI.
  • Building an adaptive media retrieval system.
  • Using text-based retrieval as a first step to
    this.
  • Allow collaborative user-centric evaluations with
    UCD.
  • From this learn how to do the video evaluation.

4
Spirit Collection
  • Approximately 1.2 terabytes of data.
  • 94,500,000 documents.
  • Consists of collection of web pages crawled
    straight from Web.
  • 1/30th the size of Google content.
  • 1/5000th the hardware.
  • Only available to 2 universities in UK.
  • No other European universities.

5
System Architecture
Aggregate Search Engine
Web Browser Interface
User Query
Formatted Results
1
n
. . .
n
. . .
1
Indexes
6
System Architecture
Aggregate Search Engine
1
n
. . .
n
. . .
1
Indexes
7
Work Done
  • Acquired SPIRIT collection necessary hardware
  • Verified collection integrity
  • Automatic unpacking collection
  • Hardware resource deployment configuration
  • Pre-processor Development
  • Requirements gathering
  • R D prototyping stage
  • 95 Implemented tested
  • To be operational by Christmas
  • Process collection over Christmas
  • Establishment maintenance of project corporate
    memory

8
(No Transcript)
9
Next Phase
  • Indexer
  • Retrieval Engine
  • User Interface Component
  • Aggregation Engine
  • Dynamic Collection distributor

10
Search Capabilities
  • Integrate PageRank/SiteRank Algorithm
  • Extend search capabilities to include
  • Phrase searching
  • Boolean searching
  • More like this searching
  • Clever multi-tiered caching to enhance retrieval
    performance
  • Query distribution operating in conjunction with
    caching to further enhance retrieval performance

11
Gold Plating
  • Structured Search Capabilities
  • Result Fusion
  • Search Engine Discovery

12
Project Management
13
Conclusions
  • Beta release
  • Bag of word searching
  • Distributed framework operational, with result
    aggregation
  • Preliminary UI operational
  • Supporting user search
  • Beta release Friday 13th February

14
  • Thank You!
Write a Comment
User Comments (0)
About PowerShow.com