Nessun titolo diapositiva - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Nessun titolo diapositiva

Description:

Site Explorer Server: an integrated, client-server, query system for Web sites Giancarlo Bongiovanni, Flavio Fontana, Stefano Borghetti Dept. Of Computer Science ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 40
Provided by: StefanoB93
Category:

less

Transcript and Presenter's Notes

Title: Nessun titolo diapositiva


1
Site Explorer Server an integrated,
client-server, query system for Web
sites Giancarlo Bongiovanni, Flavio Fontana,
Stefano Borghetti Dept. Of Computer Science,
University of Rome, La Sapienza ENEAs
Usability Lab
2
  • Summary
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

3
Information Search in Internet
Internet is the biggest and the most widespread
network
Internet
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

4
Internet
?
Issue Information search in Internet could be a
problem for particular type of users?
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Today a better scenario
  • Users problems related to information search
  • Many users dont know the Web information model
  • Users have problems to find a valid tools able to
    locate the relevant information
  • Users have problems to describe searched
    information using right and concise terms
  • Users have problems to use advanced search tools
    (i.e. Site Explorer Server is more difficult to
    use rather than browser)

5
New search and exploration tools
New and alternative Web approach to traditional
browser
Implementation of a Client/Server tools able to
make Web IR using Java, experimented and tested
ENEA
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Tool integrated with browser
Network service
6
Gerard Salton, Introduction to modern information
retrieval, Ed. 1983, McGraw-Hill, Inc.
IRS
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • Result formulation
  • Query formulation by user
  • Indexing process

7
Query formulation is a list of terms able to
express and summarize the searched argument
IRS
  • Boolean Systems combine the terms using boolean
    operators
  • and
  • or
  • andnot

Examples
Information and retrieval Information or
retrieval Information andnot retrieval
Operatori booleani
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • Extended boolean systems use additional
    operators
  • nearness of terms
  • cutting of terms
  • search using particular field

Examples
Information adj retrieval Inform Information in
titolo
Operatori estesi
In Ranking systems query formulation is made
using natural language phrases
Examples
Uman influence in Information Retrieval systems
Ranking
8
Indexing is a process to analyse documents and to
provide a short contents rapresentation.
IRS
Rapresentation is based on a keyword vector.
These keywords are choosen by a manual process or
are extracted by an authomatic process
Example
Information Retrieval Data Structure
Algorithms
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Terms vector
ltinformation, retrieval, data-strucuture,
alghoritmsgt
Example
List, tree, index file, etc.
Data structure to contains document rapresentation
Data structures
Example
A file where every record describe the releted
record with each particular term
Iverted indexing
9
In traditional IRS the result is a potential
relevant document list
Gerard Salton, Introduction to modern information
retrieval, Ed. 1983, McGraw-Hill, Inc.
IRS
William B. Frakes, Ricardo Baeza-Yates,
Information Retrieval Data Structure
Algorithms, Ed. 1992, Prentice Hall, Inc.
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Documents ordinated by relevance level
Resuls order
Explicit measure of relevance level (score)
Dynamic presentation (results manipulation)
Graphic and direct method presentations
New features
Multimedia integration
Use of windows (different way to present the
results)
10
Information Retrieval Systems
Calcolo dello score
Score compute is focused to measure the relevance
of specific terms in specific documents
IRS
Key point in score compute
Example
A method to weight the term relevance in the
whole document collection
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

(Sparck Jones, 1972)
(Dennis, 1967)
Example
Frequence normalization for particular document
collection
(Croft, 1983)
(Harman, 1986)
Compute of a term weght for a document Term
frequence in the document term relevance
weigth in the collection
  • Compute the score
  • Boolean system use SOP method
  • Ranking system use particular formula.

11
Web interface (Query and results)
Index DB
SIMILAR
Web pages
Authomatic indexing system
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • New functionality in the most popular search
    engine
  • Sites classification
  • Integration of new advanced search services to
    search information in particular format (picture,
    sounds, MP3, e-mail etc.)
  • not much search engines provide a document score
  • Migration from search service to on-line seller
    guides

Media Matrix - June 1999
12
Internet
Source FIND/ITPD, III, Gennaio 1999 - NII
project, supported by DOIT, MOEA
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

13
Internet
Source FIND/ITPD, III, Gennaio 1999 - NII
project, supported by DOIT, MOEA
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

14
Internet
Source FIND/ITPD, III, Gennaio 1999 - NII
project, supported by DOIT, MOEA
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

15
Internet
Source FIND/ITPD, III, Gennaio 1999 - NII
project, supported by DOIT, MOEA
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

16
Internet
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • Future Tracks
  • Research and technologies
  • Educational
  • The Public Administration
  • E-commerce

17
Main features
Technologies
Applet
Oriented to Graphic User Interfaces implementation
Multithread
Client
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Object-oriented
Site Explorer Server v2.0
Oriented to Client/Server systems implementation
Dynamic
Portable
High functionalities for networking
Platform independence
Server
18
Goals - To implement a new system
able to work directly on Web
able to helps the user to find interesting
documents on Web
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

with an high usability degree
  • able to integrate
  • search functions
  • alternative approach rather than browser
  • management functions
  • user position to access to the Web etherogeneous
    data using a unique way.

19
Site Explorer Server v.2.0. AClient/Server
system, implemented using Java, able to make
automatic Web site analyse, and to provide, as
result, the tree site structure where the root
node represents the site home-page.
  • Focused on information search and retreiving by
    keywords search approach
  • an easy information-filtering service
  • a score computation service
  • user management

Additional features
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

User
Client
A network service
An accessible (open to everybody) open and
multi-platform service
Interface
INTERNET
Web site
Site Explorer Server
20
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • Client/Server system
  • The Server (SES) is a Java application
  • The Client (SEJA) is a Java applet
  • SES and SEJA speak using a dedicated Application
    layer protocol (SEP)

Technical features
21
Query selector process
Query
USER
Web sites
HTTP connection process
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Links extraction process
Contents extraction process
Keywords analisys process
Score process
Result builder
Next sites page
Client user interface
Result-display process
Result
22
Site Explorer Server v2.0
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • full-text document analyse
  • Links cheking using connection requests
  • HTML 4 oriented

Features
23
  • Three score level
  • Level 1 score. Its based only on the keywords
    items inside the Web page.
  • Level 2 score. Its also based on the keywords
    distribution inside the whole Web site.
  • Level 3 score. Its based also on the position of
    keywords items inside the Web page structure.
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

24
Menù-bar
Tool-bar
Displayed result
Tree structure area
Retrieved object in Web site
Textual area
Multimedia area
State bar
State indicator
25
(No Transcript)
26
Connessione al server
27
Indicatore di connessine attiva
28
New site analyse request
29
Use of a favorite site analyse request
30
Use of a pre-defined site analyse request
31
Receiving result
32
Results navigation
33
Results browsing
34
(No Transcript)
35
Lo Usability Lab (Ulab), istituito nel 1992
presso il pilot-center del progetto ESPRIT III
VENUS e svolge unattività di Ricerca Sviluppo
nel campo delle interfacce visuali avanzate a
basi di dati e sistemi informativi multimediali
in rete.
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • Macchine di sviluppo e test
  • Intel Pentium II 350Mhz / Windows 98 (Netlab)
  • Intel Pentium MMX 166Mhz / Windows 95
    (Fontanaulab)
  • AMD K6 300Mhz/ Windows 98 (Ulab)
  • Sun Sparc Station 5 / Unix Solaris 2.5 (Venus)
  • Sun Sparc Station 10 / Unix Solaris 2.5 (Dafne)
  • Strumenti software
  • JDK v1.1.6, JDK v1.1.7, JDK v1.1.7a, JDK v1.17b,
    JDK 1.1.8
  • Edit, Netbeans
  • Java Swing v1.0.3, Java Media Framework v1.1

36
  • A strong system
  • good/exellent usability degree
  • A good response time (Analyse and result build)
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works
  • 50 users selected using ENEA/VENUS methodology
  • random user. Occassional system use.
  • Professional users System user related to their
    work.
  • Expert user.

37
  • G7 Global-Inventory project
  • A project data card collection
  • Site search engine vs Site Explorer Server
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Plus - Prosoma LinkUp Service A multimedia data
card collection
Experimental sites ULAB sites
  • Future testing
  • Virtual Lab Site
  • FAD

38
Esplorazione dei link
LinkBot - Analisi dei link
Site Explorer - Costruzione di un albero per un
singolo sito SurfMap JavaNavigator
Applet per navigazione su mappa
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

Ricerca su un sito
PersonalSearch applet come motore di ricerca per
un sito Virgilio - Funzione di ricerca su un sito
Esplorazione e rappresentazione di un sito
HyperSystem Net40 - esplora un sito e ne da una
rappresentazione ad albero permettendo la
navigazione
Navigazione su mappa e funzione di ricerca
MerzeScope applet di navigazione su un grafo con
funzione di ricerca per un solo sito
39
A totally modular internal architecture to be
able to add new modules and news functions in the
simplest and most dynamic way.
  • Index
  • Introduction
  • Information Retrieval Systems and keyword score
  • Search engines
  • Internet now and the future
  • Java
  • Site Explorer Server v2.0
  • Conclusion and experimental results
  • Future works

The implementation of a user profile system based
on the users interests constantly updateable by
a feed-back technique.
The insertion of a new system agent able to make
automatic off-line Web site analysis to suggest
to the user, using his profile information, a set
of query about specific themes.
Write a Comment
User Comments (0)
About PowerShow.com