HyKSS: Hybrid Keyword and Semantic Search - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

HyKSS: Hybrid Keyword and Semantic Search

Description:

... non-phrase stopwords hondas in – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 35
Provided by: DEG80
Learn more at: http://www.deg.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: HyKSS: Hybrid Keyword and Semantic Search


1
HyKSS Hybrid Keyword and Semantic Search
Andrew Zitzelberger
1
2
Keyword Search
2
3
Form Based Search
3
4
What about?
over 8,000 meters in elevation
less than 100K miles
faster than 100 mph
4
5
5
6
HyKSS
  • Hybrid Keyword and Semantic Search
  • Semantics extracted annotations
  • Multiple ontologies
  • Keywords text

6
7
Thesis Statement
  • HyKSS (hybrid search)
  • Outperforms keyword and semantic search
  • Dynamic query weighting outperforms various other
    hybrid search approaches
  • Allows queries over multiple ontologies
  • Allows pay-as-you-go improvement

7
8
Extraction Ontologies
8
9
Data Frames
9
10
Indexing Architecture
Document Collection
Keyword Indexer
Semantic Indexer
Keyword Index
Semantic Index
10
11
Indexing Architecture Implementation
Ontology Library
Lucene
OntoES
Sesame
11
11
12
Query Processing
Free Form Query
Keyword Processing
Semantic Processing
Pre-Process Query
Pre-Process Query
Execute Query
Execute Query
Post-Process Query
Post-Process Query
Combine Results
12
13
Keyword Query Pre-Processing
  • Remove Lucene special characters (except quotes)
  • Remove (inequality) comparison constraints
  • Remove non-phrase stopwords
  • hondas in "excellent condition" in orem for under
    12 grand
  • hondas excellent condition orem

13
14
Keyword Query Execution and Post-Processing
  • Executed by Lucene
  • Empty Post-Processing step

14
15
Semantic Query Pre-ProcessingIndividual Ontology
Scoring
hondas in "excellent condition" in orem for
under 12 grand
15
16
Semantic Query Pre-ProcessingOntology Set
Creation
  • For each ontology sorted by score
  • For each remaining ontology
  • Add point for each new or subsuming match
  • If added points gt 0 add ontology
  • Completely subsumed ontologies are removed during
    query generation

16
17
Semantic Query Pre-ProcessingOntology Set
Creation
Location
Vehicle
Price lt 12000
US_Cityorem
Vehicle
Price lt 12000
Vehicle_Score 1
ContractualServices
Location
Contractual Services
Price lt 12000
US_Cityorem
ContractualServices_Score 1
Vehicle_Score
17
18
Semantic Query Pre-ProcessingStructured Query
Generation
  • Open world assumption
  • SPARQL query

18
19
Semantic Query Execution and Post-Processing
  • Sesame query execution
  • Semantic ranking
  • 1 point for each requested projection satisfied
  • Normalized by of projections requested
  • hondas in "excellent condition" in orem for under
    12 grand
  • Projections on Make, Price and US_City

19
20
Hybrid Query Processing
  • Linear interpolation
  • (kw_weight kw_score) (sm_weight sm_score)
  • Dynamic solution
  • keywords remaining (kw)
  • concept match score (cms)
  • ½ (selections projections)
  • kw_weight kw/(kw cms)
  • sm_weight cms/(kw cms)

20
21
Basic Search
21
22
Results Display
22
23
Form Based Search
23
24
Results Display
25
Experimental Setup Ontology Libraries
  • 5 Ontology Levels
  • Number
  • Generic Units
  • Vehicle Units
  • Vehicle
  • Vehicle

25
26
Experimental Setup Query Sets
  • 113 syntactically unique queries from database
    students
  • 60 syntactically unique queries from linguistic
    students

26
27
Experimental Setup Document Collection
  • 250 vehicle advertisements (Craigslist)
  • 100 training, 50 validation, 100 test
  • 318 mountain pages (Wikipedia)
  • 66 roller coaster (Wikipedia)
  • 88 video game advertisements (Craigslist)

27
28
Experiments
  1. Training queries over test vehicle documents
  2. Test queries over test vehicle documents
  3. Training queries over test vehicle documents
    additional noise
  4. Test queries over test vehicle documents
    additional noise
  5. 5 queries over noisy data (Generic Units only)

28
29
Experiments - Metric
  • Mean Average Precision

29
30
Experimental Results
30
31
Experimental Results
31
32
Experimental Results
32
33
Conclusions
  • Hybrid search outperforms keyword and semantic
    search
  • HyKSSs dynamic query weighting approach
    outperforms various other weighting techniques
  • Using multiple does not outperform selecting and
    using a single ontology

33
34
External Image Citations
  • Slide 2 Google search screenshot
    http//www.google.com (07/30/11)
  • Slide 3 partial car search form screenshots
    http//autotrader.com/fyc (07/30/11)
  • Slide 4 mountain image http//en.wikipedia.org/wi
    ki/Lhotse (04/26/11)
  • Slide 4 car image http//en.wikipedia.org/wiki/Ho
    nda (04/26/11)
  • Slide 4 roller coaster image http//en.wikipedia.
    org/wiki/Kingda_Ka (04/26/11)
  • Slide 4 Wikipedia logo http//en.wikipedia.org/wi
    ki/Main_Page (04/26/11)
  • Slide 4 craigslist logo http//provo.craigslist.o
    rg/ (04/26/11)

34
Write a Comment
User Comments (0)
About PowerShow.com