Lucene/Solr%20Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

Lucene/Solr%20Architecture

Description:

(potentially completely custom architecture not using tokenizer/filters) ... Custom response writer. Request Handler (non-component based) /admin/luke ... – PowerPoint PPT presentation

Number of Views:171
Avg rating:3.0/5.0
Slides: 6
Provided by: yon56
Learn more at: http://people.apache.org
Category:

less

Transcript and Presenter's Notes

Title: Lucene/Solr%20Architecture


1
Lucene/Solr Architecture
Request Handlers
Update Handlers
Response Writers
/select
/spell
XML
CSV
XML
Binary
JSON
binary
/admin
Extracting Request Handler (PDF/WORD)
Schema
Search Components
Update Processors
Query
Highlighting
Signature
Spelling
Statistics
Logging
Faceting
Debug
Indexing
Apache Tika
More like this
Clustering
Query Parsing
Config
Distributed Search
Data Import Handler (SQL/RSS)
Analysis
Caching
Faceting
High-lighting
Filtering
Search
Index Replication
Apache Lucene
Core Search IndexReader/Searcher
Indexing IndexWriter
Text Analysis
2
Lucene/Solr plugins
  • RequestHandlers handle a request at a URL like
    /select
  • SearchComponents part of a SearchHandler, a
    componentized request handler
  • Includes, Query, Facet, Highlight, Debug, Stats
  • Distributed Search capable
  • UpdateHandlers handle an indexing request
  • Update Processor Chains per-handler
    componentized chain that handle updates
  • Query Parser plugins
  • Mix and match query types in a single request
  • Function plugins for Function Query
  • Text Analysis plugins Analyzers, Tokenizers,
    TokenFilters
  • ResponseWriters serialize stream response to
    client

3
Lucene/Solr Query Plugin Architecture
Declarative Analysis per-field - Tokenizer to
split text - TokenFilter to transform tokens
- Analyzer for completely custom - Separate
query / index analyzer QParser plugins -
Support different query syntaxes - Support
different query execution - Function Query
supports pluggable custom functions - Excellent
support for nesting/mixing different query types
in the same request.
// declaratively defines types // and analyzers
for fields ltfieldType nametext1gt
ltfilterwhitespacegt ltfiltercustomFilter
gt ltfiltersynonyms file..gt
ltfilterporter except..gt ltfield nametitle
typetext1 ltfield namecust1 class
lt index configuration /gt lt caching configuration
/gt lt request handler config /gt lt search component
config /gt lt update processor config /gt lt misc
HTTP cache, JMX gt ltparser namemycustom
ltfunc namecustom class
4
Lucene/Solr Request Plugins
response docs
http//.../select?qcheesewtjson
/select
/admin/luke
/mypath
RequestHandler
Request Handler (non-component based)
Request Handler (custom)
XML response writer
Query Component
Facet Component
XSLT response writer
Highlight Component
Binary response writer
Debug Component
JSON response writer
Query Response
Custom response writer
Additional plug-n-play search components
Spellcheck
TermVector
QueryElevation
MoreLikeThis
Statistics
Terms
My Custom
Clustering
5
Lucene/Solr Indexing
PDF
ltdocgt lttitlegt
HTTP POST
HTTP POST
/update
/update/csv
/update/xml
/update/extract
XML Update Handler
CSV Update Handler
XML Update with custom processor chain
Extracting RequestHandler (PDF, Word, )
Update Processor Chain (per handler)
Text Index Analyzers
Data Import Handler Database pull RSS
pull Simple transforms
RSS feed
pull
Lucene
SQL DB
pull
Lucene Index
Write a Comment
User Comments (0)
About PowerShow.com