Title: Lucene/Solr%20Architecture
1Lucene/Solr Architecture
Request Handlers
Update Handlers
Response Writers
/select
/spell
XML
CSV
XML
Binary
JSON
binary
/admin
Extracting Request Handler (PDF/WORD)
Schema
Search Components
Update Processors
Query
Highlighting
Signature
Spelling
Statistics
Logging
Faceting
Debug
Indexing
Apache Tika
More like this
Clustering
Query Parsing
Config
Distributed Search
Data Import Handler (SQL/RSS)
Analysis
Caching
Faceting
High-lighting
Filtering
Search
Index Replication
Apache Lucene
Core Search IndexReader/Searcher
Indexing IndexWriter
Text Analysis
2Lucene/Solr plugins
- RequestHandlers handle a request at a URL like
/select - SearchComponents part of a SearchHandler, a
componentized request handler - Includes, Query, Facet, Highlight, Debug, Stats
- Distributed Search capable
- UpdateHandlers handle an indexing request
- Update Processor Chains per-handler
componentized chain that handle updates - Query Parser plugins
- Mix and match query types in a single request
- Function plugins for Function Query
- Text Analysis plugins Analyzers, Tokenizers,
TokenFilters - ResponseWriters serialize stream response to
client
3Lucene/Solr Query Plugin Architecture
Declarative Analysis per-field - Tokenizer to
split text - TokenFilter to transform tokens
- Analyzer for completely custom - Separate
query / index analyzer QParser plugins -
Support different query syntaxes - Support
different query execution - Function Query
supports pluggable custom functions - Excellent
support for nesting/mixing different query types
in the same request.
// declaratively defines types // and analyzers
for fields ltfieldType nametext1gt
ltfilterwhitespacegt ltfiltercustomFilter
gt ltfiltersynonyms file..gt
ltfilterporter except..gt ltfield nametitle
typetext1 ltfield namecust1 class
lt index configuration /gt lt caching configuration
/gt lt request handler config /gt lt search component
config /gt lt update processor config /gt lt misc
HTTP cache, JMX gt ltparser namemycustom
ltfunc namecustom class
4Lucene/Solr Request Plugins
response docs
http//.../select?qcheesewtjson
/select
/admin/luke
/mypath
RequestHandler
Request Handler (non-component based)
Request Handler (custom)
XML response writer
Query Component
Facet Component
XSLT response writer
Highlight Component
Binary response writer
Debug Component
JSON response writer
Query Response
Custom response writer
Additional plug-n-play search components
Spellcheck
TermVector
QueryElevation
MoreLikeThis
Statistics
Terms
My Custom
Clustering
5Lucene/Solr Indexing
PDF
ltdocgt lttitlegt
HTTP POST
HTTP POST
/update
/update/csv
/update/xml
/update/extract
XML Update Handler
CSV Update Handler
XML Update with custom processor chain
Extracting RequestHandler (PDF, Word, )
Update Processor Chain (per handler)
Text Index Analyzers
Data Import Handler Database pull RSS
pull Simple transforms
RSS feed
pull
Lucene
SQL DB
pull
Lucene Index