SQL Server FullText Search Using fulltext search in SQL Server 2005 - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

SQL Server FullText Search Using fulltext search in SQL Server 2005

Description:

Catalogs can be set to populate continuously, track changes, or index when the ... can be used with data stored in BLOB fields, such as Word and PDF documents ... – PowerPoint PPT presentation

Number of Views:354
Avg rating:3.0/5.0
Slides: 31
Provided by: edward129
Category:

less

Transcript and Presenter's Notes

Title: SQL Server FullText Search Using fulltext search in SQL Server 2005


1
SQL Server Full-Text SearchUsing full-text
search in SQL Server 2005
  • Edward Forgács ltEdwardForgacs_at_ssw.com.augt

2
What is full text search?
  • Allows language-specific queries of text data
    stored in SQL Server
  • char
  • varchar
  • nchar
  • nvarchar
  • image (containing DOC or PDF data)
  • Supported by all editions ofSQL Server (except
    Express)

3
  • What is wrong with LIKE queries for searching?

4
Results of LIKE
  • Only one result returned
  • Does not match
  • "email"
  • "e-mails"
  • "emails"

5
Why use full text search?
  • Much more powerful than LIKE
  • Used correctly, will produce more specific,
    relevant results
  • Better performance LIKE queries are designed
    for small amounts of text data, full-text search
    scales to huge documents
  • Provides ranking of results
  • Common uses
  • Search through the content in a text-intensive,
    database driven website, e.g. a knowledge base
  • Search the contents of documents stored in BLOB
    fields
  • Perform advanced searches
  • e.g. with exact phrases - "to be or not to be"
    (however needs care!)
  • e.g. Boolean operators - AND, OR, NOT, NEAR

6
New for FTS in 2005
  • XML support
  • Linked server support for queries
  • Integrated backup, restore and recovery
  • Faster queries and index building
  • Data definition language (DDL) statements for
    creating and altering indexes
  • System stored procedures deprecated
  • Noise Insensitivity noise words no longer break
    the query
  • Accent Insensitivity (optional) e.g. café and
    cafe are the same
  • Multiple columns can be included in full-text
    searches
  • Pre-computed ranking optimizations when using
    FREETEXTTABLE
  • Improved ranking algorithm
  • Catalogs can be set to populate continuously,
    track changes, or index when the CPU is idle

7
Setting up full-text search
  • Enable FTS for the database check "Use
    full-text indexing" on the Files page, orEXEC
    sp_fulltext_database 'enable'
  • Create catalogs for the required tables and
    select columns to index

8
  • Demo Setting up FTSon a database

9
FREETEXT query
  • SyntaxSELECT ColumnsFROM TableWHEREFREETEXT((C
    olumn1, Column2), 'word1 word2')orFREETEXT(,
    'word1, word2')
  • Returns rows where the specified columns contain
    ANY form of ANY of the specified words like
    AltaVista
  • Usually returns hundreds of irrelevant results,
    often poorly ranked
  • Quick and dirty really not suitable for
    production usage

10
FREETEXTTABLE query
  • SyntaxSELECT Columns, KEY_TBL.RANKFROM
    CategoriesINNER JOIN FREETEXTTABLE(Categories,
    Column1, Column2, 'word1 word2') AS KEY_TBL ON
    Categories.CategoryID KEY_TBL.KEYORDER BY
    KEY_TBL.RANK DESC
  • Produces the same results as a FREETEXT query,
    but additionally provides ranking information

11
CONTAINS query
  • SyntaxSELECT Column1, Column2FROM TableWHERE
    CONTAINS((Column1, Column2), '"exact phrase" AND
    FORMSOF(INFLECTIONAL, "word")')
  • Special full-text query syntax allows you to
    utilize the power of full-text search
  • Allows far more sophistocated queries, with
    boolean operators and exact phrases
  • Customizable word matching
  • Produces far more specific results than FREETEXT

12
CONTAINSTABLE query
  • SyntaxSELECT ColumnsFROM CategoriesINNER JOIN
    CONTAINSTABLE(Categories, Column, 'FORMSOF(INFLEC
    TIONAL, "word")') AS KEY_TBLON
    Categories.CategoryID KEY_TBL.KEYORDER BY
    KEY_TBL.RANK DESC
  • Returns the same results as a CONTAINS query, but
    with ranking information
  • Necessary to sort results by relevance

13
  • Demo Running a simple
  • full-text search query

14
Writing FTS terms
  • The power of FTS is in the expression which is
    passed to the CONTAINS or CONTAINSTABLE function
  • Several different types of terms
  • Simple terms
  • Prefix terms
  • Generation terms
  • Proximity terms
  • Weighted terms

15
Simple terms
  • Either words or phrases
  • Quotes are optional, but recommended
  • Matches columns which contain the exact words or
    phrases specified
  • Case insensitive
  • Punctuation is ignored
  • e.g.
  • CONTAINS(Column, 'SQL')
  • CONTAINS(Column, ' "SQL" ')
  • CONTAINS(Column, 'Microsoft SQL Server')
  • CONTAINS(Column, ' "Microsoft SQL Server" ')

16
Prefix terms
  • Matches words beginning with the specified text
  • e.g.
  • CONTAINS(Column, ' "local" ')
  • matches local, locally, locality
  • CONTAINS(Column, ' "local wine" ')
  • matches "local winery", "locally wined"

17
Generation terms
  • Two types
  • Inflectional
  • FORMSOF(INFLECTIONAL, "expression")
  • Thesaurus
  • FORMSOF(THESAURUS, "expression")
  • Both return variants of the specified word, but
    variants are determined differently

18
Inflectional
  • Matches plurals and words which share the same
    stem
  • When vague words such as "best" are used, doesn't
    match the exact word, only "good"

19
Thesaurus
  • Supposed to match synonyms of search terms but
    the thesaurus seems to be very limited
  • Does not match plurals
  • Not particularly useful

20
Proximity terms
  • SyntaxCONTAINS(Column, 'local NEAR
    winery')CONTAINS(Column, ' "local" NEAR "winery"
    ')
  • Matches words which are NEAR each other
  • Terms on either side of NEAR must be either
    simple or proximity terms

21
Weighted terms
  • Each word can be given a rank
  • Can be combined with simple, prefix, generation
    and proximity terms
  • e.g.
  • CONTAINS(Column, 'ISABOUT( performance
    weight(.8), comfortable weight(.4))')
  • CONTAINS(Column, 'ISABOUT( FORMSOF(INFLECTIONAL,
    "performance") weight (.8), FORMSOF(INFLECTIONAL,
    "comfortable") weight (.4))')

22
  • Demo Using simple, prefix and generation terms

23
  • Demo Using CONTAINSTABLE

24
FTS with BLOB fields
  • Full text search can be used with data stored in
    BLOB fields, such as Word and PDF documents
  • Two columns are necessary in the table
  • A BLOB column, such as IMAGE or VARBINARY, which
    holds the data to search
  • A data type column, e.g. VARCHAR(3), which holds
    the file extension to specify the type
  • When the catalog is created, the data type field
    also needs to be specified
  • Needs to be used with caution to avoid a massive
    performance hit

25
  • Demo Creating and searching FTS catalog with a
    BLOB field

26
Managing Population Schedules
  • In SQL 2000, full text catalogs could only be
    populated on specified schedules
  • SQL 2005 can track database changes and keep the
    catalog up to date, with a minor performance hit
  • Additionally, schedules can now be set to run
    when the CPU is idle
  • Immediate population only accessible from the
    Catalog Properties dialog

27
  • Demo Creating and managing population schedules

28
Backing up full text catalogs
  • Full text catalogs are now included in SQL
    backups by default, and are retained when a
    database is detached and re-attached
  • Option in detach dialog to include keep the full
    text catalog

29
Disadvantages
  • Full text catalogs consume large amounts of disk
    space and must be kept up-to-date
  • If catalogs arent set to update continuously,
    results can be out-of-date
  • Continuous updating results in a performance hit
  • Queries are complicated to generate and must be
    generated as a string on the client
  • Requires care in stored procedures full text
    clauses are always evaluated, regardless of OR
    operators
  • Contextual information not provided with any
    full-text search functions

30
SQL Server Full-Text SearchUsing full-text
search in SQL Server 2005
  • Edward Forgács ltEdwardForgacs_at_ssw.com.augt
Write a Comment
User Comments (0)
About PowerShow.com