Skip to content

Generate MS SQL server full-text-search syntax from more user-friendly search syntax. Attempts to gracefully handle syntax errors.

Notifications You must be signed in to change notification settings

marcwittke/Easy-Full-Text-Search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Easy Full Text Search

Easy Full Text Search is a .NET class library to convert a user-friendly, Google-like search term into a valid Microsoft SQL Server full-text-search query. The code attempts to gracefully handle all syntax errors.

Details

Microsoft SQL Server provides a powerful full-text search feature. However, the syntax is rather cryptic, especially for non-programmers. Moreover, there are many conditions that can cause SQL Server to throw up an error if things aren't exactly right.

Easy Full Text Search converts a user-friendly, Google-like search term to the corresponding full-text search SQL query. Its goal is to never throw exceptions on badly formed input. It attempts to simply construct a valid query as best it can, regardless of the input.

Input Syntax

The following list shows how various input syntaxes are interpreted.

Term Meaning
abc Find inflectional forms of abc.
~abc Find thesaurus variations of abc.
+abc Find exact term abc.
"abc" Find exact term abc.
abc* Finds words that start with abc.
-abc Do not include results that contain inflectional forms of abc.
abc def Find inflectional forms of both abc and def.
abc or def Find inflectional forms of either abc or def.
<abc def> Find inflectional forms of abc near def.
abc and (def or ghi) Find inflectional forms of both abc and either def or ghi.

Prevent SQL Server Errors

Another goal of Easy Full Text Search is to always produce a valid SQL query. While the expression tree may be properly constructed, it may represent a query that is not supported by SQL Server. After constructing the expression tree, the code traverse the tree and takes steps to correct any conditions that would cause SQL Server to throw an error

Term Action Taken
NOT term1 AND term2 Subexpressions swapped.
NOT term1 Expression discarded.
NOT term1 AND NOT term2 Expression discarded if node is grouped (parenthesized) or is the root node; otherwise, the parent node may contain another subexpression that will make this one valid.
term1 OR NOT term2 Expression discarded.
term1 NEAR NOT term2 NEAR conjunction changed to AND.

Example

EasyFullTextSearch fts = new EasyFullTextSearch();
string query = fts.ToFtsQuery(searchTerm);

Stop Words (Noise Words)

One thing to be aware of is SQL Server's handling of stop words. Stop words are words such as a, and, and the. These words are not included in the full-text index. SQL Server does not index these words because they are very common and don't really add to the quality of the search. Since these words are not indexed, SQL Server will never find a match for these words. The result is that a search for a stop word will return no results, even though that stop word may appear in your articles.

The best way to handle this seems to be to exclude these words from the SQL query. Easy Full Text Search allows you to do this by adding stop words to the StopWords collection property. Stop words will not be included in the resulting query unless they are quoted, thereby preventing stop words in the query from blocking all results.

Alternatively, SQL Server provides an option for preventing the issue described above. The transform noise words option can be used to enable SQL Server to return matches even when the query contains a stop word (noise word). Set this option to 1 to enable noise word transformation.

More Information

For more information and a discussion of the code, please see my article Easy Full-Text Search Queries.

About

Generate MS SQL server full-text-search syntax from more user-friendly search syntax. Attempts to gracefully handle syntax errors.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages