Help:Full-text search

adds an experimental support for accessing the full-text capabilities of the relational databases (SQL back-end) for properties which data types use strings of characters or text to store their database tables, e.g., , or , etc. These datatypes use either  ,  , or   to store their data in the database tables.

Requirements

 * Semantic MediaWiki 2.5.0+
 * using MySQL 5.5+CiteRef::gh:smw:1481, MariaDB 10.0.5+CiteRef::gh:smw:1481 or SQLite 3.8+CiteRef::gh:smw:1801
 * PHP 5.5+

Features and limitations

 * General notes
 * The  table aggregates search content for datatypes storing their data as   and   values against an index search is being executed, e.g., ,  or , etc.
 * Supported operations rely on the relational backend database (MySQL, MariaDB and SQLite)
 * For MySQL and MariaDB databases,  is used as default search mode
 * Relevance and scores are not used for any sorting purpose, e.g. as in best match
 * relies on the "onoi/tesa" libraryCiteRef::url:onoi-tesa to help with the sanitization of text or string elements to provide some text manipulation support as well as a possibility to use language detection if enabled. This library is pre-installed for use by Semantic MediaWiki.
 * Custom stopwords are only applied by the "onoi/tesa" libraryCiteRef::url:onoi-tesa in case the language detection is enabled but MySQL/MariaDB provide their own standard listCiteRef::mysql:fulltext:stopwords which are enabled by default
 * Starting with :
 * If the  option to  is enabled the full-text search only comes into effect for selections using the comparators   and  .CiteRef::gh:smw:2499:307624826
 * is used instead of a socket connection via a special page to invoke extra "work" after an update has been completed as part of an independent transaction.CiteRef::gh:smw:3318 See also.


 * Notes on Chinese, Japanese, and Korean support (CJK)
 * General CJK support is a challenging endeavour due to text elements to be broken into corresponding tokens that are not separate by spaces
 * The "onoi/tesa" libraryCiteRef::url:onoi-tesa provides some simple 's which does not require language detection and will try to provide rudimentary CJK search out-of-the box. This however requires ICU 54+ which is still not being used by MediaWiki as of version 1.29-alpha.
 * Mroonga is a MySQL storage engine and said to be a CJK-ready fulltext search, column store
 * MySQL comes with an optional ngram Full-Text Parser and MeCab Full-Text Parser Plugin.
 * According to this issue, MariadDB is missing those parser plug-ins

Configuration

 * − Allows to enable the feature
 * − Allows to throttle the number of expected index updates
 * − Allows to set database related options
 * − Allows to describe the minimum word/token
 * − Allows to detect a language (experimental setting)
 * − Allows to list datatypes that should be indexed
 * − Allows to list properties that should be not be indexed

Usage and instructions

 * for users
 * Searching contains some examples and descriptions about the available search syntax
 * for system administrators
 * Indexing describes some methods on how to manually create and update the index table
 * for developers
 * Technical notes provides some information on the technical implementation, fine-tuning, and performance