Fulltext/Indexing

From semantic-mediawiki.org
Jump to: navigation, search
Full-text searchFulltext/Indexing

The index update only occurs on a change propagation therefore the initialization of the index should be done manually using the rebuildFulltextSearchTable.php script or 1 to prepare and create the initial index content.

Manual indexing

The simplest way to do that is by running:

php maintenance/rebuildFulltextSearchTable.php --quick -v

This will output something similar to below example where the script explains the configuration used including what datatypes are enabled or which properties are excluded. It also outputs the version used by onoi/tesa which provides some text sanitization functions.

$ php maintenance/rebuildFulltextSearchTable.php --quick -v

The script rebuilds the search index from property tables that
support a fulltext search. Any change of the index rules (altered
stopwords, new stemmer etc.) and/or a newly added or altered table
requires to run this script again to ensure that the index complies
with the rules set forth by the DB or Sanitizer.

## Configuration

- ICU (Intl) PHP-extension         54.1
- Tesa::Sanitizer                  0.2
- Tesa::Transliterator             0.2
- Tesa::LanguageDetector           (Disabled)
- DataTypes (Indexable)            BLOB, URI, WIKIPAGE

The following properties are exempted from the indexing process.

- _ASKFO, _ASKST, _IMPO, _LCODE, _UNIT, _CONV, _TYPE, _ERRT, _INST
- _ASK, _SOBJ, ___EUSER, ___CUSER, ___SUBP, ___EXIFDATA, __sci_cite
- __sil_iwl_lang, __sil_ill_lang

## Indexing

The entire index table is going to be purged first and
it may take a moment before the rebuild is completed due to
dependencies on table content including varying options.

The index table was purged.

Rebuilding the text index from (rows finished/expected):

- smw_di_blob                       100% (2387/2387)
- smw_di_uri                        100% (1990/1990)

You may run option --optimize which in case of MySQL will ensure that "... builds the table to update index statistics and free unused space in the clustered index ..." 2

php maintenance/rebuildFulltextSearchTable.php --quick --optimize

References

  1. ^  Semantic MediaWiki: GitHub pull request #2142
  2. ^  Optimizing InnoDB Full-Text Indexes notes "Running OPTIMIZE TABLE on a table with a full-text index rebuilds the full-text index, removing deleted Document IDs and consolidating multiple entries for the same word, where possible."