Help:Configuration parameter "$smwgFulltextLanguageDetection"

Jump to: navigation, search
edit with form

Title $smwgFulltextLanguageDetection
Description Sets which languages to detect for the full-text search from an indexable text
Default setting
Software Semantic MediaWiki
First version supported
Last version supported still available
Configuration Full-text search
Keyword full-text search

$smwgFulltextLanguageDetection is a configuration parameter that sets which languages to detect for the Full-text search from an indexable text. It was introduced in Semantic MediaWiki

Important noteImportant Note: Using the feature connected to this configuration parameter is experimental!
NoteNote: This setting only takes effect if the full-text search feature was enabled.

Default setting

$smwgFulltextLanguageDetection = array( );

This means that by default language detection is disabled.

Available language detectors

  • TextCatLanguageDetector: Allows for "N-Gram-Based Text Categorization" via TextCat2 and relies on the "wikimedia-textcat" utility3.
  • CdbNGramLanguageDetector: Allows for "N-Gram-Based Text Categorization" via the "constant database"24

Changing the default setting

Important noteImportant Note: Changing the content of this configuration parameter requires to run the "rebuildFulltextSearchTable.php" maintenance script.

To modify this configuration setting, add one of the following lines to your "LocalSettings.php" file after the enableSemantics() call:

Allow major Western European languages to be detected
$smwgFulltextLanguageDetection => array(
	'TextCatLanguageDetector' => array( 'en', 'de', 'fr', 'es' )
Allow major East Asian languages to be detected (MySQL 5.7+)
$smwgFulltextLanguageDetection => array(
	'TextCatLanguageDetector' => array( 'ja', 'zh', 'ko' )
  • A large list of languages does have a detrimental influence on the performance when trying to detect a language from a free text. Therefore languages should only be added with caution.
  • This configuration parameter should only hold one language detector at a time.
  • Stopwords are only applied after language detection has been enabled.

See also