ElasticStore

From semantic-mediawiki.org
(Redirected from ElasticStore)
Search feature information
ElasticStore the component to establish a connection between Semantic MediaWiki and a Elasticsearch cluster.
Collection
Keywords

The ElasticStore was introduced as part of Semantic MediaWiki 3.01 to provide a powerful and scalable QueryEngine that can serve enterprise and wiki-farm users better by moving query heavy computation to an external entity (meaning separated from the main DB master/replica) known as Elasticsearch2.

ElasticStore

Requirements | Features | Setup | Usage | Settings | Technical notes | FAQ

The ElasticStore provides a framework to replicate Semantic MediaWiki related data to an Elasticsearch cluster and enable its QueryEngine to send #ask requests and retrieve information from Elasticsearch (aka ES) instead of the default SQLStore.

The objective is to provide an interface to Elasticsearch to:

  • improve structured (and allow unstructured) content searches
  • extend and improve full-text query support (including sorting of results by relevancy)
  • provide means for a scalability strategy by relying on the ES infrastructure

Requirements

  • Elasticsearch: Recommended 6.1+, Tested with 5.6.6
  • Semantic MediaWiki: 3.0+
  • elasticsearch/elasticsearch (PHP ^7.0 ~6.0 or PHP ^5.6.6 ~5.3)

We rely on the elasticsearch php-api to communicate with Elasticsearch and are therefore independent from any other vendor or MediaWiki extension that may use ES as search backend (e.g. CirrusSearch).

It is recommended to use:

  • ES 6+ due to improvements to its sparse field handling
  • ES hardware with "... machine with 64 GB of RAM is the ideal sweet spot, but 32 GB and 16 GB machines are also common ..." as noted in the elasticsearch guide

Features

  • Handle property type changes without the need to rebuild the entire index itself after it is ensured that all ChangePropagation jobs have been processed
  • Inverse queries are supported (e.g. [[-Foo::Bar]])
  • Property chains and paths queries are supported (e.g. [[Foo.Bar::Foobar]])
  • Category and property hierarchies are supported

Setup

Before the ElasticStore (hereby Elasticsearch) can be used as drop-in replacement for the existing SQLStore based QueryEngine the following settings and operations are necessary:

  • Set $GLOBALS['smwgDefaultStore'] = 'SMWElasticStore';
  • Set $GLOBALS['smwgElasticsearchEndpoints'] = [ ... ];
  • Run php setupStore.php or php update.php
  • Rebuild the index using php rebuildElasticIndex.php

For a more detailed introduction, see the usage and settings section as well as:

General notes

Elasticsearch is not expected to be used as data store replacement and therefore it is not assumed that ES will return all _source fields during a request.

The ElasticStore provides a customized serialization format to transform and transfer its data, an DSL interpreter (see domain language) allows for existing #ask queries to be answered by an ES instance without changing its syntax when switching from a SQLStore (or SPARQLStore).

more ...

See also[edit]

References

  1. ^  Semantic MediaWiki: GitHub pull request gh:smw:3054
  2. ^  Elasticsearch is a highly scalable open-source full-text search and analytics engine