Help:Query result cache

From semantic-mediawiki.org
Jump to: navigation, search
Table of Contents

Semantic MediaWiki 2.5.0 adds an experimental feature to support a simple query cache1 with the objective to minimize the computational effort for queries that share the same query signature. A query signature is a fingerprint that identifies each query on characteristics that influences (e.g. condition, limit, offset, and sort/order) the result set of a query. This feature is disabled by default.

  • Queries that have the same signature will be redirected to the cache instead of seeking an active QueryEngine connection and hereby reduce DB connections and computation effort
  • Only the subject list is being cached as result of the query answering, leaving individual PrintRequest's to the distinctive printer and avoids conflict with existing or future ResultPrinter when generating the output representation

When the query result is enabled, current queries will be reassigned a new query ID 23 to increase the pool of queries that share potentially the same signature and hereby broaden the range of a cache reuse and minimize cache fragmentation.

The eviction4 of a cache is normally triggered by:

A recommended cache provider is redis5 to account for an optimal response time, storage capacity, and independence from the DB master. The cache statistics with its relative small size will be stored using CACHE_DB to improve durability of the collected data and make it independent of the query cache operations.

Operational relevance and statistics

To make a decision about whether the query result cache bears some operational relevance (meaning the benefits outweigh the cost of keeping an additional cache instance, or the median response time proves some generalisable advantage) with cache statistics being kept for the duration of the query cache lifetime, allowing it to gather some data about performance and use.

Example
{
    "misses": 545,
    "deletes": 56,
    "hits": {
        "nonEmbedded": 636,
        "embedded": 39
    },
    "medianRetrievalResponseTime": {
        "uncached": 0.030940875770042,
        "cached": 0.0069486399265745
    },
    "noCache": {
        "byLimit": 30
    },
    "ratio": {
        "hit": 0.5533,
        "miss": 0.4467
    },
    "meta": {
        "version": "0.2",
        "cacheLifetime": {
            "embedded": 86400,
            "nonEmbedded": 600
        },
        "collectionDate": {
            "start": "2016-11-30 14:17:21",
            "update": "2016-12-07 10:13:54"
        }
    }
}

Configuration parameters

NoteNote: The default values to these configuration parameters may not be applicable for each situation therefore it is suggested to monitor the statistics and make a decision about appropriate values for embedded and non-embedded cached queries.

See also


References

  1. ^  Semantic MediaWiki: GitHub pull request #1251
  2. ^  Semantic MediaWiki: GitHub pull request #2099
  3. ^  Semantic MediaWiki: GitHub pull request gh:smw:2176
  4. ^  Chapter 4. Cache Eviction being described as " ... the process by which old, relatively unused, or excessively voluminous data can be dropped from the cache ..."
  5. ^  Why Redis beats Memcached for caching "... Memcached and Redis serve as in-memory, key-value data stores ... Redis gives you much greater flexibility regarding the objects you can cache. While Memcached limits key names to 250 bytes and works with plain strings only, Redis allows key names and values to be as large as 512MB ..."