Help:Embedded query update

Jump to: navigation, search
Embedded query updates
Image / Video collection
Table of Contents

The embedded query update feature provided by "QueryDependencyLinksStore" was added with Semantic MediaWiki 2.3.0 as part of the query management to track and store dependencies of embedded queries (a.k.a. inline queries).1

To enable this feature it is required to:

  1. Setting $smwgEnabledQueryDependencyLinksStore to true
  2. Run the "update.php" maintenance script followed by the "rebuildData.php" maintenance script.

Features and limitations

  • Dependencies are resolved for properties, categories, concepts (non cached), and hierarchies
  • Namespace queries ([[Help:+]]) are not tracked as this would significantly impact update performance for when a single NS dependency is altered
  • Queries with arbitrary conditions (e.g. [[~Issue/*]]) can not be tracked as they are not distinguishable in terms of an object description
  • limit=0 queries are not tracked (as they return an empty result list) and only represent a simple link to Special:Ask
  • Queries via Special:Ask are not tracked (those are not embedded)
  • An invalidation (done by ParserCachePurgeJob) is only triggered by entities that have been altered or added
  • $GLOBALS['smwgPropertyDependencyExemptionlist'] = array( '_MDAT', '_SOBJ' ) contains property keys that are excluded from detection


The main obstacle for queries to display "fresh" results is the ParserCache by which MediaWiki determines whether to serve a page from cache or to re-parse (and with it the embedded parser functions such as #ask) its content.

The QueryDependencyLinksStore does not update the query result itself but manages the invalidation of the ParserCache for selected articles that have been identified to require an update using the ParserCachePurgeJob. Once the ParserCache is outdated, MediaWiki will start a re-parse on the next article view by which new results for embedded queries are to be requested from the QueryEngine.

Computing dependencies

Computing dependencies and manage them in a performant manner is crucial to avoid a bottleneck situation and that is why dependencies (which are fetched from an QueryResult object) are only resolved after a change to an object appeared.

Keeping track on each query and its dependencies is important but the identification as to when to initiate a ParserCachePurgeJob is even more so. To avoid any lag in the update process, DeferredRequestDispatchManager starts a background process to inform ParserCachePurgeJob about all changed (and ONLY those that have changed, a diff taken from the CompositePropertyTableDiffIterator) entities (properties, categories, and objects) that happened to a subject.

As updates solely happen on a diff, one reason to have the rebuildData.php run shortly after the QueryDependencyLinksStore was enabled is to build a baseline of dependencies otherwise only newly added queries are tracked and added as part of the QUERY_LINKS table.

Embedded queries that are no longer used (which includes deleted or changed queries) are removed from the table together with all its associated dependencies.

A list of selected properties can be blacklisted and exempted from tracking because there have an impact on performance or are unlikely to be used directly. (see the $GLOBALS['smwgPropertyDependencyDetectionBlacklist'] setting).

Invalidation, updates, and the job queue

The actual invalidation is handled by ParserCachePurgeJob and despite its name, the execution is done as part of a deferred update process which in case of a denied or failed request (DeferredRequestDispatchManager returns with something like "wasCompleted":false,"connectionFailure":2," as part of the logging) is going to be pushed into the job queue as safety measure.

Depending on the size of subjects that need invalidation, the ParserCachePurgeJob works in batches and may require some processing before it is finished.

The decision not to rely on the job queue as primary execution handler was made in order for the QueryDependencyLinksStore to act independent from a possible delayed job schedule and allowing queries to be updated in an instant (which depends on the update size) by the next page view. Nevertheless, any left over ParserCachePurgeJob jobs that weren't processed in time and pushed into the job queue should be timely executed or scheduled with the help of the runJobs.php.

Only a "HTTP/1.0 202 Accepted" acknowledged response of the DeferredRequestDispatchManager will allow for an immediate update, any other response code will push the updates into the job queue.

For an altered value that at the same time is part of a query within the same page (where the update was made), the query update may not be immediately visible because of the deferred update process MediaWiki runs on an article save (depending on the MW version, ParserCachePurgeJob runs to early before MW has finished or depending on the DB lag update information are not yet available for the DeferredRequestDispatchManager to act upon).

If for some reason the DeferredRequestDispatchManager is unable to complete the request and waiting on the job scheduler is not an option2 then setting SMW_HTTP_DEFERRED_SYNC_JOB3 may be an option but can slightly influence the update time since transaction updates are done synchronously.


  • sandbox:math contains an example that demonstrates query update process

See also