Search operators

From semantic-mediawiki.org
Table of Contents

Semantic MediaWiki provides different search operators to enable a user to refine search conditions and criteria.

Wildcards[edit]

Wildcards are written as "+" and allow any value for a given condition. For example, [[Born in::+]] returns all pages that have any value for the property "Born in". Please note that "+" can only be used by itself. See the section Like, not like for the wildcards "*" and "?".

Comparators[edit]

Comparators are special symbols like < or >. They are placed after :: in property conditions.

>> and << "greater than" and "less than"
> and < "greater than or equal" and "less than or equal" by default, but "greater than" and "less than" if configuration parameter $smwStrictComparatorsSets whether the ">" and "<" comparators should be strict is set to true.
and "greater than or equal" and "less than or equal"
! "not" ("unequal")
~ and !~ "like" and "not like" comparison for strings
like: and nlike: "like" and "not like" comparison for strings (alternative)

For example,

 [[Area code::>>415]]

Comparators work only for property values and not for conditions on categories. A wiki installation can limit which comparators are available, which is done by the administrator by modifying the setting to configuration parameter $smwgQComparatorsSets the list of comparator characters supported by queries for use in a regex. as explained.

When applying comparators to pages, then the title of the page (without namespace prefix) is used. However, this can be changed by setting another MediaWiki sortkey for that page, e.g. {{DEFAULTSORTKEY:custom key}}. Please mark that this applies to all comparators, including ! and ~. It is not possible to have multiple sortkeys for one page. In particular, redirect pages are not taken into account when applying comparators.

Not equal[edit]

You can select pages that have a property value which is unequal to a given value. For example,

 [[Area code::!415]]

will select pages that have an area code which is not "415". Note that this query description does not look for pages which do not have an area code 415. Rather, it looks for all pages that (also) have a code unequal to 415. In particular, pages that have no area code at all cannot be the result of the above query.

As with the (default) equality comparator, the use of custom units may require rounding in numeric conversions that can lead to unexpected results. For example, [[Height::!6.00 ft]] may still select someone whose height displays as «6.00 feet» simply because the exact numeric value is not really 6. In such situations, it might be more useful to query for pages that have a property value outside a certain range, expressed by taking a disjunction (see below) of conditions with < and >.

like, not like[edit]

This section only describes the standard behaviour of ~/!~. This syntax (but not like:/nlike:) does not work the same way if full-text search is enabled; see the guide on full-text searching. The default behaviour cannot be demonstrated on the present wiki or the Sandbox site as both sites use full-text search.

Semantic MediaWiki supports the use of "like" and "not like" conditions in queries. Two different notations, with slightly different behaviours, are available for use:

  • ~ and !~
  • like: and nlike: (since Semantic MediaWiki 3.0.0Released on 11 October 2018 and compatible with MW 1.27.0 - 1.31.x.), which relies on the standard LIKE / NOT LIKE SQL syntax for a pattern match.1

In a LIKE condition, one uses "*" wildcards to match any sequence of characters and "?" to match any single character. For example, one could ask [[Address::~*Park Place*]] to select addresses containing the string "Park Place", or [[Honorific::~M?.]] to select both "Mr." and "Ms.".

These comparators work only for properties of datatype "Text"Holds text of arbitrary length, datatype "Page"Holds names of wiki pages, and displays them as a link, datatype "Email"Holds e-mail addresses, datatype "Telephone number"Holds international telephone numbers based on the RFC 3966 standard and datatype "Date"Holds particular points in time.[1]

Wildcards do not work on namespaces. So something like [[~Proje?k::Park Place]] will not work.
string length

The searchable string length depends on a number of factors:

  • For datatype "Page"Holds names of wiki pages, and displays them as a link (all SMW versions) all 255 characters are searchable.
  • For datatype "Text"Holds text of arbitrary length (SMW ≥ 1.8.x)[2] the searchable string is limited to
    • the first 40 characters if more than 72 characters were stored as property value,
    • all 72 characters if a maximum of 72 characters were stored as property value,
    • or to 300 if the searchable length is configured by setting $smwgFieldTypeFeatures to SMW_FIELDT_CHAR_LONG.
special characters

Note that some special characters need masking when used in combination with comparator "~".

For example, property "Path" of datatype "Text"Holds text of arbitrary length holds something like "n:\path\morepath" as data value. To query for all pages that contain "n:\path\..." in property "Path" you need to mask the backslashes "\" to your query like this:

{{#ask:
 [[Path::~n:\\path\\+]]
  ...
}}

Ranges[edit]

Loose and strict comparison[edit]

By default, < and > serve as ‘loose’ comparison operators: they mean "less than or equal" and "greater than or equal". They are functionally equivalent to ≤ and ≥.

This may be confusing if you are accustomed to the strict mathematical meaning of < and > as "less than" and "greater than". As an administrator, you can choose to have SMW interpret these operators in the strict sense instead by setting the configuration parameter $smwStrictComparatorsSets whether the ">" and "<" comparators should be strict to true. The different behaviours are documented on the page about strict comparators.

Greater than or equal, less than or equal[edit]

With numeric values, you often want to select pages with property values within a certain range. For example

 [[Category:Actor]] [[height::>6 ft]] [[height::<7 ft]]

asks for all actors that are between 6 feet and and 7 feet tall. Note that this takes advantage of the automatic unit conversion: even if the height of the actor was set with [[Height::195cm]] it would be recognized as a correct answer (provided that the datatype for height understands both units, see Custom units). Note that the comparator means greater/less than or equal – the equality symbol "=" is not needed.

It is also possible to describe this with OR conditions:2

 [[Category:Actor]] [[height::>6 ft||<7 ft]]

Such range conditions on property values are mostly relevant if values can be ordered in a natural way. For example, it makes sense to ask [[Start date::>May 6 2006]] but it is not really helpful to say [[Homepage URL::>https://www.example.org]].

If a datatype has no natural linear ordering, Semantic MediaWiki will just apply the alphabetical order to the normalised datavalues as they are used in the RDF export. You can thus use greater than and less than to select alphabetic ranges of a property of datatype "String"Holds character sequences up to 255 characters. For example, you could ask [[Surname::>Do]] [[Surname::<G]] to select surnames between "Do" and up to "G". For wiki pages, the comparator refers to the name of the given page (without the namespace prefix).

Here and in all other uses of comparators, it might happen that a searched for value really starts with a symbol like <. In this case, SMW can be prevented from interpreting the symbol as a comparator if a space is inserted after ::. For example, [[Property:: <br>]] really searches for pages with the value "<br>" for the given property.

Greater than, less than[edit]

At times you might want to exclude the precise value from the result itself, e.g. to find an actor taller than Hugh Laurie (1.89m), you can query using a combination of the ">" comparator and the "!" comparator:

 [[Category:Actor]] [[height::>1.89m]] [[height::!1.89m]]

Values containing operator symbols[edit]

Cases may occur where the value of a given property starts with one of the following symbols: <, ≤, >, ≥, =, ! and ~. To assure that Semantic MediaWiki can handle those cases, value notations are white-space sensitive. To avoid confusion a space is inserted after ::. For example, [[Property:: <br>]] really searches for pages with the value "<br>" for the given property.

Case insensitivity[edit]

Starting with Semantic MediaWiki 3.0.0Released on 11 October 2018 and compatible with MW 1.27.0 - 1.31.x.34 it is possible to enable case insensitive matching for properties of e.g. datatype "Page"Holds names of wiki pages, and displays them as a link, datatype "Text"Holds text of arbitrary length, datatype "Code"Holds technical, pre-formatted texts (similar to datatype Text) and datatype "URL"Holds URIs, URNs and URLs (blob types, i.e. strings or text) using the SMW_FIELDT_CHAR_NOCASE option to configuration parameter $smwgFieldTypeFeaturesSets relational database specific field type features.

The full-text search feature is in most cases preferable to use over the feature described here.

The following four examples will provide an overview of the general differences between having this feature enabled or not:

Example 1

Property "Has text" of datatype "Text" holds "CAseInSensitiveSearch" as data value on page "Example". If enabled queries No. 1 to 3 and if not (default) queries No. 1 to 3 will also select page "Example":

  1. {{#ask: [[Has text::~CASEIN*]] |?Has text }}
  2. {{#ask: [[Has text::~casein*]] |?Has text }}
  3. {{#ask: [[Has text::~CAseIn*]] |?Has text }}
Example 2

Property "Has page" of datatype "Page" holds "CAseInSensitiveSearch" as data value on page "Example". If enabled queries No. 1 and 2 and if not (default) query No. 3 will select page "Example":

  1. {{#ask: [[Has page::~CASEIN*]] |?Has page }}
  2. {{#ask: [[Has page::~casein*]] |?Has page }}
  3. {{#ask: [[Has page::~CAseIn*]] |?Has page }}
Example 3

Property "Has text" of datatype "Text" holds "CAseIn" as data value on page "Example". If enabled queries No. 1 to 3 and if not (default) query No. 3 will select page "Example":

  1. {{#ask: [[Has text::CASEIN]] |?Has text }}
  2. {{#ask: [[Has text::casein]] |?Has text }}
  3. {{#ask: [[Has text::CAseIn]] |?Has text }}
Example 4

Property "Has page" of datatype "Page" holds "CAseIn" as data value on page "Example". If enabled query No. 3 and if not (default) query No. 3 will select page "Example":

  1. {{#ask: [[Has page::CASEIN]] |?Has page }}
  2. {{#ask: [[Has page::casein]] |?Has page }}
  3. {{#ask: [[Has page::CAseIn]] |?Has page }}

  1. In SMW ≤ 1.7.x, these comparators used to work for properties of datatype "String"Holds character sequences up to 255 characters (rather than Text) and datatype "Page"Holds names of wiki pages, and displays them as a link. Support for datatype "Email"Holds e-mail addresses1, datatype "Telephone number"Holds international telephone numbers based on the RFC 3966 standard1 and datatype "Date"Holds particular points in time2 was introduced in Semantic MediaWiki 1.3.0Released on 7 September 2008 and compatible with MW 1.12.x - 1.15.x..
  2. In case of datatype "String"Holds character sequences up to 255 characters (SMW ≤ 1.7.x) all 255 storable characters are searchable. See this mailing list post for detailed information.

References

  1. a b  |  Semantic MediaWiki: GitHub pull request gh:smw:1129
  2. ^  |  Semantic MediaWiki: GitHub pull request gh:smw:1178