Help:Selecting pages

From semantic-mediawiki.org
Jump to: navigation, search
SMW user manual
Introduction
Editing
Properties and types
Special properties
Inverse properties
Custom units

Semantic templates

Service links

Browsing interfaces
Special:Ask
Special:Browse
Semantic search
Selecting pages
Strict comparators
Displaying information
Result formats
Inline queries
Querying for queries
Concepts
Inferencing
Semantic Web
RDF export

External tools

Importing vocabulary

SMW admin manual

The most important part of the Semantic search features in Semantic MediaWiki is a simple format for describing which pages should be displayed as the search result. Queries select wiki pages based on the information that has been specified for them using Categories, Properties, and maybe some other MediaWiki features such as a page's namespace. The following paragraphs introduce the main query features in SMW.

Categories and property values

In the introductory example, we gave the single condition [[Located in::Germany]] to describe which pages we were interested in. The markup text is exactly what you would otherwise write to assert that some page has this property and value. Putting it in a semantic query makes SMW return all such pages. This is a general scheme: The syntax for asking for pages that satisfy some condition is exactly the syntax for explicitly asserting that this condition holds.

The following queries show what this means:

  1. [[Category:Actor]] gives all pages directly or indirectly (through a sub-, subsub-, etc. category) in the category.
  2. [[born in::Boston]] gives all pages annotated as being about someone born in Boston.
  3. [[height::180cm]] gives all pages annotated as being about someone having a height of 180cm.

By using other categories or properties than above, we can already ask for pages which have certain annotations. Next let us combine those requirements:

[[Category:Actor]] [[born in::Boston]] [[height::180cm]] 

asks for everybody who is an actor and was born in Boston and is 180cm tall. In other words: when many conditions are written into one query, the result is narrowed down to those pages that meet all the requirements. Thus we have a logical AND. By the way: queries can also include line breaks in order to make them more readable. So we could as well write:

  [[Category:Actor]] 
  [[born in::Boston]] 
  [[height::180cm]]

to get the same result as above. Note that queries only return the articles that are positively known to satisfy the required properties: if there is no property for the height of some actor, that actor will not be selected.

When specifying property values, SMW will usually ignore any initial and trailing whitespace, so the two conditions [[height::180cm]] and [[height:: 180cm ]] have the same meaning. Datatypes such as datatype number may have additional features such as ignoring commas that might be used to separate the thousands. SMW will also treat synonymous page names the same, just like MediaWiki would usually consider Semantic wiki, Semantic_wiki, and semantic wiki to refer to the same page.

If you are using certain condition patterns frequently, you might create a Concept as a shorthand. Concepts form some kind of virtual category and can thus be used similar to category conditions.

Distance queries

A special case of the syntax can be used for finding all pages with a value of type Geographic coordinate within a certain distance of a set location, if one has the Semantic Maps extension installed - see the distance query page within the Semantic Maps documentation for more information. Note. Distance queries are currently broken with Maps v2.x. See this bugzilla report.

Property values: wildcards and comparators

In the examples above, we gave very concrete property conditions, using «Boston» and «180cm» as values for properties. In many cases, one does not look for only one particular values, but for a whole range of values, such as all actors that are taller than 180cm. In some cases one may even just look for all pages that have any values for a given property at all. For example, the deceased people could be those which have a value for the property «date of death». Such general conditions are possible with the help of comparators and wildcards.

Wildcards

Wildcards are written as "+" and allow any value for a given condition. For example, [[born in::+]] returns all pages that have any value for the property «born in». Please note that "+" can only be used by itself. See the section Like, not like for the wildcards "*" and "?".

Comparators

Comparators are special symbols like < or >. They are placed after :: in property conditions.

  • >> and <<: "greater than" and "less than"
  • > and <: "greater than or equal" and "less than or equal" by default, but "greater than" and "less than" if $smwStrictComparators = true;
  • and : "greater than or equal" and "less than or equal"
  • !: "not" ("unequal")
  • ~: «like» comparison for strings
  • !~: «not like» comparison for strings

Comparators work only for property values and not for conditions on categories. A wiki installation can limit which comparators are available, which is done by the administrator by modifying the value of $smwgQComparators as explained in Help:Configuration.

Depending on the value of $smwStrictComparators, interpretation of > and < can differ; the different behaviours are documented on the page about strict comparators.

When applying comparators to pages, then the title of the page (without namespace prefix) is used. However, this can be changed by setting another MediaWiki sortkey for that page, e.g. {{DEFAULTSORTKEY:custom key}}. Please mark that this applies to all comparators, including ! and ~. It is not possible to have multiple sortkeys for one page. In particular, redirect pages are not taken into account when applying comparators.

Not equal

You can select pages that have a property value which is unequal to a given value. For example,

  [[Area code::!415]]

will select pages that have an area code which is not «415». Note that this query description does not look for pages which do not have an area code 415. Rather, it looks for all pages that (also) have a code unequal to 415. In particular, pages that have no area code at all cannot be the result of the above query.

As with the (default) equality comparator, the use of custom units may require rounding in numeric conversions that can lead to unexpected results. For example, [[height::!6.00 ft]] may still select someone whose height displays as «6.00 feet» simply because the exact numeric value is not really 6. In such situations, it might be more useful to query for pages that have a property value outside a certain range, expressed by taking a disjunction (see below) of conditions with < and >.

Greater than or equal, less than or equal

With numeric values, you often want to select pages with property values within a certain range. For example

[[Category:Actor]] [[height::>6 ft]] [[height::<7 ft]]

asks for all actors that are between 6 feet and and 7 feet tall. Note that this takes advantage of the automatic unit conversion: even if the height of the actor was set with [[height::195cm]] it would be recognized as a correct answer (provided that the datatype for height understands both units, see Help:custom units). Note that the comparator means greater/less than or equal – the equality symbol = is not needed.

Such range conditions on property values are mostly relevant if values can be ordered in a natural way. For example, it makes sense to ask [[start date::>May 6 2006]] but it is not really helpful to say [[homepage URL::>http://www.somewhere.org]].

If a datatype has no natural linear ordering, Semantic MediaWiki will just apply the alphabetical order to the normalised datavalues as they are used in the RDF export. You can thus use greater than and less than to select alphabetic ranges of a string property. For example, you could ask [[surname::>Do]] [[surname::<G]] to select surnames between «Do» and up to «G». For wiki pages, the comparator refers to the name of the given page (without the namespace prefix).

Here and in all other uses of comparators, it might happen that a searched for value really starts with a symbol like <. In this case, SMW can be prevented from interpreting the symbol as a comparator if a space is inserted after ::. For example, [[property:: <br>]] really searches for pages with the value «<br>» for the given property.

Greater than, less than

At times you might want to exclude the precise value from the result itself, e.g. to find an actor taller than Hugh Laurie (1.89m), you can query using a combination of the > comparator and the ! comparator:

[[Category:Actor]] [[height::>1.89m]] [[height::!1.89m]]

Like, not like

The comparators ~ and !~ work only for properties of datatype String and Page (SMW ≤ 1.7.x) or Text and Page (SMW ≥ 1.8.x). In a like condition, one uses '*' wildcards to match any sequence of characters and '?' to match any single character. For example, one could ask "[[Address::~*Park Place*]]" to select addresses containing the string "Park Place", or "[[Honorific::~M?.]]" to select both "Mr." and "Ms.".

Note that in case of datatype String (SMW ≤ 1.7.x) all 255 storable characters are searchable, in case of datatype Text (SMW ≥ 1.8.x) only the first 40 characters (if more than 72 characters were stored as property value) or all 72 characters (if a maximum of 72 characters were stored as property value) are searchable as explained.[1]. For datatype Page all 255 characters are searchable (all SMW versions).

Strict comparators

The default behaviour of SMW, where comparators < and > mean "less than or equal" and "greater than or equal", can be somewhat confusing for people that are familiar with the mathematical meaning of < and >. Thus you can choose to have SMW interpret < and > "strict", as explained here.

Properties starting with certain symbols

Cases may occur where the value of a given property starts with one of the following symbols: <, ≤, >, ≥, =, ! and ~. SMW confuses these symbols with the build in comparators. To assure that this does not happen properties are white-space sensitive. To avoid confusion a space is inserted after ::. For example, [[property:: <br>]] really searches for pages with the value "<br>" for the given property.

Unions of query results: disjunctions

Disjunctions are OR-conditions that admit several alternative conditions on query results. A disjunction requires that at least one (but maybe more than one) of the possible alternatives is satisfied (logical OR).

SMW has two ways of writing disjunctions in queries:

  • The operator OR is used for taking the union of two queries.
  • The operator || is used for disjunctions in property values, page names and category names.

For example, a query that describes pages of actors in [[Category:Musical actor]] or [[Category:Theatre actor]], or in both of them, can be written as:

  • [[Category:Musical actor]] OR [[Category:Theatre actor]]
  • or more concisely, [[Category:Musical actor||Theatre actor]]

Similarly, a query that describes all pages of people born in Boston or New York can be written using one of these operators:

  • [[born in::Boston]] OR [[born in::New York]]
  • or again more concisely, [[born in::Boston||New York]]

Note that || does not always offer an alternative to OR. For example,

  • [[born in::Boston]] OR [[Category:Actor]]
  • cannot be expressed with ||.

OR operates on the query, not on a single element of the query. Thus in the following query, which is intended to list actors born in Boston or in New York, the category name needs to be repeated:

  • [[Category:Actor]] [[born in::Boston]] OR [[Category:Actor]] [[born in::New York]]

To combine multiple queries, you can also use the extension Semantic Compound Queries.

Describing single pages

So far, all conditions depended on some or the other annotation given within an page. But there are also conditions to directly select some pages, or pages from a given namespace.

Directly giving some page title (possibly including a namespace prefix), or a list of such page titles separated by ||, selects the pages with those names. An example is the query

[[Brazil||France||User:John Doe]]

which has three results (at least if the pages exist). Note that the result does not display any namespace prefixes; see the hover box or status bar of the browser, or follow the links to determine the namespace. Restricting the set based on an attribute value one could ask, e.g., «Who of Bill Murray, Dan Aykroyd, Harold Ramis and Ernie Hudson is taller than 6ft?». But direct selection of articles is most useful if further properties of those articles are asked for, e.g. to simply print the height of Bill Murray.

To select a category in this way, a : must be put before the category name. This avoids confusing [[Category:Actor]] (return all actors) and [[:Category:Actor]] (return the category «Actor»).

Restricting results to a namespace

A less strict way of selecting given pages is via namespaces. The default is to return pages in every namespace. To return pages in a particular namespace, specify the namespace with a «wildcard», e.g. write [[Help:+]] to return every page in the «Help» namespace. Since the main namespace usually has no prefix, write [[:+]] to select only pages in the main namespace.

Disjunctions work again with the || syntax as above. For example, to return pages in either the main or «User» namespace, write [[:+||User:+]]. To return pages in the «Category» namespace, a : is again needed in front of the namespace label to prevent confusion, e.g. [[:Category:+]].

Subqueries and property chains

Enumerating multiple pages for a property is cumbersome and hard to maintain. For instance, to select all actors that are born in an Italian city one could write:

[[Category:Actor]] [[born in::Rome||Milan||Turin||Florence||...]]

To generate a list of all these Italian cities one could run another query

[[Category:City]] [[located in::Italy]]

and copy and paste the results into the first query. What one would like to do is to use the city query as a subquery within the actor query to obtain the desired result directly. Instead of a fixed list of page names for the property's value, a new query enclosed in <q> and </q> is inserted within the property condition. In this example, one can thus write:

[[Category:Actor]] [[born in::<q>[[Category:City]] [[located in::Italy]]</q>]]

(limitation: you cannot add more than one category between <q> and </q>, except in the case of disjunctions)

Arbitrary levels of nesting are possible, though nesting might be restricted for a particular site to ensure performance. For another example, to select all cities of the European Union you could write:

  [[Category:Cities]]
  [[located in::<q>[[member of::European Union]]</q>]]

(no results within this wiki)

In the above example, we essentially have constructed a chain of properties «located in» and «member of» to find things that are located in something which is a member of the EU. Queries can be written in a shorter form for this common case:

[[Category:Cities]] [[located in.member of::European Union]]

This query has the same meaning as above, but with much less special symbols required. In general, chains of properties are created by listing all properties separated by dots. In the rare case that a property should contain a dot in its name, one may start the query with a space to prevent SMW from interpreting this dot in a special way.

NOTE: It is not possible to use a subquery to obtain a list of properties that is then used in a query. See #Subqueries for properties below.

Using templates and variables

Arbitrary templates and variables can be used in a query. An example is a selection criteria that displays all future events based on the current date:

 [[Category:Event]]
 [[end date::>{{CURRENTYEAR}}-{{CURRENTMONTH}}-{{CURRENTDAY}}]]

Another particularly useful variable for inline queries is {{FULLPAGENAME}} for the current page with namespace, which allows you to reuse a generic query on many pages. For an example of this, see Property:Population. Read about inline queries for more information.

Sorting results

It is often helpful to present query results in a suitable order, for example to present a list of cities ordered by population. Special page "Ask" has a simple interface to add one or more sorting conditions to a query. The name of the property to sort by is entered into a text input, and ascending or descending order can be selected. SMW will usually attempt to sort results by the natural order that the values of the selected property may have: "numbers" are sorted numerically, "text" is sorted alphabetically and "dates" are sorted chronologically. The order therefore is the same as in the case of the < and > comparators in queries. If no specific sorting condition is provided, results will be ordered by their page name.

It is possible to provide more than one sorting condition. If multiple results turn out to be equal regarding the first sorting condition, the next condition is used to order them and so on. For example we could get a list of cities by their average number of rainy days per year, but grouped by the country they are in, with the following query:

{{#ask:
 [[Category:City]]
 |?Located in=Country
 |?Average rainy days
 |sort=Located in,Average rainy days
 |order=asc,desc
}}
resulting in
 CountryAverage rainy days
SydneyAustralia143.7
KarlsruheBaden-Württemberg124
San DiegoCalifornia
United States of America
29.4
LondonEngland145
ParisFrance111.5
MunichGermany129.4
StuttgartGermany
Baden-Württemberg
112.4
FrankfurtGermany108.9
BerlinGermany106.3
RomeItaly84.5
AmsterdamNetherlands234
WarsawPoland159
PortoPortugal140

Sorting a query also influences the result of a query, because it is only possible to sort by property values that a page actually has. Therefore, if a query is ordered by a property (say «Population») then SMW will usually restrict the query results to those pages that have at least one value for this property (i.e. only pages with specified population appear). Therefore, if the query does not require yet that the property is present in each query result, then SMW will silently add this condition. But SMW will always try to find the ordering property within the given query first, and it is even possible to order query results by subproperties. Some examples should illustrate this:

  • [[Category:City]] [[Population::+]] ordered by "Population" will present the cities with population in ascending order. The query result is the same as without the sorting.
  • [[Category:City]] ordered by "Population" will again present the cities with population in ascending order. The query result may be modified due to the sorting condition: if there are cities without a population provided, then these will no longer appear in the result.
  • [[Category:City]] [[has location country.population::+]] ordered by "Population" will present the cities ordered by the population of the countries they are located in. The query result is not changed, but "population" now refers to a property used in a subquery. Again the population must be annotated for the countries to avoid it being omitted due to the sorting condition.

If a property that is used for sorting has more than one value for some page, then this page will still appear only once in the result list. The position that the page takes in this case is not defined by SMW and may correspond to either of the property values. In the above examples, this would occur if one city would have multiple population numbers specified, or if one city is located in multiple countries each of which has a population. It is suggested to avoid such situations.

Query results displayed in a result table can also be ordered dynamically by clicking on the small sort icons found in the table heading of each column. This function requires JavaScript to be enabled in the browser and will sort only the displayed results. So if, e.g., a query has retrieved the twenty world-largest cities by population, it is possible to sort these twenty cities alphabetically or in reverse order of population, but the query will certainly not show the twenty world-smallest cities when reversing the order of the population column. The dynamic sorting of tables attempts to use the same order as used in SMW queries, and in particular orders numbers and dates in a natural way. However, the alphabetical order of strings and page names may slightly vary from the wiki's alphabetic order, simply because there are many international alphabets that can be ordered in different ways depending on the language preference.

Linking to Semantic Search Results

Links to semantic query results on Special:Ask can be created by means of the inline query feature in SMW as explained in its documentation. It is not recommended to create links directly, since they are very lengthy and use a specific encoding. Developers who create extensions that link to Special:Ask should also use SMW's internal functions for building links. Understanding the details of SMW's encoding of queries in links is therefore not required for using SMW.

Things that are not possible

Subqueries for properties

It is not possible to use a subquery to obtain a list of properties that is then used in a query. One can, however, use a query that returns a list of properties, and copy and paste the result into another query. Alternatively, one can use the template results format to pass properties directly to another query.

Querying for the absence of a property

It is not possible to query for the absence of a property. (talk)

Queries with special properties

SMW currently does not support queries for the values of any of SMW's built-in special properties such as "Has type", "Allows value" or "Equivalent URI".

Filtering Categories and Concepts

It is not possible to ask for all pages of a query that are not in a specific category or concept. For example, the query {{#ask: [[Category:!City]] }} will query every page in category "!City" instead of every page that is not in category "City". (talk)

See also

Notes
  1. See this mailing list post for detailed information


This documentation page applies to all SMW versions from 1.5.3 to the most current version.
Other versions: 1.2 – 1.5.2       Other languages: defrruzh-hans

Help:Selecting pages en 1.5.3