Localization and multilingual content

From semantic-mediawiki.org
Jump to: navigation, search
Support for localized value display
Image / Video collection
Table of Contents

Localization1 and the capability of supporting multilingual environments2 using Semantic MediaWiki has been improved in the 2.4 release.


Supporting multilingual content in Semantic MediaWiki is based on a view that a content page is assumed to be related to one specific content language with similar content in a different language linking to each other (by means of a semantic relation not through a simple literal declaration).

Whether the expressed conceptional framework is to be an acceptable practice or not will be left to users to decide but Semantic MediaWiki is capable of supporting multilingual content scenarios to the degree outlined.

Technical exposition

Efforts were made in getting global assumptions about a user and content language removed from internal data providers to enable formatting of values in terms of an explicit language (which can be either user or content oriented).

The introduction of the monolingual text type allows SMW to store text segments in tandem to a specific language and is used by:

  • property descriptions as property to create annotations to provide a localized descriptions of a property (this is so users can have a clear concept of what a property is to represent and helps its correct application). For example, dwc:kingdom shows its description in the selected user language, if available.

Multilingual content

In the past, the only way to achieve a multilingual wiki was by splitting it on a per language basis and loosely interconnect content via interwikilinks. While this may work for large installations, it seems unreasonable for maintainers of smaller sites to apply the same concept.

A general obstacle to provide localizable content is and was that the content language (or site language) is global and determines the rules of how content is to be interpret editorially (e.g. separators . vs. , on numbers, fonts, ltr vs. rtl etc.).

Semantic Interlanguage Links as extension (in connection with SMW) can help mitigate the limitation by allowing pages to "semantically" link to each other not only by means of so called interwikilinks but also to declare a specific page content language. By having the global content language no longer taking precedence over the content of a particular page (the page content language can be entirely different from that of the global content language) it has more editorial leverage to apply different rules.

The concept of an individual page content language is important because each page (and hereby its content) can declare a dependency in terms of a selected language (and its rules). For example, a page that says it is in French can create annotations using those rules with users keeping the writing style of the denoted content language.

The sandbox demonstrates this concept more clearly with the site language set to be French (which would require all numeric annotations to carry , as decimal separator), pages (e.g. Berlin) denoting its own page content language are no longer restricted to a "French" content interpretation.

Interlanguagelinks and content language annotations

When using Semantic Interlanguage Links (e.g. {{interlanguagelink:en|Berlin}}) to interlink pages (linked to each other that refer to the same Berlin as interlanguage reference) and explicitly denoting en as language, content will be given expository freedom over the editorial preference.

As of SMW 2.4, Semantic MediaWiki understands that the page content language takes antecedence over the global content language and in reference to the earlier example, . is now being identified as numeric decimal separator with annotations such as [[Has area::891.85 km²]] to be interpret in the denoted language instead of the global one (which is French).


Not only is it important to support multilingual content from an editorial perspective, another significant factor in providing a better reading experience is its localization. Here as well were some improvements made so that when a user chooses a specific user language the formatting of query results (where allowed and possible) are to be available in a localized version.

  • #LOCL was added as output format option to signal to a value that its display attributes are to be in a localized context.
  • Special:Browse and Special:SearchByProperty have been made aware of a user context and if available will show localized values.


See also


  1. ^  ...the process of translating a product into different languages or adapting a product for a specific country or region
  2. ^  https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/594
  3. ^  #1591 Localization of numeric and quantity values
  4. ^  #1580 Localization of boolean values
  5. ^  #1533 Localization of date values
  6. ^  Property:Dwc:kingdom Localization of property descriptions in a user context
  7. ^  Berlin to demonstrate the difference between page content and global content language.