Jump to: navigation, search

A data item represents the system perspective on the data to interact with a database in order to allow data to be managed, stored, and queried.

This class and its subclasses is the basic building block of all SMW elements. Its purpose is to provide a unified interface for all semantic entities that SMW deals with, e.g., numbers, dates, geo coordinates, wiki pages, and properties. It might be surprising that not only values but also subjects and properties are represented by the DataValue class. This makes sense since wiki pages can be both subjects and values, and since properties have many similarities with wiki pages (in particular they have associated articles).

Objects of class SMWDataItem represent very simple pieces of data. A dataitem is like a primitive type (e.g. a PHP string or number): its identity is determined by its contents and nothing else. Dataitems should thus be thought of as "primitive values" that are merely a bit more elaborate than the primitive types in PHP. Their main characteristics are:

  • Immutable: Once created, a dataitem cannot be changed.
  • Context independent: The meaning of a dataitem is only based on its content, not on any contextual information (such as the information about the property it is assigned to).
  • Limited shape: The kinds of datatitems (numbers, URLs, pages, ...) that SMW supports are limited and fixed. Extensions cannot add new kinds of dataitems, and programmers only need to handle a fixed list of possible kinds of datatitems.

Being immutable is essential for datatitems to behave like simple values. It imposes a restriction on programmers, but it also simplifies programming a lot since one does not have to be concerned about dataitems being changed by code that happens to have a reference to them.

The available kinds of dataitems correspond to subclasses of SMWDataItem. For convenience, each kind of dataitem is also associated with a PHP constant called its "DIType". For example, instead of using a nested if-then-else statement with many instanceof checks, one can use a switch over this DIType to handle different cases. The following table gives all dataitems:

Class DIType Description
SMWDIWikiPage SMWDataItem::TYPE_WIKIPAGE Dataitems that represent a page in a wiki or a "subobject" of such a page. They are determined by the page title (string in MediaWiki DBkey format), namespace, interwiki code, and a subobject name (can be empty).
SMWDIProperty SMWDataItem::TYPE_PROPERTY Dataitems that represent an SMW property. They are determined by the property key (which is the page DBKey string for user-defined properties), and the information whether or not they are inverted.
SMWDINumber SMWDataItem::TYPE_NUMBER Dataitems that represent some number.
SMWDIString SMWDataItem::TYPE_STRING Dataitems that represent a string that is not longer than MediaWiki titles (256 characters).
SMWDIBlob SMWDataItem::TYPE_BLOB Dataitems that represent a string (of any length).
SMWDIBoolean SMWDataItem::TYPE_BOOLEAN Dataitems that represent a truth value (true or false).
SMWDIUri SMWDataItem::TYPE_URI Dataitems that represent a URI (or IRI) according to RFC 3987.
SMWDITime SMWDataItem::TYPE_TIME Dataitems that represent a point in time in human or geological history. They are determined by a year, month, day, hour, minute, and (decimal) second, as well as a calendar model to interpret these values in (Julian or Gregorian).
SMWDIGeoCoord SMWDataItem::TYPE_GEO Dataitems that represent a location on earth, represented by latitude and longitude.
SMWDIContainer SMWDataItem::TYPE_CONTAINER Dataitems that represent a set of SMW facts, represented by an object of type SMWSemanticData (see below).
SMWDIConcept SMWDataItem::TYPE_CONCEPT Dataitems that represent the input and feature information for some SMW concept (query, description, features in query, size and depth).
SMWDIError SMWDataItem::TYPE_ERROR Dataitems that represent a list of errors (array of string). Used to gently pass on errors when dataitem return types are expected.
no class SMWDataItem::TYPE_NOTYPE Additional DIType constant that is used to indicate that the type is not known at all.

The restriction to these types of dataitem may at first look like a major limitation, since it means that SMW can only represent limited forms of data. For example, there is no dataitem for storing the structure of chemical formulae – doesn't this mean that SMW can never handle such data? No, because the existing datatitems can be used to keep all required information (for example by representing chemical formulae as strings). The task of interpreting this basic data as a chemical formula has to be handled on higher levels that deal with user input and output (the user view is explained in later sections). There is one kind of dataitem, SMWDIContainer, that represents "values" that consist of many SMW facts (subject-property-value triples); almost all complex forms of data that SMW does not have a dataitem for could be accurately represented in this format.

Creating dataitems is very easy: just call the constructor of the dataitem with the required values. Note that dataitems are strict about data quality: they are not meant to show the error-tolerance of the SMW user interface. For a programmer, it is more useful to see a clear error than to have SMW use some "repaired" or partly "guessed" value when a problem occurred. When trying to create dataitems from illegal data (e.g. trying to make a wikipage for an empty page title), an exception will be thrown. Usually dataitems will only implement basic data validation to avoid complex computations. If strict validation of, say, a URI string is needed, then own methods need to be implemented.

Dataitems implement a standard interface that allows useful operations like serialization and unserialization (a second way to create them from serialized strings). They also can generate a string hash code to efficiently compare their contents. Each dataitem also implements basic get methods to access the data, and sometimes other helper methods that are useful for the given kind of data. See the online documentation for details. The important thing is to keep data items reasonably lean and simple data containers – complex parsing or formatting functions are implemented elsewhere.