Here are some frequently-asked questions about Semantic MediaWiki.
- 1 What is Semantic MediaWiki?
- 2 Who develops Semantic MediaWiki?
- 3 How popular is it?
- 4 How is Semantic MediaWiki's performance?
- 5 How does SMW store its data?
- 6 Why doesn't SMW use SPARQL as its query language?
- 7 What knowledge can be inferred in SMW?
- 8 Why doesn't data I have just added show up in queries?
- 9 What is the relationship between Semantic MediaWiki and Wikidata?
- 10 What additional features are planned for SMW?
- 11 I would like to contribute a bug fix/new feature/new extension. How do I do that?
- 12 How else can I help the SMW effort?
- 13 What are the alternatives to SMW?
- 14 Are there any SMW-related events or conferences?
- 15 There's so much documentation! How do I get started?
What is Semantic MediaWiki?
Semantic MediaWiki (abbreviated SMW) is an extension to MediaWiki, the wiki application best known for powering Wikipedia; it lets users store data in wiki pages, and query it elsewhere, thus turning a wiki that uses it into a semantic wiki. A large number of other extensions exist that are meant to be used in conjunction with Semantic MediaWiki (there are currently around 30 active ones); the term "Semantic MediaWiki" is sometimes used to refer to this entire family of extensions. These extensions cover everything from helping users to enter data, to visualizing data (in formats such as maps, calendars and graphs), to importing and exporting SMW data, to using SMW for workflow purposes.
It should be noted that the extension is sometimes referred to simply as "Semantic", perhaps because most of its spinoff extensions start with just the word "Semantic", like "Semantic Result Formats"; this is incorrect, though.
Semantic MediaWiki, like almost all other MediaWiki extensions, is both free and open source.
Who develops Semantic MediaWiki?
SMW was originally developed by several programmers at the Karlsruhe Institute of Technology (KIT). Since then, others have taken over the development effort (and the original developers have all left KIT). See SMW Project more information on SMW's developers and development history. SMW-based extensions are developed by a large number of contributors, some of whom are also core SMW developers.
How popular is it?
SMW is currently in use on over 1,800 active public wikis. It is impossible to know on how many private wikis it runs, but a rough guess, based on the frequency of public and private wikis asked about in bug reports and the like, is that it's used on about the same number of private wikis as public wikis.
Some of the better-known public wikis that use Semantic MediaWiki include OpenEI and SNPedia. Companies that use SMW internally include Johnson & Johnson, Philips and Pfizer. Other organizations that use SMW include NASA, NATO and the United Nations. SMW is also used by government agencies in various countries, including the United States, Canada and Austria.
How is Semantic MediaWiki's performance?
There have been various tests done already to determine SMW's performance, in addition to fairly large-scale real SMW installations. Unfortunately, none of the findings of any of these tests have been published yet. However, we do know some of the conclusions: SMW has been used successfully even with millions of rows of data; and the limiting factor is usually overly-complex queries. Various standard performance improvements have been shown to have a helpful effect on SMW performance, including the use of caching tools like APC and memcached, and MySQL adjustments like increasing the buffer size and using a separate database server. The page "Speeding up Semantic MediaWiki" includes more such tips.
How does SMW store its data?
Semantic MediaWiki stores its data via about 10 additional tables within the database that MediaWiki uses (which is usually a MySQL database). SMW can additionally store its data within an RDF triplestore, such as 4store and Virtuoso, although even in such cases the standard database is also used. See Using SPARQL and RDF stores for more information on triplestore storage.
Why doesn't SMW use SPARQL as its query language?
SPARQL is the main query language of the semantic web, and there are a number of big advantages to using SPARQL to query SMW data: it's more flexible overall than SMW's native query language, it would allow for querying semantic data from other sources in addition to the wiki's own data; and using SPARQL would prevent users from having to learn yet another query language.
The use of SPARQL is considered infeasible for querying the data stored directly within the main SMW database tables, though, mostly due to that same flexibility: there are many SPARQL queries that can't be handled by SMW's current querying system.
However, SPARQL querying is easy to do on data that's stored via an RDF triplestore. It is possible to store SMW's data in RDF, and to issue SPARQL queries on that data from the wiki, in various ways — see the question above.
What knowledge can be inferred in SMW?
One of the strengths of a semantic system is that not every piece of data has to be stated explicitly; some can be inferred. Currently SMW supports four types of inferencing: subcategory (querying for pages in a certain category will also get the pages in all its subcategories), sub-property (properties can be declared as sub-properties of other properties, and can be queried in that way), equality (a property pointing to a page that redirects to another page will transfer its value to the other page) and inverse properties (you can query on properties in the reverse direction). See Inferencing for more information.
There are two additional approaches for inferencing: if you're using templates to store semantic data, you can create custom inferencing within the template — for example, by using the #if parser function, you could add a call to a page about a person so that, if the person has at least one child and the person is male, the page gets added to the category "Fathers". More expressive inferencing can also be done by using RDF triple stores and SPARQL querying.
Why doesn't data I have just added show up in queries?
There is sometimes a lag between when SMW data gets created or modified, and when that new data shows up in queries; that is due to MediaWiki's own page caching. Some people, not knowing any alternatives, get around this by re-saving the page containing the query, but this is not necessary — you can refresh the query just by doing a MediaWiki refresh/purge on the page. If you are a MediaWiki administrator, you can do this by simply hitting the "refresh" tab (not to be confused with the browser's "reload" button, which will not have any effect). If you are not an administrator, going to the URL that ends with "
&action=purge" for that page will have the same effect. Or you can simply wait — cached pages usually get refreshed within 24 hours or less.
If there are certain wiki pages that you want to never be cached, you can install extension "MagicNoCache" (MediaWiki.org), and add the string (behavior switch) "
__NOCACHE__" to anywhere within those pages.
Finally, if you run a small- or medium-sized wiki, you can simply disable MediaWiki page caching altogether, which probably will not have a huge impact on performance. This can be done in one easy step.
What is the relationship between Semantic MediaWiki and Wikidata?
Wikidata is a project that began in 2012, coordinated by Wikimedia Deutschland, with the goal of creating a single massive data store that can be used by all the individual language Wikipedias to populate their infoboxes, as well as by the outside world. It has a number of relationships with Semantic MediaWiki.
Wikidata is, in some sense, the fulfillment of the original dream of Semantic MediaWiki. SMW began as a proposal to allow for a "semantic Wikipedia", that could query and export its own data; and much of the early development of SMW was motivated by that goal. However, Wikipedia has a number of special requirements that SMW by itself is unable to fulfill, and which made something like Wikidata necessary: these include the fact that the same data is meant to be displayed across a wide variety of languages, and the fact that every piece of information requires a citation.
Some of the Wikidata team, most notably Wikidata founder Denny Vrandečić, began as members of the SMW community.
The software that powers Wikidata is a set of MediaWiki extensions collectively known as Wikibase, and though Wikibase has similarities to Semantic MediaWiki, it is a distinct set of software. However, some of SMW's backend code was spun off into a separate library, called "DataValues", which is used by both SMW and Wikibase as a framework for storing data.
There is the potential that Wikibase and Semantic MediaWiki will compete against one another as software, with some wikis choosing to use Wikibase instead of SMW as their data storage system. This seems doubtful, however: the Wikibase user interface is geared for a highly multilingual, highly general knowledge base like Wikipedia. Wikis with a specific focus and only one or a handful of languages would be better off with the greater structure and simplicity of Semantic MediaWiki.
There is a similar potential that Wikidata and Semantic MediaWiki will compete against one another, with some groups opting to store their data directly in Wikidata, instead of creating a wiki for it in the first place. Whether this can happen or not depends in large part on the extent to which Wikidata administrators (who are a different group from Wikidata's developers) allow for non-Wikipedia data on the site.
Finally, though Wikidata is intended for use by multi-language wikis like Wikipedia, there are various Wikimedia Foundation projects that are English-only, or majority-English, that could potentially benefit from directly using Semantic MediaWiki, including Wikimedia Commons, Wikispecies and MediaWiki.org. SMW is already in use on a WMF-affiliated wiki, Translatewiki.net (see map).
What additional features are planned for SMW?
You can see the roadmap page for a listing of some of the planned new features for both SMW and the extensions based off of it.
I would like to contribute a bug fix/new feature/new extension. How do I do that?
First of all, we appreciate the enthusiasm of every potential new contributor to the SMW project. Helpful features and bug fixes have been contributed by many users, and nearly every SMW developer started out as just contributing small bits of code.
If what you would like to contribute is just a bug fix, the best way to do it is to create a patch for your code, and submit it via Phabricator.
If the contribution is a new feature, or even a possible new extension, it is strongly recommended to first write an email about it to the semediawiki-user mailing list — it could be that such a feature or extension already exists, or that someone is already working on it, or that it's been discussed before and is considered infeasible, or at the very least that others will have helpful ideas about how to implement it.
If you are planning to start doing development, please see the developer's support page for further information and resources.
How else can I help the SMW effort?
Even if you're not a developer, there are various things you can do to help the SMW project. First, you can donate financially. The Open Semantic Data Association (OSDA) is a non-profit charity organization that is focused on helping Semantic MediaWiki development. Any money sent to OSDA will be directly used to help further the cause of SMW. You can donate, via PayPal, at the bottom of the OSDA homepage.
If Semantic MediaWiki has been helpful to you or your organization, you could write a testimonial about it — just a short description of how it's been helpful, sent to email@example.com, would be greatly appreciated. You can also help answer the questions of other users, both on the semediawiki-user mailing list and on the #semantic-mediawiki IRC channel.
If you have a blog or a Twitter account, you could write something there about Semantic MediaWiki. And finally, if you have any connections in the media, or you're in the media yourself, we think SMW makes for a compelling subject — the amount of press coverage SMW has received so far, both in print and online, has been woefully small.
What are the alternatives to SMW?
In the MediaWiki world, the Cargo extension bills itself as an alternative to Semantic MediaWiki, though there are a number of differences between the two. The DynamicPageList (DPL) extension is also sometimes compared to SMW. It, too, allows for querying pages, although based only on categories and other standard MediaWiki attributes, like the date a page was last revised. There's no rule against using more than one of these extensions, though, and some wikis do use a combination of them.
Outside of MediaWiki, though, we truly believe that there's no other software, either free or proprietary, that enables flexible, collaborative data structures in the way that Semantic MediaWiki does. Nonetheless, within corporations, Microsoft SharePoint comes up fairly often as an alternative option (see here for one view of the advantages of SMW over SharePoint).
There are various other semantic wiki applications, although none of them have achieved anything close to the user base and name recognition of Semantic MediaWiki.
SMW shares some of the characteristics of document-oriented databases like MongoDB, although SMW functions more like a front-end application than such databases usually do, so they're rarely seen as alternatives.
In the big picture, the real competitor to Semantic MediaWiki is every so-called "turnkey" application, meant to store a specific type of data. We would like to see the users of many of these applications consider switching to SMW as a cheap, flexible alternative.
There is an annual meeting of SMW users and developers, called "SMWCon", or the Semantic MediaWiki Conference, which occurs in Europe in the fall.
From 2010 to 2015, SMWCon was a twice-yearly event, which was also held in the spring in North America. In 2016, the North American event was renamed to EMWCon, or the Enterprise MediaWiki Conference. Despite its name, and more general topic, EMWCon also has a big focus on Semantic MediaWiki and its related extensions.
The annual Wikimania conference usually has a contingent of SMW and/or Wikidata developers (SMW was in fact first proposed at the first Wikimania, in 2005). The OpenSym event series also has SMW-related topics from time to time.
There's so much documentation! How do I get started?
One way to get started using Semantic MediaWiki is to figure out which additional extensions you want to use — those can have a big effect on your data structure, and on what you use SMW for on your wiki. You can see the full list of SMW extensions here.
You can also check out Extension:Semantic Bundle, a package of SMW-based extensions that you can download, that offers a "curated" set of software around SMW.
Also, you can check out the set of SMW wikis of the month, for some examples of wikis that have successfully used Semantic MediaWiki for a variety of purposes.