Programmer's guide

From semantic-mediawiki.org
(Redirected from Programmer's guide)
Programmer's guide
Help developers to navigate
Keywords
Table of Contents
Smw-user-input-storage-process.png

This page should help developers navigate who want to support Semantic MediaWiki development or develop extensions to SMW can use a number of resources to get started. Of course, it is strongly recommended to be familiar with the usage of SMW as well.

Moreover, there is an SMW architecture guide that provides a basic introduction to the main ideas and concepts in SMW from a developer perspective, which should be useful for reading and writing SMW-related code.

Development policies and practices | Architecture guide | Technical insights | Testing | Pull request

Objective

This document should help newcomers and developers to navigate around Semantic MediaWiki and its development environment.

The main objective of the Semantic MediaWiki software is to provide "semantic" functions on top of MediaWiki to enable machine-reading of wiki-content and allow structured content to be queried and displayed by means of employing different backends including:

  • SQLStore to be used as default storage and query engine for small and mid-size wikis
  • ElasticStore recommended to large wiki farms which need to scale or for users with a requirement to combine structured and unstructured searches
  • SPARQLStore for advanced users that have an extended requirement to work with a triple store and linked data

Development policies and practices

Polices

The general policy of the Semantic MediaWiki software and the development thereof is:

  • No MediaWiki tables are modified or altered, any data that needs to be stored persistently is relying on the Semantic MediaWiki's own database schema (writing to the cache is an exception)
  • No MediaWiki classes are modified, patched, or otherwise changed
  • Only publicly available Hooks and API interfaces are used to extend MediaWiki with Semantic MediaWiki functions
  • Classes and public methods (i.e. those declared using the public visibility attribute) marked as @private are not considered for public consumption or part of the public API hence a user should not rely upon these to be available as they may change their signature anytime or removed without prio notice
  • Tables created and managed by Semantic MediaWiki should not be accessed directly, instead a user (or extension) should make use of the public available API to fetch relevant information

Conventions

Some conventions to help developers and the project to maintain a consistent product and helps to create testable components where classes have a smaller footprint and come with a dedicated responsibility.

  • The top-level namespace is SMW and each component should be placed in a namespace that represents the main responsibility of the component
  • PSR-4 is used for resolving classes and namespaces in the src directory (includes is the legacy folder that doesn't necessarily follow any of the policies or conventions mentioned in this document)
  • Development happens against the master branch (see also the release process) and will be release according the the available release plan, backports should be cherry-picked and merged into the targeted branch
  • Semantic MediaWiki tries to depend only on a selected pool of MediaWiki core classes (Title, Wikipage, ParserOutput, RevisionRecord, Language ... ) to minimize the potential for breakage during release changes
  • It is expected that each new class and functionality is covered by corresponding unit tests and if the functionality spans into different components integration tests are required as well to ensure that the behaviour is tested across components and produces deterministic and observable outputs.

Best practices

  • A class has a defined responsibility and boundary
  • Dependency injection goes before inheritance, meaning that all objects used in a class should be injected.
  • Instance creation (e.g. new Foo( ... )) is delegated to a factory service
  • Object interaction with MediaWiki objects should be done using accessors in the SMW\MediaWiki namespace
  • A factory service should avoid using conditionals (if ... then ...) to create an instance
  • Instance creation and dependency injection are done using a service locator or dependency builder
  • Using type hinting consistently throughout a repository is vital to ensure class contracts can be appropriately followed
  • Trying to follow Single responsibility principle and applying inversion of control (i.e dependency injection, factory pattern, service locator pattern) is a best practice approach
  • Newly added functionality is expected to be accompanied by unit and integration test to ensure that its operation is verifiable and doesn't interfere with existing services
  • Newly introduced features (or enhancements) that alter existing behaviour need to be guarded by a behaviour switch (or flag) allowing to restore any previous behaviour and need to be accompanied by integration tests
  • To improve the readability of classes in terms of what is public and what are internals (not to be exposed outside of the class boundary), class methods are ordered by its visibility where public comes before protected which comes before private defined functions

Architecture guide

Technical insights

Testing

All tests are required to pass before changes can be merged into the repository.

Tests are commonly divided into unit and integration tests. Unit tests cover an isolated unit (or component) and normally don't require a database or other repository connection (e.g. triple store). Integration tests verify the interplay between components by interacting with MediaWiki and its services directly. About 80% of CI running time is spent on integration tests as they run a full cycle (parsing, storing, reading, HTML generation, etc.).

For an introduction on "How to use PHPUnit" and "How to write integration tests using JSONScript" see the relevant section in this document.

Continuous integration (CI)

The project uses GitHub Actions to run its tests across multiple MediaWiki and PHP versions. CI uses the same Docker-based setup as local development.

Create a pull request

Before creating a pull request it is recommended to:

First PR

  • Send a PR with subject [first pr] to the Semantic MediaWiki repository and verify that your git setup works and you are able to replicate changes against the master branch
  • Observe how a PR triggers CI jobs and review the output of those jobs (important when a job doesn't pass and you need to find the cause for a failure)

Preparing a PR

  • Create a PR with your changes and send it to the Semantic MediaWiki repository
  • Observe whether tests are failing or not, and when there are failing identify what caused them to fail
  • In case your PR went green without violating any existing tests, go back to your original PR and add tests that covers the newly introduced behaviour (see the difference for unit and integration tests)
  • Rebase and re-post your PR with the newly added tests and verify that they pass on all voting CI jobs

In an event that you encountered a problem, ask or create an issue.

See also

Security aspects[edit]

Web applications with open user communities are specifically threatened by security vulnerabilities. SMW developers are responsible for taking specific care to avoid vulnerabilities of all kinds. Every developer should carefully read the MediaWiki security guidelines for developers.

For more information, please read security and software vulnerabilities.