SMWCon Fall 2021
Semantic wiki as a data curation system helping in structuring CMC knowledge
Talk details
Speaker(s): Laura Di Rocco
Type: Talk
Event start: 2021/12/10 15:00:00
Event finish: 2021/12/10 15:30:00
Length: 30 minutes
Video: not available
In chemical manufacturing and control, one of the crucial parts in Pharmaceutical Research and Development, a high amount of important data about the physicochemical characteristics of active substances is generated. An example of physicochemical characteristics is active substance’s solubility together with the information about its solid state.

Most of such internal data from experiments one finds in both, numbers and text format, stored in internal manufacturing systems. Additionally, the scientific interpretation of the data takes place in meetings and the submission relevant physicochemical data is summarized in so called technical documents for registration – as human readable text. A variety of format is given when it comes to the CMC data from experiments that we sometimes want to compare with other experiments data when it comes to a specific active substance. All of the data is GMP.

Within CMC we experience the need for curating the historical technical documents data by reviewing their contents with the help of a system capable of both, visually representing the content to the human users on the frontend and technically structuring the content as linked data graph in the backend. We test the deployment of semantic wiki that we use as a service backend for other systems. The data stored in this wiki are no more simple documents but data that can be ingested by an algorithm. As a result, our new wiki proposes a set of curated and clean CMC data that are simple to use for further analysis.