Talk details
Description: The ReplicationWiki is undergoing a massive expansion and for the first time is used as a forum for a webinar series. The use case and experience with technical tools and challenges are presented.
Speaker(s): Jan H. Höffler
Type: Talk, Discussion
Event start: 2022/10/27 11:30:00
Event finish: 2022/10/27 12:00:00
Length: 30 minutes
Video: not available
Keywords: bots, forms, multiple instance templates, search form, web scraping
The ReplicationWiki is a platform on replication of empirical studies in the social sciences. It hosts a database of currently 5637 studies with information on data and code availability, methods, data and data type as well as software used, geographic origin of the data, and if replications are available and their types and results. It currently has more than 26,000 pages, 367 registered users, and its pages were viewed nearly 7.4 million times since its start in 2013.

At the beginning of this year, a massive expansion was started based on web scraped data. We present the techniques we used for this and what worked well but also the challenges we faced, not only from the technical perspective but also from competitors violating our Creative Commons BY licence, using and redistributing our data without proper attribution.[1][2] [3]

For the first time the wiki is used as the forum for a webinar series that will start in September. This has already led to an increase of the number of registered users, and we present about our experience with the wiki for this purpose.

Please also check the separate hackathon for whiche we suggest to help with an improved search form - which we regard as in general necessary for Semantic MediaWiki. To those who join forces for this or other projects like a module to implement user feedback for the training of algorithms, web scraping of further content, the introduction of multiple instance templates for our forms, bug fixing, cf. our list of open tasks, apart from sharing funding that we raise we also offer co-authorship for resulting publications.

We suggest 15 minutes of presentation with 15 minutes of discussion.