SMWCon Fall 2013/Breaking News Detection with Wikidata and Wikipedia

We have developed an application called Wikipedia Live Monitor that monitors article edits on different language versions of Wikipedia — as they happen in realtime.

Wikipedia articles in different languages are highly interlinked. For example, the English article “en:2013_Russian_meteor_event” on the topic of the February 15 meteoroid that exploded over the region of Chelyabinsk Oblast, Russia, is interlinked with “ru:Падение_метеорита_на_Урале_в_2013_году”, the Russian article on the same topic. As we monitor multiple language versions of Wikipedia and additionally Wikidata in parallel, we can exploit this fact to detect concurrent edit spikes of Wikipedia articles covering the same topics, both in only one, and in different languages or Wikidata fact sheets. We treat such concurrent edit spikes as signals for potential breaking news events, whose plausibility we then check with full-text cross-language searches on multiple social networks. Unlike the reverse approach of monitoring social networks first, and potentially checking plausibility on Wikipedia second, the approach proposed in our paper has the advantage of being less prone to false-positive alerts, while being equally sensitive to true-positive events, however, at only a fraction of the processing cost.

A live demo of our application is available online at the URL http://wikipedia-irc.herokuapp.com/, the source code is available under the terms of the Apache 2.0 license at https://github.com/tomayac/wikipedia-irc. In this talk, we mainly focus on the integration of Wikidata into the monitoring loop.

In this talk, we will report on our work of incorporating Wikidata in our tool Wikipedia Live Monitor.

The presentation slides can be found at http://bit.ly/smwconf.