Breaking News Detection with Wikidata and Wikipedia

From semantic-mediawiki.org
< SMWCon Fall 2013
SMWCon Fall 2013Breaking News Detection with Wikidata and Wikipedia
SMWCon Fall 2013
Breaking News Detection with Wikidata and Wikipedia
Talk details
Description: This talk informs about the work of incorporating Wikidata in "Wikipedia Live Monitor" tool.
Speaker(s): Thomas Steiner
Slides: see here
Type: Talk
Audience: Everyone
Event start: 2013/10/29 04:30:00 PM
Event finish: 2013/10/29 05:00:00 PM
Length: 30 minutes
Video: click here
Keywords: breaking news detection, wikidata, wikipedia
Give feedback

We have developed an application called Wikipedia Live Monitor that monitors article edits on different language versions of Wikipedia — as they happen in realtime.

Wikipedia articles in different languages are highly interlinked. For example, the English article “en:2013_Russian_meteor_event” on the topic of the February 15 meteoroid that exploded over the region of Chelyabinsk Oblast, Russia, is interlinked with “ru:Падение_метеорита_на_Урале_в_2013_году”, the Russian article on the same topic. As we monitor multiple language versions of Wikipedia and additionally Wikidata in parallel, we can exploit this fact to detect concurrent edit spikes of Wikipedia articles covering the same topics, both in only one, and in different languages or Wikidata fact sheets. We treat such concurrent edit spikes as signals for potential breaking news events, whose plausibility we then check with full-text cross-language searches on multiple social networks. Unlike the reverse approach of monitoring social networks first, and potentially checking plausibility on Wikipedia second, the approach proposed in our paper has the advantage of being less prone to false-positive alerts, while being equally sensitive to true-positive events, however, at only a fraction of the processing cost.

A live demo of our application is available online at the URL http://wikipedia-irc.herokuapp.com/, the source code is available under the terms of the Apache 2.0 license at https://github.com/tomayac/wikipedia-irc. In this talk, we mainly focus on the integration of Wikidata into the monitoring loop.

In this talk, we will report on our work of incorporating Wikidata in our tool Wikipedia Live Monitor.

The presentation slides can be found at http://bit.ly/smwconf.