Wikilink

The Wikilink tool helps program organisers and organisations track external links on Wikimedia projects. While MediaWiki has the ability to search existing links, at the time of writing there is no way to easily monitor link additions and removals over time. The tool was built primarily for The Wikipedia Library's use case. Publishers donate access to Wikipedia editors, and while it was possible to monitor the total number of links over time, there was no simple way to investigate that data further - to find out where links were being added, who was adding them, or in the case of a drop in link numbers, why those links were removed.


Using the tool

There are two primary views into the data - the 'program' level and 'organisation' level.

Programs

Programs are collections of organisations. Program pages provide a high level overview of the link additions and removals for many organisations in one place. If you have partnerships with multiple organisations, the program pages can provide data about their data in aggregate for reporting purposes.

Organisations

Organisation pages provide data relevant to an individual organisation. Organisations can have multiple collections of tracked URLs - these could be different websites or simply different URL patterns. Results for each collection are presented individually. Additionally, each collection can have multiple URLs. This is useful primarily in the case that a website has moved; both URLs can continue to be tracked in the same place.


Data collection

Two sets of data are collected: Link events and totals

Link events

A script is always monitoring the page-links-change event stream; when a link tracked by Wikilink is added or removed, the data is stored in Wikilink's database.

The event stream reports link additions and removals from all Wikimedia projects and languages, and tracks events from all namespaces. If a link is changed, it will register both an addition (the new URL) and a removal (the old URL). Editing the same URL multiple times in one edit will only send a single event.

Please be aware there is currently a known bug with the event stream whereby some additional events are being sent related to template transclusions.

Link totals

The tool also tracks the total number of links to each tracked URL on a weekly basis. These totals are retrieved from the externallinks table. Currently, these totals only consider Wikipedia projects, however they do cover every language. Unlike with the event stream, queries have to be made against each project's database individually, and it is therefore prohibitive to collect total data for every Wikimedia project.