Mostafa Dehghani, Hosein Azarbonyad and Alex Olieman
Entity recognition and disambiguation is a strong tool for enriching parliamentary debates and linking them to external sources. However, the general-purpose entity linkers are often not able to accurately detect entities which are particularly relevant for the parliamentary domain. Therefore, in the Talk of Europe event, Creative Camp#3, we focused on designing and developing a specialized entity linker for political debates. As an application of the designed entity linker, we developed an interactive tool which utilizes the extracted entities and enables users to explore the debates based on their relation to the concepts of an external knowledge base, namely Wikipedia. This system helps users to dig into the parliamentary debates from their own perspective and explore which aspects of their interested concepts are more discussed in the parliament.
Generally our work in the event had two different parts each of which presents an independent system:
● Entity Linker for Political Conversations
● ExPoSe WikiCat Browser
The following provides a brief explanation of each of these two systems:
Discovering Links in Political Conversations:
The goal in this part of our work was to enrich parliamentary proceedings by linking mentioned concepts and named entities to corresponding external resources, such as biographies or encyclopedia entries. To do so, we have developed a customized system for entity recognition and disambiguation which targets the most salient entity types in this domain: politicians and political parties. We have combined the specialized linker with general-purpose entity linking systems to achieve a promising precision as well as an acceptable recall.
At the moment, the developed linker can be utilized on Dutch proceedings. The system generates the links in PoliticalMashup format:
The implementation of the aforementioned system is available here: https://bitbucket.org/aolieman/pm_el_tools
ExPoSe WikiCat Browser
As the second part of our work, we have developed a system named “ExPoSe WikiCat Browser”. In general, ExPoSe WikiCat Browser enables users to browse and investigate the data from a particular point of view. The system makes use of extracted entities from the data in order to project the data on an arbitrary ontology. In fact, the structure of the selected ontology determines the point of view from which users want to see the data.
As an example from the parliamentary domain, assume a user is interested in analysing the relation between national laws of a european country and EU legislation. One approach would be to investigate when EU legislation was discussed in the national parliament, and in the context of which (proposed) national laws. In other words, it is desirable to see how debates within the parliament of that country can be projected onto the topics related to the European Parliament in terms of both the subject matter and time.
In ExPoSe WikiCat Browser, we make use of the entity linker’s output to determine the notion of the topics of the data and we let the user select a sub-hierarchy from Wikipedia’s category hierarchy onto which mentioned entities from the text are projected. In the selected hierarchy, each node is a category which represents a topic and its descendant nodes are its sub-topics.
In the parliamentary data example, the ExPoSe WikiCat Browser is provided with the extracted entities from national parliamentary debates and the “European Union” is selected as the category at the root of the hierarchy.
Subsequently, the system extracts all the categories in the selected hierarchy and maps extracted entities to the Wikipedia categories. Afterward, it filters the entities considering whether they present some concepts related to the extracted categories or not. Then, having the debates and entities in them which are related to the categories in the hierarchy, the system calculates the importance and recency of each category (nodes in the hierarchy). The importance of each node demonstrates how much the topic of that category is addressed in the national parliamentary debates, based on frequency of entities related to this category. The recency of of each node shows how recently the topic of this category is discussed in the debates.
We provide a graphical user interface from which the user is able to traverse the paths in the hierarchy and see which categories are more discussed in the national parliamentary debates (importance of nodes is shown by their size), which categories are recently discussed (recency of nodes is shown by their color), and which debates are related to which categories at different levels of abstraction. The links to all the debates related to the given category is provided grouped by the entities that belong to the category.