1. Visualising Co-occurrence networks of the plenary debates of the European Parliament
  2. Exploring the potential of multimodal graphs for modelling European Plenary Debates
  3. Linking the European debates to Italian Parliament data: research opportunities and tools
  4. Towards Deep Syntactico-Semantic Annotation of the European Parliament Proceedings
  5. WhatTheySaid: Enriching UK parliament Debates with Semantic Web

Visualising Co-occurrence networks of the plenary debates of the European Parliament

Geert Kessels1, Pim van Bree1, Frédéric Allemand2, Marten Düring2
1 Lab1100/nodegoat, The Netherlands
2 CVCE, Luxembourg

Network analysis provides the tools to explore highly complex constellations of relations between entities in ways which go beyond the human capacity to comprehend them. The networks are composed of nodes and ties/links between nodes (also known as entities). The collaboration will work together in two parallel phases: (1) extend nodegoat (http://nodegoat.net/) functionality to utilise the data of TOE and transform data (2) a visual analysis of political networks manifested in the plenary debates of the European Parliament through information visualisation and knowledge construction. The nodegoat research environment facilitates this process through diachronic analysis and visualisation options, as well as various complex filtering functionalities for humanities datasets. Existing nodegoat functionality supports a variety of standard data formats such as csv, json and xml to be imported but currently does not provide the facility to import RDF, the standard data structure of the Talk of Europe (TOE). The first phase of the project proposes to extend the functionality of nodegoat by extending its existing infrastructure to enable the import of the TOE RDF dataset. This will be achieved by establishing a communication between the TOE RDF dataset and nodegoat, through the development of a new module that will [1] query the SPARQL endpoint and [2] store the response in nodegoat. Once this module has been developed the second phase of the project can be initiated the network graph of the plenary debates can be developed and analysed through the development and design of a data model which will transform the text into a social graph representing the political debates, including the networks, connections and affiliations of the participants.

Exploring the potential of multimodal graphs for modelling European Plenary Debates

Fintan McGee1, Frédéric Allemand2, Marten Düring2
1 CRP Lippman, Luxembourg
2 CVCE, Luxembourg

This proposal builds on similar premises as the proposal “Visualising Co-occurrence networks of the plenary debates of the European Parliament” by Centre virtuel de la connaissance sur l’Europe (CVCE) and LAB1100 but goes beyond this approach by exploring the impact of multimodal network visualisations for both network analysis and research in European Integration studies. The data model described by ToE for the modelling of European debates consists of many different types of entities and relationships. The level of complexity of the data model challenges a straight forward node-link graph visualization. Out of practical reasons, complex graphs such as this are often reduced to more simple ones by means of projection, where nodes of one mode are removed from the graph and edges are introduced between all of their neighbors. However, this may create ambiguities, such as an inability to distinguish between politicians who contribute to a large number of debates, and politicians who contribute to a small number of debates that also include a larger number of other participants. A multimodal approach to graph visualization has potential to enable improved visualisation and knowledge of the underlying relations and interactions such as those of the participants of the plenary debates. Multimodal graphs are graphs which model more than one type of entity, with different entities being represented by nodes of a different mode.

Linking the European debates to Italian Parliament data: research opportunities and tools

Silvia Giannini
Politechnico di Bari

The Information Technology department of the Italian Parliament Chamber of Deputies has started the publication on the web of the content of all digital files and documents related to the Italian Republic legislatures, following the Linked Open Data best practices (Candia, 2012). A suitable OWL ontology has been developed enabling, among all, the description of persons, parliamentary bodies, elections, draft bills and debates data. This source of information, accessible through a public SPARQL endpoint, poses the basis for interesting analysis of historical data (see http://data.camera.it/data/en/apps). The first goal for the creative camp is to investigate on the possibility of interlinking the datasets offered by the Italian Chamber of Deputies and the one developed under the Talk of Europe (ToE) project, in order to further increase the reuse of data, the integration with other sources of information and the internationalization of the Italian Parliament dataset. As a second goal, we offer to test the effectiveness of a clustering technique recently proposed for RDF data (Colucci, Giannini, Donini, & Di Sciascio, 2014), as a valuable tool for political scientists investigations. The algorithm relies on: (i) the selection of two resources, (ii) the extraction of the information considered relevant for the description of the two resources; (iii) the evaluation of the common knowledge shared by the two RDF descriptions, as described in (Colucci, Donini, & Di Sciascio, 2013); (iv) the extraction of all other resources conveying the same informative content of the first two.  Interested researchers in political sciences and humanities will be very helpful in suggesting relevant properties to consider for each type of resource in a particular analysis dimension. Moreover, the technique can be applied for data quality issues, like detecting missing values in RDF descriptions.

Candia, E. (2012). Linked Open Data for transparency, interoperability and efficiency . World e-Parliament Conference 2012 , (p. Available on line http://www.ictparliament.org/sites/default/files/a4_e.f._candia.pdf). Roma, Italy.

Colucci, S., Donini, F., & Di Sciascio, E. (2013). Common Subsumers in RDF. AI* IA 2013: Advances in Artificial Intelligence (p. 348-359). Torino, Italy: Springer International Publishing.

Colucci, S., Giannini, S., Donini, F., & Di Sciascio, E. (2014). Finding Commonalities in Linked Open Data. Proceedings of the 29th Italian Conference on Computational Logic (CILC 2014). Torino, Italy.

Towards Deep Syntactico-Semantic Annotation of the European Parliament Proceedings

Stephan Oepen1,2 , Erik Velldal1, Emanuele Lapponi1, and Bjørn Høyland1
1 University of Oslo
2 Potsdam University

We propose to contribute to the creation of a high-quality, RDF-encoded multi-lingual corpus of European Parliament Proceedings, coupled with state-of-the-art syntactico-semantic analyses at the token and sentence levels. In particular, our contribution will target the textual (transcribed) content of European Parliament speeches, where we will leverage in-house experience and technology for syntactic and semantic parsing of running text, primarily in English, but also other European languages. We expect that making available a standardized collection of (automatically acquired) linguistic annotations of the raw text in the corpus, related to the resource at large through RDF links, will enable a variety of downstream analytical tasks and aid replicability and comparability of results.

WhatTheySaid: Enriching UK parliament Debates with Semantic Web

Yunjia Li, Chaohai Ding, and Mike Wald
University of Southampton

The publicity of UK Parliament Debate, such as BBC Parliament, has exert tremendous influence on the transparency of politics in the UK. Political figures need to be responsible for what they have said in the debates as they are monitored by the public. However, it is still difficult currently to automatically analyse the debate archives to find the answer to questions such as: how the debates across months or even years are related to each other. For this purpose, we have developed WhatTheySaid, which uses semantic Web and natural language processing (NLP) technologies to automatically enrich the UK Parliament debates and categorize them for searching, visualisation and comparison.
In this demo, we build more advanced features to fulfil the following requirements: (R1) Calculate the similarities between debates so that users can easily navigate through similar debates; (R2) Categorise debates into different topics and extract the key statements, so that users can easily spot the statements that are contradict to each other; (R3) Based on R2, link the debates to a fragment of the debate video archive, so that users can watch the video fragment as the proof of the statement; (R4) Analyse the speeches of a particular MP and find out how the sentiment is changing over time.
To demo the implementation of the requirements above, we have taken the UK House of Common debate data in 2013 from TheyWorkForYou as the sample dataset. We visualise the enriched debate data in various ways. Firstly, we use both heat map and line chart to visualise the sentiment scores of speeches for each MP on yearly and monthly basis respectively. We also provide a timeline visualisation for the statements in different topics made by a certain MP. We designed a replay page with the transcript and named entities aligned with the fragments of debate video.
We want to enrich the EU Parliament data in the similar way as Whattheysaid and compare the similarity between the two in the same period of time. We will firstly develop a general debate ontology that covers the debate structures for both UK and EU Parliament debate, and structurise the debate data based on this ontology. Then we will further process the debate speeches with topic detection, text summerisation, named entity recognition and sentimental analysis. The results will be visualized as heat maps and timeline canvas. All those enrichments will be presented along side the debate videos with the time alignment. Finally, the debate data and enrichments will be published as Linked Data format.