By Marten Düring (link)
The Talk of Europe Creative Camp was organised by Max Kemman and Astrid van Aggelen and featured research projects and presentations by 16 researchers in Computer Science and the Humanities – a big thanks for all their hard work!
What do you do when you have all the debates in the EU parliament from 1992 onwards digitized, in RDF and you are just one SPARQL query away from them? If you have one week to work with specialists in network data visualization and management? First of all you need to bridge the gap between domain experts and ask how this data can be used to answer research questions in European studies and Computer Science. This turned out to be as hard as expected.
This week nevertheless yielded a method which helps us to detect unexpected speaker appointments by Members of the European Parliament (created by Fintan McGee of our neighbouring research institute CRP Lippmann) and an import functionality for Nodegoat which makes it easy to pull and visualise data from the Talk of Europe’s SPARQL endpoint (created by Pim van Bree and Geert Kessels of Lab1100). For CVCE this was an excellent opportunity to experiment with new ways of extracting and visualising information from structured data repositories.
For starters: A little of bit on networks
To explain our work a short intro is needed: Graphs like the one below are so-called 1-mode networks. This means that there is only one type of nodes (in this case people) connected to each other. Imagine that the people in this network are connected to each other because they all are members of the same sports club.
In this example, a 1-mode network visualisations represents group affiliations by showing links between all members of a group. This works fine here but very quickly becomes cluttered when a group has a large number of members.
You can also represent this kind of information using a 2-mode network; a network which has two types of actors (people, sports club). It is important to mention that in 2-mode networks ties can only exist between actors of different types (person -> sport club), never within one type of actors (person -> person).
2-mode networks are also sometimes called affiliation networks or bimodal networks and are particularly well-suited to identify overlaps between groups as illustrated in the graph above. In both examples it is clear that Laura is the person who is part of both groups. 2-mode visualisations are however often leaner than 1-mode networks since they represent an affiliation of a person by one tie alone and merely imply the links between members of a group.
Both 1-mode and 2-mode networks can be described mathematically as well. There are numerous ways to describe how central a node is in a network and how different nodes cluster together. Fintan’s work focuses on algorithms which are used for 2-mode networks and networks with more than 2 types of actors, so-called multimodal networks. 2-mode networks can be projected into 1-mode networks. This means that instead of 2 types of nodes we end up with only one. A tie between these nodes is added when – in our example – two people are members of the same club. This works just fine in the above examples. But as as Fintan highlights, projections can lead to a loss of information: One actor may be a member of three clubs and end up with the same connections as someone who only attends two clubs. Redundancy is one way to measure this loss of information.
Redundancy may reveal irregularities
Getting the opportunity to speak is a privilege in the EU parliament and very often the same people are chosen to speak on their area of expertise. But sometimes they don’t. Domain experts Frédéric Allemand (CVCE) and Bjørn Høyland (Oslo University) showed interest to detect the latter. It turned out that Redundancy, a concept developed in graph theory might do just that, as Fintan discovered during the week.
Drastically simplified, Redundancy describes how irreplaceable a node is in a 2-mode graph. In other words: Nodes which can be removed from a graph without changing its overall structure when it is projected into a single mode are considered to be highly redundant.
The graph below was created by Fintan and shows connections between Members of the European Parliament who spoke on certain agenda items in the year 2010 (527 nodes, 595 edges). An interactive version of this graph, realized with the sigmajs plugin for Gephi, is also available.
Only those MEPs are shown who have a low level of redundancy (<0.5) which means that speakers with very atypical combinations of topics on which they speak will be highlighted. An example is Mike Nattrass (United_Kingdom) who spoke on contrasting subjects such as:
- Ban on commercial whaling (debate)
- Implementation of the first railway package directives (debate)
- Progress made on resettling Guantanamo detainees and on closing Guantanamo (debate)
- Protection of animals used for scientific purposes (debate)
- Welfare of laying hens (debate)
This is where our journey ends and domain experts need to take a close look at the data to identify the meaning of the graph and link back their observations to the context knowledge which is and remains essential for the humanist analysis of the parliamentary debates.
From SPARQL to Network via Nodegoat
For our second project, Geert Kessels and Pim van Bree of Lab1100 have developed a new feature for Nodegoat which allows users to load data directly from the Talk of Europe SPARQL endpoint (or any other). This will make it significantly easier to explore the debate data for anyone who can write a query. Nodegoat might well be the to-date easiest way to set up a relational database, fill it with data and visualise the data as a social or spatial network and to observe changes over time.
That said, it is no trivial task to write such a query. But its advantage lies in the precision with which one can download a very particular constellation of data points.
User can define custom import templates which define how the data will be organized and linked to each other in Nodegoat.
Once these two steps are sorted, the data is ready to be visualised. Nodegoat supports the visualisation of social graphs but also combinations of social graphs and maps. You can check out a few examples on their website and use this User Guide to setup your own project. Sadly we ran out of time to test this new feature and develop queries together with our domain expert.
All in all, this was a great opportunity to start a collaboration between the CVCE DH Lab, CVCE EIS and Fintan of CRP Lippmann and Geert and Pim of Lab1100.