Karin van Leeuwen, Bart Vredebregt, Hilde Reiding en Radboud Winkels
How can researchers analyse the enormous impact of the case law of the European Court of Justice (ECJ) in for example European Parliament (EP)? By using digital techniques!
By participating in the Talk of Europe Creative Camp #3, this project aimed to link the dataset on EP proceedings 1999-2014 provided by Talk of Europe to the case law of the European Court of Justice as it can be found in EUR-lex. Thus, analysis of trends in how and how often the European Parliament talks about ECJ rulings should be possible. On a longer term, the participating historians aim to apply similar methods to for example larger datasets with the proceedings of national parliaments, or on newsmedia. For the Talk of Europe Creative Camp #3 however, the project focuses on how Members of European Parliament (MEP’s) regard the legitimacy of these rulings, which are often of a highly political nature.
Linking legal data to political speeches turned out to be as challenging as expected: unlike legal professionals, MEP’s do not use specific codes or vocabulary to refer to court cases and even sometimes mix up courts in the heat of their debates. Nonetheless, this week produced the groundwork for a method to link both datasets. Moreover, preliminary queries in this linked data proved the great value of this kind of digital methodology for analysing the reception of ECJ case law.
Results: collecting and linking the data
For this project, we used the English speeches RDF/XML datadump from the Talk of Europe website which were combined with ECJ case law from EUR-lex through the SOAP dataservice. For the respective texts overlapping keywords were collected of which the word probability were summed. By normalizing the summed probabilities the link strength between cases and speeches was determined.
As the links are not binary we cannot use them to directly link the cases and speeches. To be able determine which links are strong enough further research is needed. Yet the current links do allow for some research as it allows us to find relations like which speech was the most likely to mention the case.
Linked dataset: challenges and possibilities
The Creative Camp proved a great opportunity to start bridging the gap between history and computer science in the field of legal data. Based on the results of this week, we plan to look into ways to solve our largest problem namely the different vocabulary used in the ECJ and EP to describe cases.
The current results are promising, yet we cannot determine which links are strong enough to be seen as true links. Hence, it is not possible to incorporate the links directly in the ToE dataset for now. Yet the current result do allow interesting queries (e.g. “Which country is most likely to mention case X?”) which could tell us more about the context from which a case is referred to.