Skip to main content

Using NLP to understand trends in political and social debate: a constructive visit to the GATE/CLARIN team at the University of Sheffield, UK.

by Tasos Galanopoulos

How can NLP tools and large language models be used to understand trends in political and social debate around major issues of the day?

What is the relationship between ‘distant reading’ and the layered understanding that these tools offer for large volumes of data, and ‘close reading’, understanding aspects of these topical issues?

What role can these modern tools play in the humanities and in everyday journalistic practice?

Questions such as these, on the occasion of a project on “Analysis of textual data from newspapers on the agreement of Greece’s accession to the European Economic Community EEC (1961) ”, in the context of my postgraduate studies in Digital Humanities at the Open University of Greece, brought me to the School of Computer Science at the University of Sheffield at the end of November (23/11/2024 - 7/12/2024), to collaborate with members of the GATE team , as part of an ATRIUM TNA research visit.

A landscape shot of the modern building of the University of Sheffield with trees in the background.

School of Computer Science at the University of Sheffield

Despite the short period of the stay, the impressions were the best: the patience and goodwill of all the team - with Dr Maynard at the forefront - helped me to “navigate” the tools offered by the GATE Cloud and the European Language Grid , to understand a bit better the processes required, and the wider field, to learn a bit more about its “alphabet” and requirements. At the same time, through the regular meetings of the team I was able to get a “glimpse” of the modern, specialised, and valuable research being carried out at the university.

A sign outside an old English university building saying 'University of Sheffield'

University of Sheffield

In relation to the actual subject of the research, the findings from the processing with tools such as Named Entity Recognition, N-gram detection and their visualization with wordclouds, Topic Classification, Sentiment Analysis, Multidimensional analysis with LIWC-22, Persuasion techniques were very interesting, giving answers and insights to our questions that had to do with the attempt to develop a methodology to identify, document and frame named entities in the context of the investigation of public discourse, Press with different political orientation and political rhetoric in relation to critical events in political life, with reference to the economic and social environment inside and outside the country. Also “identifying” and categorising arguments for and against, and ‘bias’ for/against in the Press of that time and at a subsequent level , enabled us to explore ways to link entities to key concepts in argumentation.

A wordcloud of the research with certain words larger such as 'in Greece will', 'Greece will be' 'will be discussed' 'French investments in' 'investments in Greece'

Wordcloud pertaining to research

Overall, my impressions were therefore the best from this constructive visit, a visit that on a personal level gave me inspiration and opened new horizons, but also created new contacts with remarkable people.

A scenic view of the quad of the University of Sheffield from above.

Scenic view of University of Sheffield