Skip to main content

CLARIN:EL, Greek National Infrastructure for Language Resources & Technologies

Location: Institute for Language and Speech Processing, ATHENA RC; Athens, Greece

Contact: Maria Gavriilidou (maria@athenarc.gr)

CLARIN:EL is the National Infrastructure for Language Resources and Technologies in Greece. It is a unique RI whose mission is to collect, document, curate and distribute digital language resources, language technology tools and certified online language processing services for the support of researchers, academics, students, language professionals, citizen scientists and the general public, whose activities fall into the fields of Language studies, Digital Humanities and Social Sciences, Cultural Heritage, Language Technology, Artificial Intelligence, Computer Science, Cognitive Science, etc. CLARIN:EL belongs to the National Roadmap for Research Infrastructures of Greece and is the Greek national node of the CLARIN ERIC European Infrastructure. Access to the CLARIN:EL Infrastructure and services is open to the entire academic and research community, industry, but also to the public, in accordance with Open Data Principles and FAIR Data Principles, while existing data restrictions are respected.

CLARIN:EL consists of 14 top research and academic organisations that provide data and/or processing services in the domain of humanities and arts in Greece. The core technology of the infrastructure is developed by ILSP/ATHENA RC, which is also responsible for hosting the infrastructure, supporting its users, and providing educational and training activities.

Facilities offered by the CLARIN:EL infrastructure include large computing clusters, digital data repository (certified as B-type CLARIN Technical Centre and also awarded the CoreTrustSeal in 2022) and language processing web services which the users can run remotely and download the results locally; additionally, it offers knowledge expertise in language technology (via NLP:EL, the specially designed CLARIN Knowledge Centre for Natural Language Processing in Greece). NLP:EL’s main objective is to support users of Language Technology tools and services and Sign Language Technologies, provide information on studies and curricula, educational and training material and scientific publications in the domains of Language Resources and Technologies and Digital Humanities. At the same time, NLP:EL acts as a network link with Natural Language Processing and Sign Language Technologies teams active in Greece, as well as with other certified CLARIN Knowledge Centres and with the National and European Language Resources and Technologies Infrastructures.

Services currently offered by the infrastructure

The CLARIN:EL Research Infrastructure offers access to more than 800 resources, i.e. digital language data of various language modalities (written, spoken, multimodal, sign, lexical/conceptual, etc.) and in various media (text, audio, video, etc.); language processing tools, web services and workflows (such as tokenizers, part of speech taggers, dependency parsers, terminology extractors, information extractors, named entity recognizers, etc.) for the processing of digital language resources (either available through CLARIN:EL or user-owned data); and the metadata of all resources made available through the Infrastructure. The usage of the infrastructure amounts to approx. 40,000 visits on a yearly basis.

CLARIN:EL ensures digital preservation of the hosted language data and processing services; it supports the creation, curation, sharing and reuse of Greek language resources, tools and services, but also multilingual resources and services containing Greek. The services are provided remotely (while some tools are also available for downloading and local use). The use of data and services is legally protected by appropriate licences ensuring lawful use; personal and sensitive data are also respected.

CLARIN:EL organises a variety of training and dissemination activities, including summer schools and focused seminars and workshops for university students and researchers, language professionals and other stakeholders (journalists, translators, historians, archaeologists, political and social scientists), promoting digital methodologies in their research and professional domains.

What this TNA offers

CLARIN:EL research infrastructure represented by ATHENA RC offers to host up to 36 person-weeks in total, that is up to 12 person-weeks per year (up to 6 different people per year, for 2 weeks each, and max. 3 at the same time) in the domain of language technologies, such as language analysis (including named entity recognition), machine translation, machine learning, and automatic speech recognition. CLARIN:EL will provide hands-on seminars and mentoring to the hosted researchers, in order to introduce them to digital methods in their field as regards collection and curation of data as well as use of digital tools for data processing of data. The applicants will ideally propose their own specific research projects; alternatively, CLARIN:EL members may assign them specific tasks aiming at their familiarisation with the infrastructure’s data and services. In both cases, CLARIN:EL members will guide and support them for the successful completion of their tasks, and also provide workspace, access to the infrastructure, its computing resources and data, tools and web services available through the infrastructure.