LINDAT/CLARIAH-CZ (Digital Research Infrastructure for the Arts and Humanities)
Location: Charles University, Prague
Contact: Jana Hamrlová ( hamrlova@ufal.mff.cuni.cz )
LINDAT/CLARIAH-CZ puts together three RI pillars – LINDAT/CLARIN, DARIAH-CZ and EHRI-CZ. It is a unique RI that provides language and other digital data and software tools and services to researchers and other users in the area of language technology, humanities and arts. LINDAT/CLARIAH-CZ is the Czech national node to CLARIN ERIC (Common Language Resources and Technology Infrastructure), DARIAH ERIC (Digital Research Infrastructure for the Arts and Humanities) and EHRI (European Holocaust Research Infrastructure). It consists of 15 top research organisations that are active in the domain of humanities and arts in the Czech Republic – in linguistics, history (including oral history), historical bibliography, culture and cultural research, the history of arts, philosophy, film culture, visual arts, musicology and the history of music, ethnology, folklore, archaeology and also in some cross-disciplinary domains.
LINDAT/CLARIAH-CZ takes part in international cooperation between RI of a similar type, as well as directly between the relevant research institutions in all branches of humanities. It puts emphasis on digital and interdisciplinary processing methods, including modern methods of machine learning and artificial intelligence. An integral part of LINDAT/CLARIAH-CZ is an analysis of relevant legal aspects of the use of its resources, such as copyright and other intellectual property laws, in order to minimise the impact of their restrictions on research activities. It also develops basic language technologies for the use in industrial, as well as public services sectors. Facilities include: large computing cluster, digital data repository (CTS- and CLARIN-certified as B-type CLARIN Technical Centre), and knowledge expertise in language technologies (at Charles University) and in the above-listed disciplines (all partners).
Services currently offered by the infrastructure
The aim of LINDAT/CLARIAH-CZ is to allow for an open access to digitised data resources of each given discipline for the broad research community, including students both in Czechia and in the EU and worldwide and, at the same time, to obtain access to similar resources available in CLARIN, DARIAH and EHRI. LINDAT/CLARIAH-CZ also offers know-how and provides software tools for the processing of language resources and other digital data, in the form of downloadable software and models, Application Programming Interfaces (APIs) as well as web applications and forms for immediate, interactive work with users’ own data. In the area of digital data, it offers deposit and long-term preservation service for anyone working in the area of digital humanities and arts. These services are provided mostly remotely, with tens of thousands of accesses to the data and millions of accesses of the services (in all forms) monthly. In addition, LINDAT/CLARIAH CZ offers the various knowledge and expertise of its staff and collaborators to others, from students to international researchers, in the areas of language technologies, machine learning, data preparation, access, distribution and preservation, and in specific areas related mostly to linguistic data processing. Among the most popular datasets and related services is the multilingual Universal Dependencies annotated collection and the UDPipe tool which are both used worldwide; similarly for the machine translation service among multiple languages, and some more NLP tools including Named Entity Recognition.
What this TNA offers
LINDAT/CLARIAH-CZ (in cooperation with AIS/ARUP) offers access to language technology tools and services and language resources for TNA visitors to get familiar with these tools and use them on visitors’ own data and within their projects. Expertise will be provided by LINDAT and AIS mentors, with easy access to many other language technology researchers in LINDAT/CLARIAH-CZ’s host institution (Institute of Formal and Applied Linguistics) covering further multilanguage, multimodal and interdisciplinary technologies. AIS/ARUP offers to provide comparative samples of archaeological datasets and expertise on their processing, FAIRification, preservation and sharing.