Skip to main content Skip to main navigation

Project | CLARIN

Duration:
Common Language Resources and Technology Infrastructure

Common Language Resources and Technology Infrastructure

The CLARIN project is a large-scale pan-European collaborative effort to create, coordinate and make language resources and technology available and readily usable.

By language resources and its technology we mean all knowledge sources based on language (written or spoken) and the tools to carry out operations on such language material.

CLARIN offers scholars the tools to allow computer-aided language processing, addressing one or more of the multiple roles language plays (i.e. carrier of cultural content and knowledge, instrument of communication, component of identity and object of study) in the Humanities and Social Sciences.

CLARIN Key Points

  • Comprehensive service to the humanities disciplines with respect to language resources and technology.
  • Technology overcoming the many boundaries currently fragmenting the resources and tools landscape as it is given by institutional, structural and semantic interoperability problems.
  • Tools and resources that will be interoperable across languages and domains, thus addressing the issue of preserving and supporting the multilingual and multicultural European heritage.
  • Comprehensive training and education programs that include university education in the different member states.
  • Improvement and extension of web-based collaborations, i.e. creating virtual working groups breaking the discipline boundaries.
  • Development or improvement of standards for language resource maintenance.
  • A persistent and stable infrastructure that researchers can rely on for the next decades.

CLARIN Key Technologies

  • It includes Data Grid technology to connect the repositories as being implemented in the DAM-LR pilot project and web services the various centres provide;
  • It builds on ideas launched by the Digital Library community to create Live Archives, and will further such initiatives;
  • It incorporates, and contributes to, Semantic Web technology to overcome the structural and semantic encoding problems;
  • It incorporates advanced multi-lingual language processing technology that supports cultural and linguistic integration.

Partners

  • Universiteit Utrecht (NL)
  • Max-Planck-Gesellschaft (DE)
  • Research Institute for Linguistics, Hungarian Academy of Sciences (HU)
  • The Chancellor, Masters and Scholars of the University of Oxford (UK)
  • Institutul de Cercetari pentru Inteligenta Artificiala (RO)
  • Fundacao da Faculdade de Ciencias da Universidade de Lisboa (PT)
  • Universitat Pompeu Fabra (ES)#
  • Institute for Parallel Processing of the Bulgarian Academy of Sciences (BG)
  • Centre National de la Recherche Scientifique (FR)
  • University of Zagreb, Faculty of Humanities and Social Sciences (HR)
  • University of Copenhagen (DK)
  • Universitetet i Bergen (NO)
  • Eberhard Karls Universität Tübingen (DE)
  • University of Malta (MT)
  • Univerzita Karlova v Praze (CZ)
  • Instituut voor Nederlandse Lexicologie (NL)
  • Lunds universitet (SE)
  • Helsingin yliopisto (FI)
  • University Al.I. Cuza of Iasi (RO)
  • Institute for Language and Speech Processing - Athena Research and Innovation Centre in Information, Communication and Knowledge Technologies (GR)
  • Consiglio Nazionale delle Ricerche - Istituto di Linguistica Computazionale (IT)
  • Politechnika Wroclawska (PL)
  • Kungliga Tekniska Hoegskolan (SE)
  • Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DE)
  • The University of Sheffield (UK)
  • Instytut Podstaw Informatyki Polskiej Akademii Nauk (PL)
  • European Language Resources Distribution Agency S.A. (FR)
  • Lancaster University (UK)
  • Universität Wien (AT)
  • University of Tartu (EE)
  • CSC-Tieteellinen laskenta Oy (FI)
  • Katholieke Universiteit Leuven (BE)

Publications about the project

  1. Linguistic and Semantic Representation of the Thompson's Motif-Index of Folk-Literature

    Thierry Declerck; Piroska Lendvai

    In: Stefan Gradmann; Francesca Borri; Carlo Meghini; Heiko Schuldt (Hrsg.). Research and Advanced Technology for Digital Libraries - International Conference on Theory and Practice of Digital Libraries. International Conference on Theory and Practice of Digital Libraries (TPDL-2011), Research and Advanced Technology for Digital Libraries, September 26-28, Berlin, Germany, Lecture Notes in Computer…
  2. A Text Technology Infrastructure for Annotating Corpora in the eHumanities

    Thierry Declerck; Ulrike Czeitschner; Karlheinz Moerth; Claudia Resch; Gerhard Budin

    In: Stefan Gradmann; Francesca Borri; Carlo Meghini; Heiko Schuldt (Hrsg.). esearch and Advanced Technology for Digital Libraries - International Conference on Theory and Practice of Digital Libraries. International Conference on Theory and Practice of Digital Libraries (TPDL-2011), September 26-28, Berlin, Germany, LNCS, No. 6966, ISBN 978-3-642-24468-1, Springer, 9/2011.
  3. Proppian Content Descriptors in an Integrated Annotation Schema for Fairy Tales

    Thierry Declerck; Antonia Scheidel; Piroska Lendvai

    In: Caroline Sporleder; Antal van den Bosch; Kalliopi Zervanou. Language Technology for Cultural Heritage. Selected Papers from the LaTeCH Workshop Series. Pages 155-169, Theory and Applications of Natural Language Processing, ISBN 978-3-642-20226-1, Springer, Heidelberg, 2011.

Sponsors

EU - European Union

CP-CSA_INFRA-2007-2.2.01

EU - European Union