Language Data Space Workshop Report

The Language Data Space aims to give stakeholders the opportunity to monetise their efforts in terms of language resources (data, tools, services, models, etc.), while also supporting the deployment of language models and language technology services for their businesses, in one single marketplace. Its objective is to create an interconnected and competitive European data economy for the promotion and re-use of language resources.

In line with the European Data Strategy and the launch of the DIGITAL Programme, the European Commission organised a series of eight workshops. They targeted several business sectors (news, broadcasting, advertising, publishing, language technology, telecommunication industries as well as libraries, archives and public administrations). The goal was not only to present the ‘Language Data Space’ concept, but also to gather insights from the different stakeholders’ groups.

Over 100 particpants representing organisations including the Federation of European Publishers, the Federation of European Data and Marketing, the European Telecommunication Network and the European Broadcasting Union attended these workshops. Through their participation they identified several opportunities, for instance, monetising language data, counteracting the fragmentation of the European Language Technologies landscape and enriching it with high-quality data, covering different procedures, business domains and use cases. In addition, stakeholders maintained that indispensable ‘enabling conditions’ must be implemented and certain challenges have yet to be overcome, on a technical (e.g., promoting standards, normalising metadata, designing and developing the architecture), legal (e.g., complying with GDPR, implementing IPR clearance and correct licensing) or operational (e.g., defining governance, fostering sustainability and interoperability) level.

The Language Data Space will be financed as follows:

Framework Programme: Digital Europe Work Programme 2021-2022;
Type of Action: PROCUREMENT;
Indicative Budget: €6 million;
Indicative Time of the Call Opening: July – September 2022;
Indicative Starting Date: Early 2023;
Link to the LDS procurement text on the TED eTendering website.

The workshop material, including workshop report and the EC presentation on the Language Data Space project are available in attachments: