Project | Medinym

Duration: 12/15/2022 - 12/14/2025

AI-based anonymization of personal patient data in clinical text and voice databases

Research Topics

Application fields

Health & Medicine

In the Medinym project, we are pursuing the goal of completely anonymising the speaker identity of a speaker, both on voice and on statement/semantic level, without losing emotional or diagnostic information. From the point of view of data protection, this development of speech data opens up enormous application potential.

The SLT contributes significant expertise in the areas of:

NLP/ IE for anonymization, for example, recognise and obfuscate relevant entities and/or relations, synthetic data generation for AI learning processes in the medical field.
Speech Synthesis, for example voice conversion (VC), speech-to-text (STT), voice cloning, zero-shot learning.
Speech Recognition, e.g. Automatic Speech Recognition (ASR), Multi-Lingual Speech Recognition.
Speaker Recognition, e.g. Automatic Speaker Recognition and Verification (ASV), Multi-Lingual Speaker Recognition
Emotion Recognition from Speech, Text, Video/Images, Multimodal, e.g. Transformer-based Models, Acoustic- , Linguistic- (Language Models), and Visual Models (Facial Expression, Landmarks).
Crowd-based AI support, *e.g. automated online orchestrated crowd- and expert sourcing hybrid AI+Human workflows for high quality data acquisition.
AI in the area of pre-trained language models, transfer-learning, cross-lingual learning, continuous learning, frugal AI, LLMs, RLHF.

Motivation The advancing scientific development of technologies based on artificial intelligence (AI) promotes medical application potentials. The real use of these technologies by a large number of users such as citizens, public authorities, health care workers and small and medium-sized enterprises is confronted with the difficulty of data security and data protection. Particularly in the automated processing of medical data, innovative technologies often cannot be used, as the protection of identity is rightly a high priority due to the sensitive content. The protection of clinical data and the resulting difficulty in accessing it also means that machine learning (ML), for example for clinical diagnoses, prognoses and therapy or decision support, cannot be developed without major hurdles.

Aims and approach The project "AI-based Anonymisation of Personal Patient Data in Clinical Text and Speech Datasets" (Medinym) investigates the possibility of further utilisation of sensitive data by removing the sensitive information through anonymisation. In the project, two medical use cases, text-based data from the electronic patient file and voice data from diagnostic doctor-patient conversations, are implemented as examples. To this end, open technologies for anonymisation are being investigated, further developed and applied to real data in the project. The researchers are also investigating how the significance of such anonymised data can be preserved for further use. In addition, methods will be considered that prevent or impede misuse of the technology outside of the intended use case.

Innovations and perspectives Information-preserving anonymisation should make it possible to further process clinical data, since de-anonymisation is no longer possible. These data sets can then be used to train AI models on clinical data in a privacy-compliant manner or be extended to other cohorts. This would make a cumulative collection of corresponding data sets possible even for small and medium-sized enterprises. For in this way, sensitive data could be combined across several application purposes and used for AI training routines; always assuming appropriate anonymisation. The intended anonymisation should also increase the willingness of patients to consent to participation in studies, data analyses and general donations of health data. Finally, information-preserving anonymisation allows the integration of the technology into common development methods and diagnostic systems and thus strengthens Germany as a science and business location in the areas of diagnostics, treatment and thus health care in general.

Lead: Dr. Tim Polzehl Dr. Tim Polzehl leads the AI-based developments in the area of speech-based applications of the Speech and Language Technology department at DFKI.In addition, he leads the area of "Next Generation Crowdsourcing and Open Data" and is an active member of the "Speech Technolgy" group of the Quality and Usability Labs (QU-Labs) at the Technical University of Berlin.

Profile DFKI: https://www-live.dfki.de/web/ueber-uns/mitarbeiter/person/tipo02

Profile QU-Labs TU-Berlin: https://www.tu.berlin/index.php?id=29499/

Contact:tim.polzehl@dfki.de

Contact Person

Dr.-Ing. Tim Polzehl

Tim.Polzehl@dfki.de
Phone: +49 30 23895 1863

Dr. Roland Roller

Roland.Roller@dfki.de
Phone: +49 30 23895 1847

Keyfacts

Involved research areas

Speech and Language Technology

Head

Dr.-Ing. Tim Polzehl

Publications

All publications

Exploring Foundation Model Fusion Effectiveness and Explainability for Stylistic Analysis of Emotional Podcast Data
Arnab Das; Carlos Franzreb; Tim Polzehl; Sebastian Möller
In: Advances in Information and Communication. Future of Information and Communication Conference (FICC-2025), located at FICC-2025, March 4-5, Berlin, Germany, Springer Nature, Switzerland, 2025.
Speecher: Towards Privacy Ensuring Decoder Only Speech Reconstruction Through Disentanglement for German Speech Anonymization Using Any-to-Many Voice Conversion
Arnab Das; Carlos Franzreb; Suhita Ghosh; Tim Polzehl; Sebastian Möller
In: ISCA Archive. Symposium on Security and Privacy in Speech Communication (SPSC-2024), located at SPSC, ISCA, 9/2024.
Comparing Speech Anonymization Efficacy by Voice Conversion Using KNN and Disentangled Speaker Feature Representations
Arnab Das; Carlos Franzreb; Tim Herzig; Philipp Pirlet; Tim Polzehl
In: Isca Archive. Symposium on Security and Privacy in Speech Communication (SPSC-2024), Voice Privacy Challenge (VPC), ISCA, 2024.

Project | Medinym

AI-based anonymization of personal patient data in clinical text and voice databases

Research Topics

Application fields

Contact Person

Keyfacts

Involved research areas

Head

Publications

Exploring Foundation Model Fusion Effectiveness and Explainability for Stylistic Analysis of Emotional Podcast Data

Speecher: Towards Privacy Ensuring Decoder Only Speech Reconstruction Through Disentanglement for German Speech Anonymization Using Any-to-Many Voice Conversion

Comparing Speech Anonymization Efficacy by Voice Conversion Using KNN and Disentangled Speaker Feature Representations

Funding Authorities

BMBF - Federal Ministry of Education, Science, Research and Technology

Research Topics

Application fields

Share project:

Contact Person

Keyfacts

Involved research areas

Head

Related projects

Exploring Foundation Model Fusion Effectiveness and Explainability for Stylistic Analysis of Emotional Podcast Data

Speecher: Towards Privacy Ensuring Decoder Only Speech Reconstruction Through Disentanglement for German Speech Anonymization Using Any-to-Many Voice Conversion

Comparing Speech Anonymization Efficacy by Voice Conversion Using KNN and Disentangled Speaker Feature Representations

Funding Authorities

BMBF - Federal Ministry of Education, Science, Research and Technology