The AI Safety Report analyzes current and future AI systems that can perform a wide range of tasks, so-called "General-Purpose AI." These systems' capabilities, which many people experienced for the first time through the ChatGPT application, have improved rapidly in recent months. In the report, the experts point out both the opportunities and potential risks that could arise from malfunctions.
Antonio Krüger: “The report provides a good, comprehensive assessment of the risks of AI. I see the greatest danger in faulty AI systems. AI systems that do not function in our interests could, as an important part of everyday life, have a creeping influence on society.”
The report mentions the lack of reliability of current systems. AI can cause damage if it hallucinates, provides false information, or does not consider causalities and contextual knowledge. The report also lists bias as a weakness of general-purpose AI. Biased AI systems produce distorted results because, for example, they are predominantly trained with English language data and are Western-influenced. A possible loss of human control over the technology is also raised as a controversial point.
Other challenges identified in the report include systemic risks, such as a global AI divide between countries, adverse environmental impacts, privacy violations, copyright issues, and criminals' malicious use of technology.
“Areas that are labeled trustworthy AI should be considered in the development and operation of AI. Particularly neurosymbolic systems, which combine data-driven building blocks and formalized world knowledge, could be a great asset to the reliability of AI,” explains Krüger.
The report underscores that privacy must be ensured throughout an AI system's life cycle. Risk avoidance with general-purpose AI cannot start with the finished AI model. It must begin during development with training that generates more robust results, such as by increasing data quality. In addition, the AI system should be monitored during operation. A system's explainability can help here.
DFKI is already addressing many open questions in several initiatives:
CERTAIN, the "Centre for European Research in Trusted AI," works towards guarantees for safe AI and conducts research on various aspects of trustworthy AI, including human oversight. Researchers investigate when a human can intervene in a running system; for that purpose, they have examined technical (explainability) and non-technical requirements [1].
MISSION KI focuses on the certification of AI, with a particular focus on SMEs.
Scientists at DFKI Darmstadt are investigating how to design AI systems that incorporate general-purpose AI so that their output contains less bias and fewer undesirable concepts [2]. For example, they discovered that a language model's answers to safety-critical questions differ significantly depending on the language [3]. It is also of interest that AI agents do not cause any damage with their actions.
"Part of the German and European AI strategy must be to invest more in safety-relevant technologies, to focus on promoting innovation as a solution and not just aiming to contain risks with the help of regulation," says Krüger.
The report is preparatory work for the AI Action Summit, which will take place in Paris on February 10 and 11, 2025.
Communications & Media, DFKI