Das DFKI stellt viele seiner Entwicklungen als Open Source bereit. Entdecken Sie hier eine Auswahl unserer frei zugänglichen Softwareprojekte und Datensätze, die Sie auch auf Plattformen wie GitHub oder HuggingFace finden.
The goal of Ophthalmo-AI is to develop better diagnostic and therapeutic decision support in ophthalmology through effective collaboration of machine and human expertise (Interactive Machine Learning – IML). The DFKI aims to integrate clinical guidelines and knowledge of medical professionals (expert knowledge/or human intelligence) interactively with machine learning (artificial intelligence) into the diagnostic process in a so-called augmented intelligence system.
A dataset specifically designed for spherical keypoint detection, matching, and camera pose estimation. It addresses the limitations of existing datasets by providing extracted keypoints from popular handcrafted and learn-based detectors, along with their ground-truth correspondences. Synthetic scenes with photo-realistic rendering and accurate depth maps and 3D meshes are included, as well as real-world scenes acquired from different spherical cameras. SphereCraft enables the development and evaluation of algorithms and models targeting multiple camera viewpoints, aiming to advance the state-of-the-art in computer vision tasks involving spherical images.
We introduce the Open X-Embodiment Dataset, the largest open-source real robot dataset to date. It contains 1M+ real robot trajectories spanning 22 robot embodiments, from single robot arms to bi-manual robots and quadrupeds. The dataset was constructed by pooling 60 existing robot datasets from 34 robotic research labs around the world. Our analysis shows that the number of visually distinct scenes is well-distributed across different robot embodiments and that the dataset includes a wide range of common behaviors and household objects.
pytransform3d bietet verschiedene Funktionen zur Arbeit mit Transformationen, darunter Operationen wie Verkettung und Inversion für gängige Darstellungen von Rotation und Translation sowie deren Konvertierung. Es stellt eine klare Dokumentation der Transformationskonventionen bereit und ermöglicht durch die enge Kopplung mit matplotlib eine schnelle Visualisierung oder Animation.
Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the critic per environment sample. However, this comes at the expense of a greatly increased computational cost. To reduce this computational burden, we introduce CrossQ: a lightweight algorithm for continuous control tasks that makes careful use of Batch Normalization and removes target networks to surpass the current state-of-the-art in sample efficiency, while maintaining a low UTD ratio of 1. Notably, CrossQ does not rely on advanced bias-reduction schemes used in current methods.
As part of the RoBivaL project, various robot locomotion concepts from space research and agricultural applications are being compared with each other based on experiments conducted under agricultural conditions. The findings obtained are intended to promote technology and knowledge transfer between space research and agricultural research. To further strengthen this transfer, the recorded test data will be stored in a standardized test database developed as part of RoBivaL and made available for future research.
Combines explainability methods from the captum library with Hugging Face's datasets and transformers. Mitigates repetitive execution of common experiments in Explainable NLP and thus reduces the environmental impact and financial roadblocks. Increases comparability and replicability of research. Reduces the implementational burden.
A dataset of human manipulation actions recorded with a motion capture system. A Qualisys motion capture system was used to record the data. We tracked individual finger movements as well as the position and orientation of the right hand. Some recordings contain additional markers at the back, shoulder, and elbow.
MARS is a platform-independent simulation and visualization tool that was created for robotics research. It consists of a core framework that contains all the important simulation components: a GUI (based on Qt), a 3D visualization (using Open Scene Graph), and a physics engine (based on ODE). MARS was designed in a modular form and can be used very flexibly; for example, the physics simulation can be started without visualization and a GUI. It is also possible to extend MARS by creating your own plugins, thereby adding new functionality.
We introduce LlavaGuard, a family of VLM-based safeguard models, offering a versatile framework for evaluating the safety compliance of visual content. Specifically, we designed LlavaGuard for dataset annotation and generative model safeguarding. To this end, we collected and annotated a high-quality visual dataset incorporating a broad safety taxonomy, which we use to tune VLMs on context-aware safety risks. As a key innovation, LlavaGuard's responses contain comprehensive information, including a safety rating, the violated safety categories, and an in-depth rationale. Further, our introduced customizable taxonomy categories enable the context-specific alignment of LlavaGuard to various scenarios.