As wearable sensors have become more and more ubiquitous one would expect human activity recognition (HAR) systems to increasingly move out of the lab into long term, real life scenarios. While the availability of sensor data as such is rapidly increasing with the spread of various sensor enabled personal devices, labelling such data remains a challenge. Recently the idea of using labeled videos to generate “synthetic” sensor data has been proposed as a solution for the HAR labeled data problem. Thus, when developing a new HAR application it would not be necessary anymore for the difficult process of recording large amounts of real sensor data from users actually wearing sensors and annotating their own sensor data. Instead, one could first collect videos related to the activities that need to be recognised from appropriate online sources. If not already assigned to the respective activities (which is often the case with online videos) the videos would then be labeled (which as described above is much easier then labeling sensor data) and converted to synthetic sensor data that can be used to train the system.
In this project we aim to make a systematic and focused effort to ease access to labeled training data for wearable and ubiquitous HAR systems by: -Improving the initial methods for the generation of IMU data from online videos; -Extending the methods to -leverage multimodal data (video, sound, textual description), -incorporate physical models of the sensor and the relevant activity, -explore the use of additional semantic information either extracted from the videos or given by the background knowledge; -Investigating the transferability of the resulting methods to other HAR sensing modalities such as our Head- Sense systems, textile pressure sensor mats or auditory scene analysis.