Pornography Detection in Video Benefits (a lot) from a Multi-modal Approach
Adrian Ulges; Christian Schulze; Damian Borth; Armin Stahl
In: Proceedings of the International Conference on Multimedia. ACM International Workshop on Audio and Multimedia Methods for Large-Scale Video Analysis (AMVA-2012), located at ACM Multimedia 2012, October 29 - November 2, Nara, Japan, ACM, 10/2012.
We address the challenge of detecting pornographic content in video streams. On offensive material crawled from dif- ferent pornographic websites and non-offensive clips from YouTube (a total of 500 hours of video), we first study a compressed-domain activity descriptor based on MPEG motion compensation vectors. We show that the approach offers an interesting alternative but generalizes poorly be- tween videos compressed with different codecs, a problem that can be overcome to some extent by adding noise to the image data prior to video compression.
Our main contribution is an evaluation that benchmarks the above motion-based descriptor as well as three other widely used features (audio-based MFCC features, skin color detection, and visual words). Here, we show that a multi- modal approach is a key strategy for an accurate detection or adult content: A combination of the different features gives considerable improvements in accuracy, reducing equal error by 3656% compared to the best uni-modal system.