Data-efficient deep learningage, expression, pornography, and sexual activity detection for crime prevention

  1. Abhishek Gangwar Kumar
Supervised by:
  1. Víctor González Castro Director
  2. Enrique Alegre Gutiérrez Director

Defence university: Universidad de León

Year of defence: 2022

Committee:
  1. Laura Fernández Robles Chair
  2. Ana Lucila Sandoval Orozco Secretary
  3. Sandra Eliza Fontes de Ávila Committee member

Type: Thesis

Abstract

The growing number of cases of possession and distribution of Child Sexual Abuse (CSA) material poses a significant challenge for Law Enforcement Agencies (LEAs). Similarly, pornographic content detection is also an active area of research. In this Thesis, we propose different novel methods for CSA and pornographic material recognition. Our strategy lies in dividing the problem of automatically detecting CSA into two simpler ones: (i) pornographic content detection and (ii) age-group classification of a person as a minor or an adult. This approach not only simplifies the problem but also makes it feasible to create massive labelled datasets, especially to train deep neural networks. With this goal in mind, first, we proposed a deep Convolutional Neural Network (CNN) architecture with a novel attention mechanism and metric learning, denoted as AttMCNN, for these tasks. Furthermore, pornography detection and the age-group classification networks were combined for CSA detection using two different strategies: decision level fusion for binary CSA classification and score level fusion for the re-arrangement of the suspicious images. We also introduced two new datasets: (i) Pornographic-2M, which contains two million pornographic images, and (ii) Juvenile-80k, including 80k manually labelled images with apparent facial age. Our approach provided better results than other state-of-the-art methods on a test dataset comprising one million adult pornographic images, one million non-pornographic images, and 5,000 real CSA images provided to us by Police Forces. The methods and datasets introduced in this Thesis are useful for the research community and our work extended the state-of-the-art in pornography detection, age estimation, and CSA detection. All these methods can help LEAs in the task of automatic CSA detection more accurately.