Color Sift Descriptors to Categorize Illegal Activities in Images of Onion Domains
- David Matilla 1
- Víctor González-Castro 1
- Laura Fernández-Robles 1
- Eduardo Fidalgo 1
- Mhd Wesam Al-Nabk 1
-
1
Universidad de León
info
- Inés Tejado Balsera (coord.)
- Emiliano Pérez Hernández (coord.)
- Antonio José Calderón Godoy (coord.)
- Isaías González Pérez (coord.)
- Pilar Merchán García (coord.)
- Jesús Lozano Rogado (coord.)
- Santiago Salamanca Miño (coord.)
- Blas M. Vinagre Jara (coord.)
Publisher: Universidad de Extremadura
ISBN: 978-84-9749-756-5, 978-84-09-04460-3
Year of publication: 2018
Pages: 991-997
Congress: Jornadas de Automática (39. 2018. Badajoz)
Type: Conference paper
Abstract
Dark Web, i.e. the portion of the Web whose content is not indexed either accessible by standard web browsers, comprises several darknets. The Onion Router (Tor) is the most famous one, thanks to the anonymity provided to its users, and it results in the creation of domains, or hidden services, which hosts illegal activities. In this work, we explored the possibility of identifying illegal domains on Tor darknet based on its visual content. After crawling and filtering the images of 500 hidden services, we sorted them into five different illegal categories, and we trained a classifier using the Bag of Visual Words (BoVW) model. In this model, SIFT (Scale Invariant Feature Transform) or dense SIFT were used as the descriptors of the images patches to compute the visual words of the BoVW model. However, SIFT only works with gray-scale images; thus the information given by color in an image is not retrieved. To overcome this drawback, in this work we implemented and assessed the performance of three different variants of SIFT descriptors that can be used in color images, namely HSV-SIFT, RGB-SIFT and the BoVW model for image classification. The obtained results showed the usefulness of using color-SIFT descriptors instead of SIFT, whereas in our experiments the latter achieved an accuracy of 57.52%, the HSV-SIFT descriptor achieved an accuracy up to 59.44%.