Detecting Textual Information in Images from Onion Domains Using Text Spotting

Pablo Blanco; Eduardo Fidalgo; Enrique Alegre; Mhd Wesam Al-Nabki

doi:10.17979/SPUDC.9788497497565.0975

Detecting Textual Information in Images from Onion Domains Using Text Spotting

1 Universidad de León

Universidad de León

León, España

ROR https://ror.org/02tzt0b78

Book:

XXXIX Jornadas de Automática: actas. Badajoz, 5-7 de septiembre de 2018

Inés Tejado Balsera (coord.)
Emiliano Pérez Hernández (coord.)
Antonio José Calderón Godoy (coord.)
Isaías González Pérez (coord.)
Pilar Merchán García (coord.)
Jesús Lozano Rogado (coord.)
Santiago Salamanca Miño (coord.)
Blas M. Vinagre Jara (coord.)

Publisher: Universidad de Extremadura

ISBN: 978-84-9749-756-5, 978-84-09-04460-3

Year of publication: 2018

Pages: 975-982

Congress: Jornadas de Automática (39. 2018. Badajoz)

Type: Conference paper

DOI: 10.17979/SPUDC.9788497497565.0975 DIALNET GOOGLE SCHOLAR RUC editor

Abstract

Due to the efforts of different authorities in the fight against illegal activities in the Tor networks, the traders have developed new ways of circumventing the monitoring tools used to obtain evidence of said activities. In particular, embedding textual content into graphical objects avoids that text analysis, using Natural Language Processing (NLP) algorithms, can be used for watching such onion web contents. In this paper, we present a Text Spotting framework dedicated to detecting and recognizing textual information within images hosted in onion domains. We found that the Connectionist Text Proposal Network and Convolutional Recurrent Neural Network achieve 0.57 F-Measure when running the combined pipeline on a subset of 100 images labeled manually obtained from TOIC dataset. We also identified the parameters that have a critical influence on the Text Spotting results. The proposed technique might support tools to help the authorities in detecting these activities.

Data source: Dialnet

Detecting Textual Information in Images from Onion Domains Using Text Spotting

Universidad de León

Abstract