Oculus-Crawl, a Software Tool for Building Datasets for Computer Vision Tasks

  1. Iván de Paz Centeno
  2. Eduardo Fidalgo Fernández
  3. Enrique Alegre Gutiérrez
  4. Wesam Al Nabki
Libro:
Actas de las XXXVIII Jornadas de Automática
  1. Hilario López García (coord.)

Editorial: Servicio de Publicaciones ; Universidad de Oviedo

ISBN: 978-84-16664-74-0

Año de publicación: 2017

Páginas: 991-998

Congreso: Jornadas de Automática (38. 2017. Gijón)

Tipo: Aportación congreso

Resumen

Building datasets for computer vision tasks require a source of a large number of images, like the ones provided by the Internet search engines, joined with automated scraping tools, to construct them in a reasonable time. In this paper it is presented Oculus-Crawl, a tool designed to crawl and scrape images from the search engines Google and Yahoo Images to build datasets of pictures, that is modular, scalable and portable. It is also discussed a benchmark for this crawler and an internal feature for storing and sharing big datasets, that makes it suitable for computer vision and machine learning tasks. In our tests we were able to crawl and fetch 11.555 images in less than 14 minutes, including also their meta-data description, showing that it might be well-suited for retrieving large datasets.