Botnet Detection on TCP Traffic Using Supervised Machine Learning

  1. Javier Velasco-Mata 12
  2. Eduardo Fidalgo 12
  3. Víctor González-Castro 12
  4. Enrique Alegre 12
  5. Pablo Blanco-Medina 12
  1. 1 Universidad de León
    info

    Universidad de León

    León, España

    ROR https://ror.org/02tzt0b78

  2. 2 INCIBE (Spanish National Cybersecurity Institute, León)
Libro:
Hybrid Artificial Intelligent Systems. 14th International Conference, HAIS 2019: León, Spain, September 4–6, 2019. Proceedings
  1. Hilde Pérez García (coord.)
  2. Lidia Sánchez González (coord.)
  3. Manuel Castejón Limas (coord.)
  4. Héctor Quintián Pardo (coord.)
  5. Emilio Corchado Rodríguez (coord.)

Editorial: Springer Suiza

ISBN: 978-3-030-29859-3 978-3-030-29858-6

Año de publicación: 2019

Páginas: 444-455

Congreso: Hybrid Artificial Intelligent Systems (14. 2019. León)

Tipo: Aportación congreso

Resumen

The increase of botnet presence on the Internet has made it necessary to detect their activity in order to prevent them to attack and spread over the Internet. The main methods to detect botnets are traffic classifiers and sinkhole servers, which are special servers designed as a trap for botnets. However, sinkholes also receive non-malicious automatic online traffic and therefore they also need to use traffic classifiers. For these reasons, we have created two new datasets to evaluate classifiers: the TCP-Int dataset, built from publicly available TCP Internet traces of normal traffic and of three botnets, Kelihos, Miuref and Sality; and the TCP-Sink dataset based on traffic from a private sinkhole server with traces of the Conficker botnet and of automatic normal traffic. We used the two datasets to test four well-known Machine Learning classifiers: Decision Tree, k-Nearest Neighbours, Support Vector Machine and Naïve Bayes. On the TCP-Int dataset, we used the F1 score to measure the capability to identify the type of traffic, i.e., if the trace is normal or from one of the three considered botnets, while on the TCP-Sink we used ROC curves and the corresponding AUC score since it only presents two classes: non-malicious or botnet traffic. The best performance was achieved by Decision Tree, with a 0.99 F1 score and a 0.99 AUC score on the TCP-Int and the TCP-Sink datasets respectively.