Pattern recognition in hate speech detection: a neural network and ensemble approach

Authors

DOI:

https://doi.org/10.33448/rsd-v14i5.48633

Keywords:

Hate Speech, Machine Learning, Natural Language Processing.

Abstract

Hate speech on online platforms is a growing problem with significant social impacts. This work proposes an approach for binary classification of hate speech in Portuguese texts using machine learning and deep learning algorithms. The experiments were conducted on an annotated dataset, with textual representations generated by pre-trained GloVe word embeddings. The voting-based model, which combines the outputs of the base classifiers, achieved the best overall performance, reaching an F1-score of 0.76. The results demonstrate the effectiveness of neural networks, especially in capturing complex textual patterns, and highlight the potential of combined approaches for the hate speech classification task. This study reinforces the importance of exploring diverse architectures and preprocessing techniques tailored to the specific characteristics of the Portuguese language.

Downloads

Download data is not yet available.

References

Asogwa, D. C., Chukwuneke, C. I., Ngene, C. C., & Anigbogu, G. N. (2022). Hate speech classification using SVM and Naive Bayes. IOSR Journal of Mobile Computing & Application (IOSR-JMCA), 9(1), 27–34. https://doi.org/10.9790/0050-09012734

Cortes, C. (1995). Support-vector networks. Machine Learning.

Badjatiya, P., Gupta, S., Gupta, M., & Varma, V. (2017). Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion (pp. 759–760).

d’Sa, A. G., Illina, I., & Fohr, D. (2020). Classification of hate speech using deep neural networks. Revue d’Information Scientifique & Technique, 25(1). HAL Id: hal-03101938. https://hal.science/hal-03101938v1

Ferreira, M. C. dos S., & Teixeira, T. (2024). Social media and political polarization as threats to democracy. Research, Society and Development, 13(7), e7713746214–e7713746214.

Fortuna, P., da Silva, J. R., Wanner, L., Nunes, S., et al. (2019). A hierarchically-labeled Portuguese hate speech dataset. In Proceedings of the Third Workshop on Abusive Language Online (pp. 94–104).

Fortuna, P., & Nunes, S. (2018). A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), 51(4), 1–30.

Gambäck, B., & Sikdar, U. K. (2017). Using convolutional neural networks to classify hate-speech. In Proceedings of the First Workshop on Abusive Language Online (pp. 85–90). Vancouver, Canada: Association for Computational Linguistics. https://aclanthology.org/W17-3013

Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., & Aluísio, S. (2017). Portuguese word embeddings: Evaluating on word analogies and natural language tasks. arXiv preprint, arXiv:1708.06025.

Hochreiter, S. (1997). Long short-term memory. Neural Computation MIT-Press.

Pereira, A. S., Shitsuka, D. M., Parreira, F. J., & Shitsuka, R. (2018). Metodologia da pesquisa científica. Brasil.

Rakhlin, A. (2016). Convolutional neural networks for sentence classification. GitHub, 6, 25.

Silva, S. C., & Serapião, A. B. S. (2018). Detecção de discurso de ódio em português usando CNN combinada a vetores de palavras. In Anais do VI Symposium on Knowledge Discovery, Mining and Learning (pp. 1–8). SBC.

Waseem, Z., & Hovy, D. (2016). Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In Proceedings of the NAACL Student Research Workshop (pp. 88–93).

Published

2025-05-29

Issue

Section

Exact and Earth Sciences

How to Cite

Pattern recognition in hate speech detection: a neural network and ensemble approach. Research, Society and Development, [S. l.], v. 14, n. 5, p. e10814548633, 2025. DOI: 10.33448/rsd-v14i5.48633. Disponível em: https://ojs34.rsdjournal.org/index.php/rsd/article/view/48633. Acesso em: 28 jun. 2025.