TY - GEN
T1 - Fake News Detection in a Real-World Spanish Dataset
T2 - 17th Mexican Conference on Artificial Intelligence, COMIA 2025
AU - Nina, Fiorella
AU - Arana, Angelina
AU - Escobedo, Edwin
AU - Dávila, Guillermo
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - The rise of fake news on digital platforms has led to a growing interest in the development of automatic detection models. This research evaluates the effectiveness of different natural language processing (NLP) algorithms to identify fake news in Spanish. For this purpose, we implemented five encoding models: GloVe, BART, XLNet, DeBerta, and RoBERTa, combined with six classifier models: LSTM, GRU, DNC, CNN, BETO and ANN. Additionally, we develop a proprietary Spanish-language dataset which includes real and fake news collected from digital newspapers and social media. The models were evaluated using accuracy, recall, F1-score, and AUC-ROC metrics. The GloVe-DNC model achieved the best overall performance, with an accuracy of 0.96, an F1-score of 0.97, a recall of 0.98, and an AUC-ROC of 0.98. In contrast, the BART-LSTM and RoBERTa-ANN models showed the lowest overall results, with AUC-ROC scores of 0.63 and 0.71, respectively. These findings contribute to the field of fake news detection in Spanish and highlight potential directions for further research, such as expanding the dataset and exploring new model architectures to enhance detection accuracy and effectiveness.
AB - The rise of fake news on digital platforms has led to a growing interest in the development of automatic detection models. This research evaluates the effectiveness of different natural language processing (NLP) algorithms to identify fake news in Spanish. For this purpose, we implemented five encoding models: GloVe, BART, XLNet, DeBerta, and RoBERTa, combined with six classifier models: LSTM, GRU, DNC, CNN, BETO and ANN. Additionally, we develop a proprietary Spanish-language dataset which includes real and fake news collected from digital newspapers and social media. The models were evaluated using accuracy, recall, F1-score, and AUC-ROC metrics. The GloVe-DNC model achieved the best overall performance, with an accuracy of 0.96, an F1-score of 0.97, a recall of 0.98, and an AUC-ROC of 0.98. In contrast, the BART-LSTM and RoBERTa-ANN models showed the lowest overall results, with AUC-ROC scores of 0.63 and 0.71, respectively. These findings contribute to the field of fake news detection in Spanish and highlight potential directions for further research, such as expanding the dataset and exploring new model architectures to enhance detection accuracy and effectiveness.
UR - https://www.scopus.com/pages/publications/105018905881
U2 - 10.1007/978-3-031-97913-2_10
DO - 10.1007/978-3-031-97913-2_10
M3 - Articulo (Contribución a conferencia)
AN - SCOPUS:105018905881
SN - 9783031979125
T3 - Communications in Computer and Information Science
SP - 120
EP - 132
BT - Artificial Intelligence, COMIA 2025 - 17th Mexican Congress, Proceedings
A2 - Martínez-Villaseñor, Lourdes
A2 - Martínez-Seis, Bella
A2 - Pichardo, Obdulia
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 12 May 2025 through 16 May 2025
ER -