Skip to main navigation Skip to search Skip to main content

Facial Expressions Recognition in Sign Language Based on a Two-Stream Swin Transformer Model Integrating RGB and Texture Map Images

Research output: Contribution to journalArticle (Contribution to Journal)peer-review

Abstract

The study of facial expressions in sign language has become a significant research area, as these expressions not only convey personal states, but also enhance the meaning of signs within specific contexts. The absence of facial expressions during communication can lead to misinterpretations, underscoring the need for datasets that include facial expressions in sign language. To address this, we present the Facial-BSL dataset, which consists of videos capturing eight distinct facial expressions used in Brazilian Sign Language. Additionally, we propose a two-stream model designed to classify facial expressions in a sign language context. This model utilizes RGB images to capture local facial information and texture map images to record facial movements. We assessed the performance of several deep learning architectures within this two-stream framework, including Convolutional Neural Networks (CNNs) and Vision Transformers. In addition, experiments were conducted using public datasets such as CK+, KDEF-dyn, and LIBRAS. The two-stream architecture based on the Swin Transformer model demonstrated superior performance on the KDEF-dyn and LIBRAS datasets and achieved a second-place ranking on the CK+ dataset, with an accuracy of 97% and an F1-score of 95%.

Original languageEnglish
Pages (from-to)773-792
Number of pages20
JournalComputacion y Sistemas
Volume29
Issue number2
DOIs
StatePublished - 2025

Fingerprint

Dive into the research topics of 'Facial Expressions Recognition in Sign Language Based on a Two-Stream Swin Transformer Model Integrating RGB and Texture Map Images'. Together they form a unique fingerprint.

Cite this