TY - JOUR
T1 - Multimodal hand gesture recognition combining temporal and pose information based on CNN descriptors and histogram of cumulative magnitudes
AU - Escobedo Cardenas, Edwin Jonathan
AU - Chavez, Guillermo Camara
N1 - Funding Information:
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001, the Postgraduate Program in Computer Science (PPGCC) at the Federal University of Ouro Preto (UFOP), the FAPEMIG (Fundação de Amparo à Pesquisa do Estado de Minas Gerais, research project APQ 01517-17) and the funding Brazilian agency CNPq.
Publisher Copyright:
© 2020 Elsevier Inc.
PY - 2020/8
Y1 - 2020/8
N2 - In this paper, we present a new approach for dynamic hand gesture recognition. Our goal is to integrate spatiotemporal features extracted from multimodal data captured by the Kinect sensor. In case the skeleton data is not provided, we apply a novel skeleton estimation method to compute temporal features. Furthermore, we introduce an effective method to extract a fixed number of keyframes to reduce the processing time. To extract pose features from RGB-D data, we take advantage of two different approaches: (1) Convolutional Neural Networks and (2) Histogram of Cumulative Magnitudes. We test different integration methods to fuse the extracted spatiotemporal features to boost recognition performance in a linear SVM classifier. Extensive experiments prove the effectiveness and feasibility of the proposed framework for hand gesture recognition.
AB - In this paper, we present a new approach for dynamic hand gesture recognition. Our goal is to integrate spatiotemporal features extracted from multimodal data captured by the Kinect sensor. In case the skeleton data is not provided, we apply a novel skeleton estimation method to compute temporal features. Furthermore, we introduce an effective method to extract a fixed number of keyframes to reduce the processing time. To extract pose features from RGB-D data, we take advantage of two different approaches: (1) Convolutional Neural Networks and (2) Histogram of Cumulative Magnitudes. We test different integration methods to fuse the extracted spatiotemporal features to boost recognition performance in a linear SVM classifier. Extensive experiments prove the effectiveness and feasibility of the proposed framework for hand gesture recognition.
KW - Convolucional neuronal networks
KW - Fusion schemes
KW - Hand gesture recognition
KW - Histogram of cumulative magnitudes
KW - Keyframe extraction
KW - Pose and motion information
KW - Spherical coordinates
UR - http://www.scopus.com/inward/record.url?scp=85086633883&partnerID=8YFLogxK
U2 - 10.1016/j.jvcir.2020.102772
DO - 10.1016/j.jvcir.2020.102772
M3 - Artículo (Contribución a Revista)
AN - SCOPUS:85086633883
SN - 1047-3203
VL - 71
JO - Journal of Visual Communication and Image Representation
JF - Journal of Visual Communication and Image Representation
M1 - 102772
ER -