Abstract:
Recently, emotion recognition through gait, which is more difficult to imitate than other biological characteristics, has aroused extensive attention. Although some deep-learning studies have been conducted in this field, there are still two challenges. First, it is hard to extract the representational features of the gait from video effectively. Second, the input of body joints sequences has noise introduced during dataset collection and feature production. In this work, we propose a global link, which extends the existing skeleton graph (the natural link) to capture the overall state of gait based on spatial-temporal convolution. In addition, we use soft thresholding to reduce noise. The thresholds are learned automatically by a block called shrinkage block. Combined with the global link and shrinkage block, we further propose the global graph convolution shrinkage network (G-GCSN) to capture the emotion-related features. We validate the effectiveness of the proposed method on a public dataset (i.e., Emotion-Gait dataset). The proposed G-GCSN achieves improvements compared with state-of-the-art methods.