Dense-Scale Feature Learning in Person Re-Identification
Li Wang (Inspur), Baoyu Fan (Inspur Electronic Information Industry Co.,Ltd.)*, Zhenhua Guo (Inspur Electronic Information Industry Co.,Ltd.), Yaqian Zhao (Inspur), Runze Zhang (Inspur Electronic Information Industry Co.,Ltd.), Rengang Li (Inspur), Weifeng Gong ( Inspur Electronic Information Industry Co.,Ltd.)
Keywords: Applications of Computer Vision, Vision for X
Abstract:
For mass pedestrians re-identification (Re-ID), models must be capable of representing extremely complex and diverse multi-scale features. However, existing models only learn limited multi-scale features in a multi-branches manner, and directly expanding the number of scale branches for more scales will confuse the discrimination and affect performance. Because for a specific input image, there are a few scale features that are critical. In order to fulfill vast scale representation for person Re-ID and solve the contradiction of excessive scale declining performance, we proposed a novel Dense-Scale Feature Learning Network (DSLNet) which consist of two core components: Dense Connection Group (DCG) for providing abundant scale features, and Channel-Wise Scale Selection (CSS) module for dynamic select the most discriminative scale features to each input image. DCG is composed of a densely connected convolutional stream. The receptive field gradually increases as the feature flows along the convolution stream. Dense shortcut connections provide much more fused multi-scale features than existing methods. CSS is a novel attention module different from any existing model which calculates attention along the branch direction. By enhancing or suppressing specific scale branches, truly channel-wised multi-scale selection is realized. To the best of our knowledge, DSLNet is most lightweight and achieves state-of-the-art performance among lightweight models on four commonly used Re-ID datasets, surpassing most large-scale models.