Multi-Task Learning for Simultaneous Video Generation and Remote Photoplethysmography Estimation
Yun-Yun Tsou (National Tsing Hua University)*, Yi-An Lee (National Tsing Hua University), Chiou-Ting Hsu (National Tsing Hua University)
Keywords: Face, Pose, Action, and Gesture
Abstract:
Remote photoplethysmography (rPPG) is a contactless method for estimating physiological signals from facial videos. Without large supervised datasets, learning a robust rPPG estimation model is extremely challenging. Instead of merely focusing on model learning, we believe data augmentation may be of greater importance for this task. In this paper, we propose a novel multi-task learning framework to simultaneously augment training data while learning the rPPG estimation model. We design three joint-learning networks: rPPG estimation network, Image-to-Video network, and Video-to-Video network, to estimate rPPG signals from face videos, to generate synthetic videos from a source image and a specified rPPG signal, and to generate synthetic videos from a source video and a specified rPPG signal, respectively. Experimental results on three benchmark datasets, COHFACE, UBFC, and PURE, show that our method successfully generates photo-realistic videos and significantly outperforms existing methods with a large margin.
SlidesLive
Similar Papers
Condensed Movies: Story Based Retrieval with Contextual Embeddings
Max Bain (University of Oxford)*, Arsha Nagrani (Oxford University ), Andrew Brown (University of Oxford), Andrew Zisserman (University of Oxford)

Play Fair: Frame Contributions in Video Models
Will Price (University of Bristol)*, Dima Damen (University of Bristol)

RealSmileNet: A Deep End-To-End Network for Spontaneous and Posed Smile Recognition
Yan Yang (Australian National University)*, Md Zakir Hossain (The Australian National University ), Tom Gedeon (The Australian National University), Shafin Rahman (North South University)
