Multi-Task Learning for Simultaneous Video Generation and Remote Photoplethysmography Estimation

Yun-Yun  Tsou (National Tsing Hua University)*; Yi-An Lee (National Tsing Hua  University); Chiou-Ting Hsu (National Tsing Hua University)

Multi-Task Learning for Simultaneous Video Generation and Remote Photoplethysmography Estimation

Yun-Yun Tsou (National Tsing Hua University)*, Yi-An Lee (National Tsing Hua University), Chiou-Ting Hsu (National Tsing Hua University)

Keywords: Face, Pose, Action, and Gesture

Abstract: Remote photoplethysmography (rPPG) is a contactless method for estimating physiological signals from facial videos. Without large supervised datasets, learning a robust rPPG estimation model is extremely challenging. Instead of merely focusing on model learning, we believe data augmentation may be of greater importance for this task. In this paper, we propose a novel multi-task learning framework to simultaneously augment training data while learning the rPPG estimation model. We design three joint-learning networks: rPPG estimation network, Image-to-Video network, and Video-to-Video network, to estimate rPPG signals from face videos, to generate synthetic videos from a source image and a specified rPPG signal, and to generate synthetic videos from a source video and a specified rPPG signal, respectively. Experimental results on three benchmark datasets, COHFACE, UBFC, and PURE, show that our method successfully generates photo-realistic videos and significantly outperforms existing methods with a large margin.

Multi-Task Learning for Simultaneous Video Generation and Remote Photoplethysmography Estimation

Yun-Yun Tsou (National Tsing Hua University)*, Yi-An Lee (National Tsing Hua University), Chiou-Ting Hsu (National Tsing Hua University)

SlidesLive

Similar Papers

Condensed Movies: Story Based Retrieval with Contextual Embeddings

Max Bain (University of Oxford)*, Arsha Nagrani (Oxford University ), Andrew Brown (University of Oxford), Andrew Zisserman (University of Oxford)

Play Fair: Frame Contributions in Video Models

Will Price (University of Bristol)*, Dima Damen (University of Bristol)

RealSmileNet: A Deep End-To-End Network for Spontaneous and Posed Smile Recognition

Yan Yang (Australian National University)*, Md Zakir Hossain (The Australian National University ), Tom Gedeon (The Australian National University), Shafin Rahman (North South University)