Visible to the public Real-Time Neural Style Transfer for Videos

TitleReal-Time Neural Style Transfer for Videos
Publication TypeConference Paper
Year of Publication2017
AuthorsHuang, H., Wang, H., Luo, W., Ma, L., Jiang, W., Zhu, X., Li, Z., Liu, W.
Conference Name2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Date PublishedJuly 2017
ISBN Number978-1-5386-0457-1
Keywordscontent information, fast style transfer, feed-forward convolutional neural networks, feed-forward network, feedforward neural nets, hybrid loss, image colour analysis, image sequences, image style, image texture, input frames, learning (artificial intelligence), Metrics, neural nets, neural style transfer, Optical imaging, optical losses, Optimization, pubcrawl, real-time neural style transfer, Real-time Systems, resilience, Resiliency, Scalability, style information, stylized video frames, temporal consistency, temporal information, temporal loss, temporally consistent stylized videos, trained network, Training, training stage, two-frame synergic training mechanism, video signal processing, video style transfer method, Videos

Recent research endeavors have shown the potential of using feed-forward convolutional neural networks to accomplish fast style transfer for images. In this work, we take one step further to explore the possibility of exploiting a feed-forward network to perform style transfer for videos and simultaneously maintain temporal consistency among stylized video frames. Our feed-forward network is trained by enforcing the outputs of consecutive frames to be both well stylized and temporally consistent. More specifically, a hybrid loss is proposed to capitalize on the content information of input frames, the style information of a given style image, and the temporal information of consecutive frames. To calculate the temporal loss during the training stage, a novel two-frame synergic training mechanism is proposed. Compared with directly applying an existing image style transfer method to videos, our proposed method employs the trained network to yield temporally consistent stylized videos which are much more visually pleasant. In contrast to the prior video style transfer method which relies on time-consuming optimization on the fly, our method runs in real time while generating competitive visual results.

Citation Keyhuang_real-time_2017