An Internal Learning Approach to Video Inpainting. Motivation & Design. A novel deep learning architecture is proposed which contains two subnetworks: a temporal structure inference network and a spatial detail recovering network. $L_p(\hat{I_i}) = \sum_{k \in K} || \psi_k (M_i) \odot (\phi_k (\hat{I_i}) - \phi_k(I_i)) ||_2^2$. 3 layers {relu1_2, relu2_2, relu3_3} of VGG16 pre-trained. $L_c(\hat{I_j}, \hat{F_{i,j}}) = || (1-M_{i,j}^f) \odot ( \hat{I_j}(\hat{F_{i,j}}) - \hat{I_i}) ||_2^2$. Internal Learning. from frame $I_i$ to frame $I_j$. $M^f_{i,j} = M_i \cap M_j (F_{i,j})$. In this work, we approach video inpainting with an internal learning formulation. Keyword [Deep Image Prior] Zhang H, Mai L, Xu N, et al. Arjovsky, S. Chintala, and L. Bottou (2017) Wasserstein gan. Video inpainting has also been used as a self-supervised task for deep feature learning which has a different goal from ours. The convolutional encoder–decoder network is developed. 