<  Back to the Polytechnique Montréal portal

Continuous conditional video synthesis by neural processes

Xi Ye and Guillaume-Alexandre Bilodeau

Article (2025)

Open Acess document in PolyPublie and at official publisher
[img]
Preview
Open Access to the full text of this document
Published Version
Terms of Use: Creative Commons Attribution Non-commercial
Download (1MB)
Show abstract
Hide abstract

Abstract

Different conditional video synthesis tasks, such as frame interpolation and future frame prediction, are typically addressed individually by task-specific models, despite their shared underlying characteristics. Additionally, most conditional video synthesis models are limited to discrete frame generation at specific integer time steps. This paper presents a unified model that tackles both challenges simultaneously. We demonstrate that conditional video synthesis can be formulated as a neural process, where input spatio-temporal coordinates are mapped to target pixel values by conditioning on context spatio-temporal coordinates and pixel values. Our approach leverages a Transformer-based non-autoregressive conditional video synthesis model that takes the implicit neural representation of coordinates and context pixel features as input. Our task-specific models outperform previous methods for future frame prediction and frame interpolation across multiple datasets. Importantly, our model enables temporal continuous video synthesis at arbitrary high frame rates, outperforming the previous state-of-the-art.

Uncontrolled Keywords

Supplementary Material:
Department: Department of Computer Engineering and Software Engineering
Research Center: LITIV - Images and video processing laboratory
Funders: NSERC, FRQ-NT
Grant number: RGPIN-2020-04633
PolyPublie URL: https://publications.polymtl.ca/66039/
Journal Title: Computer Vision and Image Understanding (vol. 259)
Publisher: Elsevier BV
DOI: 10.1016/j.cviu.2025.104387
Official URL: https://doi.org/10.1016/j.cviu.2025.104387
Date Deposited: 10 Jun 2025 09:42
Last Modified: 11 Feb 2026 04:15
Cite in APA 7: Ye, X., & Bilodeau, G.-A. (2025). Continuous conditional video synthesis by neural processes. Computer Vision and Image Understanding, 259, 104387 (11 pages). https://doi.org/10.1016/j.cviu.2025.104387

Statistics

Total downloads

Downloads per month in the last year

Origin of downloads

Dimensions

Repository Staff Only

View Item View Item