0% found this document useful (0 votes)

70 views17 pages

Depth-Aware Video Frame Interpolation Supplementary Material

The document provides supplementary details for the paper "Depth-Aware Video Frame Interpolation". It describes the algorithm details, including the derivation of backpropagation for the depth-aware flow projection layer and adaptive warping layer. Network configurations are also specified, such as the U-Net architecture used for the kernel estimation network and residual blocks used for the frame synthesis network. Additional results on depth estimation, arbitrary frame interpolation, and limitations are referenced but not described.

Uploaded by

FELIPE GONZALEZ RESTREPO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views17 pages

Depth-Aware Video Frame Interpolation Supplementary Material

Uploaded by

FELIPE GONZALEZ RESTREPO

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Depth-Aware Video Frame Interpolation

Supplementary Material

Wenbo Bao1 Wei-Sheng Lai3 Chao Ma2 Xiaoyun Zhang1∗ Zhiyong Gao1 Ming-Hsuan Yang3,4
1
Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University
2
MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
3 4
University of California, Merced Google
https://sites.google.com/view/wenbobao/dain

1. Overview
In this supplementary document, we present additional results to complement the paper. First, we provide the algorithmic
details and network configuration of the proposed model. Second, we conduct additional analysis on the adaptive warping
layer and depth-aware flow projection layer and evaluate the performance of the depth estimation network. Finally, we
present more experimental results on interpolating arbitrary intermediate frames, qualitative comparisons with the state-of-
the-art video frame interpolation methods on the Middlebury and Vimeo90K datasets, as well as discussions of the limitations
on the HD video dataset. More video results are provided on our project website.

2. Algorithm Details and Network Configurations

We provide the derivation of the back-propagation in the proposed depth-aware flow projection layer, the algorithmic
details of the adaptive warping layer, and the configuration of our kernel estimation and frame synthesis networks.
2.1. Back-Propagation in Depth-Aware Flow Projection
Given the input flows (F0→1 and F1→0 ) and the depth maps (D0 and D1 ), our depth-aware flow projection layer generates
the intermediate flows Ft→0 and Ft→1 . The proposed model jointly optimizes the flow estimation and depth estimation
networks to achieve a better performance for video frame interpolation. The gradient of Ft→0 with respect to the input
optical flow F0→1 is calculated by:

w (y)
 −t · P 0
 , for y ∈ S(x);
∂Ft→0 (x) w0 (y0 )

= y0 ∈S(x) (1)
∂F0→1 (y) 

0, for y ∈/ S(x).


The gradient of Ft→0 with respect to the depth D0 is calculated by:

∂Ft→0 (x) ∂Ft→0 (x) ∂w0 (x)
= · , (2)
∂D0 (y) ∂w0 (y) ∂D0 (y)
where
w0 (y0 ) − w0 (y) · F0→1 (y0 )
P P
F0→1 (y) ·

 0 ∈S(x) 0 ∈S(x)
 y y
 −t · , for y ∈ S(x);


∂Ft→0 (x) P 2
= w0 (y0 ) (3)
∂w0 (y) 
 y0 ∈S(x)



0, for y ∈/ S(x).
and
∂w0 (y) −2
= −1 · D0 (y) . (4)
∂D0 (y)

1
2.2. Adaptive Warping Layer
The adaptive warping layer [2] warps images or features based on the estimated optical flow and local interpolation kernels.
Let I(x) : Z2 → R3 denote the RGB image where x ∈ [1, H] × [1, W ], f (x) := (u(x), v(x)) represent the optical flow field
and kl (x) = [krl (x)]H×W (r ∈ [−R + 1, R]2 ) indicate the interpolation kernel where R = 2 is the kernel size. The adaptive
warping layer synthesizes an output image by:
X
Î(x) = kr (x)I(x + bf (x)c + r), (5)
r∈[−R+1,R]2

where the weight kr = krl krd is determined by both the interpolation kernel krl and bilinear coefficient krd . The bilinear
coefficient is defined by:
 [1 − θ(u)][1 − θ(v)], ru ≤ 0, rv ≤ 0,


 θ(u)[1 − θ(v)],

ru > 0, rv ≤ 0,
d
kr = (6)

 [1 − θ(u)]θ(v), ru ≤ 0, rv > 0,


θ(u)θ(v), ru > 0, rv > 0,
where θ(u) = u − buc denotes the fractional part of a float point number, and the subscript u, v of the 2-D vector r represent
the horizontal and vertical components, respectively. The bilinear coefficient allows the layer to back-propagate the gradients
to the optical flow estimation network. The interpolation kernels krl have the same spatial resolution as the input image with
a channel size of (2 × R)2 = 16, as listed in the last row of Table 1.
2.3. Network Architectures
Our kernel estimation network generates two separable 1D interpolation kernels for each pixel. We use a U-Net architec-
ture and provide the configuration details in Table 1. In our frame synthesis network, we use 3 residual blocks to predict the
residual between the blended warped frames and the ground-truth frame. Table 2 provides the configuration details of the
frame synthesis network.
Table 1. Detailed configuration of the kernel estimation network.

#input #output
Input Output Kernel size Stride Activation Output size
channels channels
in —— RGBs —— —— 6 —— —— H ×W
RGBs enc conv1 3× 3 6 16 1 ReLU H ×W
enc conv1 enc conv2 3× 3 16 32 1 ReLU H ×W
enc conv2 enc pool1 2× 2 32 32 2 —— H/2 × W/2
enc pool1 enc conv3 3× 3 32 64 1 ReLU H/2 × W/2
enc conv3 enc pool2 2× 2 64 64 2 —— H/4 × W/4
encoder enc pool2 enc conv4 3× 3 64 128 1 ReLU H/4 × W/4
enc conv4 enc pool3 2× 2 128 128 2 —— H/8 × W/8
enc pool3 enc conv5 3× 3 128 256 1 ReLU H/8 × W/8
enc conv5 enc pool4 2× 2 256 256 2 —— H/16 × W/16
enc pool4 enc conv6 3× 3 256 512 1 ReLU H/16 × W/16
enc conv6 enc pool5 2× 2 512 512 2 —— H/32 × W/32
enc pool5 dec conv6 3× 3 512 512 1 ReLU H/32 × W/32
dec conv6 dec up5 4× 4 512 512 1/2 —— H/16 × W/16
enc conv6+dec up5 dec conv5 3× 3 512 256 1 ReLU H/16 × W/16
dec conv5 dec up4 4× 4 256 256 1/2 —— H/8 × W/8
enc conv5+dec up4 dec conv4 3× 3 256 128 1 ReLU H/8 × W/8
decoder dec conv4 dec up3 4× 4 128 128 1/2 —— H/4 × W/4
enc conv4+dec up3 dec conv3 3× 3 128 64 1 ReLU H/4 × W/4
dec conv3 dec up2 4× 4 64 64 1/2 —— H/2 × W/2
enc conv3+dec up2 dec conv2 3× 3 64 32 1 ReLU H/2 × W/2
dec conv2 dec up1 4× 4 32 32 1/2 —— H ×W
enc conv2+dec up1 dec conv1 3× 3 32 16 1 ReLU H ×W
dec conv1 out conv1 3× 3 16 16 1 ReLU H ×W
out out conv1 kernel horizontal 3× 3 16 16 1 —— H ×W
dec conv1 out conv2 3× 3 16 16 1 ReLU H ×W
out conv2 kernel vertical 3× 3 16 16 1 —— H ×W
Table 2. Detailed configuration of the frame synthesis network.

#input #output
Input Output Kernel size Stride Activation Output size
channels channels
—— features —— —— 428 —— —— H ×W
in
features in conv 7×7 428 128 1 ReLU H ×W
in conv res1 conv1 3×3 128 128 1 ReLU H ×W
res1 conv1 res1 conv2 3×3 128 128 1 —— H ×W
in conv+res1 conv2 resblock1 —— 128 128 1 ReLU H ×W
resblock1 res2 conv1 3×3 128 128 1 ReLU H ×W
resblocks
res2 conv1 res2 conv2 3×3 128 128 1 —— H ×W
resblock1+res2 conv2 resblock2 —— 128 128 1 ReLU H ×W
resblock2 res3 conv1 3×3 128 128 1 ReLU H ×W
res3 conv1 res3 conv2 3×3 128 128 1 —— H ×W
resblock2+res3 conv2 resblock3 —— 128 128 1 ReLU H ×W
out resblock3 out conv 3×3 128 3 1 —— H ×W

3. Additional Analysis
We conduct an additional evaluation to compare the proposed depth-aware flow projection layer and adaptive warping
layer with their alternatives. We also evaluate the accuracy of the depth estimation network.
3.1. Depth-aware flow projection layer
In our depth-aware flow projection layer, we use the inverse of depth value as the weight for aggregating flow vectors,
which is a soft blending of flows. A straightforward baseline is to project the flow with the smallest depth value, which can be
referred to as a hard selection scheme. We show a quantitative comparison between these two schemes in Table 3, where the
soft blending scheme obtains better performance on all the datasets. As shown in Figure 1, the hard selection scheme obtains
broken skateboard in both the black and blue close-ups. We note that the soft blending can better account for the uncertainty
of depth estimation. On the other hand, our flow projection layer allows the network to back-propagate gradients to the depth
estimation module for fine-tuning, which leads to performance gain as shown in the main paper.

Hard selection Soft blending (ours) Ground-truth

Figure 1. Effect of the depth-aware flow projection.

Table 3. Analysis on the depth-aware flow projection. M.B. is short for the OTHER set of the Middlebury dataset. The proposed model
Soft blending scheme shows a substantial improvement against the Hard selection scheme using flow with smallest depth.
UCF101 [10] Vimeo90K [11] M.B. [1]
Method
PSNR SSIM PSNR SSIM IE
Hard selection 34.89 0.9680 34.35 0.9740 2.11
Soft blending (ours) 34.99 0.9683 34.71 0.9756 2.04
3.2. Adaptive warping layer
To understand the effect of the adaptive warping layer, we remove the kernel estimation network from the proposed model
and replace the adaptive warping layer with a bilinear warping layer. The results are presented in Table 4. The adaptive
warping consistently provides significant performance improvement over the bilinear warping layer among the UCF101 [10],
Vimeo90K [11], and Middlebury [1] datasets. The results in Figure 2 demonstrate that the adaptive warping layer generates
clearer textures than bilinear warping.

Bilinear warping Adaptive warping (ours) Ground-truth

Figure 2. Effect of the adaptive warping layer.

Table 4. Comparison between bilinear and adaptive warping layers. M.B. is short for the OTHER set of the Middlebury dataset.
UCF101 [10] Vimeo90K [11] M.B. [1]
Method
PSNR SSIM PSNR SSIM IE
Bilinear warping 34.73 0.9672 33.81 0.9680 2.58
Adaptive warping (ours) 34.99 0.9683 34.71 0.9756 2.04

4. Depth Estimation
Our model learns the relative depth order instead of absolute depth values. Therefore, we use the SDR (SfM Disagreement
Rate) [4] to measure the preservation of depth order. The SDR= / SDR6= is the disagreement rate for pairs of pixels with
similar / different depth orders. We compare the depth maps from our depth estimation network and the MegaDepth [4]
on the dataset of [4] and show the results in Table 5. We observe that our method improves the SDR6= substantially as our
depth-aware flow projection layer is more effective on the motion boundaries where objects have different depth orders.

Table 5. Evaluation of depth estimation.

Method SDR= % SDR6= % SDR%

MegaDepth [4] 33.4 26.0 29.2
Ours 59.0 18.9 36.4
5. Experimental Results
5.1. Arbitrary Frame Interpolation
In Figure 3, we demonstrate that our method can generate arbitrary intermediate frames to create 10× slow-motion videos.
An Adobe PDF Reader is recommended to view the videos.

Figure 3. Videos with 10× slow-motion of inputs. Please view in Adobe PDF Reader to play the videos.
5.2. Qualitative Comparisons
We provide more visual comparisons with state-of-the-art methods on the Middlebury and Vimeo90K datasets.

5.2.1 Middlebury Dataset

Overlayed inputs ToFlow [11]

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

Figure 4. Visual comparisons on Middlebury [1] E VALUATION set. Our method reconstructs the clearer roof and sharper edges than the
state-of-the-art algorithms.
Overlayed inputs ToFlow [11]

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

Figure 5. Visual comparisons on the Middlebury [1] E VALUATION set. Our method reconstructs a straight and complete lamppost,
while the state-of-the-art approaches cannot reconstruct the lamppost well. In addition, our model generates clearer texture behind the
moving car.
Overlayed inputs ToFlow [11]

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

Figure 6. Visual comparisons on the Middlebury [1] E VALUATION set. The proposed method reconstructs the falling ball with a clear
shape and generates fewer artifacts on the foot.
Overlayed inputs ToFlow [11]

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

Figure 7. Visual comparisons on the Middlebury [1] E VALUATION set. Our method preserves the fine textures of the basketball well
and does not produce blockiness or ghost effect.
Overlayed inputs ToFlow [11]

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

Figure 8. Visual comparisons on the Middlebury [1] E VALUATION set. Our method generates favorable results in the highly textured
region.
MIND [5] ToFlow [11]

EpicFlow [9] SPyNet [8]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

Figure 9. Visual comparisons on the Middlebury [1] OTHER set. Our method preserves the shapes of the balls well.
MIND [5] ToFlow [11]

EpicFlow [9] SPyNet [8]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

Figure 10. Visual comparisons on the Middlebury [1] OTHER set. The fine structure around the shadow of the lid constructed by our
method is more consistent with the ground truth than by the state-of-the-art approaches.
MIND [5] ToFlow [11]

EpicFlow [9] SPyNet [8]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

Figure 11. Visual comparisons on the Middlebury [1] OTHER set. Our method reconstructs the fine details of the fur and preserves the
shape of the shoe well.
5.2.2 Vimeo90K dataset

Overlayed inputs MIND [5]

ToFlow [11] SepConv-Lf [7]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

Figure 12. Visual comparisons on the Vimeo90K [11] test set. Our method reconstructs the legs well.
Overlayed inputs MIND [5]

ToFlow [11] SepConv-Lf [7]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

Figure 13. Visual comparisons on the Vimeo90K [11] test set. Our method maintains the structures of both the fingers in gloves and the
steel bar of the devices well.
5.3. HD video results
The HD video results are available in our website. Although we show in the main paper that our DAIN model achieves
better PSNR and SSIM values against the MEMC-Net [2] algorithm on the HD dataset, we observe that there are some
annoying artifacts in the Bluesky and Sunflower videos. Specifically, we discover that these artifacts are introduced by
the frame synthesis network. In Figure 14, we present the 4-th frame results of the Bluesky video. The three images are
generated by the adaptive warping layer, the frame synthesis network and the corresponding ground-truth frame respectively.
The artifacts appeared in the flat sky area of the synthesized result suggest us that a more robust network should be proposed
to deal with high-resolution images.

Warped Results

Synthesized Results

Ground-Truth
Figure 14. Limitations of the proposed method on HD dataset.
References
[1] S. Baker, D. Scharstein, J. Lewis, S. Roth, M. J. Black, and R. Szeliski. A database and evaluation methodology for optical flow.
IJCV, 2011. 3, 4, 6, 7, 8, 9, 10, 11, 12, 13
[2] W. Bao, W.-S. Lai, X. Zhang, Z. Gao, and M.-H. Yang. MEMC-Net: Motion Estimation and Motion Compensation Driven Neural
Network for Video Interpolation and Enhancement. arXiv, 2018. 2, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
[3] H. Jiang, D. Sun, V. Jampani, M.-H. Yang, E. Learned-Miller, and J. Kautz. Super SloMo: High Quality Estimation of Multiple
Intermediate Frames for Video Interpolation. In CVPR, 2018. 6, 7, 8, 9, 10
[4] Z. Li and N. Snavely. Megadepth: Learning single-view depth prediction from internet photos. In CVPR, 2018. 4
[5] G. Long, L. Kneip, J. M. Alvarez, H. Li, X. Zhang, and Q. Yu. Learning image matching by simply watching video. In ECCV, 2016.
11, 12, 13, 14, 15
[6] S. Niklaus and F. Liu. Context-aware synthesis for video frame interpolation. In CVPR, 2018. 6, 7, 8, 9, 10
[7] S. Niklaus, L. Mai, and F. Liu. Video frame interpolation via adaptive separable convolution. In ICCV, 2017. 6, 7, 8, 9, 10, 11, 12,
13, 14, 15
[8] A. Ranjan and M. J. Black. Optical flow estimation using a spatial pyramid network. In CVPR, 2017. 11, 12, 13
[9] J. Revaud, P. Weinzaepfel, Z. Harchaoui, and C. Schmid. Epicflow: Edge-preserving interpolation of correspondences for optical
flow. In CVPR, 2015. 6, 7, 8, 9, 10, 11, 12, 13
[10] K. Soomro, A. R. Zamir, and M. Shah. UCF101: A dataset of 101 human actions classes from videos in the wild. In CRCV-TR-12-01,
2012. 3, 4
[11] T. Xue, B. Chen, J. Wu, D. Wei, and W. T. Freeman. Video enhancement with task-oriented flow. arXiv, 2017. 3, 4, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15

Atapour-Abarghouei Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by CVPR 2019 Paper PDF
No ratings yet
Atapour-Abarghouei Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by CVPR 2019 Paper PDF
12 pages
Video Depth Estimation via LDM
No ratings yet
Video Depth Estimation via LDM
13 pages
Access-2025-23163 Proof Hi
No ratings yet
Access-2025-23163 Proof Hi
14 pages
qt5w91q7q6 Nosplash
No ratings yet
qt5w91q7q6 Nosplash
143 pages
COMP3411 Week 7 - Computer Vision
No ratings yet
COMP3411 Week 7 - Computer Vision
58 pages
Neural RGB D Sensing: Depth and Uncertainty From A Video Camera
No ratings yet
Neural RGB D Sensing: Depth and Uncertainty From A Video Camera
13 pages
Park Depth Prompting For Sensor-Agnostic Depth Estimation CVPR 2024 Paper
No ratings yet
Park Depth Prompting For Sensor-Agnostic Depth Estimation CVPR 2024 Paper
11 pages
Demon: Depth and Motion Network For Learning Monocular Stereo
No ratings yet
Demon: Depth and Motion Network For Learning Monocular Stereo
22 pages
Neural RGBRD Sensing Depth and Uncertainty From A Video Camera
No ratings yet
Neural RGBRD Sensing Depth and Uncertainty From A Video Camera
10 pages
Atapour-Abarghouei Real-Time Monocular Depth CVPR 2018 Paper PDF
No ratings yet
Atapour-Abarghouei Real-Time Monocular Depth CVPR 2018 Paper PDF
11 pages
2024 - Learning Temporally Consistent Video Depth From Video Diffusion Priors - Shao Et Al
No ratings yet
2024 - Learning Temporally Consistent Video Depth From Video Diffusion Priors - Shao Et Al
13 pages
Equations
No ratings yet
Equations
2 pages
Lect15 PDF
No ratings yet
Lect15 PDF
37 pages
FULLTEXT01
No ratings yet
FULLTEXT01
74 pages
Improving Structured Light Based Depth and Pose Estimation Using Cnns
No ratings yet
Improving Structured Light Based Depth and Pose Estimation Using Cnns
77 pages
Coding With OpenCV
No ratings yet
Coding With OpenCV
36 pages
A Directional Field-Based Fast Intra Mode Decision Algorithm For Video Coding
No ratings yet
A Directional Field-Based Fast Intra Mode Decision Algorithm For Video Coding
4 pages
Sup Mob FGSR Super Resolution and Frame Generation For Mobile Real Time Rendering
No ratings yet
Sup Mob FGSR Super Resolution and Frame Generation For Mobile Real Time Rendering
4 pages
Machine Learning for Seismic Data Extrapolation
No ratings yet
Machine Learning for Seismic Data Extrapolation
100 pages
Depth From Videos in The Wild Unsupervised Monocular Depth Learning From Unknown Cameras
No ratings yet
Depth From Videos in The Wild Unsupervised Monocular Depth Learning From Unknown Cameras
10 pages
FIOT Group 18 IOT Device Identification Copy
No ratings yet
FIOT Group 18 IOT Device Identification Copy
6 pages
Improved Error Detection and Data Recovery Architecture For Motion Estimation Testing Applications
No ratings yet
Improved Error Detection and Data Recovery Architecture For Motion Estimation Testing Applications
7 pages
Doctoral Dissertation For Junyong Lee, PH.D., POSTECH
No ratings yet
Doctoral Dissertation For Junyong Lee, PH.D., POSTECH
141 pages
2020-@ Yusuke Kameda (Numerically - Stable - Multi-Channel - Depth - Scene - Flow - With - Adaptive - Weighting - of - Regularization - Terms
No ratings yet
2020-@ Yusuke Kameda (Numerically - Stable - Multi-Channel - Depth - Scene - Flow - With - Adaptive - Weighting - of - Regularization - Terms
5 pages
Generalised Parallel Bilinear Interpolation Archit
No ratings yet
Generalised Parallel Bilinear Interpolation Archit
7 pages
Depth Reconstruction With Deep Neural Networks (Part 1)
No ratings yet
Depth Reconstruction With Deep Neural Networks (Part 1)
66 pages
Internship Report
No ratings yet
Internship Report
17 pages
IRJET V1i101
No ratings yet
IRJET V1i101
7 pages
CS 664 Slides #7 Visual Motion: Prof. Dan Huttenlocher Fall 2003
No ratings yet
CS 664 Slides #7 Visual Motion: Prof. Dan Huttenlocher Fall 2003
32 pages
Adaptive Mixed Norm Optical Flow Estimation Vcip 2005
No ratings yet
Adaptive Mixed Norm Optical Flow Estimation Vcip 2005
8 pages
Final Exam Topics
No ratings yet
Final Exam Topics
9 pages
OpenCV Lections: 7. Working With Camera. Background and Motion Analysis
100% (3)
OpenCV Lections: 7. Working With Camera. Background and Motion Analysis
22 pages
CV Notes
No ratings yet
CV Notes
333 pages
Group 09
No ratings yet
Group 09
9 pages
Image Warping & Processing Guide
No ratings yet
Image Warping & Processing Guide
13 pages
Documentation Image Processing Day 1
No ratings yet
Documentation Image Processing Day 1
11 pages
L2-Video Encoding
No ratings yet
L2-Video Encoding
54 pages
Project Synopsis Template
No ratings yet
Project Synopsis Template
5 pages
Thesis Project AI
No ratings yet
Thesis Project AI
65 pages
Pedersen LasseJonFuglsang TemporalReprojectionAntiAliasing
No ratings yet
Pedersen LasseJonFuglsang TemporalReprojectionAntiAliasing
47 pages
OpenCV-Foundational Notes - Day 1
No ratings yet
OpenCV-Foundational Notes - Day 1
12 pages
Lecture 7-8
No ratings yet
Lecture 7-8
56 pages
Learning The Depths of Moving People by Watching Frozen People
No ratings yet
Learning The Depths of Moving People by Watching Frozen People
10 pages
Yu Et Al - 2016 - Recent Developments On Deep Big Vision
No ratings yet
Yu Et Al - 2016 - Recent Developments On Deep Big Vision
2 pages
HM DRL
No ratings yet
HM DRL
32 pages
Image Compression Using High Efficient Video Coding (HEVC) Technique
No ratings yet
Image Compression Using High Efficient Video Coding (HEVC) Technique
3 pages
Real-Time Intermediate Flow Estimation For Video Frame Interpolation
No ratings yet
Real-Time Intermediate Flow Estimation For Video Frame Interpolation
22 pages
Low-Cost Implementation of Bilinear and Bicubic Image Interpolation For Real-Time Image Super-Resolution
No ratings yet
Low-Cost Implementation of Bilinear and Bicubic Image Interpolation For Real-Time Image Super-Resolution
5 pages
Image Scaling Techniques Guide
No ratings yet
Image Scaling Techniques Guide
70 pages
Digital Video Processing, Second Edition, 2015: June 2015
No ratings yet
Digital Video Processing, Second Edition, 2015: June 2015
9 pages
Ml@ok Questions
No ratings yet
Ml@ok Questions
16 pages
2021-Depth From Defocus With Learned Optics For Imaging and Occlusion-Aware Depth Estimation
No ratings yet
2021-Depth From Defocus With Learned Optics For Imaging and Occlusion-Aware Depth Estimation
12 pages
Optical Flow Estimation With CUDA: Mikhail Smirnov
100% (1)
Optical Flow Estimation With CUDA: Mikhail Smirnov
14 pages
Depth Estimation Based On Monocular Camera Sensors in Autonomous Vehicles: A Self Supervised Learning Approach
No ratings yet
Depth Estimation Based On Monocular Camera Sensors in Autonomous Vehicles: A Self Supervised Learning Approach
13 pages
Monocular Depth Estimation with U-Net
No ratings yet
Monocular Depth Estimation with U-Net
8 pages
Layered RGBD Scene Flow Estimation
No ratings yet
Layered RGBD Scene Flow Estimation
9 pages
Cheat Sheet 4
No ratings yet
Cheat Sheet 4
1 page
Avc 777
No ratings yet
Avc 777
2 pages
Lots PDF
No ratings yet
Lots PDF
4 pages
Sony Kdl-32ex720 40ex720 46ex720 55ex720 Ex723 Chassis Az2-F Ver.2.0 Segm.3a-2 STM
No ratings yet
Sony Kdl-32ex720 40ex720 46ex720 55ex720 Ex723 Chassis Az2-F Ver.2.0 Segm.3a-2 STM
112 pages
Tech in Language Teaching
No ratings yet
Tech in Language Teaching
19 pages
UPC Direct On Thor 5 - 6 at 0.8°W - LyngSat
No ratings yet
UPC Direct On Thor 5 - 6 at 0.8°W - LyngSat
6 pages
MPEG-4 Insights for Developers
No ratings yet
MPEG-4 Insights for Developers
53 pages
Body Worn Camera Draft Policy
No ratings yet
Body Worn Camera Draft Policy
5 pages
E-Joy MINI DVR Camera Pricelist 2010-7-22th
No ratings yet
E-Joy MINI DVR Camera Pricelist 2010-7-22th
18 pages
Broadcast-Edge 1 1
0% (1)
Broadcast-Edge 1 1
65 pages
CCS352 Ma
No ratings yet
CCS352 Ma
18 pages
Rtos Applications: (I) .RTOS For Control Systems
No ratings yet
Rtos Applications: (I) .RTOS For Control Systems
19 pages
Guide-VisualDVR and Overlay Training-003
No ratings yet
Guide-VisualDVR and Overlay Training-003
36 pages
Archita Vral XXX XNXX Deshi Us333
No ratings yet
Archita Vral XXX XNXX Deshi Us333
3 pages
Amazon Prime Video Channel List and Price Guide 2022
0% (1)
Amazon Prime Video Channel List and Price Guide 2022
2 pages
Specification Sheet VSXLX105-spec - Sheet
No ratings yet
Specification Sheet VSXLX105-spec - Sheet
2 pages
Evaluation The Information Content of The Computer Monitor RF Radiation
No ratings yet
Evaluation The Information Content of The Computer Monitor RF Radiation
5 pages
SANsui Price List
No ratings yet
SANsui Price List
9 pages
3G-SDI Analog and Digital Audio Embedder / De-Embedder: Flashlink
No ratings yet
3G-SDI Analog and Digital Audio Embedder / De-Embedder: Flashlink
4 pages
En DS 7304 - 7308 - 7316hi S PDF
No ratings yet
En DS 7304 - 7308 - 7316hi S PDF
1 page
DDX418 DDX4048BT DDX318 DDX3048: Instruction Manual
No ratings yet
DDX418 DDX4048BT DDX318 DDX3048: Instruction Manual
64 pages
Os08a10-H92a Specification Version-2-11 Se
No ratings yet
Os08a10-H92a Specification Version-2-11 Se
173 pages
RGB To Component Video Converter
No ratings yet
RGB To Component Video Converter
1 page
Adobe Premiere Guide for Beginners
No ratings yet
Adobe Premiere Guide for Beginners
10 pages
Stop Motion Storyboard Guide
No ratings yet
Stop Motion Storyboard Guide
3 pages
Final Cut Pro X Guide for Beginners
No ratings yet
Final Cut Pro X Guide for Beginners
12 pages
MT Bl475wi 485wi Revc
100% (1)
MT Bl475wi 485wi Revc
150 pages
Dt47mg User Manual
No ratings yet
Dt47mg User Manual
32 pages
Imenco Oe14110 Datasheet RevA-new-1
No ratings yet
Imenco Oe14110 Datasheet RevA-new-1
2 pages
KDL55W900A
No ratings yet
KDL55W900A
36 pages
Samsung LE26A457
No ratings yet
Samsung LE26A457
64 pages

Depth-Aware Video Frame Interpolation Supplementary Material

Uploaded by

Depth-Aware Video Frame Interpolation Supplementary Material

Uploaded by

Depth-Aware Video Frame Interpolation

2. Algorithm Details and Network Configurations

The gradient of Ft→0 with respect to the depth D0 is calculated by:

Hard selection Soft blending (ours) Ground-truth

Bilinear warping Adaptive warping (ours) Ground-truth

Table 5. Evaluation of depth estimation.

Method SDR= % SDR6= % SDR%

5.2.1 Middlebury Dataset

Overlayed inputs ToFlow [11]

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

SepConv-L1 [7] EpicFlow [9]

SuperSlomo [3] CtxSyn [6]

MEMC-Net [2] DAIN (Ours)

EpicFlow [9] SPyNet [8]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

EpicFlow [9] SPyNet [8]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

EpicFlow [9] SPyNet [8]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

Overlayed inputs MIND [5]

ToFlow [11] SepConv-Lf [7]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

ToFlow [11] SepConv-Lf [7]

SepConv-L1 [7] MEMC-Net [2]

DAIN (Ours) Ground-truth

You might also like