add edvr doc (PaddlePaddle#282)

live0717 · Apr 23, 2021 · b2ede95 · b2ede95
1 parent a693a7d
commit b2ede95
Show file tree

Hide file tree

Showing 6 changed files with 177 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -44,7 +44,8 @@ GAN-Generative Adversarial Network, was praised by "the Father of Convolutional
 * [U-GAT-IT](./docs/en_US/tutorials/ugatit.md)
 * [Photo2Cartoon](./docs/en_US/tutorials/photo2cartoon.md)
 * [Wav2Lip](./docs/en_US/tutorials/wav2lip.md)
-* [Super_Resolution](./docs/en_US/tutorials/super_resolution.md)
+* [Single Image Super Resolution(SISR)](./docs/en_US/tutorials/single_image_super_resolution.md)
+* [Video Super Resolution(VSR)](./docs/en_US/tutorials/video_super_resolution.md)
 * [StyleGAN2](./docs/en_US/tutorials/styleganv2.md)
 * [Pixel2Style2Pixel](./docs/en_US/tutorials/pixel2style2pixel.md)
 

diff --git a/README_cn.md b/README_cn.md
@@ -77,7 +77,8 @@ GAN--生成对抗网络，被“卷积网络之父”**Yann LeCun（杨立昆）
 * [U-GAT-IT](./docs/zh_CN/tutorials/ugatit.md)
 * [Photo2Cartoon](docs/zh_CN/tutorials/photo2cartoon.md)
 * [Wav2Lip](docs/zh_CN/tutorials/wav2lip.md)
-* [Super_Resolution](./docs/zh_CN/tutorials/super_resolution.md)
+* [Single Image Super Resolution(SISR)](./docs/zh_CN/tutorials/single_image_super_resolution.md)
+* [Video Super Resolution(VSR)](./docs/zh_CN/tutorials/video_super_resolution.md)
 * [StyleGAN2](./docs/zh_CN/tutorials/styleganv2.md)
 * [Pixel2Style2Pixel](./docs/zh_CN/tutorials/pixel2style2pixel.md)
 

diff --git a/docs/en_US/tutorials/super_resolution.md → ...utorials/single_image_super_resolution.md b/docs/en_US/tutorials/super_resolution.md → ...utorials/single_image_super_resolution.md
@@ -1,4 +1,4 @@
-# 1 Super Resolution
+# 1 Single Image Super Resolution(SISR)
 
 ## 1.1 Principle
 
@@ -133,8 +133,8 @@ The metrics are PSNR / SSIM.
 <!-- ![](../../imgs/horse2zebra.png) -->
 
 
-## 1.4 模型下载
-| 模型 | 数据集 | 下载地址 |
+## 1.4 Model Download
+| Method | Dataset | Download Link |
 |---|---|---|
 | realsr_df2k  | df2k | [realsr_df2k](https://paddlegan.bj.bcebos.com/models/realsr_df2k.pdparams)
 | realsr_dped  | dped | [realsr_dped](https://paddlegan.bj.bcebos.com/models/realsr_dped.pdparams)

diff --git a/docs/en_US/tutorials/video_super_resolution.md b/docs/en_US/tutorials/video_super_resolution.md
@@ -0,0 +1,84 @@
+
+# 1 Video Super Resolution (VSR)
+
+## 1.1 Principle
+
+  Video super-resolution originates from image super-resolution, which aims to recover high-resolution (HR) images from one or more low resolution (LR) images. The difference between them is that the video is composed of multiple frames, so the video super-resolution usually uses the information between frames to repair. Here we provide the video super-resolution model [EDVR](https://arxiv.org/pdf/1905.02716.pdf).
+
+  [EDVR](https://arxiv.org/pdf/1905.02716.pdf) wins the champions and outperforms the second place by a large margin in all four tracks in the NTIRE19 video restoration and enhancement challenges. The main difficulties of video super-resolution from two aspects: (1) how to align multiple frames given large motions, and (2) how to effectively fuse different frames with diverse motion and blur. First, to handle large motions, EDVR devise a Pyramid, Cascading and Deformable (PCD) alignment module, in which frame alignment is done at the feature level using deformable convolutions in a coarse-to-fine manner. Second, EDVR propose a Temporal and Spatial Attention (TSA) fusion module, in which attention is applied both temporally and spatially, so as to emphasize important features for subsequent restoration.
+
+
+
+## 1.2 How to use  
+
+### 1.2.1 Prepare Datasets
+
+  REDS（[download](https://seungjunnah.github.io/Datasets/reds.html)）is a newly proposed high-quality (720p) video dataset in the NTIRE19 Competition. REDS consists of 240 training clips, 30 validation clips and 30 testing clips (each with 100 consecutive frames). Since the test ground truth is not available, we select four representative clips (they are '000', '011', '015', '020', with diverse scenes and motions) as our test set, denoted by REDS4. The remaining training and validation clips are re-grouped as our training dataset (a total of 266 clips). 
+
+  The structure of the processed REDS is as follows:
+  ```
+    PaddleGAN
+      ├── data
+          ├── REDS
+                ├── train_sharp
+                |    └──X4
+                ├── train_sharp_bicubic
+                |    └──X4
+                ├── REDS4_test_sharp
+                |    └──X4
+                └── REDS4_test_sharp_bicubic
+                     └──X4
+              ...
+  ```
+
+### 1.2.2 Train/Test
+
+  The command to train and test edvr model with the processed EDVR is as follows:
+
+  Train a model:
+  ```
+     python -u tools/main.py --config-file configs/edvr.yaml
+  ```
+
+  Test the model:
+  ```
+     python tools/main.py --config-file configs/edvr.yaml --evaluate-only --load ${PATH_OF_WEIGHT}
+  ```
+
+## 1.3 Results
+The experimental results are evaluated on RGB channel.
+
+The metrics are PSNR / SSIM.
+
+| Method | REDS4 | 
+|---|---|
+| EDVR_M_wo_tsa_SRx4  | 30.4429 / 0.8684 |
+| EDVR_M_w_tsa_SRx4  | 30.5169 / 0.8699 |
+| EDVR_L_wo_tsa_SRx4  | 30.8649 / 0.8761 |
+
+
+## 1.4 Model Download
+| Method | Dataset | Download Link |
+|---|---|---|
+| EDVR_M_wo_tsa_SRx4  | REDS | [EDVR_M_wo_tsa_SRx4](https://paddlegan.bj.bcebos.com/models/EDVR_M_wo_tsa_SRx4.pdparams)
+| EDVR_M_w_tsa_SRx4  | REDS | [EDVR_M_w_tsa_SRx4](https://paddlegan.bj.bcebos.com/models/EDVR_M_w_tsa_SRx4.pdparams)
+| EDVR_L_wo_tsa_SRx4  | REDS | [EDVR_L_wo_tsa_SRx4](https://paddlegan.bj.bcebos.com/models/EDVR_L_wo_tsa_SRx4.pdparams)
+
+
+
+
+
+# References
+
+- 1. [EDVR: Video Restoration with Enhanced Deformable Convolutional Networks](https://arxiv.org/pdf/1905.02716.pdf)
+
+  ```
+  @InProceedings{wang2019edvr,
+    author = {Wang, Xintao and Chan, Kelvin C.K. and Yu, Ke and Dong, Chao and Loy, Chen Change},
+    title = {EDVR: Video Restoration with Enhanced Deformable Convolutional Networks},
+    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
+    month = {June},
+    year = {2019}
+    }
+  ```
+
diff --git a/docs/zh_CN/tutorials/super_resolution.md → ...utorials/single_image_super_resolution.md b/docs/zh_CN/tutorials/super_resolution.md → ...utorials/single_image_super_resolution.md
@@ -1,4 +1,4 @@
-# 1 超分
+# 1 单张图像超分
 
 ## 1.1 原理介绍
 
@@ -89,7 +89,7 @@
      python -u tools/main.py --config-file configs/realsr_bicubic_noise_x4_df2k.yaml
   ```
 
-  训练模型:
+  测试模型:
   ```
      python tools/main.py --config-file configs/realsr_bicubic_noise_x4_df2k.yaml --evaluate-only --load ${PATH_OF_WEIGHT}
   ```

diff --git a/docs/zh_CN/tutorials/video_super_resolution.md b/docs/zh_CN/tutorials/video_super_resolution.md
@@ -0,0 +1,84 @@
+
+# 1 视频超分
+
+## 1.1 原理介绍
+
+  视频超分源于图像超分，其目的是从一个或多个低分辨率（LR）图像中恢复高分辨率（HR）图像。它们的区别也很明显，由于视频是由多个帧组成的，所以视频超分通常利用帧间的信息来进行修复。这里我们提供视频超分模型[EDVR](https://arxiv.org/pdf/1905.02716.pdf).
+
+  [EDVR](https://arxiv.org/pdf/1905.02716.pdf)模型在NTIRE19视频恢复和增强挑战赛的四个赛道中都赢得了冠军，并以巨大的优势超过了第二名。视频超分的主要难点在于（1）如何在给定大运动的情况下对齐多个帧；（2）如何有效地融合具有不同运动和模糊的不同帧。首先，为了处理大的运动，EDVR模型设计了一个金字塔级联的可变形（PCD）对齐模块，在该模块中，从粗到精的可变形卷积被使用来进行特征级的帧对齐。其次，EDVR使用了时空注意力（TSA）融合模块，该模块在时间和空间上同时应用注意力机制，以强调后续恢复的重要特征。
+
+
+
+## 1.2 如何使用 
+
+### 1.2.1 数据准备
+
+  REDS（[数据下载](https://seungjunnah.github.io/Datasets/reds.html)）数据集是NTIRE19公司最新提出的高质量（720p）视频数据集，其由240个训练片段、30个验证片段和30个测试片段组成（每个片段有100个连续帧）。由于测试数据集不可用，这里在训练集选择了四个具有代表性的片段（分别为'000', '011', '015', '020'，它们具有不同的场景和动作）作为测试集，用REDS4表示。剩下的训练和验证片段被重新分组为训练数据集（总共266个片段）。
+
+  处理后的数据集 REDS 的组成形式如下:
+  ```
+    PaddleGAN
+      ├── data
+          ├── REDS
+                ├── train_sharp
+                |    └──X4
+                ├── train_sharp_bicubic
+                |    └──X4
+                ├── REDS4_test_sharp
+                |    └──X4
+                └── REDS4_test_sharp_bicubic
+                     └──X4
+              ...
+  ```
+
+### 1.2.2 训练/测试
+
+  使用处理后的REDS数据集训练与测试EDVR模型命令如下:
+
+  训练模型:
+  ```
+     python -u tools/main.py --config-file configs/edvr.yaml
+  ```
+
+  测试模型:
+  ```
+     python tools/main.py --config-file configs/edvr.yaml --evaluate-only --load ${PATH_OF_WEIGHT}
+  ```
+
+## 1.3 实验结果展示
+实验数值结果是在 RGB 通道上进行评估。
+
+度量指标为 PSNR / SSIM.
+
+| 模型 | REDS4 | 
+|---|---|
+| EDVR_M_wo_tsa_SRx4  | 30.4429 / 0.8684 |
+| EDVR_M_w_tsa_SRx4  | 30.5169 / 0.8699 |
+| EDVR_L_wo_tsa_SRx4  | 30.8649 / 0.8761 |
+
+
+## 1.4 模型下载
+| 模型 | 数据集 | 下载地址 |
+|---|---|---|
+| EDVR_M_wo_tsa_SRx4  | REDS | [EDVR_M_wo_tsa_SRx4](https://paddlegan.bj.bcebos.com/models/EDVR_M_wo_tsa_SRx4.pdparams)
+| EDVR_M_w_tsa_SRx4  | REDS | [EDVR_M_w_tsa_SRx4](https://paddlegan.bj.bcebos.com/models/EDVR_M_w_tsa_SRx4.pdparams)
+| EDVR_L_wo_tsa_SRx4  | REDS | [EDVR_L_wo_tsa_SRx4](https://paddlegan.bj.bcebos.com/models/EDVR_L_wo_tsa_SRx4.pdparams)
+
+
+
+
+
+# 参考文献
+
+- 1. [EDVR: Video Restoration with Enhanced Deformable Convolutional Networks](https://arxiv.org/pdf/1905.02716.pdf)
+
+  ```
+  @InProceedings{wang2019edvr,
+    author = {Wang, Xintao and Chan, Kelvin C.K. and Yu, Ke and Dong, Chao and Loy, Chen Change},
+    title = {EDVR: Video Restoration with Enhanced Deformable Convolutional Networks},
+    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
+    month = {June},
+    year = {2019}
+    }
+  ```
+