GitHub - audaci0us/MVDet-CNN-experiments: [ECCV 2020] Codes and MultiviewX dataset for "Multiview Detection with Feature Perspective Transformation".

My personal experiments on CNN behind "Multiview Detection with Feature Perspective Transformation" [Website] [arXiv]

@inproceedings{hou2020multiview,
  title={Multiview Detection with Feature Perspective Transformation},
  author={Hou, Yunzhong and Zheng, Liang and Gould, Stephen},
  booktitle={ECCV},
  year={2020}
}

Overview

This repo is for my practice and trials in learning Convolutional Neural Networks. The changes carried out are probably simple ones. The primary goal here is to get familiar with a code base that's a good implementation of a Neural Network architecture.

Here I'm focusing on MVDet model, which has been described and analyzed in the paper linked at the top. I got my understandings by reading through this paper (still learning from it) and also by going through the code that the paper's authors provided (this repo is a fork of theirs). With the intention of learning, I'm trying out different things here. Hoping to document enough of the learnings! We'll see!

Content

Original setup

The original architecture of MVDet is given below.

Their code implementation (and also my modified ones) of MVDet uses CUDA as well as the following libraries

python 3.7+
pytorch 1.4+ & tochvision
numpy
matplotlib
pillow
opencv-python
kornia
matlab & matlabengine (required for evaluation) (see this link for detailed guide)

Experiment 1

This experiment is only to get some basic familiarity with the code logics.

Trials attempted in exp 1

Importantly, MVDet uses a Resnet18 architecture as its model-core. I've modified the code to use Resnet34 instead.
Some setup related changes have been carried out in order to run the code in my Windows desktop with one GPU instead of 2 GPUs as run by the paper's authors.
Also, minor changes are done to the main script where the parameters are explicitly hard-coded (this is for my convenience).

Changes carried out in exp 1

"persp_trans_detector.py" script has a "PerspTransDetector" class. In its constructor (__init__ method), Resnet34 is added as an additional option.
In the same script, the __init__ method and the forward method, both establish the device in which the model stays and the data gets loaded onto.
- The original implementation splits the model between two GPUs. I remove that portion and instead put the whole model into the single available GPU (in __init__ method).
- Then, the data before it gets put through the model (in the forward method), it is loaded into GPU. Since the original model is split, this happens multiple times for the dataset. At those instances, they are just loaded into the same GPU.
In the "main.py" script, the argparse library is used to apply the parameters. I've modified to use a simple class with properties instead. This is just for my convenience.

Experiment 2

This one is a work in progress. Will be committing the changes soon. I've tried to modify the architecture to add an additional model in between the single-view results and the multi-view aggregation layers. These modifications will primarily be in "res_proj_variant". Hoping to complete them soon and commit them here. Fingers crossed!!

Trials attempted in exp 2

The existing architecture has multiple variants. One such variant passes the result of the single-view detection, on to the next steps instead of the feature mappings. Refer image below where the red-line highlights the alternative path in this variant. I've modified the CNN model (CNN block highlighted in red as well) used in this alternate pathway here. The idea was to understand what it takes to integrate other single-view CNN models into this architecture.

Changes carried out in exp 2

The model definition in the __init__ method of the ResProjVariant class in the "res_proj_variant.py" script was modified. The self.image_classifier property was modified to include an additional resnet18 CNN block in its middle.
In order to integrate this ResNet18 model inside this block, the kernel size of the first CNN layer was converted to kernel_size=1, so that the channel size can be suitably adjusted for the input of ResNet18 block.
The last few layers of the ResNet18 block added was omitted, in order to avoid flattening the layers. This way it will remain suitable for the next steps. Ideally, some of the first few layers should also be omitted so that the channel size need not be drastically reduced in order to input here (reduced 512 channels to 3 channels). May be in one of the future experiments.

Experiment 3

This one is a work in progress. Will be committing the changes soon. I'll try to make changes that avoid the drastic reduction in the channels count, when modified for the experiment-2 above. These modifications will primarily be in "res_proj_variant" as well. Hoping to complete them soon and commit them here. Fingers crossed!!

Trials attempted in exp 3

*** Work in progress ***

Changes carried out in exp 3

*** Work in progress ***

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
multiview_detector		multiview_detector
.gitattributes		.gitattributes
.gitignore		.gitignore
README - MVDet authors'.md		README - MVDet authors'.md
README.md		README.md
grid_visualize.py		grid_visualize.py
img_grid_visualize.png		img_grid_visualize.png
main.py		main.py
map_grid_visualize.png		map_grid_visualize.png
run_gpu01.sh		run_gpu01.sh
video_visualize.py		video_visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Content

Original setup

Experiment 1

Trials attempted in exp 1

Changes carried out in exp 1

Experiment 2

Trials attempted in exp 2

Changes carried out in exp 2

Experiment 3

Trials attempted in exp 3

Changes carried out in exp 3

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview

Content

Original setup

Experiment 1

Trials attempted in exp 1

Changes carried out in exp 1

Experiment 2

Trials attempted in exp 2

Changes carried out in exp 2

Experiment 3

Trials attempted in exp 3

Changes carried out in exp 3

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages