diff --git a/docs/en/index.rst b/docs/en/index.rst index 285954487bb..32c5952a4ae 100644 --- a/docs/en/index.rst +++ b/docs/en/index.rst @@ -24,7 +24,7 @@ Welcome to MMDetection's documentation! :maxdepth: 1 :caption: Migration - migration.md + migration/migration.md .. toctree:: :maxdepth: 1 diff --git a/docs/en/migration/api_and_registry_migration.md b/docs/en/migration/api_and_registry_migration.md new file mode 100644 index 00000000000..72bfd3aec8e --- /dev/null +++ b/docs/en/migration/api_and_registry_migration.md @@ -0,0 +1 @@ +# Migrate API and Registry from MMDetection 2.x to 3.x diff --git a/docs/en/migration/config_migration.md b/docs/en/migration/config_migration.md new file mode 100644 index 00000000000..20fe0bb7e0f --- /dev/null +++ b/docs/en/migration/config_migration.md @@ -0,0 +1,819 @@ +# Migrate Configuration File from MMDetection 2.x to 3.x + +The configuration file of MMDetection 3.x has undergone significant changes in comparison to the 2.x version. This document explains how to migrate 2.x configuration files to 3.x. + +In the previous tutorial [Learn about Configs](../user_guides/config.md), we used Mask R-CNN as an example to introduce the configuration file structure of MMDetection 3.x. Here, we will follow the same structure to demonstrate how to migrate 2.x configuration files to 3.x. + +## Model Configuration + +There have been no major changes to the model configuration in 3.x compared to 2.x. For the model's backbone, neck, head, as well as train_cfg and test_cfg, the parameters remain the same as in version 2.x. + +On the other hand, we have added the `DataPreprocessor` module in MMDetection 3.x. The configuration for the `DataPreprocessor` module is located in `model.data_preprocessor`. It is used to preprocess the input data, such as normalizing input images and padding images of different sizes into batches, and loading images from memory to VRAM. This configuration replaces the `Normalize` and `Pad` modules in `train_pipeline` and `test_pipeline` of the earlier version. + + + + + + + + + +
2.x Config + +```python +# Image normalization parameters +img_norm_cfg = dict( + mean=[123.675, 116.28, 103.53], + std=[58.395, 57.12, 57.375], + to_rgb=True) +pipeline=[ + ..., + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size_divisor=32), # Padding the image to multiples of 32 + ... +] +``` + +
2.x Config + +```python +model = dict( + data_preprocessor=dict( + type='DetDataPreprocessor', + # Image normalization parameters + mean=[123.675, 116.28, 103.53], + std=[58.395, 57.12, 57.375], + bgr_to_rgb=True, + # Image padding parameters + pad_mask=True, # In instance segmentation, the mask needs to be padded + pad_size_divisor=32) # Padding the image to multiples of 32 +) + +``` + +
+ +## Dataset and Evaluator Configuration + +The dataset and evaluator configurations have undergone major changes compared to version 2.x. We will introduce how to migrate from version 2.x to version 3.x from three aspects: Dataloader and Dataset, Data transform pipeline, and Evaluator configuration. + +### Dataloader and Dataset Configuration + +In the new version, we set the data loading settings consistent with PyTorch's official DataLoader, +making it easier for users to understand and get started with. +We put the data loading settings for training, validation, and testing separately in `train_dataloader`, `val_dataloader`, and `test_dataloader`. +Users can set different parameters for these dataloaders. +The input parameters are basically the same as those required by [PyTorch DataLoader](https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader). + +This way, we put the unconfigurable parameters in version 2.x, such as `sampler`, `batch_sampler`, and `persistent_workers`, in the configuration file, so that users can set dataloader parameters more flexibly. + +Users can set the dataset configuration through `train_dataloader.dataset`, `val_dataloader.dataset`, and `test_dataloader.dataset`, which correspond to `data.train`, `data.val`, and `data.test` in version 2.x. + + + + + + + + + +
2.x Config + +```python +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + ann_file=data_root + 'annotations/instances_train2017.json', + img_prefix=data_root + 'train2017/', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + ann_file=data_root + 'annotations/instances_val2017.json', + img_prefix=data_root + 'val2017/', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + ann_file=data_root + 'annotations/instances_val2017.json', + img_prefix=data_root + 'val2017/', + pipeline=test_pipeline)) +``` + +
3.x Config + +```python +train_dataloader = dict( + batch_size=2, + num_workers=2, + persistent_workers=True, # Avoid recreating subprocesses after each iteration + sampler=dict(type='DefaultSampler', shuffle=True), # Default sampler, supports both distributed and non-distributed training + batch_sampler=dict(type='AspectRatioBatchSampler'), # Default batch_sampler, used to ensure that images in the batch have similar aspect ratios, so as to better utilize graphics memory + dataset=dict( + type=dataset_type, + data_root=data_root, + ann_file='annotations/instances_train2017.json', + data_prefix=dict(img='train2017/'), + filter_cfg=dict(filter_empty_gt=True, min_size=32), + pipeline=train_pipeline)) +# In version 3.x, validation and test dataloaders can be configured independently +val_dataloader = dict( + batch_size=1, + num_workers=2, + persistent_workers=True, + drop_last=False, + sampler=dict(type='DefaultSampler', shuffle=False), + dataset=dict( + type=dataset_type, + data_root=data_root, + ann_file='annotations/instances_val2017.json', + data_prefix=dict(img='val2017/'), + test_mode=True, + pipeline=test_pipeline)) +test_dataloader = val_dataloader # The configuration of the testing dataloader is the same as that of the validation dataloader, which is omitted here + +``` + +
+ +### Data Transform Pipeline Configuration + +As mentioned earlier, we have separated the normalization and padding configurations for images from the `train_pipeline` and `test_pipeline`, and have placed them in `model.data_preprocessor` instead. Hence, in the 3.x version of the pipeline, we no longer require the `Normalize` and `Pad` transforms. + +At the same time, we have also refactored the transform responsible for packing the data format, and have merged the `Collect` and `DefaultFormatBundle` transforms into `PackDetInputs`. This transform is responsible for packing the data from the data pipeline into the input format of the model. For more details on the input format conversion, please refer to the [data flow documentation](../advanced_guides/data_flow.md). + +Below, we will use the `train_pipeline` of Mask R-CNN as an example, to demonstrate how to migrate from the 2.x configuration to the 3.x configuration: + + + + + + + + + +
2.x Config + +```python +img_norm_cfg = dict( + mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations', with_bbox=True), + dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), + dict(type='RandomFlip', flip_ratio=0.5), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size_divisor=32), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), +] +``` + +
3.x Config + +```python +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations', with_bbox=True), + dict(type='Resize', scale=(1333, 800), keep_ratio=True), + dict(type='RandomFlip', prob=0.5), + dict(type='PackDetInputs') +] +``` + +
+ +For the `test_pipeline`, apart from removing the `Normalize` and `Pad` transforms, we have also separated the data augmentation for testing (TTA) from the normal testing process, and have removed `MultiScaleFlipAug`. For more information on how to use the new TTA version, please refer to the [TTA documentation](../advanced_guides/tta.md). + +Below, we will again use the `test_pipeline` of Mask R-CNN as an example, to demonstrate how to migrate from the 2.x configuration to the 3.x configuration: + + + + + + + + + +
2.x Config + +```python +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(1333, 800), + flip=False, + transforms=[ + dict(type='Resize', keep_ratio=True), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size_divisor=32), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']), + ]) +] +``` + +
3.x Config + +```python +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='Resize', scale=(1333, 800), keep_ratio=True), + dict( + type='PackDetInputs', + meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', + 'scale_factor')) +] +``` + +
+ +In addition, we have also refactored some data augmentation transforms. The following table lists the mapping between the transforms used in the 2.x version and the 3.x version: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Name2.x Config3.x Config
Resize + +```python +dict(type='Resize', + img_scale=(1333, 800), + keep_ratio=True) +``` + + + +```python +dict(type='Resize', + scale=(1333, 800), + keep_ratio=True) +``` + +
RandomResize + +```python +dict( + type='Resize', + img_scale=[ + (1333, 640), (1333, 800)], + multiscale_mode='range', + keep_ratio=True) +``` + + + +```python +dict( + type='RandomResize', + scale=[ + (1333, 640), (1333, 800)], + keep_ratio=True) +``` + +
RandomChoiceResize + +```python +dict( + type='Resize', + img_scale=[ + (1333, 640), (1333, 672), + (1333, 704), (1333, 736), + (1333, 768), (1333, 800)], + multiscale_mode='value', + keep_ratio=True) +``` + + + +```python +dict( + type='RandomChoiceResize', + scales=[ + (1333, 640), (1333, 672), + (1333, 704), (1333, 736), + (1333, 768), (1333, 800)], + keep_ratio=True) +``` + +
RandomFlip + +```python +dict(type='RandomFlip', flip_ratio=0.5) +``` + + + +```python +dict(type='RandomFlip', prob=0.5) +``` + +
+ +### 评测器配置 + +In version 3.x, model accuracy evaluation is no longer tied to the dataset, but is instead accomplished through the use of an Evaluator. +The Evaluator configuration is divided into two parts: `val_evaluator` and `test_evaluator`. The `val_evaluator` is used for validation dataset evaluation, while the `test_evaluator` is used for testing dataset evaluation. +This corresponds to the `evaluation` field in version 2.x. + +The following table shows the corresponding relationship between Evaluators in version 2.x and 3.x. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Metric Name2.x Config3.x Config
COCO + +```python +data = dict( + val=dict( + type='CocoDataset', + ann_file=data_root + 'annotations/instances_val2017.json')) +evaluation = dict(metric=['bbox', 'segm']) +``` + + + +```python +val_evaluator = dict( + type='CocoMetric', + ann_file=data_root + 'annotations/instances_val2017.json', + metric=['bbox', 'segm'], + format_only=False) +``` + +
Pascal VOC + +```python +data = dict( + val=dict( + type=dataset_type, + ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt')) +evaluation = dict(metric='mAP') +``` + + + +```python +val_evaluator = dict( + type='VOCMetric', + metric='mAP', + eval_mode='11points') +``` + +
OpenImages + +```python +data = dict( + val=dict( + type='OpenImagesDataset', + ann_file=data_root + 'annotations/validation-annotations-bbox.csv', + img_prefix=data_root + 'OpenImages/validation/', + label_file=data_root + 'annotations/class-descriptions-boxable.csv', + hierarchy_file=data_root + + 'annotations/bbox_labels_600_hierarchy.json', + meta_file=data_root + 'annotations/validation-image-metas.pkl', + image_level_ann_file=data_root + + 'annotations/validation-annotations-human-imagelabels-boxable.csv')) +evaluation = dict(interval=1, metric='mAP') +``` + + + +```python +val_evaluator = dict( + type='OpenImagesMetric', + iou_thrs=0.5, + ioa_thrs=0.5, + use_group_of=True, + get_supercategory=True) +``` + +
CityScapes + +```python +data = dict( + val=dict( + type='CityScapesDataset', + ann_file=data_root + + 'annotations/instancesonly_filtered_gtFine_val.json', + img_prefix=data_root + 'leftImg8bit/val/', + pipeline=test_pipeline)) +evaluation = dict(metric=['bbox', 'segm']) +``` + + + +```python +val_evaluator = [ + dict( + type='CocoMetric', + ann_file=data_root + + 'annotations/instancesonly_filtered_gtFine_val.json', + metric=['bbox', 'segm']), + dict( + type='CityScapesMetric', + ann_file=data_root + + 'annotations/instancesonly_filtered_gtFine_val.json', + seg_prefix=data_root + '/gtFine/val', + outfile_prefix='./work_dirs/cityscapes_metric/instance') +] +``` + +
+ +## Configuration for Training and Testing + + + + + + + + + +
2.x Config + +```python +runner = dict( + type='EpochBasedRunner', # Type of training loop + max_epochs=12) # Maximum number of training epochs +evaluation = dict(interval=2) # Interval for evaluation, check the performance every 2 epochs +``` + +
3.x Config + +```python +train_cfg = dict( + type='EpochBasedTrainLoop', # Type of training loop, please refer to https://github.com/open-mmlab/mmengine/blob/main/mmengine/runner/loops.py + max_epochs=12, # Maximum number of training epochs + val_interval=2) # Interval for validation, check the performance every 2 epochs +val_cfg = dict(type='ValLoop') # Type of validation loop +test_cfg = dict(type='TestLoop') # Type of testing loop +``` + +
+ +## Optimization Configuration + +The configuration for optimizer and gradient clipping is moved to the `optim_wrapper` field. +The following table shows the correspondences for optimizer configuration between 2.x version and 3.x version: + + + + + + + + + +
2.x Config + +```python +optimizer = dict( + type='SGD', # Optimizer: Stochastic Gradient Descent + lr=0.02, # Base learning rate + momentum=0.9, # SGD with momentum + weight_decay=0.0001) # Weight decay +optimizer_config = dict(grad_clip=None) # Configuration for gradient clipping, set to None to disable +``` + +
3.x Config + +```python +optim_wrapper = dict( # Configuration for the optimizer wrapper + type='OptimWrapper', # Type of optimizer wrapper, you can switch to AmpOptimWrapper to enable mixed precision training + optimizer=dict( # Optimizer configuration, supports various PyTorch optimizers, please refer to https://pytorch.org/docs/stable/optim.html#algorithms + type='SGD', # SGD + lr=0.02, # Base learning rate + momentum=0.9, # SGD with momentum + weight_decay=0.0001), # Weight decay + clip_grad=None, # Configuration for gradient clipping, set to None to disable. For usage, please see https://mmengine.readthedocs.io/en/latest/tutorials/optimizer.html + ) +``` + +
+ +The configuration for learning rate is also moved from the `lr_config` field to the `param_scheduler` field. The `param_scheduler` configuration is more similar to PyTorch's learning rate scheduler and more flexible. The following table shows the correspondences for learning rate configuration between 2.x version and 3.x version: + + + + + + + + + +
2.x Config + +```python +lr_config = dict( + policy='step', # Use multi-step learning rate strategy during training + warmup='linear', # Use linear learning rate warmup + warmup_iters=500, # End warmup at iteration 500 + warmup_ratio=0.001, # Coefficient for learning rate warmup + step=[8, 11], # Learning rate decay at which epochs + gamma=0.1) # Learning rate decay coefficient + +``` + +
3.x Config + +```python +param_scheduler = [ + dict( + type='LinearLR', # Use linear learning rate warmup + start_factor=0.001, # Coefficient for learning rate warmup + by_epoch=False, # Update the learning rate during warmup at each iteration + begin=0, # Starting from the first iteration + end=500), # End at the 500th iteration + dict( + type='MultiStepLR', # Use multi-step learning rate strategy during training + by_epoch=True, # Update the learning rate at each epoch + begin=0, # Starting from the first epoch + end=12, # Ending at the 12th epoch + milestones=[8, 11], # Learning rate decay at which epochs + gamma=0.1) # Learning rate decay coefficient +] + +``` + +
+ +For information on how to migrate other learning rate adjustment policies, please refer to the [learning rate migration document of MMEngine](https://mmengine.readthedocs.io/zh_CN/latest/migration/param_scheduler.html). + +## Migration of Other Configurations + +### Configuration for Saving Checkpoints + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Function2.x Config3.x Config
Set Save Interval + +```python +checkpoint_config = dict( + interval=1) +``` + + + +```python +default_hooks = dict( + checkpoint=dict( + type='CheckpointHook', + interval=1)) +``` + +
Save Best Model + +```python +evaluation = dict( + save_best='auto') +``` + + + +```python +default_hooks = dict( + checkpoint=dict( + type='CheckpointHook', + save_best='auto')) +``` + +
Keep Latest Model + +```python +checkpoint_config = dict( + max_keep_ckpts=3) +``` + + + +```python +default_hooks = dict( + checkpoint=dict( + type='CheckpointHook', + max_keep_ckpts=3)) +``` + +
+ +### Logging Configuration + +In MMDetection 3.x, the logging and visualization of the log are carried out respectively by the logger and visualizer in MMEngine. The following table shows the comparison between the configuration of printing logs and visualizing logs in MMDetection 2.x and 3.x. + + + + + + + + + + + + + + + + + + + + + + + + +
Function2.x Config3.x Config
Set Log Printing Interval + +```python +log_config = dict(interval=50) +``` + + + +```python +default_hooks = dict( + logger=dict(type='LoggerHook', interval=50)) +# Optional: set moving average window size +log_processor = dict( + type='LogProcessor', window_size=50) +``` + +
Use TensorBoard or WandB to visualize logs + +```python +log_config = dict( + interval=50, + hooks=[ + dict(type='TextLoggerHook'), + dict(type='TensorboardLoggerHook'), + dict(type='MMDetWandbHook', + init_kwargs={ + 'project': 'mmdetection', + 'group': 'maskrcnn-r50-fpn-1x-coco' + }, + interval=50, + log_checkpoint=True, + log_checkpoint_metadata=True, + num_eval_images=100) + ]) +``` + + + +```python +vis_backends = [ + dict(type='LocalVisBackend'), + dict(type='TensorboardVisBackend'), + dict(type='WandbVisBackend', + init_kwargs={ + 'project': 'mmdetection', + 'group': 'maskrcnn-r50-fpn-1x-coco' + }) +] +visualizer = dict( + type='DetLocalVisualizer', + vis_backends=vis_backends, + name='visualizer') +``` + +
+ +For visualization-related tutorials, please refer to [Visualization Tutorial](../user_guides/visualization.md) of MMDetection. + +### Runtime Configuration + +The runtime configuration fields in version 3.x have been adjusted, and the specific correspondence is as follows: + + + + + + + + + + + + + + + + +
2.x Config3.x Config
+ +```python +cudnn_benchmark = False +opencv_num_threads = 0 +mp_start_method = 'fork' +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None + + +``` + + + +```python +env_cfg = dict( + cudnn_benchmark=False, + mp_cfg=dict(mp_start_method='fork', + opencv_num_threads=0), + dist_cfg=dict(backend='nccl')) +log_level = 'INFO' +load_from = None +resume = False +``` + +
diff --git a/docs/en/migration/dataset_migration.md b/docs/en/migration/dataset_migration.md new file mode 100644 index 00000000000..75d093298e0 --- /dev/null +++ b/docs/en/migration/dataset_migration.md @@ -0,0 +1 @@ +# Migrate dataset from MMDetection 2.x to 3.x diff --git a/docs/en/migration/migration.md b/docs/en/migration/migration.md new file mode 100644 index 00000000000..ec6a2f891b1 --- /dev/null +++ b/docs/en/migration/migration.md @@ -0,0 +1,12 @@ +# Migrating from MMDetection 2.x to 3.x + +MMDetection 3.x is a significant update that includes many changes to API and configuration files. This document aims to help users migrate from MMDetection 2.x to 3.x. +We divided the migration guide into the following sections: + +- [Configuration file migration](./config_migration.md) +- [API and Registry migration](./api_and_registry_migration.md) +- [Dataset migration](./dataset_migration.md) +- [Model migration](./model_migration.md) +- [Frequently Asked Questions](./migration_faq.md) + +If you encounter any problems during the migration process, feel free to raise an issue. We also welcome contributions to this document. diff --git a/docs/en/migration/migration_faq.md b/docs/en/migration/migration_faq.md new file mode 100644 index 00000000000..a6e3c356c27 --- /dev/null +++ b/docs/en/migration/migration_faq.md @@ -0,0 +1 @@ +# Migration FAQ diff --git a/docs/en/migration/model_migration.md b/docs/en/migration/model_migration.md new file mode 100644 index 00000000000..04e280879fc --- /dev/null +++ b/docs/en/migration/model_migration.md @@ -0,0 +1 @@ +# Migrate models from MMDetection 2.x to 3.x diff --git a/docs/en/overview.md b/docs/en/overview.md index f78b658f017..39fb6f51564 100644 --- a/docs/en/overview.md +++ b/docs/en/overview.md @@ -50,3 +50,5 @@ Here is a detailed step-by-step guide to learn more about MMDetection: - [Basic Concepts](https://mmdetection.readthedocs.io/en/dev-3.x/advanced_guides/index.html#basic-concepts) - [Component Customization](https://mmdetection.readthedocs.io/en/dev-3.x/advanced_guides/index.html#component-customization) + +4. For users of MMDetection 2.x version, we provide a guide to help you adapt to the new version. You can find it in the [migration guide](./migration/migration.md). diff --git a/docs/zh_cn/index.rst b/docs/zh_cn/index.rst index 280e1ecacf6..58a4d8a52d3 100644 --- a/docs/zh_cn/index.rst +++ b/docs/zh_cn/index.rst @@ -24,7 +24,7 @@ Welcome to MMDetection's documentation! :maxdepth: 1 :caption: 迁移版本 - migration.md + migration/migration.md .. toctree:: :maxdepth: 1 diff --git a/docs/zh_cn/migration/api_and_registry_migration.md b/docs/zh_cn/migration/api_and_registry_migration.md new file mode 100644 index 00000000000..66e1c340806 --- /dev/null +++ b/docs/zh_cn/migration/api_and_registry_migration.md @@ -0,0 +1 @@ +# 将 API 和注册器从 MMDetection 2.x 迁移至 3.x diff --git a/docs/zh_cn/migration/config_migration.md b/docs/zh_cn/migration/config_migration.md new file mode 100644 index 00000000000..c4f9c8e3d2d --- /dev/null +++ b/docs/zh_cn/migration/config_migration.md @@ -0,0 +1,814 @@ +# 将配置文件从 MMDetection 2.x 迁移至 3.x + +MMDetection 3.x 的配置文件与 2.x 相比有较大变化,这篇文档将介绍如何将 2.x 的配置文件迁移到 3.x。 + +在前面的[配置文件教程](../user_guides/config.md)中,我们以 Mask R-CNN 为例介绍了 MMDetection 3.x 的配置文件结构,这里我们将按同样的结构介绍如何将 2.x 的配置文件迁移至 3.x。 + +## 模型配置 + +模型的配置与 2.x 相比并没有太大变化,对于模型的 backbone,neck,head,以及 train_cfg 和 test_cfg,它们的参数与 2.x 版本的参数保持一致。 + +不同的是,我们在 3.x 版本的模型中新增了 `DataPreprocessor` 模块。 +`DataPreprocessor` 模块的配置位于 `model.data_preprocessor` 中,它用于对输入数据进行预处理,例如对输入图像进行归一化,将不同大小的图片进行 padding 从而组成 batch,将图像从内存中读取到显存中等。这部分配置取代了原本存在于 train_pipeline 和 test_pipeline 中的 `Normalize` 和 `Pad`。 + + + + + + + + + +
原配置 + +```python +# 图像归一化参数 +img_norm_cfg = dict( + mean=[123.675, 116.28, 103.53], + std=[58.395, 57.12, 57.375], + to_rgb=True) +pipeline=[ + ..., + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size_divisor=32), # 图像 padding 到 32 的倍数 + ... +] +``` + +
新配置 + +```python +model = dict( + data_preprocessor=dict( + type='DetDataPreprocessor', + # 图像归一化参数 + mean=[123.675, 116.28, 103.53], + std=[58.395, 57.12, 57.375], + bgr_to_rgb=True, + # 图像 padding 参数 + pad_mask=True, # 在实例分割中,需要将 mask 也进行 padding + pad_size_divisor=32) # 图像 padding 到 32 的倍数 +) +``` + +
+ +## 数据集和评测器配置 + +数据集和评测部分的配置相比 2.x 版本有较大的变化。我们将从 Dataloader 和 Dataset,Data transform pipeline,以及评测器配置三个方面介绍如何将 2.x 版本的配置迁移到 3.x 版本。 + +### Dataloader 和 Dataset 配置 + +在新版本中,我们将数据加载的设置与 PyTorch 官方的 DataLoader 保持一致,这样可以使用户更容易理解和上手。 +我们将训练、验证和测试的数据加载设置分别放在 `train_dataloader`,`val_dataloader` 和 `test_dataloader` 中,用户可以分别对这些 dataloader 设置不同的参数,其输入参数与 [PyTorch 的 Dataloader](https://pytorch.org/docs/stable/data.html?highlight=dataloader#torch.utils.data.DataLoader) 所需要的参数基本一致。 + +通过这种方式,我们将 2.x 版本中不可配置的 `sampler`,`batch_sampler`,`persistent_workers` 等参数都放到了配置文件中,使得用户可以更加灵活地设置数据加载的参数。 + +用户可以通过 `train_dataloader.dataset`,`val_dataloader.dataset` 和 `test_dataloader.dataset` 来设置数据集的配置,它们分别对应 2.x 版本中的 `data.train`,`data.val` 和 `data.test`。 + + + + + + + + + +
原配置 + +```python +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + ann_file=data_root + 'annotations/instances_train2017.json', + img_prefix=data_root + 'train2017/', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + ann_file=data_root + 'annotations/instances_val2017.json', + img_prefix=data_root + 'val2017/', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + ann_file=data_root + 'annotations/instances_val2017.json', + img_prefix=data_root + 'val2017/', + pipeline=test_pipeline)) +``` + +
新配置 + +```python +train_dataloader = dict( + batch_size=2, + num_workers=2, + persistent_workers=True, # 避免每次迭代后 dataloader 重新创建子进程 + sampler=dict(type='DefaultSampler', shuffle=True), # 默认的 sampler,同时支持分布式训练和非分布式训练 + batch_sampler=dict(type='AspectRatioBatchSampler'), # 默认的 batch_sampler,用于保证 batch 中的图片具有相似的长宽比,从而可以更好地利用显存 + dataset=dict( + type=dataset_type, + data_root=data_root, + ann_file='annotations/instances_train2017.json', + data_prefix=dict(img='train2017/'), + filter_cfg=dict(filter_empty_gt=True, min_size=32), + pipeline=train_pipeline)) +# 在 3.x 版本中可以独立配置验证和测试的 dataloader +val_dataloader = dict( + batch_size=1, + num_workers=2, + persistent_workers=True, + drop_last=False, + sampler=dict(type='DefaultSampler', shuffle=False), + dataset=dict( + type=dataset_type, + data_root=data_root, + ann_file='annotations/instances_val2017.json', + data_prefix=dict(img='val2017/'), + test_mode=True, + pipeline=test_pipeline)) +test_dataloader = val_dataloader # 测试 dataloader 的配置与验证 dataloader 的配置相同,这里省略 +``` + +
+ +### Data transform pipeline 配置 + +上文中提到,我们将图像 normalize 和 padding 的配置从 `train_pipeline` 和 `test_pipeline` 中独立出来,放到了 `model.data_preprocessor` 中,因此在 3.x 版本的 pipeline 中,我们不再需要 `Normalize` 和 `Pad` 这两个 transform。 + +同时,我们也对负责数据格式打包的 transform 进行了重构,将 `Collect` 和 `DefaultFormatBundle` 这两个 transform 合并为了 `PackDetInputs`,它负责将 data pipeline 中的数据打包成模型的输入格式,关于输入格式的转换,详见[数据流文档](../advanced_guides/data_flow.md)。 + +下面以 Mask R-CNN 1x 的 train_pipeline 为例,介绍如何将 2.x 版本的配置迁移到 3.x 版本: + + + + + + + + + +
原配置 + +```python +img_norm_cfg = dict( + mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations', with_bbox=True), + dict(type='Resize', img_scale=(1333, 800), keep_ratio=True), + dict(type='RandomFlip', flip_ratio=0.5), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size_divisor=32), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), +] +``` + +
新配置 + +```python +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations', with_bbox=True), + dict(type='Resize', scale=(1333, 800), keep_ratio=True), + dict(type='RandomFlip', prob=0.5), + dict(type='PackDetInputs') +] +``` + +
+ +对于 test_pipeline,除了将 `Normalize` 和 `Pad` 这两个 transform 去掉之外,我们也将测试时的数据增强(TTA)与普通的测试流程分开,移除了 `MultiScaleFlipAug`。关于新版的 TTA 如何使用,详见[TTA 文档](../advanced_guides/tta.md)。 + +下面同样以 Mask R-CNN 1x 的 test_pipeline 为例,介绍如何将 2.x 版本的配置迁移到 3.x 版本: + + + + + + + + + +
原配置 + +```python +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(1333, 800), + flip=False, + transforms=[ + dict(type='Resize', keep_ratio=True), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size_divisor=32), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']), + ]) +] +``` + +
新配置 + +```python +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='Resize', scale=(1333, 800), keep_ratio=True), + dict( + type='PackDetInputs', + meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', + 'scale_factor')) +] +``` + +
+ +除此之外,我们还对一些数据增强进行了重构,下表列出了 2.x 版本中的 transform 与 3.x 版本中的 transform 的对应关系: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
名称原配置新配置
Resize + +```python +dict(type='Resize', + img_scale=(1333, 800), + keep_ratio=True) +``` + + + +```python +dict(type='Resize', + scale=(1333, 800), + keep_ratio=True) +``` + +
RandomResize + +```python +dict( + type='Resize', + img_scale=[ + (1333, 640), (1333, 800)], + multiscale_mode='range', + keep_ratio=True) +``` + + + +```python +dict( + type='RandomResize', + scale=[ + (1333, 640), (1333, 800)], + keep_ratio=True) +``` + +
RandomChoiceResize + +```python +dict( + type='Resize', + img_scale=[ + (1333, 640), (1333, 672), + (1333, 704), (1333, 736), + (1333, 768), (1333, 800)], + multiscale_mode='value', + keep_ratio=True) +``` + + + +```python +dict( + type='RandomChoiceResize', + scales=[ + (1333, 640), (1333, 672), + (1333, 704), (1333, 736), + (1333, 768), (1333, 800)], + keep_ratio=True) +``` + +
RandomFlip + +```python +dict(type='RandomFlip', + flip_ratio=0.5) +``` + + + +```python +dict(type='RandomFlip', + prob=0.5) +``` + +
+ +### 评测器配置 + +在 3.x 版本中,模型精度评测不再与数据集绑定,而是通过评测器(Evaluator)来完成。 +评测器配置分为 val_evaluator 和 test_evaluator 两部分,其中 val_evaluator 用于验证集评测,test_evaluator 用于测试集评测,对应 2.x 版本中的 evaluation 字段。 +下表列出了 2.x 版本与 3.x 版本中的评测器的对应关系: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
评测指标名称原配置新配置
COCO + +```python +data = dict( + val=dict( + type='CocoDataset', + ann_file=data_root + 'annotations/instances_val2017.json')) +evaluation = dict(metric=['bbox', 'segm']) +``` + + + +```python +val_evaluator = dict( + type='CocoMetric', + ann_file=data_root + 'annotations/instances_val2017.json', + metric=['bbox', 'segm'], + format_only=False) +``` + +
Pascal VOC + +```python +data = dict( + val=dict( + type=dataset_type, + ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt')) +evaluation = dict(metric='mAP') +``` + + + +```python +val_evaluator = dict( + type='VOCMetric', + metric='mAP', + eval_mode='11points') +``` + +
OpenImages + +```python +data = dict( + val=dict( + type='OpenImagesDataset', + ann_file=data_root + 'annotations/validation-annotations-bbox.csv', + img_prefix=data_root + 'OpenImages/validation/', + label_file=data_root + 'annotations/class-descriptions-boxable.csv', + hierarchy_file=data_root + + 'annotations/bbox_labels_600_hierarchy.json', + meta_file=data_root + 'annotations/validation-image-metas.pkl', + image_level_ann_file=data_root + + 'annotations/validation-annotations-human-imagelabels-boxable.csv')) +evaluation = dict(interval=1, metric='mAP') +``` + + + +```python +val_evaluator = dict( + type='OpenImagesMetric', + iou_thrs=0.5, + ioa_thrs=0.5, + use_group_of=True, + get_supercategory=True) +``` + +
CityScapes + +```python +data = dict( + val=dict( + type='CityScapesDataset', + ann_file=data_root + + 'annotations/instancesonly_filtered_gtFine_val.json', + img_prefix=data_root + 'leftImg8bit/val/', + pipeline=test_pipeline)) +evaluation = dict(metric=['bbox', 'segm']) +``` + + + +```python +val_evaluator = [ + dict( + type='CocoMetric', + ann_file=data_root + + 'annotations/instancesonly_filtered_gtFine_val.json', + metric=['bbox', 'segm']), + dict( + type='CityScapesMetric', + ann_file=data_root + + 'annotations/instancesonly_filtered_gtFine_val.json', + seg_prefix=data_root + '/gtFine/val', + outfile_prefix='./work_dirs/cityscapes_metric/instance') +] +``` + +
+ +## 训练和测试的配置 + + + + + + + + + +
原配置 + +```python +runner = dict( + type='EpochBasedRunner', # 训练循环的类型 + max_epochs=12) # 最大训练轮次 +evaluation = dict(interval=2) # 验证间隔。每 2 个 epoch 验证一次 +``` + +
新配置 + +```python +train_cfg = dict( + type='EpochBasedTrainLoop', # 训练循环的类型,请参考 https://github.com/open-mmlab/mmengine/blob/main/mmengine/runner/loops.py + max_epochs=12, # 最大训练轮次 + val_interval=2) # 验证间隔。每 2 个 epoch 验证一次 +val_cfg = dict(type='ValLoop') # 验证循环的类型 +test_cfg = dict(type='TestLoop') # 测试循环的类型 +``` + +
+ +## 优化相关配置 + +优化器以及梯度裁剪的配置都移至 optim_wrapper 字段中。下表列出了 2.x 版本与 3.x 版本中的优化器配置的对应关系: + + + + + + + + + +
原配置 + +```python +optimizer = dict( + type='SGD', # 随机梯度下降优化器 + lr=0.02, # 基础学习率 + momentum=0.9, # 带动量的随机梯度下降 + weight_decay=0.0001) # 权重衰减 +optimizer_config = dict(grad_clip=None) # 梯度裁剪的配置,设置为 None 关闭梯度裁剪 +``` + +
新配置 + +```python +optim_wrapper = dict( # 优化器封装的配置 + type='OptimWrapper', # 优化器封装的类型。可以切换至 AmpOptimWrapper 来启用混合精度训练 + optimizer=dict( # 优化器配置。支持 PyTorch 的各种优化器。请参考 https://pytorch.org/docs/stable/optim.html#algorithms + type='SGD', # 随机梯度下降优化器 + lr=0.02, # 基础学习率 + momentum=0.9, # 带动量的随机梯度下降 + weight_decay=0.0001), # 权重衰减 + clip_grad=None, # 梯度裁剪的配置,设置为 None 关闭梯度裁剪。使用方法请见 https://mmengine.readthedocs.io/en/latest/tutorials/optimizer.html + ) +``` + +
+ +学习率的配置也从 lr_config 字段中移至 param_scheduler 字段中。param_scheduler 的配置更贴近 PyTorch 的学习率调整策略,更加灵活。下表列出了 2.x 版本与 3.x 版本中的学习率配置的对应关系: + + + + + + + + + +
原配置 + +```python +lr_config = dict( + policy='step', # 在训练过程中使用 multi step 学习率策略 + warmup='linear', # 使用线性学习率预热 + warmup_iters=500, # 到第 500 个 iteration 结束预热 + warmup_ratio=0.001, # 学习率预热的系数 + step=[8, 11], # 在哪几个 epoch 进行学习率衰减 + gamma=0.1) # 学习率衰减系数 +``` + +
新配置 + +```python +param_scheduler = [ + dict( + type='LinearLR', # 使用线性学习率预热 + start_factor=0.001, # 学习率预热的系数 + by_epoch=False, # 按 iteration 更新预热学习率 + begin=0, # 从第一个 iteration 开始 + end=500), # 到第 500 个 iteration 结束 + dict( + type='MultiStepLR', # 在训练过程中使用 multi step 学习率策略 + by_epoch=True, # 按 epoch 更新学习率 + begin=0, # 从第一个 epoch 开始 + end=12, # 到第 12 个 epoch 结束 + milestones=[8, 11], # 在哪几个 epoch 进行学习率衰减 + gamma=0.1) # 学习率衰减系数 +] +``` + +
+ +关于其他的学习率调整策略的迁移,请参考 MMEngine 的[学习率迁移文档](https://mmengine.readthedocs.io/zh_CN/latest/migration/param_scheduler.html)。 + +## 其他配置的迁移 + +### 保存 checkpoint 的配置 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
功能原配置新配置
设置保存间隔 + +```python +checkpoint_config = dict( + interval=1) +``` + + + +```python +default_hooks = dict( + checkpoint=dict( + type='CheckpointHook', + interval=1)) +``` + +
保存最佳模型 + +```python +evaluation = dict( + save_best='auto') +``` + + + +```python +default_hooks = dict( + checkpoint=dict( + type='CheckpointHook', + save_best='auto')) +``` + +
只保留最新的几个模型 + +```python +checkpoint_config = dict( + max_keep_ckpts=3) +``` + + + +```python +default_hooks = dict( + checkpoint=dict( + type='CheckpointHook', + max_keep_ckpts=3)) +``` + +
+ +### 日志的配置 + +3.x 版本中,日志的打印和可视化由 MMEngine 中的 logger 和 visualizer 分别完成。下表列出了 2.x 版本与 3.x 版本中的日志配置的对应关系: + + + + + + + + + + + + + + + + + + + + + + + + +
功能原配置新配置
设置日志打印间隔 + +```python +log_config = dict( + interval=50) +``` + + + +```python +default_hooks = dict( + logger=dict( + type='LoggerHook', + interval=50)) +# 可选: 配置日志打印数值的平滑窗口大小 +log_processor = dict( + type='LogProcessor', + window_size=50) +``` + +
使用 TensorBoard 或 WandB 可视化日志 + +```python +log_config = dict( + interval=50, + hooks=[ + dict(type='TextLoggerHook'), + dict(type='TensorboardLoggerHook'), + dict(type='MMDetWandbHook', + init_kwargs={ + 'project': 'mmdetection', + 'group': 'maskrcnn-r50-fpn-1x-coco' + }, + interval=50, + log_checkpoint=True, + log_checkpoint_metadata=True, + num_eval_images=100) + ]) +``` + + + +```python +vis_backends = [ + dict(type='LocalVisBackend'), + dict(type='TensorboardVisBackend'), + dict(type='WandbVisBackend', + init_kwargs={ + 'project': 'mmdetection', + 'group': 'maskrcnn-r50-fpn-1x-coco' + }) +] +visualizer = dict( + type='DetLocalVisualizer', vis_backends=vis_backends, name='visualizer') +``` + +
+ +关于可视化相关的教程,请参考 MMDetection 的[可视化教程](../user_guides/visualization.md)。 + +### Runtime 的配置 + +3.x 版本中 runtime 的配置字段有所调整,具体的对应关系如下: + + + + + + + + + + + + + + + + +
原配置新配置
+ +```python +cudnn_benchmark = False +opencv_num_threads = 0 +mp_start_method = 'fork' +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None + + +``` + + + +```python +env_cfg = dict( + cudnn_benchmark=False, + mp_cfg=dict(mp_start_method='fork', + opencv_num_threads=0), + dist_cfg=dict(backend='nccl')) +log_level = 'INFO' +load_from = None +resume = False +``` + +
diff --git a/docs/zh_cn/migration/dataset_migration.md b/docs/zh_cn/migration/dataset_migration.md new file mode 100644 index 00000000000..c379b9f1b7b --- /dev/null +++ b/docs/zh_cn/migration/dataset_migration.md @@ -0,0 +1 @@ +# 将数据集从 MMDetection 2.x 迁移至 3.x diff --git a/docs/zh_cn/migration/migration.md b/docs/zh_cn/migration/migration.md new file mode 100644 index 00000000000..d706856fa82 --- /dev/null +++ b/docs/zh_cn/migration/migration.md @@ -0,0 +1,12 @@ +# 从 MMDetection 2.x 迁移至 3.x + +MMDetection 3.x 版本是一个重大更新,包含了许多 API 和配置文件的变化。本文档旨在帮助用户从 MMDetection 2.x 版本迁移到 3.x 版本。 +我们将迁移指南分为以下几个部分: + +- [配置文件迁移](./config_migration.md) +- [API 和 Registry 迁移](./api_and_registry_migration.md) +- [数据集迁移](./dataset_migration.md) +- [模型迁移](./model_migration.md) +- [常见问题](./migration_faq.md) + +如果您在迁移过程中遇到任何问题,欢迎在 issue 中提出。我们也欢迎您为本文档做出贡献。 diff --git a/docs/zh_cn/migration/migration_faq.md b/docs/zh_cn/migration/migration_faq.md new file mode 100644 index 00000000000..208a138b25d --- /dev/null +++ b/docs/zh_cn/migration/migration_faq.md @@ -0,0 +1 @@ +# 迁移 FAQ diff --git a/docs/zh_cn/migration/model_migration.md b/docs/zh_cn/migration/model_migration.md new file mode 100644 index 00000000000..d7992440228 --- /dev/null +++ b/docs/zh_cn/migration/model_migration.md @@ -0,0 +1 @@ +# 将模型从 MMDetection 2.x 迁移至 3.x diff --git a/docs/zh_cn/overview.md b/docs/zh_cn/overview.md index b27ead66357..deaa5a2c173 100644 --- a/docs/zh_cn/overview.md +++ b/docs/zh_cn/overview.md @@ -50,3 +50,5 @@ MMDetection 由 7 个主要部分组成,apis、structures、datasets、models - [基础概念](https://mmdetection.readthedocs.io/zh_CN/dev-3.x/advanced_guides/index.html#basic-concepts) - [组件定制](https://mmdetection.readthedocs.io/zh_CN/dev-3.x/advanced_guides/index.html#component-customization) + +4. 对于 MMDetection 2.x 版本的用户,我们提供了[迁移指南](./migration/migration.md),帮助您完成新版本的适配。