mmdetection训练(1)voc格式的数据集(自制)

admin2024-05-15  1

mmdetection训练(1)voc格式的数据集(自制)

  • 提前准备
  • 一、voc数据集
  • 二、修改配置代码进行训练(敲黑板!!!!!)
    • 1.数据集相关内容修改
    • 2.自定义配置文件构建
  • 三、训练及其评估测试
  • 总结

提前准备

voc数据集,mmdetection代码库

一、voc数据集

需要有以下三个文件夹的格式数据
(这里不需要完全按照vocdevkit/voc2007这种进行构建,下面教大家如何修改)
mmdetection训练(1)voc格式的数据集(自制),在这里插入图片描述,第1张
存放xml标签文件
mmdetection训练(1)voc格式的数据集(自制),在这里插入图片描述,第2张
放的train.txt,test.txt,和val.txt
mmdetection训练(1)voc格式的数据集(自制),在这里插入图片描述,第3张
存放原始的图片数据

二、修改配置代码进行训练(敲黑板!!!!!)

强调:这里包括以下的内容,均在自己创建模板文件下进行修改,原则上不对原始代码进行修改,修改原始代码繁琐且容易搞混破坏代码整体结构,以下均为自己创建配置文件,请自己按需改写(个人喜欢这样的配置方式)。

1.数据集相关内容修改

(1)configs/base/datasets/voc0712.py中
修改相关的路径与voc数据集保持一致(这个文件夹中只要修改路径不要修改别的东西)

# dataset settings
dataset_type = 'VOCDataset'
# data_root = 'data/VOCdevkit/'
data_root = '/home/ubuntu/data/Official-SSDD-OPEN/BBox_SSDD/'

# Example to use different file client
# Method 1: simply set the data root and let the file I/O module
# automatically Infer from prefix (not support LMDB and Memcache yet)

# data_root = 's3://openmmlab/datasets/detection/segmentation/VOCdevkit/'

# Method 2: Use `backend_args`, `file_client_args` in versions before 3.0.0rc6
# backend_args = dict(
#     backend='petrel',
#     path_mapping=dict({
#         './data/': 's3://openmmlab/datasets/segmentation/',
#         'data/': 's3://openmmlab/datasets/segmentation/'
#     }))
backend_args = None

###数据增强的方法
train_pipeline = [
    dict(type='LoadImageFromFile', backend_args=backend_args),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', scale=(1000, 600), keep_ratio=True),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PackDetInputs')
]
test_pipeline = [
    dict(type='LoadImageFromFile', backend_args=backend_args),
    dict(type='Resize', scale=(1000, 600), keep_ratio=True),
    # avoid bboxes being resized
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor'))
]

###数据加载
train_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    batch_sampler=dict(type='AspectRatioBatchSampler'),
    dataset=dict(
        type='RepeatDataset',
        times=3,
        dataset=dict(
            type='ConcatDataset',
            # VOCDataset will add different `dataset_type` in dataset.metainfo,
            # which will get error if using ConcatDataset. Adding
            # `ignore_keys` can avoid this error.
            ignore_keys=['dataset_type'],
            datasets=[
                dict(
                    type=dataset_type,
                    data_root=data_root,
                    # ann_file='VOC2007/ImageSets/Main/trainval.txt',
                    ann_file='voc_style/ImageSets/Main/train.txt',
                    data_prefix=dict(sub_data_root='voc_style/'),
                    filter_cfg=dict(
                        filter_empty_gt=True, min_size=32, bbox_min_size=32
                        ),
                    pipeline=train_pipeline,
                    backend_args=backend_args),
                # dict(
                #     type=dataset_type,
                #     data_root=data_root,
                #     ann_file='VOC2012/ImageSets/Main/trainval.txt',
                #     data_prefix=dict(sub_data_root='VOC2012/'),
                #     filter_cfg=dict(
                #         filter_empty_gt=True, min_size=32, bbox_min_size=32),
                #     pipeline=train_pipeline,
                #     backend_args=backend_args)
            ])))

val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file='voc_style/ImageSets/Main/test.txt',
        data_prefix=dict(sub_data_root='voc_style/'),
        test_mode=True,
        pipeline=test_pipeline,
        backend_args=backend_args))
test_dataloader = val_dataloader

# Pascal VOC2007 uses `11points` as default evaluate mode, while PASCAL
# VOC2012 defaults to use 'area'.
val_evaluator = dict(type='VOCMetric', metric='mAP', eval_mode='11points')
test_evaluator = val_evaluator

(2)修改 mmdet/datasets/voc.py文件
修改自己数据集的类别信息与框的颜色,并且一定注释取消voc2007和2012版本判断的要求,方便后面使用自己的数据集路径。

# Copyright (c) OpenMMLab. All rights reserved.
from mmdet.registry import DATASETS
from .xml_style import XMLDataset


@DATASETS.register_module()
class VOCDataset(XMLDataset):
    """Dataset for PASCAL VOC."""


    # 标准的voc格式类别信息
    # METAINFO = {
    #     'classes':
    #     ('aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
    #      'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
    #      'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'),
    #     # palette is a list of color tuples, which is used for visualization.
    #     'palette': [(106, 0, 228), (119, 11, 32), (165, 42, 42), (0, 0, 192),
    #                 (197, 226, 255), (0, 60, 100), (0, 0, 142), (255, 77, 255),
    #                 (153, 69, 1), (120, 166, 157), (0, 182, 199),
    #                 (0, 226, 252), (182, 182, 255), (0, 0, 230), (220, 20, 60),
    #                 (163, 255, 0), (0, 82, 0), (3, 95, 161), (0, 80, 100),
    #                 (183, 130, 88)]
    # }


    ### 修改的数据类别信息
    METAINFO = {
        'classes':
        ('ship', ),
        # palette is a list of color tuples, which is used for visualization.
        'palette': [(106, 0, 228)]
    }


    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        # if 'VOC2007' in self.sub_data_root:
        #     self._metainfo['dataset_type'] = 'VOC2007'
        # elif 'VOC2012' in self.sub_data_root:
        #     self._metainfo['dataset_type'] = 'VOC2012'
        # else:
        #     self._metainfo['dataset_type'] = None

(3)修改网络配置文件中的输出类别(非必须操作)
configs/base/models/faster-rcnn_r50_fpn.py

2.自定义配置文件构建

(1)在代码根目录新建myconfig.py的文件,
(2)复制以下内容到其中:
新的配置文件主要是分为三个部分
1、倒入相应的库文件(base
2、模型加载文件:一定要家在修改num_classses=‘你的类别’
3、数据集配置:直接复制configs/base/datasets/voc0712.py即可

# 新配置继承了基本配置,并做了必要的修改
# _base_ = './configs/faster_rcnn/mask-rcnn_r50-caffe_fpn_ms-poly-1x_coco.py'
_base_ = './configs/faster_rcnn/faster-rcnn_r50_fpn_1x_voc.py'
# 我们还需要更改 head 中的 num_classes 以匹配数据集中的类别数
########------模型相关配置--------#########
model = dict(
    roi_head=dict(
        bbox_head=dict(num_classes=1)))
 
########------修改数据集相关配置--------#########
backend_args = None

# dataset settings

dataset_type = 'VOCDataset'
# data_root = 'data/VOCdevkit/'
data_root = '/home/ubuntu/data/Official-SSDD-OPEN/BBox_SSDD/'

###数据增强的方法
train_pipeline = [
    dict(type='LoadImageFromFile', backend_args=backend_args),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', scale=(1000, 600), keep_ratio=True),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PackDetInputs')
]
test_pipeline = [
    dict(type='LoadImageFromFile', backend_args=backend_args),
    dict(type='Resize', scale=(1000, 600), keep_ratio=True),
    # avoid bboxes being resized
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor'))
]

###数据加载
train_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    batch_sampler=dict(type='AspectRatioBatchSampler'),
    dataset=dict(
        type='RepeatDataset',
        times=3,
        dataset=dict(
            type='ConcatDataset',
            # VOCDataset will add different `dataset_type` in dataset.metainfo,
            # which will get error if using ConcatDataset. Adding
            # `ignore_keys` can avoid this error.
            ignore_keys=['dataset_type'],
            datasets=[
                dict(
                    type=dataset_type,
                    data_root=data_root,
                    # ann_file='VOC2007/ImageSets/Main/trainval.txt',
                    ann_file='voc_style/ImageSets/Main/train.txt',
                    data_prefix=dict(sub_data_root='voc_style/'),
                    filter_cfg=dict(
                        filter_empty_gt=True, min_size=32, bbox_min_size=32
                        ),
                    pipeline=train_pipeline,
                    backend_args=backend_args),
                # dict(
                #     type=dataset_type,
                #     data_root=data_root,
                #     ann_file='VOC2012/ImageSets/Main/trainval.txt',
                #     data_prefix=dict(sub_data_root='VOC2012/'),
                #     filter_cfg=dict(
                #         filter_empty_gt=True, min_size=32, bbox_min_size=32),
                #     pipeline=train_pipeline,
                #     backend_args=backend_args)
            ])))

val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file='voc_style/ImageSets/Main/test.txt',
        data_prefix=dict(sub_data_root='voc_style/'),
        test_mode=True,
        pipeline=test_pipeline,
        backend_args=backend_args))
test_dataloader = val_dataloader
###数据加载
train_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    batch_sampler=dict(type='AspectRatioBatchSampler'),
    dataset=dict(
        type='RepeatDataset',
        times=3,
        dataset=dict(
            type='ConcatDataset',
            # VOCDataset will add different `dataset_type` in dataset.metainfo,
            # which will get error if using ConcatDataset. Adding
            # `ignore_keys` can avoid this error.
            ignore_keys=['dataset_type'],
            datasets=[
                dict(
                    type=dataset_type,
                    data_root=data_root,
                    # ann_file='VOC2007/ImageSets/Main/trainval.txt',
                    ann_file='voc_style/ImageSets/Main/train.txt',
                    data_prefix=dict(sub_data_root='voc_style/'),
                    filter_cfg=dict(
                        filter_empty_gt=True, min_size=32, bbox_min_size=32
                        ),
                    pipeline=train_pipeline,
                    backend_args=backend_args),
                # dict(
                #     type=dataset_type,
                #     data_root=data_root,
                #     ann_file='VOC2012/ImageSets/Main/trainval.txt',
                #     data_prefix=dict(sub_data_root='VOC2012/'),
                #     filter_cfg=dict(
                #         filter_empty_gt=True, min_size=32, bbox_min_size=32),
                #     pipeline=train_pipeline,
                #     backend_args=backend_args)
            ])))

val_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    drop_last=False,
    sampler=dict(type='DefaultSampler', shuffle=False),
    dataset=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file='voc_style/ImageSets/Main/test.txt',
        data_prefix=dict(sub_data_root='voc_style/'),
        test_mode=True,
        pipeline=test_pipeline,
        backend_args=backend_args))
test_dataloader = val_dataloader




 
# 使用预训练的 Mask R-CNN 模型权重来做初始化,可以提高模型性能
# load_from = 'https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'

三、训练及其评估测试

模型训练的命令:

python tools/train.py myconfig_voc.py```
模型测试的命令

```bash
在这里插入代码片

总结

训练遇到的问题
记录下遇到的问题。训练SSDD数据集的时候,发现coco格式训练正常,但voc格式训练出现map值很低,一直升不上去。试了下2.x版本的mmdetection训练voc格式没问题,解决方案是在configs/base/datasets/voc0712.py中删掉bbox_min_size=32即可,原文链接:https://blog.csdn.net/Pliter/article/details/134389961

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明原文出处。如若内容造成侵权/违法违规/事实不符,请联系SD编程学习网:675289112@qq.com进行投诉反馈,一经查实,立即删除!