Skip to content
Snippets Groups Projects
Commit a416edb6 authored by Jas's avatar Jas
Browse files

add docs about datasets

parent dca589a0
No related branches found
No related tags found
No related merge requests found
# Tutorial 2: Adding New Dataset
## Customize datasets by reorganizing data
## Customize datasets by reorganizing data to COCO format
### Reorganize dataset to existing format
The simplest way to use the custom dataset is to convert your annotation format to existing COCO dataset format.
The simplest way to use the custom dataset is to convert your annotation format to COCO dataset format.
The annotation json files in COCO format has the following necessary keys:
......@@ -62,7 +60,232 @@ There are three necessary keys in the json file:
- `annotations`: contains the list of instance annotations.
- `categories`: contains the category name ('person') and its ID (1).
After the data pre-processing, the users need to further modify the config files to use the dataset.
## Create a custom dataset_info config file for the dataset
Add a new dataset info config file.
```
configs/_base_/datasets/custom.py
```
An example of the dataset config is as follows.
`keypoint_info` contains the information about each keypoint.
1. `name`: the keypoint name. The keypoint name must be unique.
2. `id`: the keypoint id.
3. `color`: ([R, G, B]) is used for keypoint visualization.
4. `type`: 'upper' or 'lower', will be used in data augmetation.
5. `swap`: indicates the 'swap pair' (also known as 'flip pair'). When applying image horizontal flip, the left part will become the right part. We need to flip the keypoints accordingly.
`skeleton_info` contains the information about the keypoint connectivity, which is used for visualization.
`joint_weights` assigns different loss weights to different keypoints.
`sigmas` is used to calculate the OKS score. Please read [keypoints-eval](https://cocodataset.org/#keypoints-eval) to learn more about it.
```
dataset_info = dict(
dataset_name='coco',
paper_info=dict(
author='Lin, Tsung-Yi and Maire, Michael and '
'Belongie, Serge and Hays, James and '
'Perona, Pietro and Ramanan, Deva and '
r'Doll{\'a}r, Piotr and Zitnick, C Lawrence',
title='Microsoft coco: Common objects in context',
container='European conference on computer vision',
year='2014',
homepage='http://cocodataset.org/',
),
keypoint_info={
0:
dict(name='nose', id=0, color=[51, 153, 255], type='upper', swap=''),
1:
dict(
name='left_eye',
id=1,
color=[51, 153, 255],
type='upper',
swap='right_eye'),
2:
dict(
name='right_eye',
id=2,
color=[51, 153, 255],
type='upper',
swap='left_eye'),
3:
dict(
name='left_ear',
id=3,
color=[51, 153, 255],
type='upper',
swap='right_ear'),
4:
dict(
name='right_ear',
id=4,
color=[51, 153, 255],
type='upper',
swap='left_ear'),
5:
dict(
name='left_shoulder',
id=5,
color=[0, 255, 0],
type='upper',
swap='right_shoulder'),
6:
dict(
name='right_shoulder',
id=6,
color=[255, 128, 0],
type='upper',
swap='left_shoulder'),
7:
dict(
name='left_elbow',
id=7,
color=[0, 255, 0],
type='upper',
swap='right_elbow'),
8:
dict(
name='right_elbow',
id=8,
color=[255, 128, 0],
type='upper',
swap='left_elbow'),
9:
dict(
name='left_wrist',
id=9,
color=[0, 255, 0],
type='upper',
swap='right_wrist'),
10:
dict(
name='right_wrist',
id=10,
color=[255, 128, 0],
type='upper',
swap='left_wrist'),
11:
dict(
name='left_hip',
id=11,
color=[0, 255, 0],
type='lower',
swap='right_hip'),
12:
dict(
name='right_hip',
id=12,
color=[255, 128, 0],
type='lower',
swap='left_hip'),
13:
dict(
name='left_knee',
id=13,
color=[0, 255, 0],
type='lower',
swap='right_knee'),
14:
dict(
name='right_knee',
id=14,
color=[255, 128, 0],
type='lower',
swap='left_knee'),
15:
dict(
name='left_ankle',
id=15,
color=[0, 255, 0],
type='lower',
swap='right_ankle'),
16:
dict(
name='right_ankle',
id=16,
color=[255, 128, 0],
type='lower',
swap='left_ankle')
},
skeleton_info={
0:
dict(link=('left_ankle', 'left_knee'), id=0, color=[0, 255, 0]),
1:
dict(link=('left_knee', 'left_hip'), id=1, color=[0, 255, 0]),
2:
dict(link=('right_ankle', 'right_knee'), id=2, color=[255, 128, 0]),
3:
dict(link=('right_knee', 'right_hip'), id=3, color=[255, 128, 0]),
4:
dict(link=('left_hip', 'right_hip'), id=4, color=[51, 153, 255]),
5:
dict(link=('left_shoulder', 'left_hip'), id=5, color=[51, 153, 255]),
6:
dict(link=('right_shoulder', 'right_hip'), id=6, color=[51, 153, 255]),
7:
dict(
link=('left_shoulder', 'right_shoulder'),
id=7,
color=[51, 153, 255]),
8:
dict(link=('left_shoulder', 'left_elbow'), id=8, color=[0, 255, 0]),
9:
dict(
link=('right_shoulder', 'right_elbow'), id=9, color=[255, 128, 0]),
10:
dict(link=('left_elbow', 'left_wrist'), id=10, color=[0, 255, 0]),
11:
dict(link=('right_elbow', 'right_wrist'), id=11, color=[255, 128, 0]),
12:
dict(link=('left_eye', 'right_eye'), id=12, color=[51, 153, 255]),
13:
dict(link=('nose', 'left_eye'), id=13, color=[51, 153, 255]),
14:
dict(link=('nose', 'right_eye'), id=14, color=[51, 153, 255]),
15:
dict(link=('left_eye', 'left_ear'), id=15, color=[51, 153, 255]),
16:
dict(link=('right_eye', 'right_ear'), id=16, color=[51, 153, 255]),
17:
dict(link=('left_ear', 'left_shoulder'), id=17, color=[51, 153, 255]),
18:
dict(
link=('right_ear', 'right_shoulder'), id=18, color=[51, 153, 255])
},
joint_weights=[
1., 1., 1., 1., 1., 1., 1., 1.2, 1.2, 1.5, 1.5, 1., 1., 1.2, 1.2, 1.5,
1.5
],
sigmas=[
0.026, 0.025, 0.025, 0.035, 0.035, 0.079, 0.079, 0.072, 0.072, 0.062,
0.062, 0.107, 0.107, 0.087, 0.087, 0.089, 0.089
])
```
## Create a custom dataset class
1. First create a package inside the mmpose/datasets/datasets folder.
2. Create a class definition of your dataset in the package folder and register it in the registry with a name. Without a name, it will keep giving the error. `KeyError: 'XXXXX is not in the dataset registry'`
```
@DATASETS.register_module(name='MyCustomDataset')
class MyCustomDataset(SomeOtherBaseClassAsPerYourNeed):
```
3. Make sure you have updated the `__init__.py` of your package folder
4. Make sure you have updated the `__init__.py` of the dataset package folder.
## Create a custom training config file
Create a custom training config file as per your need and the model/architecture you want to use in the configs folder. You may modify an existing config file to use the new custom dataset.
In `configs/my_custom_config.py`:
......@@ -70,7 +293,6 @@ In `configs/my_custom_config.py`:
...
# dataset settings
dataset_type = 'MyCustomDataset'
classes = ('a', 'b', 'c', 'd', 'e')
...
data = dict(
samples_per_gpu=2,
......@@ -92,3 +314,5 @@ data = dict(
...))
...
```
Make sure you have provided all the paths correctly.
# 教程 2: 增加新的数据集
## 通过将数据组织为已有格式来添加自定义数据集
## 将数据集转化为COCO格式
使用自定义数据集最简单的方法是将其转换为现有的COCO数据集格式。
我们首先需要将自定义数据集,转换为COCO数据集格式。
COCO数据集格式的json标注文件有以下关键字:
......@@ -60,15 +60,239 @@ Json文件中必须包含以下三个关键字:
- `annotations`: 包含实例标注的列表。
- `categories`: 包含类别名称 ('person') 和对应的 ID (1)。
在数据预处理完成后,用户需要修改配置文件以使用该数据集。
## 为自定义数据集创建 dataset_info 数据集配置文件
`configs/my_custom_config.py` 文件中,需要进行如下修改:
在如下位置,添加一个数据集配置文件。
```
configs/_base_/datasets/custom.py
```
数据集配置文件的样例如下:
`keypoint_info` 包含每个关键点的信息,其中:
1. `name`: 代表关键点的名称。一个数据集的每个关键点,名称必须唯一。
2. `id`: 关键点的标识号。
3. `color`: ([R, G, B]) 用于可视化关键点。
4. `type`: 分为 'upper' 和 'lower' 两种,用于数据增强。
5. `swap`: 表示与当前关键点,“镜像对称”的关键点名称。
`skeleton_info` 包含关键点之间的连接关系,主要用于可视化。
`joint_weights` 可以为不同的关键点设置不同的损失权重,用于训练。
`sigmas` 用于计算 OKS 得分,具体内容请参考 [keypoints-eval](https://cocodataset.org/#keypoints-eval)
```
dataset_info = dict(
dataset_name='coco',
paper_info=dict(
author='Lin, Tsung-Yi and Maire, Michael and '
'Belongie, Serge and Hays, James and '
'Perona, Pietro and Ramanan, Deva and '
r'Doll{\'a}r, Piotr and Zitnick, C Lawrence',
title='Microsoft coco: Common objects in context',
container='European conference on computer vision',
year='2014',
homepage='http://cocodataset.org/',
),
keypoint_info={
0:
dict(name='nose', id=0, color=[51, 153, 255], type='upper', swap=''),
1:
dict(
name='left_eye',
id=1,
color=[51, 153, 255],
type='upper',
swap='right_eye'),
2:
dict(
name='right_eye',
id=2,
color=[51, 153, 255],
type='upper',
swap='left_eye'),
3:
dict(
name='left_ear',
id=3,
color=[51, 153, 255],
type='upper',
swap='right_ear'),
4:
dict(
name='right_ear',
id=4,
color=[51, 153, 255],
type='upper',
swap='left_ear'),
5:
dict(
name='left_shoulder',
id=5,
color=[0, 255, 0],
type='upper',
swap='right_shoulder'),
6:
dict(
name='right_shoulder',
id=6,
color=[255, 128, 0],
type='upper',
swap='left_shoulder'),
7:
dict(
name='left_elbow',
id=7,
color=[0, 255, 0],
type='upper',
swap='right_elbow'),
8:
dict(
name='right_elbow',
id=8,
color=[255, 128, 0],
type='upper',
swap='left_elbow'),
9:
dict(
name='left_wrist',
id=9,
color=[0, 255, 0],
type='upper',
swap='right_wrist'),
10:
dict(
name='right_wrist',
id=10,
color=[255, 128, 0],
type='upper',
swap='left_wrist'),
11:
dict(
name='left_hip',
id=11,
color=[0, 255, 0],
type='lower',
swap='right_hip'),
12:
dict(
name='right_hip',
id=12,
color=[255, 128, 0],
type='lower',
swap='left_hip'),
13:
dict(
name='left_knee',
id=13,
color=[0, 255, 0],
type='lower',
swap='right_knee'),
14:
dict(
name='right_knee',
id=14,
color=[255, 128, 0],
type='lower',
swap='left_knee'),
15:
dict(
name='left_ankle',
id=15,
color=[0, 255, 0],
type='lower',
swap='right_ankle'),
16:
dict(
name='right_ankle',
id=16,
color=[255, 128, 0],
type='lower',
swap='left_ankle')
},
skeleton_info={
0:
dict(link=('left_ankle', 'left_knee'), id=0, color=[0, 255, 0]),
1:
dict(link=('left_knee', 'left_hip'), id=1, color=[0, 255, 0]),
2:
dict(link=('right_ankle', 'right_knee'), id=2, color=[255, 128, 0]),
3:
dict(link=('right_knee', 'right_hip'), id=3, color=[255, 128, 0]),
4:
dict(link=('left_hip', 'right_hip'), id=4, color=[51, 153, 255]),
5:
dict(link=('left_shoulder', 'left_hip'), id=5, color=[51, 153, 255]),
6:
dict(link=('right_shoulder', 'right_hip'), id=6, color=[51, 153, 255]),
7:
dict(
link=('left_shoulder', 'right_shoulder'),
id=7,
color=[51, 153, 255]),
8:
dict(link=('left_shoulder', 'left_elbow'), id=8, color=[0, 255, 0]),
9:
dict(
link=('right_shoulder', 'right_elbow'), id=9, color=[255, 128, 0]),
10:
dict(link=('left_elbow', 'left_wrist'), id=10, color=[0, 255, 0]),
11:
dict(link=('right_elbow', 'right_wrist'), id=11, color=[255, 128, 0]),
12:
dict(link=('left_eye', 'right_eye'), id=12, color=[51, 153, 255]),
13:
dict(link=('nose', 'left_eye'), id=13, color=[51, 153, 255]),
14:
dict(link=('nose', 'right_eye'), id=14, color=[51, 153, 255]),
15:
dict(link=('left_eye', 'left_ear'), id=15, color=[51, 153, 255]),
16:
dict(link=('right_eye', 'right_ear'), id=16, color=[51, 153, 255]),
17:
dict(link=('left_ear', 'left_shoulder'), id=17, color=[51, 153, 255]),
18:
dict(
link=('right_ear', 'right_shoulder'), id=18, color=[51, 153, 255])
},
joint_weights=[
1., 1., 1., 1., 1., 1., 1., 1.2, 1.2, 1.5, 1.5, 1., 1., 1.2, 1.2, 1.5,
1.5
],
sigmas=[
0.026, 0.025, 0.025, 0.035, 0.035, 0.079, 0.079, 0.072, 0.072, 0.062,
0.062, 0.107, 0.107, 0.087, 0.087, 0.089, 0.089
])
```
## 创建自定义数据集类
1. 首先在 mmpose/datasets/datasets 文件夹创建一个包,比如命名为 custom。
2. 定义数据集类,并且注册这个类。
```
@DATASETS.register_module(name='MyCustomDataset')
class MyCustomDataset(SomeOtherBaseClassAsPerYourNeed):
```
3. 为你的自定义类别创建 `mmpose/datasets/datasets/custom/__init__.py`
4. 更新 `mmpose/datasets/__init__.py`
## 创建和修改训练配置文件
创建和修改训练配置文件,来使用你的自定义数据集。
`configs/my_custom_config.py` 中,修改如下几行。
```python
...
# 数据集设定
# dataset settings
dataset_type = 'MyCustomDataset'
classes = ('a', 'b', 'c', 'd', 'e')
...
data = dict(
samples_per_gpu=2,
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment