Google openImage Datasetでyolov4のデータセットをdownload & annotationファイルの作成

今回はgoogle open Image datasetのyolov4データをdownloadする方法。

google open Image datasetは物体検出からセグメンテーションまで良質なデータが揃ってtて、v1〜v6まである。

直でdownloadすると割と面倒。（調べるのがめんどい）

なので今回は物体検出の特定のclassのデータをdownloadする方法のメモ。

データのDownload

OIDv4_ToolKitを使う。

Open Images Dataset V4 の任意のクラスだけの画像とアノテーションデータをダウンロードすることができる

アノテーションデータは、物体検出のみ。セグメンテーションは対応していない。

bbox は [name_of_the_class, left, top, right, bottom] の .txt フォーマットで得られるため、場合によっては変換が必要

$ git clone https://github.com/EscVM/OIDv4_ToolKit.git
$ cd OIDv4_ToolKit
$ pip3 install -r requirements.txt

今回はclass==Knife をdownload。
classはGoogle open Image Datsetの検索欄から見れる。

--type_csvで[train/validation/test/all]と選択可能。
全部欲しいのでallを指定。

引数はgithubに書いてある通りに指定できる

・IsOccluded: Indicates that the object is occluded by another object in the image.
・ IsTruncated: Indicates that the object extends beyond the boundary of the image.
・ IsGroupOf: Indicates that the box spans a group of objects (e.g., a bed of flowers or a crowd of people). We asked annotators to use this tag for cases with more than 5 instances which are heavily occluding each other and are physically touching.
・ IsDepiction: Indicates that the object is a depiction (e.g., a cartoon or drawing of the object, not a real physical instance).
・IsInside: Indicates a picture taken from the inside of the object (e.g., a car interior or inside of a building).
・n_threads: Select how many threads you want to use. The ToolKit will take care for you to download multiple images in parallel, considerably speeding up the downloading process.
・limit: Limit the number of images being downloaded. Useful if you want to restrict the size of your dataset.
・y： Answer yes when have to download missing csv files.

$ python3 main.py downloader --classes Knife --type_csv all

全部 [Y]で進み、download。

フォルダ構造

$ tree OID
>>>>

OID
|-- Dataset
|   |-- test
|   |   |
|   |   |
|   |   |-- Knife
            |--〜.jpg　（Knife画像）
            -- Label
                 |-- ~.txt　（box用label text）
|   |-- train
|   |   |
|   |   |
|   |   |-- Knife
            |--〜.jpg　（Knife画像）
            -- Label
                 |-- ~.txt　（box用label text）
|   |-- validation
|   |   |
|   |   |
|   |   |-- Knife
            |--〜.jpg　（Knife画像）
            -- Label
                 |-- ~.txt　（box用label text）
`-- csv_folder
    |-- class-descriptions-boxable.csv
    |-- test-annotations-bbox.csv
    |-- train-annotations-bbox.csv
    `-- validation-annotations-bbox.csv

データの情報・中身

# OIDv4_ToolKit/OID/Dataset/train/Knife/Labelのtxtファイルの中身
# validationとtestも同じ

$ cat 870eb1cdddbcce5a.txt
Knife 24.320256 23.04 767.360256 849.92

# OIDv4_ToolKit/OID/Dataset/train/Knife/Labelのデータ数
$ ls -l | wc -l
611

# OIDv4_ToolKit/OID/Dataset/train/Knifeの画像枚数
$ ls |wc -l 
611

# OIDv4_ToolKit/OID/Dataset/test/Knifeの画像とラベル数
161
# OIDv4_ToolKit/OID/Dataset/validation/Knifeの画像とラベル数
56

# csv_folderのフィイルの中身
$ cat class-descriptions-boxable.csv

~~
/m/0pcr,Alpaca
/m/0pg52,Taxi
/m/0ph39,Canoe
/m/0qjjc,Remote control
/m/0qmmr,Wheelchair
/m/0wdt60w,Rugby ball
/m/0xfy,Armadillo
/m/0xzly,Maracas
/m/0zvk5,Helmet

$ cat test-annotations-bbox.csv
>>>
fffc6543b32da1dd,freeform,/m/0jbk,1,0.013794,0.999996,0.388438,0.727906,0,0,1,0,0
fffd0258c243bbea,freeform,/m/01g317,1,0.000120,0.999896,0.000000,1.000000,1,0,1,0,0

$ cat validation-annotations-bbox.csv

>>>
ffff21932da3ed01,freeform,/m/0c9ph5,1,0.540223,0.624863,0.493633,0.577892,1,0,1,0,0
ffff21932da3ed01,freeform,/m/0cgh4,1,0.002521,1.000000,0.000000,0.998685,0,0,0,0,1

Knifeの画像データ

f:id:trafalbad:20210510112838j:plain

データをyolov4で読み込ませる

dataフォルダにKnifeフォルダを入れる。

そんでyolov4用のtextファイルの作成

classes = ['Knife']
classes_dicts = {key:idx for idx, key in enumerate(classes)}

def main(label_path, jpg_path_name, save_filetxt_name):
    with open(save_filetxt_name, 'w') as f:
        for path in os.listdir(label_path):
            filename = path.replace('txt', 'jpg')
            f.write(os.path.join(jpg_path_name, filename))
            
            loadf = open(os.path.join(label_path, path), 'r', encoding='utf-8')
            for line in loadf.readlines():
                cls, x_min, y_min, x_max, y_max = line.split(" ")
                ## rewrite
                y_max = y_max.rstrip('\n')
                x_min, y_min, x_max, y_max = int(float(x_min)), int(float(y_min)), int(float(x_max)), int(float(y_max))
                cls = classes_dicts[cls]
                box_info = " %d,%d,%d,%d,%d" % (
                x_min, y_min, x_max, y_max, int(cls))
                f.write(box_info)
            f.write('\n')
            
if __name__=='__main__':
    data_type='test'
    assert data_type in ['train', 'validation', 'test'], 'corecct word from [train, validation, test]'
    jpg_path_name = 'data/Knife/Dataset/{}/Knife'.format(data_type)
    save_filetxt_name = 'data/pytorch_yolov4_{}.txt'.format(data_type)
    label_path = 'data/Knife/Dataset/{}/Knife/Label'.format(data_type)
    main(label_path, jpg_path_name, save_filetxt_name)

出来上がった、「pytorch_yolov4_validation.txt」を開いてみる。

# load用関数
data_type='validation'
save_filetxt_name = 'data/pytorch_yolov4_{}.txt'.format(data_type)
lable_path = save_filetxt_name

def open_txtfile(label_path):
    truth = {}
    f = open(lable_path, 'r', encoding='utf-8')
    for line in f.readlines():
        data = line.split(" ")
        truth[data[0]] = []
        for i in data[1:]:
            truth[data[0]].append([int(float(j)) for j in i.split(',')])
            print(truth)

open_txtfile(label_path)

>>>

data/Knife/Dataset/validation/Knife/2497ac78d31d89d5.jpg 15,166,942,489,0
data/Knife/Dataset/validation/Knife/09a9a9d1fe0a592a.jpg 55,313,333,1024,0
data/Knife/Dataset/validation/Knife/f2a2a1a0095f5d79.jpg 108,481,1024,636,0
data/Knife/Dataset/validation/Knife/4b6d3c391753e5ce.jpg 225,59,372,219,0 539,242,1024,292,0 611,478,1024,720,0 776,179,1024,244,0
data/Knife/Dataset/validation/Knife/4b1fc77d58646a7e.jpg 65,66,983,744,0
〜〜〜

一応githubのREADME.mdのやつと同じにできてる。これでyolov4用のannotation txt fileができた。

yolov4ファイルの変更ポイント

読み込ませるには以下の点を変更した。

dataset.pyのYolo_datasetクラスのimageをloadするときのos.path.joinを消した。

# dataset.py
class Yolo_dataset(Dataset):
〜〜
　　def __getitem__(self, index):
        if not self.train:
            return self._get_val_item(index)
        img_path = self.imgs[index]
        bboxes = np.array(self.truth.get(img_path), dtype=np.float)
        img_path = img_path
        use_mixup = self.cfg.mixup
        if random.randint(0, 1):
            use_mixup = 0

　　for i in range(use_mixup + 1):
            if i != 0:
                img_path = random.choice(list(self.truth.keys()))
                bboxes = np.array(self.truth.get(img_path), dtype=np.float)
                img_path = img_path
            img = cv2.imread(img_path)
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

メモリ調整のためtrain.pyの引数にnum_workerを追加

def train(model, device, config, epochs=5, batch_size=1, save_cp=True, num_worker = 0, log_step=20, img_scale=0.5):
    train_dataset = Yolo_dataset(config.train_label, config, train=True)
    val_dataset = Yolo_dataset(config.val_label, config, train=False)
   〜〜〜〜
　writer.close()

#  train.py　=> 実行
# num_workerでメモリ加減を調整

try:
    train(model=model,
          device=device,
          config=cfg,
          epochs=cfg.TRAIN_EPOCHS,
          num_worker=0)
except KeyboardInterrupt:
    torch.save(model.state_dict(), 'INTERRUPTED.pth')
    logging.info('Saved interrupt')
    try:
        sys.exit(0)
    except SystemExit:
        os._exit(0)

>>>>>

2021-05-11 07:57:09,935 <ipython-input-3-6cfc1c1d5a28>[line:36] INFO: Starting training:
        Epochs:          300
        Batch size:      64
        Subdivisions:    16
        Learning rate:   0.001
        Training size:   610
        Validation size: 56
        Checkpoints:     True
        Device:          cpu
        Images size:     608
        Optimizer:       adam
        Dataset classes: 1
        Train label path:data/pytorch_yolov4_train.txt
        Pretrained:
    
Epoch 1/300:   0%|       | 0/610 [00:07<?, ?img/s]