今回はgoogle open Image datasetのyolov4データをdownloadする方法。
google open Image datasetは物体検出からセグメンテーションまで良質なデータが揃ってtて、v1〜v6まである。
直でdownloadすると割と面倒。(調べるのがめんどい)
なので今回は物体検出の特定のclassのデータをdownloadする方法のメモ。
データのDownload
OIDv4_ToolKitを使う。
- Open Images Dataset V4 の任意のクラスだけの画像とアノテーションデータをダウンロードすることができる
- アノテーションデータは、物体検出 のみ。セグメンテーションは対応していない。
- bbox は [name_of_the_class, left, top, right, bottom] の .txt フォーマットで得られるため、場合によっては変換が必要
$ git clone https://github.com/EscVM/OIDv4_ToolKit.git $ cd OIDv4_ToolKit $ pip3 install -r requirements.txt
今回はclass==Knife をdownload。
classはGoogle open Image Datsetの検索欄から見れる。
--type_csv
で[train/validation/test/all]と選択可能。
全部欲しいのでallを指定。
引数はgithubに書いてある通りに指定できる
・IsOccluded: Indicates that the object is occluded by another object in the image.・ IsTruncated: Indicates that the object extends beyond the boundary of the image.
・ IsGroupOf: Indicates that the box spans a group of objects (e.g., a bed of flowers or a crowd of people). We asked annotators to use this tag for cases with more than 5 instances which are heavily occluding each other and are physically touching.
・ IsDepiction: Indicates that the object is a depiction (e.g., a cartoon or drawing of the object, not a real physical instance).
・IsInside: Indicates a picture taken from the inside of the object (e.g., a car interior or inside of a building).
・n_threads: Select how many threads you want to use. The ToolKit will take care for you to download multiple images in parallel, considerably speeding up the downloading process.
・limit: Limit the number of images being downloaded. Useful if you want to restrict the size of your dataset.
・y: Answer yes when have to download missing csv files.
$ python3 main.py downloader --classes Knife --type_csv all
全部 [Y]で進み、download。
フォルダ構造
$ tree OID >>>> OID |-- Dataset | |-- test | | | | | | | | |-- Knife |--〜.jpg (Knife画像) -- Label |-- ~.txt (box用label text) | |-- train | | | | | | | | |-- Knife |--〜.jpg (Knife画像) -- Label |-- ~.txt (box用label text) | |-- validation | | | | | | | | |-- Knife |--〜.jpg (Knife画像) -- Label |-- ~.txt (box用label text) `-- csv_folder |-- class-descriptions-boxable.csv |-- test-annotations-bbox.csv |-- train-annotations-bbox.csv `-- validation-annotations-bbox.csv
データの情報・中身
# OIDv4_ToolKit/OID/Dataset/train/Knife/Labelのtxtファイルの中身 # validationとtestも同じ $ cat 870eb1cdddbcce5a.txt Knife 24.320256 23.04 767.360256 849.92 # OIDv4_ToolKit/OID/Dataset/train/Knife/Labelのデータ数 $ ls -l | wc -l 611 # OIDv4_ToolKit/OID/Dataset/train/Knifeの画像枚数 $ ls |wc -l 611 # OIDv4_ToolKit/OID/Dataset/test/Knifeの画像とラベル数 161 # OIDv4_ToolKit/OID/Dataset/validation/Knifeの画像とラベル数 56
# csv_folderのフィイルの中身 $ cat class-descriptions-boxable.csv ~~ /m/0pcr,Alpaca /m/0pg52,Taxi /m/0ph39,Canoe /m/0qjjc,Remote control /m/0qmmr,Wheelchair /m/0wdt60w,Rugby ball /m/0xfy,Armadillo /m/0xzly,Maracas /m/0zvk5,Helmet $ cat test-annotations-bbox.csv >>> fffc6543b32da1dd,freeform,/m/0jbk,1,0.013794,0.999996,0.388438,0.727906,0,0,1,0,0 fffd0258c243bbea,freeform,/m/01g317,1,0.000120,0.999896,0.000000,1.000000,1,0,1,0,0 $ cat validation-annotations-bbox.csv >>> ffff21932da3ed01,freeform,/m/0c9ph5,1,0.540223,0.624863,0.493633,0.577892,1,0,1,0,0 ffff21932da3ed01,freeform,/m/0cgh4,1,0.002521,1.000000,0.000000,0.998685,0,0,0,0,1
Knifeの画像データ
データをyolov4で読み込ませる
dataフォルダにKnifeフォルダを入れる。そんでyolov4用のtextファイルの作成
classes = ['Knife'] classes_dicts = {key:idx for idx, key in enumerate(classes)} def main(label_path, jpg_path_name, save_filetxt_name): with open(save_filetxt_name, 'w') as f: for path in os.listdir(label_path): filename = path.replace('txt', 'jpg') f.write(os.path.join(jpg_path_name, filename)) loadf = open(os.path.join(label_path, path), 'r', encoding='utf-8') for line in loadf.readlines(): cls, x_min, y_min, x_max, y_max = line.split(" ") ## rewrite y_max = y_max.rstrip('\n') x_min, y_min, x_max, y_max = int(float(x_min)), int(float(y_min)), int(float(x_max)), int(float(y_max)) cls = classes_dicts[cls] box_info = " %d,%d,%d,%d,%d" % ( x_min, y_min, x_max, y_max, int(cls)) f.write(box_info) f.write('\n') if __name__=='__main__': data_type='test' assert data_type in ['train', 'validation', 'test'], 'corecct word from [train, validation, test]' jpg_path_name = 'data/Knife/Dataset/{}/Knife'.format(data_type) save_filetxt_name = 'data/pytorch_yolov4_{}.txt'.format(data_type) label_path = 'data/Knife/Dataset/{}/Knife/Label'.format(data_type) main(label_path, jpg_path_name, save_filetxt_name)
出来上がった、「pytorch_yolov4_validation.txt」を開いてみる。
# load用関数 data_type='validation' save_filetxt_name = 'data/pytorch_yolov4_{}.txt'.format(data_type) lable_path = save_filetxt_name def open_txtfile(label_path): truth = {} f = open(lable_path, 'r', encoding='utf-8') for line in f.readlines(): data = line.split(" ") truth[data[0]] = [] for i in data[1:]: truth[data[0]].append([int(float(j)) for j in i.split(',')]) print(truth) open_txtfile(label_path) >>> data/Knife/Dataset/validation/Knife/2497ac78d31d89d5.jpg 15,166,942,489,0 data/Knife/Dataset/validation/Knife/09a9a9d1fe0a592a.jpg 55,313,333,1024,0 data/Knife/Dataset/validation/Knife/f2a2a1a0095f5d79.jpg 108,481,1024,636,0 data/Knife/Dataset/validation/Knife/4b6d3c391753e5ce.jpg 225,59,372,219,0 539,242,1024,292,0 611,478,1024,720,0 776,179,1024,244,0 data/Knife/Dataset/validation/Knife/4b1fc77d58646a7e.jpg 65,66,983,744,0 〜〜〜
一応githubのREADME.mdのやつと同じにできてる。これでyolov4用のannotation txt fileができた。
yolov4ファイルの変更ポイント
読み込ませるには以下の点を変更した。- dataset.pyのYolo_datasetクラスのimageをloadするときのos.path.joinを消した。
# dataset.py class Yolo_dataset(Dataset): 〜〜 def __getitem__(self, index): if not self.train: return self._get_val_item(index) img_path = self.imgs[index] bboxes = np.array(self.truth.get(img_path), dtype=np.float) img_path = img_path use_mixup = self.cfg.mixup if random.randint(0, 1): use_mixup = 0 for i in range(use_mixup + 1): if i != 0: img_path = random.choice(list(self.truth.keys())) bboxes = np.array(self.truth.get(img_path), dtype=np.float) img_path = img_path img = cv2.imread(img_path) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
- メモリ調整のためtrain.pyの引数にnum_workerを追加
def train(model, device, config, epochs=5, batch_size=1, save_cp=True, num_worker = 0, log_step=20, img_scale=0.5): train_dataset = Yolo_dataset(config.train_label, config, train=True) val_dataset = Yolo_dataset(config.val_label, config, train=False) 〜〜〜〜 writer.close()
# train.py => 実行 # num_workerでメモリ加減を調整 try: train(model=model, device=device, config=cfg, epochs=cfg.TRAIN_EPOCHS, num_worker=0) except KeyboardInterrupt: torch.save(model.state_dict(), 'INTERRUPTED.pth') logging.info('Saved interrupt') try: sys.exit(0) except SystemExit: os._exit(0) >>>>> 2021-05-11 07:57:09,935 <ipython-input-3-6cfc1c1d5a28>[line:36] INFO: Starting training: Epochs: 300 Batch size: 64 Subdivisions: 16 Learning rate: 0.001 Training size: 610 Validation size: 56 Checkpoints: True Device: cpu Images size: 608 Optimizer: adam Dataset classes: 1 Train label path:data/pytorch_yolov4_train.txt Pretrained: Epoch 1/300: 0%| | 0/610 [00:07<?, ?img/s]
無事動いた。