本文主要是介绍yolov4训练自己的数据集,基于darknet框架,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
目录
一:安装darknet
二:首先以VOC的数据格式准备好自己的数据
三:制作darknet需要的label以及txt文件。
四:准备data文件
五:准备names文件
六:修改cfg文件
七:开始训练
八:单张图片测试
一:安装darknet
git clone https://github.com/AlexeyAB/darknet/
修改makefile里面的值,
GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
OPENMP=1
LIBSO=1
DEBUG=1
然后进行编译
cd darknet
make
二:首先以VOC的数据格式准备好自己的数据
VOCdevkit
VOC2007
Annotations
ImageSets
Main
JPEGImages
其中Annotations里面存放的是xml文件,ImagesSets下面的Main文件夹里面存的是VOC数据格式里面的txt,JPEGImages里面是所有的图片。具体将自己的数据生成VOC格式的方式见如下博客:caffe检测网络训练及测试步骤_RefineDet_陈 洪 伟的博客-CSDN博客
三:制作darknet需要的label以及txt文件。
上面把自己的数据按照VOC的数据格式准备好之后,接下来用脚本生成darknet需要的标注文件以及train.txt val.txt等文件;在darknet里面找到build/darknet/x64/data/voc/voc_label.py。
# -*- coding: UTF-8 -*-import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import joinsets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')]classes = ["aaaaaa", "bbbbbb", "cccccc", "dddddd", "eeeeee", "ffffff", "gggggg", "hhhhhh", "iiiiii", "jjjjjj"]def convert(size, box): #size是整张照片的w,h; box里面是xmin, xmax, ymin, ymax.dw = 1./size[0] #1/wdh = 1./size[1] #1/hx = (box[0] + box[1])/2.0 #框的中心点的x坐标。y = (box[2] + box[3])/2.0 #框的中心点的y坐标。w = box[1] - box[0] #框的w,h = box[3] - box[2] #框的h。x = x*dww = w*dwy = y*dhh = h*dhreturn (x,y,w,h)def convert_annotation(year, image_id):in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id)) #根据图片名字找到xml文件,out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w') #根据xml文件生成darknet需要的txt标注文件。tree=ET.parse(in_file)root = tree.getroot()size = root.find('size')w = int(size.find('width').text)h = int(size.find('height').text)for obj in root.iter('object'):difficult = obj.find('difficult').textcls = obj.find('name').textif cls not in classes or int(difficult) == 1:continuecls_id = classes.index(cls)xmlbox = obj.find('bndbox')b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))bb = convert((w,h), b)out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n') #生成darknet需要的label文件。wd = getcwd() #获取当前工作目录。for year, image_set in sets:if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):os.makedirs('VOCdevkit/VOC%s/labels/'%(year))##.read()是把文件所有的内容全都保存到一个大的字符串中,其中包含换行符\n,.strip()是删除字符串头尾指定的字符,传入参数为空默认删除像\n \t \r这种,split是根据字符对字符串进行切割,不传入参数也是默认根据\n 空格等进行切割。image_ids里面是图片的名字,不包含路径,也不包含.jpg后缀。image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split() list_file = open('%s_%s.txt'%(year, image_set), 'w') #生成darknet需要的txt文件。for image_id in image_ids:list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id))convert_annotation(year, image_id)list_file.close()
我们需要把这个脚本进行修改,先把地7行的关于2012的去掉,只保留2007的,然后把第9行改成自己的类别,修改完成之后执行该脚本,该脚本会生成2007_train.txt,2007_test.txt,还会在VOCdevkit/VOC2007/labels文件夹下面生成标注文件。其中标注文件的格式如下:
1 0.55546875 0.265972222222 0.1140625 0.2125
其中第一个表示类别的编号,后面分别是坐标框中心点的坐标x,y,以及坐标框的宽高,但是这里的坐标以及宽高都是除以图像的宽高之后得到的值。
四:准备data文件
classes= 10
train = /data/chw/darknet_yolov4/2007_train.txt
valid = /data/chw/darknet_yolov4/2007_val.txt
names = ./cfg/yolov4_chw.names
backup = /data/chw/darknet_yolov4/models
第一行表示类别数,第二行表示训练图片所在路径,第三行表示验证图片所在路径,第四行表示names文件路径,最后一行是模型文件保存路径。
五:准备names文件
aaaaaa
bbbbbb
cccccc
dddddd
eeeeee
ffffff
gggggg
hhhhhh
iiiiii
jjjjjj
这里面就是类别名,每一行一个类别。
六:修改cfg文件
cfg文件要根据自己的类别数量进行修改:
1.首先将width和height修改为416.
2.max_batches一般修改为2000*类别数,然后step就是max_batches的80%和90%.
如果没用预训练模型,那么这里的max_batches如果是2000*类别数有点小,这样训练出来的模型效果不够好,可以设置成15000*类别数。
3.每个yololayer里面的classes修改为自己的数量,
4.每个yololayer前面的conv里面的filter修改为(num_classes + 5)*3.
注意上面3,4步是要修改三个地方,因为有三个yolo头。
[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=32
width=416
height=416
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1learning_rate=0.001305
burn_in=2000
max_batches = 20000
policy=steps
steps=16000,18000
scales=.1,.1#cutmix=1
mosaic=1#:104x104 54:52x52 85:26x26 104:13x13 for 416[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=mish# Downsample[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[route]
layers = -2[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[route]
layers = -1,-7[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish# Downsample[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[route]
layers = -2[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=mish[route]
layers = -1,-10[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish# Downsample[convolutional]
batch_normalize=1
filters=256
size=3
stride=2
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[route]
layers = -2[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=mish[route]
layers = -1,-28[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish# Downsample[convolutional]
batch_normalize=1
filters=512
size=3
stride=2
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[route]
layers = -2[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=mish[route]
layers = -1,-28[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish# Downsample[convolutional]
batch_normalize=1
filters=1024
size=3
stride=2
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish[route]
layers = -2[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish [convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=mish[shortcut]
from=-3
activation=linear[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=mish[route]
layers = -1,-16[convolutional]
batch_normalize=1
filters=1024
size=1
stride=1
pad=1
activation=mish##########################[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky### SPP ###
[maxpool]
stride=1
size=5[route]
layers=-2[maxpool]
stride=1
size=9[route]
layers=-4[maxpool]
stride=1
size=13[route]
layers=-1,-3,-5,-6
### End SPP ###[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[upsample]
stride=2[route]
layers = 85[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[route]
layers = -1, -3[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky[upsample]
stride=2[route]
layers = 54[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky[route]
layers = -1, -3[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky##########################[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=256
activation=leaky[convolutional]
size=1
stride=1
pad=1
filters=45
activation=linear[yolo]
mask = 0,1,2
anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
classes=10
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
scale_x_y = 1.2
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6[route]
layers = -4[convolutional]
batch_normalize=1
size=3
stride=2
pad=1
filters=256
activation=leaky[route]
layers = -1, -16[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=leaky[convolutional]
size=1
stride=1
pad=1
filters=45
activation=linear[yolo]
mask = 3,4,5
anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
classes=10
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
scale_x_y = 1.1
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6[route]
layers = -4[convolutional]
batch_normalize=1
size=3
stride=2
pad=1
filters=512
activation=leaky[route]
layers = -1, -37[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=leaky[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky[convolutional]
size=1
stride=1
pad=1
filters=45
activation=linear[yolo]
mask = 6,7,8
anchors = 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401
classes=10
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
scale_x_y = 1.05
iou_thresh=0.213
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
nms_kind=greedynms
beta_nms=0.6
七:开始训练
./darknet detector train cfg/yolov4_chw.data cfg/yolov4_chw.cfg -dont_show -gpus 0,1
先利用上面的命令跑一下,看下有没有错误,没有错误再用下面的命令改为后台运行。
nohup ./darknet detector train cfg/yolov4_chw.data cfg/yolov4_chw.cfg -dont_show -gpus 0,1 >&log.txt &
八:单张图片测试
./darknet detector test cfg/yolov4_chw.data cfg/yolov4_chw.cfg ../models/yolov4_chw_20000.weights ../3.jpg
这篇关于yolov4训练自己的数据集,基于darknet框架的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!