本文主要是介绍caffe fine-tuning 图像分类,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
fine-tuning流程:
1、准备数据集(包括训练、验证、测试);
2、数据转换和数据集的均值文件生成;
3、修改网络输出类别和最后一层的网络名称,加大最后一层参数的学习速率,调整solver的配置参数;
4、加载预训练模型的参数,启动训练;
5、选取图片进行测试。
准备数据集
将图像整理到对应的文件夹中,对应的ground-truth放到对应的txt文件中。把自己的数据集划分为训练集、验证集和测试集三个集合,并把对应的图片放到对应的文件夹下。然后生成三个txt文件来保存三个集合的图片以及ground-truth。如下:(本人做单字符识别,因此对应的类别为数字0-9)
0000000.jpg 0
0000035.jpg 7
0000054.jpg 1
0000071.jpg 0
0000074.jpg 1
0000080.jpg 0
0000083.jpg 0
0000090.jpg 0
0000100.jpg 0
0000103.jpg 0
0000161.jpg 0
0000173.jpg 0
0000195.jpg 3
0000210.jpg 0
0000221.jpg 0
0000231.jpg 0
0000252.jpg 0
0000283.jpg 4
每行包含两项:图片名称以及对应的类别,中间以空格分隔。
划分数据集的代码如下:
# -*- coding: utf-8 -*-
__author__ = 'XYZ'import os
from os import listdir
from os.path import isfile, join
from PIL import Image
import xml.dom.minidom
import shutil
import random
from random import choiceDataPath = "E:\\XYZ\\digital number recognization\\watermeter_data\\singleCharacters\\"
fileListPath = "E:\\XYZ\\digital number recognization\\watermeter_data\\SingleCharactersFileList\\"
DataSavedPath = 'E:\\XYZ\\digital number recognization\\watermeter_data\\CharactersSavedPath\\'trainval_percent = 0.8
train_percent = 0.8labels = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]if not os.path.exists(fileListPath):os.makedirs(fileListPath)if not os.path.exists(DataSavedPath):os.makedirs(DataSavedPath)file_list = os.listdir(DataPath)names = []
for tmp in file_list:try:name = tmp.split(".")[0]names.append(name)except Exception as e:print("Error:", e)totalSize = len(names)trainval_names = random.sample(names, int(trainval_percent*totalSize))
train_names = random.sample(trainval_names, int(train_percent*len(trainval_names)))test_names = []
for tmp in names:if tmp not in trainval_names:test_names.append(tmp)valid_names = []
for tmp in trainval_names:if tmp not in train_names:valid_names.append(tmp)train_file_path = fileListPath + "train.txt"
valid_file_path = fileListPath + "valid.txt"
test_file_path = fileListPath + "test.txt"train_file = open(train_file_path,'w')
for tmp in train_names:name = tmp.split("_")[0]label = tmp.split("_")[1]train_file.write(name+'.jpg '+label+'\n')img_path = DataPath+tmp+".jpg"new_path = DataSavedPath+name+".jpg"os.rename(img_path,new_path)
train_file.close()test_file = open(test_file_path,'w')
for tmp in test_names:name = tmp.split("_")[0]label = tmp.split("_")[1]test_file.write(name+'.jpg '+label+'\n')img_path = DataPath+tmp+".jpg"new_path = DataSavedPath+name+".jpg"os.rename(img_path,new_path)
test_file.close()valid_file = open(valid_file_path,'w')
for tmp in valid_names:name = tmp.split("_")[0]label = tmp.split("_")[1]valid_file.write(name+'.jpg '+label+'\n')img_path = DataPath+tmp+".jpg"new_path = DataSavedPath+name+".jpg"os.rename(img_path,new_path)
valid_file.close()
最后得到的数据的目录结构如下:
数据转换和数据集的均值文件生成
该步骤可以利用caffe自带的例子中的脚本完成数据转换。
在caffe-root/examples/下创建自己的文件夹watermeter
复制caffe-root/examples/imagenet/create_imagenet.sh文件到caffe-root/examples/watermeter下,重命名为create_watermeter.sh。修改内容如下:
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
set -eEXAMPLE=examples/watermeter # lmdb saved path
DATA=~/DataDir/watermeter_characters/ # image path
TOOLS=build/toolsTRAIN_DATA_ROOT=~/DataDir/watermeter_characters/train/
VAL_DATA_ROOT=~/DataDir/watermeter_characters/valid/
TEST_DATA_ROOT=~/DataDir/watermeter_characters/test/# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; thenRESIZE_HEIGHT=256RESIZE_WIDTH=256
elseRESIZE_HEIGHT=0RESIZE_WIDTH=0
fiif [ ! -d "$TRAIN_DATA_ROOT" ]; thenecho "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"echo "Set the TRAIN_DATA_ROOT variable in create_imagenet.sh to the path" \"where the ImageNet training data is stored."exit 1
fiif [ ! -d "$VAL_DATA_ROOT" ]; thenecho "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"echo "Set the VAL_DATA_ROOT variable in create_imagenet.sh to the path" \"where the ImageNet validation data is stored."exit 1
fiecho "Creating train lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \--resize_height=$RESIZE_HEIGHT \--resize_width=$RESIZE_WIDTH \--shuffle \$TRAIN_DATA_ROOT \$DATA/train.txt \$EXAMPLE/watermeter_train_lmdb #changedecho "Creating val lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \--resize_height=$RESIZE_HEIGHT \--resize_width=$RESIZE_WIDTH \--shuffle \$VAL_DATA_ROOT \$DATA/valid.txt \$EXAMPLE/watermeter_val_lmdbecho "Creating test lmdb..."GLOG_logtostderr=1 $TOOLS/convert_imageset \--resize_height=$RESIZE_HEIGHT \--resize_width=$RESIZE_WIDTH \--shuffle \$TEST_DATA_ROOT \$DATA/test.txt \$EXAMPLE/watermeter_test_lmdbecho "Done."
从上到下各变量代表的意思依次是:EXAMPLE指定转换后的lmdb数据存放的路径,DATA指定原生数据所在目录,TOOLS指定实际进行数据转换时所用到的文件所在的目录,即build/tools。在caffe的根目录下执行该脚本文件。三个…DATA_ROOT变量分别代表训练集、验证集和测试集所在目录。也就是说,在第一步准备数据之后,数据集和对应的标注文件。
由于后续需要计算图像的平均值,所以要将所有的图片resize一下, 将RESIZE变量设为true即可。
根目录下执行命令:
./examples/watermeter/create_watermeter.sh
输出结果:
执行完脚本后,在EXAMPLE文件夹下生成如下三个文件夹:
接着生成均值文件,因为机器学习算法一半都会对数据做去均值化处理,该均值文件会在网络训练时用到。同样将caffe-root/examples/imagenet/make_imagenet_mean.sh复制到caffe-root/examples/watermeter文件夹下,并重命名为make_watermeter_mean.sh,其内容如下:
#!/usr/bin/env sh
# Compute the mean image from the imagenet training lmdb
# N.B. this is available in data/ilsvrc12EXAMPLE=~/CaffeDir/caffe/examples/watermeter
DATA=~/DataDir/watermeter_characters/
TOOLS=build/tools$TOOLS/compute_image_mean $EXAMPLE/watermeter_train_lmdb \$DATA/watermeter_mean.binaryprotoecho "Done."
根目录下执行:
./examples/watermeter/make_watermeter_mean.sh
输出如下:
DATA指明均值文件的名称和存放路径。在caffe根目录下运行该脚本,最终得到”_mean.binaryproto”文件。
复制watermeter_mean.binaryproto到caffe-root/examples/watermeter目录下。
修改网络
使用caffe做fine-tuning,本文以caffenet为例。caffe中,网络结构最终是以.prototxt(文件后缀)文件来定义的,可以通过写代码来定义网络,不过最后还是要生成一个.prototxt文件来执行。
在caffe-root/models目录下创建character_classification文件夹,将caffe-root/models/bvlc_reference_caffenet下的deploy.prototxt、solver.prototxt、train_val.prototxt三个文件复制到character_classification文件夹下,并做修改。
deploy.prototxt修改如下:
name: "CaffeNet"
layer {name: "data"type: "Input"top: "data"input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"convolution_param {engine: CAFFEnum_output: 96kernel_size: 11stride: 4}
}
layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1"
}
layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "norm1"type: "LRN"bottom: "pool1"top: "norm1"lrn_param {local_size: 5alpha: 0.0001beta: 0.75}
}
layer {name: "conv2"type: "Convolution"bottom: "norm1"top: "conv2"convolution_param {engine: CAFFEnum_output: 256pad: 2kernel_size: 5group: 2}
}
layer {name: "relu2"type: "ReLU"bottom: "conv2"top: "conv2"
}
layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "norm2"type: "LRN"bottom: "pool2"top: "norm2"lrn_param {local_size: 5alpha: 0.0001beta: 0.75}
}
layer {name: "conv3"type: "Convolution"bottom: "norm2"top: "conv3"convolution_param {engine: CAFFEnum_output: 384pad: 1kernel_size: 3}
}
layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3"
}
layer {name: "conv4"type: "Convolution"bottom: "conv3"top: "conv4"convolution_param {engine: CAFFEnum_output: 384pad: 1kernel_size: 3group: 2}
}
layer {name: "relu4"type: "ReLU"bottom: "conv4"top: "conv4"
}
layer {name: "conv5"type: "Convolution"bottom: "conv4"top: "conv5"convolution_param {engine: CAFFEnum_output: 256pad: 1kernel_size: 3group: 2}
}
layer {name: "relu5"type: "ReLU"bottom: "conv5"top: "conv5"
}
layer {name: "pool5"type: "Pooling"bottom: "conv5"top: "pool5"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "fc6"type: "InnerProduct"bottom: "pool5"top: "fc6"inner_product_param {num_output: 4096}
}
layer {name: "relu6"type: "ReLU"bottom: "fc6"top: "fc6"
}
layer {name: "drop6"type: "Dropout"bottom: "fc6"top: "fc6"dropout_param {dropout_ratio: 0.5}
}
layer {name: "fc7"type: "InnerProduct"bottom: "fc6"top: "fc7"inner_product_param {num_output: 4096}
}
layer {name: "relu7"type: "ReLU"bottom: "fc7"top: "fc7"
}
layer {name: "drop7"type: "Dropout"bottom: "fc7"top: "fc7"dropout_param {dropout_ratio: 0.5}
}
layer {name: "fc8-watermeter" #最后全连接层名称,与train_val.prototxt对应type: "InnerProduct"bottom: "fc7"top: "fc8-watermeter"inner_product_param {num_output: 10 #修改分类类别数目}
}
layer {name: "prob"type: "Softmax"bottom: "fc8-watermeter" #最后全连接层名称,与train_val.prototxt对应top: "prob"
}
train_val.prototxt:
name: "CaffeNet"
layer {name: "data"type: "Data"top: "data"top: "label"include {phase: TRAIN}transform_param {mirror: truecrop_size: 227mean_file: "/home/wupengfei/DataDir/watermeter_characters/watermeter_mean.binaryproto" #上步生成的图像均值文件目录,注意要写全路径}data_param {source: "examples/watermeter/watermeter_train_lmdb" # 训练数据的lmdb文件夹batch_size: 128backend: LMDB}
}
layer {name: "data"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {mirror: falsecrop_size: 227mean_file: "/home/wupengfei/DataDir/watermeter_characters/watermeter_mean.binaryproto" #均值文件全路径}data_param {source: "examples/watermeter/watermeter_test_lmdb" #测试数据lmdb路径batch_size: 64backend: LMDB}
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {engine: CAFFEnum_output: 96kernel_size: 11stride: 4weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}}
}
layer {name: "relu1"type: "ReLU"bottom: "conv1"top: "conv1"
}
layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "norm1"type: "LRN"bottom: "pool1"top: "norm1"lrn_param {local_size: 5alpha: 0.0001beta: 0.75}
}
layer {name: "conv2"type: "Convolution"bottom: "norm1"top: "conv2"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {engine: CAFFEnum_output: 256pad: 2kernel_size: 5group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}}
}
layer {name: "relu2"type: "ReLU"bottom: "conv2"top: "conv2"
}
layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "norm2"type: "LRN"bottom: "pool2"top: "norm2"lrn_param {local_size: 5alpha: 0.0001beta: 0.75}
}
layer {name: "conv3"type: "Convolution"bottom: "norm2"top: "conv3"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {engine: CAFFEnum_output: 384pad: 1kernel_size: 3weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}}
}
layer {name: "relu3"type: "ReLU"bottom: "conv3"top: "conv3"
}
layer {name: "conv4"type: "Convolution"bottom: "conv3"top: "conv4"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {engine: CAFFEnum_output: 384pad: 1kernel_size: 3group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}}
}
layer {name: "relu4"type: "ReLU"bottom: "conv4"top: "conv4"
}
layer {name: "conv5"type: "Convolution"bottom: "conv4"top: "conv5"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {engine: CAFFEnum_output: 256pad: 1kernel_size: 3group: 2weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 1}}
}
layer {name: "relu5"type: "ReLU"bottom: "conv5"top: "conv5"
}
layer {name: "pool5"type: "Pooling"bottom: "conv5"top: "pool5"pooling_param {pool: MAXkernel_size: 3stride: 2}
}
layer {name: "fc6"type: "InnerProduct"bottom: "pool5"top: "fc6"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4096weight_filler {type: "gaussian"std: 0.005}bias_filler {type: "constant"value: 1}}
}
layer {name: "relu6"type: "ReLU"bottom: "fc6"top: "fc6"
}
layer {name: "drop6"type: "Dropout"bottom: "fc6"top: "fc6"dropout_param {dropout_ratio: 0.5}
}
layer {name: "fc7"type: "InnerProduct"bottom: "fc6"top: "fc7"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}inner_product_param {num_output: 4096weight_filler {type: "gaussian"std: 0.005}bias_filler {type: "constant"value: 1}}
}
layer {name: "relu7"type: "ReLU"bottom: "fc7"top: "fc7"
}
layer {name: "drop7"type: "Dropout"bottom: "fc7"top: "fc7"dropout_param {dropout_ratio: 0.5}
}
layer {name: "fc8-watermeter" # fine-tuning该层,因此要重命名,否则会报错type: "InnerProduct"bottom: "fc7"top: "fc8-watermeter" #重命名param {lr_mult: 10 decay_mult: 1}param {lr_mult: 20decay_mult: 0}inner_product_param {num_output: 10 #类别数目weight_filler {type: "gaussian"std: 0.01}bias_filler {type: "constant"value: 0}}
}layer {name: "loss"type: "SoftmaxWithLoss"bottom: "fc8-watermeter" #重命名层bottom: "label"top: "loss"
}
layer {name: "accuracy"type: "Accuracy"bottom: "fc8-watermeter" # 重命名层bottom: "label"top: "accuracy"include {phase: TEST}
}
solver.prototxt:
net: "models/character_classification/train_val.prototxt" #模型结构路径
test_iter: 100
test_interval: 1000
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 5000
display: 20
max_iter: 3000 #最大迭代次数
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "models/character_classification/caffenet_watermeter_train" #生成的模型参数存储路径
solver_mode: GPU
训练
./build/tools/caffe train -solver models/character_classification/solver.prototxt -weights models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel -gpu 0
选取图片进行测试
mean.binaryproto 转化
import caffe
import numpy as npMEAN_PROTO_PATH = 'mean.binaryproto' # 待转换的pb格式图像均值文件路径
MEAN_NPY_PATH = 'mean.npy' # 转换后的numpy格式图像均值文件路径blob = caffe.proto.caffe_pb2.BlobProto() # 创建protobuf blob
data = open(MEAN_PROTO_PATH, 'rb' ).read() # 读入mean.binaryproto文件内容
blob.ParseFromString(data) # 解析文件内容到blobarray = np.array(caffe.io.blobproto_to_array(blob))# 将blob中的均值转换成numpy格式,array的shape (mean_number,channel, hight, width)
mean_npy = array[0] # 一个array中可以有多组均值存在,故需要通过下标选择其中一组均值
np.save(MEAN_NPY_PATH ,mean_npy)
测试程序如下:
import numpy as np
import matplotlib.pyplot as plt
import os
import sys
import caffe
import time
import cv2caffe_root = '/home/wupengfei/CaffeDir/caffe/'
sys.path.insert(0,caffe_root+'python')MODEL_FILE = caffe_root+'models/character_classification/deploy.prototxt'
caffemodel = caffe_root+'models/character_classification/caffenet_watermeter_train_iter_3000.caffemodel'synset_words = caffe_root + 'data/watermeter_test/words.txt'
labels = np.loadtxt(synset_words, str, delimiter='\t')caffe.set_mode_gpu()net = caffe.Net(MODEL_FILE, caffemodel, caffe.TEST)mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
mu = mu.mean(1).mean(1)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2,0,1))
transformer.set_mean('data', mu)
transformer.set_raw_scale('data', 255)
transformer.set_channel_swap('data', (2,1,0))img_root = caffe_root + 'data/watermeter_test/'
#img = img_root + '0001233.jpg'#0001233images = os.listdir(img_root)
for img in images:if img.split('.')[-1] == 'jpg':img_path = img_root+imginput_image = caffe.io.load_image(img_path)net.blobs['data'].data[...] = transformer.preprocess('data',input_image)out = net.forward()prob = net.blobs['prob'].data[0].flatten()top_k = net.blobs['prob'].data[0].flatten().argsort()[-1:-6:-1]print(img," class:",labels[top_k[0]],prob[top_k[0]])
错误和解决方案
1、Check failed: error == cudaSuccess (2 vs. 0) out of memory】
修改batch_size
2、Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM
在卷积层convolution_param中添加engine: CAFFE
3、"Incorrect data field size"
在生成均值文件时可能遇到该错误,因此在进行数据转换时RESIZE要设置成true.
4、libcaffe.so.1.0.0 symbol cudnnSetActivationDescriptor, version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference
重新安装cudnn:http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
5. Unknown database backend
解决方案:setting OPENCV, LMDB flag back to 1 and recompiling
修改Makefile.config文件:
USE_OPENCV := 1
USE_LEVELDB := 1
USE_LMDB := 1
重新编译Caffe:
make clean
make all
make test
make runtest
make pycaffe
6. make pycaffe出现错误:
python/caffe/_caffe.hpp:8:31: fatal error: numpy/arrayobject.h: No such file or directory
You may need to first relocate the file numpy/arrayobject.h on your computer using "find / -name numpy/arrayobject.h", and then modify the PYTHON_INCLUDE in the Makefile.configure.
Perhaps it's in /usr/local/lib/python2.7 instead of /usr/lib/python2.7
7. python can't import _caffe module
Make sure you have done
make pycaffe
8. I1220 14:47:21.014974 402 solver.cpp:449] Snapshotting to binary proto file ./snapshots/split_iter_2500.caffemodel
F1220 14:47:23.816285 402 io.cpp:67] Check failed: proto.SerializeToOstream(&output)
check if you have any more space on disk.
参考:
[1] http://blog.csdn.net/u010358677/article/details/53305333
[2] http://blog.csdn.net/sinat_26917383/article/details/54141697
[3] https://github.com/BVLC/caffe/issues/3579
[4] https://github.com/BVLC/caffe/issues/1284
[5] https://github.com/BVLC/caffe/issues/263
这篇关于caffe fine-tuning 图像分类的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!