Age and gender estimation based on Convolutional Neural Network and TensorFlow

本文主要是介绍Age and gender estimation based on Convolutional Neural Network and TensorFlow,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

训练数据处理

imdb数据提取

gender: 0 for female and 1 for male, NaN if unknown

age: 年龄分为101类,分别为从0到100岁.

将训练数据转换为tfrecords格式,命令为,

python convert_to_records_multiCPU.py --imdb --nworks 8 --imdb_db /home/research/data/hjimce/classifyData/age_gender/imdb_crop/imdb.mat --base_path /home/research/data/hjimce/classifyData/age_gender/

lmdb.mat文件包括,

data = {"file_name": full_path, "gender": gender, "age": age, "score": face_score,"second_score": second_face_score}

数据中含有较多的噪声,因此在加载.mat文件,并获得数据词典后,对图像进行筛选,即可以设置face_score,second_face_score需要满足特定的条件,

if face_score[index] < 1:continue
# if (~np.isnan(second_face_score[index])) and second_face_score[index] > 0.0:
#     continue
if ~(0 <= ages[index] <= 100):continueif np.isnan(genders[index]):continue

提取人脸图像

数据处理包括,对输入图像检测人脸框,并对人脸采用仿射变换进行对齐,

# load the input image, resize it, and convert it to grayscale
image = cv2.imread(os.path.join(image_base_dir, str(file_name[index][0])), cv2.IMREAD_COLOR)
# image = imutils.resize(image, width=256)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
rects = detector(gray, 2)
if len(rects) != 1:continue
else:image_raw = fa.align(image, gray, rects[0])image_raw = image_raw.tostring()

最后将age,gender,对齐后的人脸数据,图像名,保存,

# image_raw = images[index].tostring()
example = tf.train.Example(features=tf.train.Features(feature={# 'height': _int64_feature(rows),# 'width': _int64_feature(cols),# 'depth': _int64_feature(depth),'age': _int64_feature(int(ages[index])),'gender': _int64_feature(int(genders[index])),'image_raw': _bytes_feature(image_raw),'file_name': _bytes_feature(str(file_name[index][0]))}))
writer.write(example.SerializeToString())

人脸对齐算法为

from imutils.face_utils import FaceAligner

对于人脸,我们可以指定调整的眼睛位置,人脸大小,

class FaceAligner:def __init__(self, predictor, desiredLeftEye=(0.35, 0.35),desiredFaceWidth=256, desiredFaceHeight=None):# store the facial landmark predictor, desired output left# eye position, and desired output face width + heightself.predictor = predictorself.desiredLeftEye = desiredLeftEyeself.desiredFaceWidth = desiredFaceWidthself.desiredFaceHeight = desiredFaceHeight# if the desired face height is None, set it to be the# desired face width (normal behavior)if self.desiredFaceHeight is None:self.desiredFaceHeight = self.desiredFaceWidth

之后计算放射变换参数,并对人脸框进行放射变换,

def align(self, image, gray, rect):# convert the landmark (x, y)-coordinates to a NumPy arrayshape = self.predictor(gray, rect)shape = shape_to_np(shape)# extract the left and right eye (x, y)-coordinates(lStart, lEnd) = FACIAL_LANDMARKS_IDXS["left_eye"](rStart, rEnd) = FACIAL_LANDMARKS_IDXS["right_eye"]leftEyePts = shape[lStart:lEnd]rightEyePts = shape[rStart:rEnd]# compute the center of mass for each eyeleftEyeCenter = leftEyePts.mean(axis=0).astype("int")rightEyeCenter = rightEyePts.mean(axis=0).astype("int")# compute the angle between the eye centroidsdY = rightEyeCenter[1] - leftEyeCenter[1]dX = rightEyeCenter[0] - leftEyeCenter[0]angle = np.degrees(np.arctan2(dY, dX)) - 180# compute the desired right eye x-coordinate based on the# desired x-coordinate of the left eyedesiredRightEyeX = 1.0 - self.desiredLeftEye[0]# determine the scale of the new resulting image by taking# the ratio of the distance between eyes in the *current*# image to the ratio of distance between eyes in the# *desired* imagedist = np.sqrt((dX ** 2) + (dY ** 2))desiredDist = (desiredRightEyeX - self.desiredLeftEye[0])desiredDist *= self.desiredFaceWidthscale = desiredDist / dist# compute center (x, y)-coordinates (i.e., the median point)# between the two eyes in the input imageeyesCenter = ((leftEyeCenter[0] + rightEyeCenter[0]) // 2,(leftEyeCenter[1] + rightEyeCenter[1]) // 2)# grab the rotation matrix for rotating and scaling the faceM = cv2.getRotationMatrix2D(eyesCenter, angle, scale)# update the translation component of the matrixtX = self.desiredFaceWidth * 0.5tY = self.desiredFaceHeight * self.desiredLeftEye[1]M[0, 2] += (tX - eyesCenter[0])M[1, 2] += (tY - eyesCenter[1])# apply the affine transformation(w, h) = (self.desiredFaceWidth, self.desiredFaceHeight)output = cv2.warpAffine(image, M, (w, h),flags=cv2.INTER_CUBIC)# return the aligned facereturn output

处理后的数据放在/home/research/data/hjimce/classifyData/age_gender/下.

模型结构分析

模型结构为inception v1,

输入大小为 160×160×3

首先为6个卷积层,

# 149 x 149 x 32
net = slim.conv2d(inputs, 32, 3, stride=2, padding='VALID',scope='Conv2d_1a_3x3')
end_points['Conv2d_1a_3x3'] = net
# 147 x 147 x 32
net = slim.conv2d(net, 32, 3, padding='VALID',scope='Conv2d_2a_3x3')
end_points['Conv2d_2a_3x3'] = net
# 147 x 147 x 64
net = slim.conv2d(net, 64, 3, scope='Conv2d_2b_3x3')
end_points['Conv2d_2b_3x3'] = net
# 73 x 73 x 64
net = slim.max_pool2d(net, 3, stride=2, padding='VALID',scope='MaxPool_3a_3x3')
end_points['MaxPool_3a_3x3'] = net
# 73 x 73 x 80
net = slim.conv2d(net, 80, 1, padding='VALID',scope='Conv2d_3b_1x1')
end_points['Conv2d_3b_1x1'] = net
# 71 x 71 x 192
net = slim.conv2d(net, 192, 3, padding='VALID',scope='Conv2d_4a_3x3')
end_points['Conv2d_4a_3x3'] = net
# 35 x 35 x 256
net = slim.conv2d(net, 256, 3, stride=2, padding='VALID',scope='Conv2d_4b_3x3')
end_points['Conv2d_4b_3x3'] = net

之后是5层block35,

# Inception-Resnet-A
def block35(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):"""Builds the 35x35 resnet block."""with tf.variable_scope(scope, 'Block35', [net], reuse=reuse):with tf.variable_scope('Branch_0'):tower_conv = slim.conv2d(net, 32, 1, scope='Conv2d_1x1')with tf.variable_scope('Branch_1'):tower_conv1_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1')tower_conv1_1 = slim.conv2d(tower_conv1_0, 32, 3, scope='Conv2d_0b_3x3')with tf.variable_scope('Branch_2'):tower_conv2_0 = slim.conv2d(net, 32, 1, scope='Conv2d_0a_1x1')tower_conv2_1 = slim.conv2d(tower_conv2_0, 32, 3, scope='Conv2d_0b_3x3')tower_conv2_2 = slim.conv2d(tower_conv2_1, 32, 3, scope='Conv2d_0c_3x3')mixed = tf.concat([tower_conv, tower_conv1_1, tower_conv2_2], 3)up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,activation_fn=None, scope='Conv2d_1x1')net += scale * upif activation_fn:net = activation_fn(net)return net

reuduction A,

def reduction_a(net, k, l, m, n):with tf.variable_scope('Branch_0'):tower_conv = slim.conv2d(net, n, 3, stride=2, padding='VALID',scope='Conv2d_1a_3x3')with tf.variable_scope('Branch_1'):tower_conv1_0 = slim.conv2d(net, k, 1, scope='Conv2d_0a_1x1')tower_conv1_1 = slim.conv2d(tower_conv1_0, l, 3,scope='Conv2d_0b_3x3')tower_conv1_2 = slim.conv2d(tower_conv1_1, m, 3,stride=2, padding='VALID',scope='Conv2d_1a_3x3')with tf.variable_scope('Branch_2'):tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID',scope='MaxPool_1a_3x3')net = tf.concat([tower_conv, tower_conv1_2, tower_pool], 3)return net

10 层 Inception-Resnet-B,8x8x896

net = slim.repeat(net, 10, block17, scale=0.10)end_points[‘Mixed_6b’] = net

def block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):"""Builds the 17x17 resnet block."""with tf.variable_scope(scope, 'Block17', [net], reuse=reuse):with tf.variable_scope('Branch_0'):tower_conv = slim.conv2d(net, 128, 1, scope='Conv2d_1x1')with tf.variable_scope('Branch_1'):tower_conv1_0 = slim.conv2d(net, 128, 1, scope='Conv2d_0a_1x1')tower_conv1_1 = slim.conv2d(tower_conv1_0, 128, [1, 7],scope='Conv2d_0b_1x7')tower_conv1_2 = slim.conv2d(tower_conv1_1, 128, [7, 1],scope='Conv2d_0c_7x1')mixed = tf.concat([tower_conv, tower_conv1_2], 3)up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,activation_fn=None, scope='Conv2d_1x1')net += scale * upif activation_fn:net = activation_fn(net)return net

reuduction B,

def reduction_b(net):with tf.variable_scope('Branch_0'):tower_conv = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')tower_conv_1 = slim.conv2d(tower_conv, 384, 3, stride=2,padding='VALID', scope='Conv2d_1a_3x3')with tf.variable_scope('Branch_1'):tower_conv1 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')tower_conv1_1 = slim.conv2d(tower_conv1, 256, 3, stride=2,padding='VALID', scope='Conv2d_1a_3x3')with tf.variable_scope('Branch_2'):tower_conv2 = slim.conv2d(net, 256, 1, scope='Conv2d_0a_1x1')tower_conv2_1 = slim.conv2d(tower_conv2, 256, 3,scope='Conv2d_0b_3x3')tower_conv2_2 = slim.conv2d(tower_conv2_1, 256, 3, stride=2,padding='VALID', scope='Conv2d_1a_3x3')with tf.variable_scope('Branch_3'):tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID',scope='MaxPool_1a_3x3')net = tf.concat([tower_conv_1, tower_conv1_1,tower_conv2_2, tower_pool], 3)return net

5层 Inception-Resnet-C,3x3x1792

net = slim.repeat(net, 5, block8, scale=0.20)
# Inception-Resnet-C
def block8(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):"""Builds the 8x8 resnet block."""with tf.variable_scope(scope, 'Block8', [net], reuse=reuse):with tf.variable_scope('Branch_0'):tower_conv = slim.conv2d(net, 192, 1, scope='Conv2d_1x1')with tf.variable_scope('Branch_1'):tower_conv1_0 = slim.conv2d(net, 192, 1, scope='Conv2d_0a_1x1')tower_conv1_1 = slim.conv2d(tower_conv1_0, 192, [1, 3],scope='Conv2d_0b_1x3')tower_conv1_2 = slim.conv2d(tower_conv1_1, 192, [3, 1],scope='Conv2d_0c_3x1')mixed = tf.concat([tower_conv, tower_conv1_2], 3)up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,activation_fn=None, scope='Conv2d_1x1')net += scale * upif activation_fn:net = activation_fn(net)return net

一层 Inception-Resnet-C,3x3x1792

net = block8(net, activation_fn=None)

训练

CUDA_VISIBLE_DEVICES=0 python train.py –images /home/research/data/hjimce/classifyData/age_gender/train –lr 1e-3 –weight_decay 1e-5 –epoch 6 –batch_size 128 –keep_prob 0.8 –cuda

测试

单张图像预测

下载模型放在models/下,

CUDA_VISIBLE_DEVICES=0 python eval.py –I “./demo/demo.jpg” –M “./models/” –font_scale 1 –thickness 1

运行结果,

这里写图片描述

这里写图片描述

测试多张图像精度

CUDA_VISIBLE_DEVICES=0 python test.py –images /home/research/data/hjimce/classifyData/age_gender/test

a.测试给定的模型精度,

Age_MAE:7.21,Gender_Acc:80.32%,Loss:4.37

b.测试自己训练模型精度,

Age_MAE:7.55,Gender_Acc:79.24%,Loss:4.33

修改模型就够,模型大小为24M,测试精度,

Age_MAE:7.57,Gender_Acc:78.85%,Loss:4.29

这篇关于Age and gender estimation based on Convolutional Neural Network and TensorFlow的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1079393

相关文章

Retrieval-based-Voice-Conversion-WebUI模型构建指南

一、模型介绍 Retrieval-based-Voice-Conversion-WebUI(简称 RVC)模型是一个基于 VITS(Variational Inference with adversarial learning for end-to-end Text-to-Speech)的简单易用的语音转换框架。 具有以下特点 简单易用:RVC 模型通过简单易用的网页界面,使得用户无需深入了

poj 2349 Arctic Network uva 10369(prim or kruscal最小生成树)

题目很麻烦,因为不熟悉最小生成树的算法调试了好久。 感觉网上的题目解释都没说得很清楚,不适合新手。自己写一个。 题意:给你点的坐标,然后两点间可以有两种方式来通信:第一种是卫星通信,第二种是无线电通信。 卫星通信:任何两个有卫星频道的点间都可以直接建立连接,与点间的距离无关; 无线电通信:两个点之间的距离不能超过D,无线电收发器的功率越大,D越大,越昂贵。 计算无线电收发器D

MonoHuman: Animatable Human Neural Field from Monocular Video 翻译

MonoHuman:来自单目视频的可动画人类神经场 摘要。利用自由视图控制来动画化虚拟化身对于诸如虚拟现实和数字娱乐之类的各种应用来说是至关重要的。已有的研究试图利用神经辐射场(NeRF)的表征能力从单目视频中重建人体。最近的工作提出将变形网络移植到NeRF中,以进一步模拟人类神经场的动力学,从而动画化逼真的人类运动。然而,这种流水线要么依赖于姿态相关的表示,要么由于帧无关的优化而缺乏运动一致性

图神经网络框架DGL实现Graph Attention Network (GAT)笔记

参考列表: [1]深入理解图注意力机制 [2]DGL官方学习教程一 ——基础操作&消息传递 [3]Cora数据集介绍+python读取 一、DGL实现GAT分类机器学习论文 程序摘自[1],该程序实现了利用图神经网络框架——DGL,实现图注意网络(GAT)。应用demo为对机器学习论文数据集——Cora,对论文所属类别进行分类。(下图摘自[3]) 1. 程序 Ubuntu:18.04

win10不用anaconda安装tensorflow-cpu并导入pycharm

记录一下防止忘了 一、前提:已经安装了python3.6.4,想用tensorflow的包 二、在pycharm中File-Settings-Project Interpreter点“+”号导入很慢,所以直接在cmd中使用 pip install -i https://mirrors.aliyun.com/pypi/simple tensorflow-cpu下载好,默认下载的tensorflow

深度学习--对抗生成网络(GAN, Generative Adversarial Network)

对抗生成网络(GAN, Generative Adversarial Network)是一种深度学习模型,由Ian Goodfellow等人在2014年提出。GAN主要用于生成数据,通过两个神经网络相互对抗,来生成以假乱真的新数据。以下是对GAN的详细阐述,包括其概念、作用、核心要点、实现过程、代码实现和适用场景。 1. 概念 GAN由两个神经网络组成:生成器(Generator)和判别器(D

稀疏自编码器tensorflow

自编码器是一种无监督机器学习算法,通过计算自编码的输出与原输入的误差,不断调节自编码器的参数,最终训练出模型。自编码器可以用于压缩输入信息,提取有用的输入特征。如,[1,0,0,0],[0,1,0,0],[0,0,1,0],[0,0,0,1]四比特信息可以压缩成两位,[0,0],[1,0],[1,1],[0,1]。此时,自编码器的中间层的神经元个数为2。但是,有时中间隐藏层的神经元

Tensorflow实现与门感知机

感知机是最简单的神经网络,通过输入,进行加权处理,经过刺激函数,得到输出。通过输出计算误差,调整权重,最终,得到合适的加权函数。 今天,我通过tensorflow实现简单的感知机。 首先,初始化变量:     num_nodes = 2     output_units = 1     w = tf.Variable(tf.truncated_normal([num_nodes,output

Tensorflow lstm实现的小说撰写预测

最近,在研究深度学习方面的知识,结合Tensorflow,完成了基于lstm的小说预测程序demo。 lstm是改进的RNN,具有长期记忆功能,相对于RNN,增加了多个门来控制输入与输出。原理方面的知识网上很多,在此,我只是将我短暂学习的tensorflow写一个预测小说的demo,如果有错误,还望大家指出。 1、将小说进行分词,去除空格,建立词汇表与id的字典,生成初始输入模型的x与y d

Deepin Linux安装TensorFlow

Deepin Linux安装TensorFlow 1.首先检查是否有Python,一般deepin系统都自带python的。   2.安装pip Sudo appt-get install pip来安装pip,如果失败就先更新一下sudo apt-get updata,然后再sudo apt-get install pip,如果定位失败,就sudo apt-get install pyth