深度学习-TensorFlow2 ：构建DNN神经网络模型【构建方式：自定义函数、keras.Sequential、CompileFit、自定义Layer、自定义Model】

本文主要是介绍深度学习-TensorFlow2 ：构建DNN神经网络模型【构建方式：自定义函数、keras.Sequential、CompileFit、自定义Layer、自定义Model】，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

1、手工创建参数、函数–>利用tf.GradientTape()进行梯度优化(手写数字识别)

import osos.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
import tensorflow as tf
from tensorflow.keras import datasets# 一、获取手写数字辨识训练数据集
(X_train, Y_train), (X_val, Y_val) = datasets.mnist.load_data()  # X_train: [60k, 28, 28],Y_train: [60k]  X_val: [10k, 28, 28],Y_val: [10k]
print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}------------X_train.dtype = {4}，Y_train.dtype = {5}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train), X_train.dtype, Y_train.dtype))
# 二、数据处理
# 1、将numpy数据类型转为tensor数据类型
X_train = tf.convert_to_tensor(X_train, dtype=tf.float32) / 255.  # X_train: [0~255] => [0~1.]
Y_train = tf.convert_to_tensor(Y_train, dtype=tf.int32)
X_val = tf.convert_to_tensor(X_val, dtype=tf.float32) / 255.  # X_val: [0~255] => [0~1.]
Y_val = tf.convert_to_tensor(Y_val, dtype=tf.int32)
print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}------------X_train.dtype = {4}，Y_train.dtype = {5}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train), X_train.dtype, Y_train.dtype))
print('X_train 中的最小值 tf.reduce_min(X_train)= {0}，X_train 中的最大值 tf.reduce_max(X_train) = {1}'.format(tf.reduce_min(X_train), tf.reduce_max(X_train)))
print('Y_train 中的最小值 tf.reduce_min(Y_train)= {0}，Y_train 中的最大值 tf.reduce_max(Y_train) = {1}'.format(tf.reduce_min(Y_train), tf.reduce_max(Y_train)))
# 2、创建dataset数据集对象，利用dataset数据集对象的batch功能实现多个样本同步批量操作
batch_size_train = 20000  # 每个batch里的样本数量设置100-200之间合适。
batch_size_val = 5000  # 每个batch里的样本数量设置100-200之间合适。
dataset_train = tf.data.Dataset.from_tensor_slices((X_train, Y_train))  # 创建训练数据数据集对象
dataset_val = tf.data.Dataset.from_tensor_slices((X_val, Y_val))  # 创建测试数据数据集对象
dataset_train = dataset_train.shuffle(len(X_train))  # 打散样本顺序，防止图片的原始顺序对神经网络性能的干扰
dataset_val = dataset_val.shuffle(len(X_val))  # 打散样本顺序，防止图片的原始顺序对神经网络性能的干扰
dataset_batch_train = dataset_train.batch(batch_size_train)  # 将数据集dataset_batch_train设为每个batch含有sample_num_of_each_batch_train个样本
dataset_batch_val = dataset_val.batch(batch_size_val)  # 将数据集dataset_batch_val设为每个batch含有sample_num_of_each_batch_val个样本
# 3、通过迭代器查看训练数据集对象
train_iter = iter(dataset_batch_train)
batch01 = next(train_iter)  # 获取一个batch的数据
print('\nbatch01:\n', batch01)
print('\nbatch01[0].shape = {0}，\nbatch01[0] = \n{1}'.format(batch01[0].shape, batch01[0]))
print('\nbatch01[1].shape = {0}，\nbatch01[1] = \n{1}'.format(batch01[1].shape, batch01[1]))# 三、创建权重参数[W1,W2,W3,B1,B2,B3]
# 参数的维度设计：[dim_in, dim_out], [dim_out]
# 输入值的维度变换过程：[b, 784] --·[784,256]--> [b, 256] --·[256,128]--> [b, 128] --·[128,10]--> [b, 10]
# 从方差为0.1的裁剪正态分布中随机初始化一个shape为[784, 256]的矩阵，并将矩阵中的所有元素设置为tf.Variable变量类型以便可以作为梯度下降时的参数。
W1 = tf.Variable(tf.random.truncated_normal([784, 256], stddev=0.1))  # 实践证明设置抽样的正态分布的方差可以有效防止梯度爆炸或梯度弥散
B1 = tf.Variable(tf.zeros([256]))  # 初始化偏置项为shape为[256]的矩阵，矩阵元素全为0。将矩阵中的所有元素设置为tf.Variable变量类型以便可以作为梯度下降时的参数。
W2 = tf.Variable(tf.random.truncated_normal([256, 128], stddev=0.1))
B2 = tf.Variable(tf.zeros([128]))
W3 = tf.Variable(tf.random.truncated_normal([128, 10], stddev=0.1))
B3 = tf.Variable(tf.zeros([10]))# 四、设置学习率
learning_rate = 1e-3# 五、训练模型：整体数据集进行一次梯度下降来更新模型参数，整体数据集迭代一次，一般用epoch。每个epoch中含有batch_step_no个step，每个step中样本的数量就是设置的每个batch所含有的样本数量。
def train_epoch(epoch_no):print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))for batch_step_no, (X_batch, Y_batch) in enumerate(dataset_batch_train):  # 每次计算一个batch的数据，循环结束则计算完毕整体数据的一次前向传播；每个batch的序号一般用step表示(batch_step_no)X_batch = tf.reshape(X_batch, [-1, 28 * 28])  # [b, 28, 28] => [b, 28*28]Y_batch_one_hot = tf.one_hot(Y_batch, depth=10)  # 进行not-hot编码# 梯度带tf.GradientTape：连接需要计算梯度的”函数“和”变量“的上下文管理器（context manager）。将“函数”(即Loss的定义式)与“变量”(即神经网络的所有参数)都包裹在tf.GradientTape中进行追踪管理with tf.GradientTape() as tape:  # tensorflow提供的自动求导函数：tf.GradientTape()可以对tf.Variable类型变量自动求导。# tape.watch([[W1, W2, W3, B1, B2, B3]])	# 如果参数定义的时候没有指定其为tf.Variable类型，则此处用tape.watch();如果已经指定参数为tf.Variable类型，则此处不需要用tape.watch()再次指定。# Step1. 前向传播/前向运算-->计算当前参数下模型的预测值# h1 = X@W1 + B1 = [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256] => [b, 256]# X: [b, 784] => [b, 256]h1 = X_batch @ W1 + tf.broadcast_to(B1, [X_batch.shape[0], 256])  # tf.broadcast_to函数可以省略，计算式默认进行broadcaseh1 = tf.nn.relu(h1)  # 利用激活函数对Hidden Layer的输出数据进行非线性转换# h2 = X@W2 + B2 = [b, 256]@[256, 128] + [128] => [b, 128] + [128] => [b, 128]# X: [b, 256] => [b, 128]h2 = h1 @ W2 + B2h2 = tf.nn.relu(h2)  # 利用激活函数对Hidden Layer的输出数据进行非线性转换# h3 = X@W3 + B3 = [b, 128]@[128, 10] + [10] => [b, 10] + [10] => [b, 10]# X: [b, 128] => [b, 10]out = h2 @ W3 + B3out = tf.nn.softmax(out)    # 最后一层为输出层，利用Softmax激活函数进行非线性转换# Step2. 计算预测值与真实值之间的损失LossMSE_Loss = tf.square(Y_batch_one_hot - out)  # mse = mean(sum(Y-out)^2)   [b,10]MSE_Loss = tf.reduce_mean(MSE_Loss)  # 取平均值得一个标量print('epoch_no = {0}, batch_step_no = {1}，x_batch.shpae = {2}，y_batch.shpae = {3}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch.shape))print('\t第{0}个epoch-->第{1}个batch step的初始时的：MSE_Loss = {2}'.format(epoch_no, batch_step_no + 1, MSE_Loss))# Step3. 反向传播-->损失值Loss下降一个学习率的梯度之后所对应的更新后的各个Layer的参数：W1, W2, W3, B1, B2, B3# grads为整个模型中所有Layer的待优化参数trainable_variables [W1, W2, W3, B1, B2, B3]分别对目标函数MSE_Loss 在 x_batch 处的梯度grads = tape.gradient(MSE_Loss, [W1, B1, W2, B2, W3, B3])grads, _ = tf.clip_by_global_norm(grads, 15)  # 限幅：解决gradient explosion或者gradients vanishing的问题。# 进行一次梯度下降print('\t梯度下降步骤-->θ = θ  - learning_rate * grad：开始')# 原地更新：assign_sub()函数保证参数类型以及引用指针不变。# 不能用 W1 = W1 - learning_rate * grads[0]，否则W1的数据类型会由tf.Variabel类型变为Tensor类型。而Tensor类型的数据无法参与函数tf.GradientTape()的梯度下降计算。W1.assign_sub(learning_rate * grads[0])B1.assign_sub(learning_rate * grads[1])W2.assign_sub(learning_rate * grads[2])B2.assign_sub(learning_rate * grads[3])W3.assign_sub(learning_rate * grads[4])B3.assign_sub(learning_rate * grads[5])print('\t梯度下降步骤-->θ = θ  - learning_rate * grad：结束')print('得到模型参数：W1, B1, W2, B2, W3, B3')print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))return W1, B1, W2, B2, W3, B3# 六、模型评估 test/evluation
def evluation(epoch_no, W1, B1, W2, B2, W3, B3):print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))total_correct, total_num = 0, 0for batch_step_no, (X_batch, Y_batch) in enumerate(dataset_batch_val):print('epoch_no = {0}, batch_step_no = {1}，X_batch.shpae = {2}，Y_batch.shpae = {3}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch.shape))X_batch = tf.reshape(X_batch, [-1, 28 * 28])  # [b, 28, 28] => [b, 28*28]# 根据训练模型训练的参数计算测试数据的输出值out# [b, 784] => [b, 256] => [b, 128] => [b, 10]h1 = tf.nn.relu(X_batch @ W1 + B1)h2 = tf.nn.relu(h1 @ W2 + B2)out = h2 @ W3 + B3  # out: [b, 10] ~ R# 利用softmax()函数使得所有类别预测概率总和为1out_prob = tf.nn.softmax(out, axis=1)   # out_prob: [b, 10] ~ [0, 1]out_prob_int = tf.argmax(out_prob, axis=1)  # [b, 10] => [b]  int64!!!out_prob_int = tf.cast(out_prob_int, dtype=tf.int32)print('\t预测值：out_prob_int = {0},\t真实值：Y_batch = {1}'.format(out_prob_int, Y_batch))is_correct_boolean = tf.equal(out_prob_int, Y_batch.numpy())print('\tis_correct_boolean = {0}'.format(is_correct_boolean))is_correct_int = tf.cast(is_correct_boolean, dtype=tf.float32)print('\tis_correct_int = {0}'.format(is_correct_int))is_correct_count = tf.reduce_sum(is_correct_int)print('\tis_correct_count = {0}'.format(is_correct_count))total_correct += int(is_correct_count)total_num += X_batch.shape[0]print('total_correct = {0}---total_num = {1}'.format(total_correct, total_num))acc = total_correct / total_numprint('第{0}次迭代的准确度： acc = {1}'.format(epoch_no, acc))print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))# 六、整体数据迭代多次梯度下降来更新模型参数
def train():epoch_count = 3  # epoch_count为整体数据集迭代梯度下降次数for epoch_no in range(1, epoch_count + 1):print('\n\n利用整体数据集进行模型的第{0}轮Epoch迭代开始:**********************************************************************************************************************************'.format(epoch_no))W1, B1, W2, B2, W3, B3 = train_epoch(epoch_no)evluation(epoch_no, W1, B1, W2, B2, W3, B3)print('利用整体数据集进行模型的第{0}轮Epoch迭代结束:**********************************************************************************************************************************'.format(epoch_no))if __name__ == '__main__':train()

打印结果：

X_train.shpae = (60000, 28, 28)，Y_train.shpae = (60000,)------------type(X_train) = <class 'numpy.ndarray'>，type(Y_train) = <class 'numpy.ndarray'>------------X_train.dtype = uint8，Y_train.dtype = uint8
X_train.shpae = (60000, 28, 28)，Y_train.shpae = (60000,)------------type(X_train) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_train) = <class 'tensorflow.python.framework.ops.EagerTensor'>------------X_train.dtype = <dtype: 'float32'>，Y_train.dtype = <dtype: 'int32'>
X_train 中的最小值 tf.reduce_min(X_train)= 0.0，X_train 中的最大值 tf.reduce_max(X_train) = 1.0
Y_train 中的最小值 tf.reduce_min(Y_train)= 0，Y_train 中的最大值 tf.reduce_max(Y_train) = 9batch01:(<tf.Tensor: shape=(20000, 28, 28), dtype=float32, numpy=
array([[[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],...,[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.]],[[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],...,[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.]],[[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],...,[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.]],...,[[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],...,[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.]],[[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],...,[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.]],[[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],...,[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.],[0., 0., 0., ..., 0., 0., 0.]]], dtype=float32)>, <tf.Tensor: shape=(20000,), dtype=int32, numpy=array([7, 8, 9, ..., 7, 3, 9])>)batch01[0].shape = (20000, 28, 28)，
batch01[0] = 
[[[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]...[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]][[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]...[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]][[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]...[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]]...[[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]...[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]][[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]...[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]][[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]...[0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.][0. 0. 0. ... 0. 0. 0.]]]batch01[1].shape = (20000,)，
batch01[1] = 
[7 8 9 ... 7 3 9]利用整体数据集进行模型的第1轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 1, batch_step_no = 1，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第1个epoch-->第1个batch step的初始时的：MSE_Loss = 0.09286290407180786梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
epoch_no = 1, batch_step_no = 2，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第1个epoch-->第2个batch step的初始时的：MSE_Loss = 0.09292884916067123梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
epoch_no = 1, batch_step_no = 3，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第1个epoch-->第3个batch step的初始时的：MSE_Loss = 0.09291936457157135梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
得到模型参数：W1, B1, W2, B2, W3, B3
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 1, batch_step_no = 1，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)预测值：out_prob_int = [6 6 0 ... 1 0 6],	真实值：Y_batch = [5 4 0 ... 4 4 7]is_correct_boolean = [False False  True ... False False False]is_correct_int = [0. 0. 1. ... 0. 0. 0.]is_correct_count = 586.0
epoch_no = 1, batch_step_no = 2，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)预测值：out_prob_int = [7 6 6 ... 6 6 6],	真实值：Y_batch = [7 1 2 ... 9 2 0]is_correct_boolean = [ True False False ... False False False]is_correct_int = [1. 0. 0. ... 0. 0. 0.]is_correct_count = 581.0
total_correct = 1167---total_num = 10000
第1次迭代的准确度： acc = 0.1167
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第1轮Epoch迭代结束:**********************************************************************************************************************************利用整体数据集进行模型的第2轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 2, batch_step_no = 1，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第2个epoch-->第1个batch step的初始时的：MSE_Loss = 0.09297237545251846梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
epoch_no = 2, batch_step_no = 2，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第2个epoch-->第2个batch step的初始时的：MSE_Loss = 0.09285857528448105梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
epoch_no = 2, batch_step_no = 3，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第2个epoch-->第3个batch step的初始时的：MSE_Loss = 0.0928683876991272梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
得到模型参数：W1, B1, W2, B2, W3, B3
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 2, batch_step_no = 1，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)预测值：out_prob_int = [6 7 6 ... 6 6 6],	真实值：Y_batch = [5 2 5 ... 2 1 3]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 568.0
epoch_no = 2, batch_step_no = 2，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)预测值：out_prob_int = [6 6 0 ... 6 7 6],	真实值：Y_batch = [8 5 1 ... 8 5 8]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 600.0
total_correct = 1168---total_num = 10000
第2次迭代的准确度： acc = 0.1168
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第2轮Epoch迭代结束:**********************************************************************************************************************************利用整体数据集进行模型的第3轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 3, batch_step_no = 1，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第3个epoch-->第1个batch step的初始时的：MSE_Loss = 0.09299982339143753梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
epoch_no = 3, batch_step_no = 2，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第3个epoch-->第2个batch step的初始时的：MSE_Loss = 0.09280283004045486梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
epoch_no = 3, batch_step_no = 3，x_batch.shpae = (20000, 784)，y_batch.shpae = (20000,)第3个epoch-->第3个batch step的初始时的：MSE_Loss = 0.09288491308689117梯度下降步骤-->θ = θ  - learning_rate * grad：开始梯度下降步骤-->θ = θ  - learning_rate * grad：结束
得到模型参数：W1, B1, W2, B2, W3, B3
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 3, batch_step_no = 1，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)预测值：out_prob_int = [6 6 8 ... 6 7 6],	真实值：Y_batch = [9 7 1 ... 3 4 4]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 596.0
epoch_no = 3, batch_step_no = 2，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)预测值：out_prob_int = [7 6 7 ... 7 6 7],	真实值：Y_batch = [5 3 0 ... 4 1 5]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 572.0
total_correct = 1168---total_num = 10000
第3次迭代的准确度： acc = 0.1168
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第3轮Epoch迭代结束:**********************************************************************************************************************************Process finished with exit code 0

2、keras.Sequential构建DNN神经网络模型(手写数字识别)

import osos.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # 放在 import tensorflow as tf 之前才有效import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, datasets# 一、获取数据集
(X_train, Y_train), (X_val, Y_val) = datasets.mnist.load_data()
print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))# 二、数据处理
# 预处理函数：将numpy数据转为tensor
def preprocess(x, y):x = tf.cast(x, dtype=tf.float32) / 255.y = tf.cast(y, dtype=tf.int32)return x, y# 2.1 处理训练集
# print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
dataset_train = tf.data.Dataset.from_tensor_slices((X_train, Y_train))  # 此步骤自动将numpy类型的数据转为tensor
dataset_train = dataset_train.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
dataset_train = dataset_train.shuffle(len(X_train))  # 打散dataset_train中的样本顺序，防止图片的原始顺序对神经网络性能的干扰
print('dataset_train = {0}，type(dataset_train) = {1}'.format(dataset_train, type(dataset_train)))
batch_size_train = 20000  # 每个batch里的样本数量设置100-200之间合适。
dataset_batch_train = dataset_train.batch(batch_size_train)  # 将dataset_batch_train中每sample_num_of_each_batch_train张图片分为一个batch，读取一个batch相当于一次性并行读取sample_num_of_each_batch_train张图片
print('dataset_batch_train = {0}，type(dataset_batch_train) = {1}'.format(dataset_batch_train, type(dataset_batch_train)))
# 2.2 处理测试集
dataset_val = tf.data.Dataset.from_tensor_slices((X_val, Y_val))  # 此步骤自动将numpy类型的数据转为tensor
dataset_val = dataset_val.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
dataset_val = dataset_val.shuffle(len(X_val))  # 打散样本顺序，防止图片的原始顺序对神经网络性能的干扰
batch_size_val = 5000  # 每个batch里的样本数量设置100-200之间合适。
dataset_batch_val = dataset_val.batch(batch_size_val)  # 将dataset_val中每sample_num_of_each_batch_val张图片分为一个batch，读取一个batch相当于一次性并行读取sample_num_of_each_batch_val张图片# 三、构建神经网络结构：Dense 表示全连接神经网络，激活函数用 relu
network = keras.Sequential([layers.Dense(500, activation=tf.nn.relu),  # 降维：784-->500layers.Dense(300, activation=tf.nn.relu),  # 降维：500-->300layers.Dense(100, activation=tf.nn.relu),  # 降维：300-->100layers.Dense(10)])  # 降维：100-->10，最后一层一般不需要在此处指定激活函数，在计算Loss的时候会自动运用激活函数
network.build(input_shape=[None, 784])  # 28*28=784，None表示样本数量，是不确定的值。
network.summary()  # 打印神经网络network的简要信息# 四、梯度下降优化器设置
optimizer = optimizers.SGD(learning_rate=0.001)# 五、整体数据集进行一次梯度下降来更新模型参数，整体数据集迭代一次，一般用epoch。每个epoch中含有batch_step_no个step，每个step中就是设置的每个batch所含有的样本数量。
def train_epoch(epoch_no):print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))for batch_step_no, (X_batch, Y_batch) in enumerate(dataset_batch_train):  # 每次计算一个batch的数据，循环结束则计算完毕整体数据的一次梯度下降；每个batch的序号一般用step表示(batch_step_no)print('X_batch.shpae = {0}，Y_batch.shpae = {1}------------type(X_batch) = {2}，type(Y_batch) = {3}'.format(X_batch.shape, Y_batch.shape, type(X_batch), type(Y_batch)))# 梯度带tf.GradientTape：连接需要计算梯度的”函数“和”变量“的上下文管理器（context manager）。将“函数”(即Loss的定义式)与“变量”(即神经网络的所有参数)都包裹在tf.GradientTape中进行追踪管理with tf.GradientTape() as tape:X_batch = tf.reshape(X_batch, (-1, 784))  # [b, 28, 28] => [b, 784]   28 * 28=784Y_batch_one_hot = tf.one_hot(Y_batch, depth=10)  # One-Hot编码print('X_batch.shpae = {0}，Y_batch_one_hot.shpae = {1}------------type(X_batch) = {2}，type(Y_batch_one_hot) = {3}'.format(X_batch.shape, Y_batch_one_hot.shape, type(X_batch), type(Y_batch_one_hot)))# Step1. 前向传播/前向运算-->计算当前参数下模型的预测值out_logits = network(X_batch)  # [b, 784] => [b, 10]# Step2. 计算预测值与真实值之间的损失Loss：均方误差损失、交叉熵损失MSE_Loss_method01_mse = tf.reduce_sum(tf.square(out_logits - Y_batch_one_hot)) / X_batch.shape[0]  # 均方误差MSEMSE_Loss_method02_mse = tf.reduce_mean(tf.losses.MSE(Y_batch_one_hot, out_logits))MSE_Loss_method03_crossentropy = tf.reduce_mean(tf.losses.categorical_crossentropy(Y_batch_one_hot, out_logits, from_logits=True))print('epoch_no = {0}, batch_step_no = {1}，X_batch.shpae = {2}，Y_train_one_hot.shpae = {3}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch_one_hot.shape))print('\t第{0}个epoch-->第{1}个batch step的初始时的：MSE_Loss_method01_mse = {2}，MSE_Loss_method02_mse = {3}，MSE_Loss_method03_crossentropy = {4}'.format(epoch_no, batch_step_no + 1, MSE_Loss_method01_mse, MSE_Loss_method02_mse, MSE_Loss_method03_crossentropy))# 选用交叉熵损失作为LossMSE_Loss = MSE_Loss_method03_crossentropy# Step3. 反向传播-->损失值Loss下降一个学习率的梯度之后所对应的更新后的各个Layer的参数：W1, W2, W3, B1, B2, B3# grads为整个模型中所有Layer的待优化参数trainable_variables [W1, W2, W3, B1, B2, B3]分别对目标函数MSE_Loss 在 X_batch 处的梯度，因为设计了3个Layer，所以有6个参数。grads = tape.gradient(MSE_Loss, network.trainable_variables)  # MSE_Loss为目标函数，network.trainable_variables为待优化参数，grads, _ = tf.clip_by_global_norm(grads, 15)  # 限幅：解决gradient explosion或者gradients vanishing的问题。w_index = 1b_index = 1print('\t第{0}个epoch-->第{1}个batch step的初始时的参数：'.format(epoch_no, batch_step_no + 1))for grad in grads:if grad.ndim == 2:print('\t\t参数w{0}：grad.shape = {1}，grad.ndim = {2}'.format(w_index, grad.shape, grad.ndim))w_index = w_index + 1else:print('\t\t参数b{0}：grad.shape = {1}，grad.ndim = {2}'.format(b_index, grad.shape, grad.ndim))b_index = b_index + 1# 进行一次梯度下降print('\t梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始')optimizer.apply_gradients(zip(grads, network.trainable_variables))  # network的所有参数 trainable_variables [W1, W2, W3, B1, B2, B3...]下降一个梯度  w' = w - lr * grad，zip的作用是让梯度值与所属参数前后一一对应print('\t梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束')print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))# 六、模型评估 test/evluation
def evluation(epoch_no):print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))total_correct, total_num = 0, 0for batch_step_no, (X_batch, Y_batch) in enumerate(dataset_batch_val):print('epoch_no = {0}, batch_step_no = {1}，X_batch.shpae = {2}，Y_batch.shpae = {3}'.format(epoch_no, batch_step_no + 1, X_batch.shape, Y_batch.shape))X_batch = tf.reshape(X_batch, [-1, 28 * 28])  # [b, 28, 28] => [b, 28*28]# 根据训练模型计算测试数据的输出值outout_logits = network(X_batch)print('\tout_logits[:1,:] = {0}'.format(out_logits[:1, :]))# 利用softmax()函数将network的输出值转为0~1范围的值，并且使得所有类别预测概率总和为1out_logits_prob = tf.nn.softmax(out_logits, axis=1)  # out_logits_prob: [b, 10] ~ [0, 1]print('\tout_logits_prob[:1,:] = {0}'.format(out_logits_prob[:1, :]))out_logits_prob_max_index = tf.cast(tf.argmax(out_logits_prob, axis=1), dtype=tf.int32)  # [b, 10] => [b] 查找最大值所在的索引位置 int64 转为 int32print('\t预测值：out_logits_prob_max_index = {0},\t真实值：Y_train_one_hot = {1}'.format(out_logits_prob_max_index, Y_batch))is_correct_boolean = tf.equal(out_logits_prob_max_index, Y_batch.numpy())print('\tis_correct_boolean = {0}'.format(is_correct_boolean))is_correct_int = tf.cast(is_correct_boolean, dtype=tf.float32)print('\tis_correct_int = {0}'.format(is_correct_int))is_correct_count = tf.reduce_sum(is_correct_int)print('\tis_correct_count = {0}'.format(is_correct_count))total_correct += int(is_correct_count)total_num += X_batch.shape[0]print('total_correct = {0}---total_num = {1}'.format(total_correct, total_num))acc = total_correct / total_numprint('第{0}轮Epoch迭代的准确度： acc = {1}'.format(epoch_no, acc))print('++++++++++++++++++++++++++++++++++++++++++++第{0}轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++'.format(epoch_no))# 七、整体数据迭代多次梯度下降来更新模型参数
def train():epoch_count = 3  # epoch_count为整体数据集迭代梯度下降次数for epoch_no in range(1, epoch_count + 1):print('\n\n利用整体数据集进行模型的第{0}轮Epoch迭代开始:**********************************************************************************************************************************'.format(epoch_no))train_epoch(epoch_no)evluation(epoch_no)print('利用整体数据集进行模型的第{0}轮Epoch迭代结束:**********************************************************************************************************************************'.format(epoch_no))if __name__ == '__main__':train()

打印结果：

X_train.shpae = (60000, 28, 28)，Y_train.shpae = (60000,)------------type(X_train) = <class 'numpy.ndarray'>，type(Y_train) = <class 'numpy.ndarray'>
dataset_train = <ShuffleDataset shapes: ((28, 28), ()), types: (tf.float32, tf.int32)>，type(dataset_train) = <class 'tensorflow.python.data.ops.dataset_ops.ShuffleDataset'>
dataset_batch_train = <BatchDataset shapes: ((None, 28, 28), (None,)), types: (tf.float32, tf.int32)>，type(dataset_batch_train) = <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 500)               392500    
_________________________________________________________________
dense_1 (Dense)              (None, 300)               150300    
_________________________________________________________________
dense_2 (Dense)              (None, 100)               30100     
_________________________________________________________________
dense_3 (Dense)              (None, 10)                1010      
=================================================================
Total params: 573,910
Trainable params: 573,910
Non-trainable params: 0
_________________________________________________________________利用整体数据集进行模型的第1轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 1, batch_step_no = 1，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第1个epoch-->第1个batch step的初始时的：MSE_Loss_method01_mse = 2.0608606338500977，MSE_Loss_method02_mse = 0.20608605444431305，MSE_Loss_method03_crossentropy = 2.3258094787597656第1个epoch-->第1个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 1, batch_step_no = 2，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第1个epoch-->第2个batch step的初始时的：MSE_Loss_method01_mse = 2.051452398300171，MSE_Loss_method02_mse = 0.2051452100276947，MSE_Loss_method03_crossentropy = 2.3246347904205322第1个epoch-->第2个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 1, batch_step_no = 3，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第1个epoch-->第3个batch step的初始时的：MSE_Loss_method01_mse = 2.060541868209839，MSE_Loss_method02_mse = 0.2060541808605194，MSE_Loss_method03_crossentropy = 2.3286986351013184第1个epoch-->第3个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 1, batch_step_no = 1，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)out_logits[:1,:] = [[-0.09286472 -0.40838078  0.59313995  0.567914   -0.2801518  -0.4418489-0.10716559  0.2037589  -0.54599494 -0.10862864]]out_logits_prob[:1,:] = [[0.08978924 0.06549338 0.1783004  0.17385885 0.07445374 0.063337710.08851431 0.12079425 0.05707322 0.0883849 ]]预测值：out_logits_prob_max_index = [2 3 2 ... 2 3 2],	真实值：Y_train_one_hot = [0 5 8 ... 0 4 3]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 531.0
epoch_no = 1, batch_step_no = 2，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)out_logits[:1,:] = [[-0.22466329  0.02471503  0.21530814  0.14504369 -0.41789523 -0.10817001-0.04880778  0.10835642 -0.26833645  0.01063258]]out_logits_prob[:1,:] = [[0.08304936 0.10657121 0.12894765 0.1201982  0.06845682 0.09331010.09901691 0.11586837 0.07950038 0.10508094]]预测值：out_logits_prob_max_index = [2 3 3 ... 2 3 3],	真实值：Y_train_one_hot = [1 5 9 ... 0 8 6]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 463.0
total_correct = 994---total_num = 10000
第1轮Epoch迭代的准确度： acc = 0.0994
++++++++++++++++++++++++++++++++++++++++++++第1轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第1轮Epoch迭代结束:**********************************************************************************************************************************利用整体数据集进行模型的第2轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 2, batch_step_no = 1，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第2个epoch-->第1个batch step的初始时的：MSE_Loss_method01_mse = 2.0464680194854736，MSE_Loss_method02_mse = 0.20464679598808289，MSE_Loss_method03_crossentropy = 2.3242886066436768第2个epoch-->第1个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 2, batch_step_no = 2，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第2个epoch-->第2个batch step的初始时的：MSE_Loss_method01_mse = 2.042309284210205，MSE_Loss_method02_mse = 0.2042309194803238，MSE_Loss_method03_crossentropy = 2.3242390155792236第2个epoch-->第2个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 2, batch_step_no = 3，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第2个epoch-->第3个batch step的初始时的：MSE_Loss_method01_mse = 2.03645658493042，MSE_Loss_method02_mse = 0.20364566147327423，MSE_Loss_method03_crossentropy = 2.324409246444702第2个epoch-->第3个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 2, batch_step_no = 1，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)out_logits[:1,:] = [[-0.43637595 -0.28027508  0.5622314   0.27695957 -0.21745302 -0.35976154-0.17552382 -0.34576192 -0.5501306  -0.2705214 ]]out_logits_prob[:1,:] = [[0.07291631 0.08523509 0.19793124 0.14880666 0.09076151 0.078722330.09464797 0.07983216 0.06507612 0.08607052]]预测值：out_logits_prob_max_index = [2 2 2 ... 3 3 2],	真实值：Y_train_one_hot = [4 7 9 ... 9 5 6]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 494.0
epoch_no = 2, batch_step_no = 2，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)out_logits[:1,:] = [[-0.13980536 -0.4590331   0.38090762  0.49927706 -0.32300603 -0.326805260.15332891 -0.20123209 -0.07309138 -0.06368762]]out_logits_prob[:1,:] = [[0.08775125 0.06376972 0.14770532 0.16626595 0.07306179 0.072784740.11764133 0.0825232  0.09380519 0.09469148]]预测值：out_logits_prob_max_index = [3 2 3 ... 2 3 2],	真实值：Y_train_one_hot = [5 9 8 ... 7 7 9]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 514.0
total_correct = 1008---total_num = 10000
第2轮Epoch迭代的准确度： acc = 0.1008
++++++++++++++++++++++++++++++++++++++++++++第2轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第2轮Epoch迭代结束:**********************************************************************************************************************************利用整体数据集进行模型的第3轮Epoch迭代开始:**********************************************************************************************************************************
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 3, batch_step_no = 1，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第3个epoch-->第1个batch step的初始时的：MSE_Loss_method01_mse = 2.0333616733551025，MSE_Loss_method02_mse = 0.2033362090587616，MSE_Loss_method03_crossentropy = 2.3235182762145996第3个epoch-->第1个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 3, batch_step_no = 2，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第3个epoch-->第2个batch step的初始时的：MSE_Loss_method01_mse = 2.023038387298584，MSE_Loss_method02_mse = 0.20230382680892944，MSE_Loss_method03_crossentropy = 2.3231124877929688第3个epoch-->第2个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
X_batch.shpae = (20000, 28, 28)，Y_batch.shpae = (20000,)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_batch.shpae = (20000, 784)，Y_batch_one_hot.shpae = (20000, 10)------------type(X_batch) = <class 'tensorflow.python.framework.ops.EagerTensor'>，type(Y_batch_one_hot) = <class 'tensorflow.python.framework.ops.EagerTensor'>
epoch_no = 3, batch_step_no = 3，X_batch.shpae = (20000, 784)，Y_train_one_hot.shpae = (20000, 10)第3个epoch-->第3个batch step的初始时的：MSE_Loss_method01_mse = 2.022404432296753，MSE_Loss_method02_mse = 0.2022404819726944，MSE_Loss_method03_crossentropy = 2.3201558589935303第3个epoch-->第3个batch step的初始时的参数：参数w1：grad.shape = (784, 500)，grad.ndim = 2参数b1：grad.shape = (500,)，grad.ndim = 1参数w2：grad.shape = (500, 300)，grad.ndim = 2参数b2：grad.shape = (300,)，grad.ndim = 1参数w3：grad.shape = (300, 100)，grad.ndim = 2参数b3：grad.shape = (100,)，grad.ndim = 1参数w4：grad.shape = (100, 10)，grad.ndim = 2参数b4：grad.shape = (10,)，grad.ndim = 1梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：开始梯度下降步骤-->optimizer.apply_gradients(zip(grads, network.trainable_variables))：结束
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
epoch_no = 3, batch_step_no = 1，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)out_logits[:1,:] = [[-0.17201006 -0.35150957  0.3403244   0.34693193 -0.12656237 -0.508447350.2102331  -0.23760363 -0.5095875  -0.3919059 ]]out_logits_prob[:1,:] = [[0.09204622 0.07692196 0.15364201 0.15466058 0.09632604 0.065749610.13490005 0.08620234 0.06567468 0.07387652]]预测值：out_logits_prob_max_index = [3 6 3 ... 3 2 3],	真实值：Y_train_one_hot = [0 3 0 ... 7 3 9]is_correct_boolean = [False False False ... False False False]is_correct_int = [0. 0. 0. ... 0. 0. 0.]is_correct_count = 526.0
epoch_no = 3, batch_step_no = 2，X_batch.shpae = (5000, 28, 28)，Y_batch.shpae = (5000,)out_logits[:1,:] = [[-0.05259133 -0.4864607   0.31176758  0.16162373 -0.55530626 -0.379473840.01532018  0.11542548 -0.50286686 -0.02993747]]out_logits_prob[:1,:] = [[0.10453555 0.06773871 0.15048842 0.12950794 0.06323212 0.075387750.11188133 0.12366101 0.06663644 0.10693072]]预测值：out_logits_prob_max_index = [2 3 2 ... 2 2 3],	真实值：Y_train_one_hot = [1 7 2 ... 5 7 8]is_correct_boolean = [False False  True ... False False False]is_correct_int = [0. 0. 1. ... 0. 0. 0.]is_correct_count = 496.0
total_correct = 1022---total_num = 10000
第3轮Epoch迭代的准确度： acc = 0.1022
++++++++++++++++++++++++++++++++++++++++++++第3轮Epoch-->Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++
利用整体数据集进行模型的第3轮Epoch迭代结束:**********************************************************************************************************************************Process finished with exit code 0

3、神经网络模型集成版Compile&Fit(手写数字识别)

import osos.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # 放在 import tensorflow as tf 之前才有效import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, datasets# 一、获取数据集
(X_train, Y_train), (X_val, Y_val) = datasets.mnist.load_data()
print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))# 二、数据处理
# 预处理函数：将numpy数据转为tensor
def preprocess(x, y):x = tf.cast(x, dtype=tf.float32) / 255.x = tf.reshape(x, [28 * 28])y = tf.cast(y, dtype=tf.int32)y = tf.one_hot(y, depth=10)return x, y# 2.1 处理训练集
# print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
dataset_train = tf.data.Dataset.from_tensor_slices((X_train, Y_train))  # 此步骤自动将numpy类型的数据转为tensor
dataset_train = dataset_train.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
dataset_train = dataset_train.shuffle(len(X_train))  # 打散dataset_train中的样本顺序，防止图片的原始顺序对神经网络性能的干扰
print('dataset_train = {0}，type(dataset_train) = {1}'.format(dataset_train, type(dataset_train)))
batch_size_train = 20000  # 每个batch里的样本数量设置100-200之间合适。
dataset_batch_train = dataset_train.batch(batch_size_train)  # 将dataset_batch_train中每sample_num_of_each_batch_train张图片分为一个batch，读取一个batch相当于一次性并行读取sample_num_of_each_batch_train张图片
print('dataset_batch_train = {0}，type(dataset_batch_train) = {1}'.format(dataset_batch_train, type(dataset_batch_train)))
# 2.2 处理测试集
dataset_val = tf.data.Dataset.from_tensor_slices((X_val, Y_val))  # 此步骤自动将numpy类型的数据转为tensor
dataset_val = dataset_val.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
dataset_val = dataset_val.shuffle(len(X_val))  # 打散样本顺序，防止图片的原始顺序对神经网络性能的干扰
batch_size_val = 5000  # 每个batch里的样本数量设置100-200之间合适。
dataset_batch_val = dataset_val.batch(batch_size_val)  # 将dataset_val中每sample_num_of_each_batch_val张图片分为一个batch，读取一个batch相当于一次性并行读取sample_num_of_each_batch_val张图片# 三、构建神经网络结构：Dense 表示全连接神经网络，激活函数用 relu
network = keras.Sequential([layers.Dense(500, activation=tf.nn.relu),  # 降维：784-->500layers.Dense(300, activation=tf.nn.relu),  # 降维：500-->300layers.Dense(100, activation=tf.nn.relu),  # 降维：300-->100layers.Dense(10)])  # 降维：100-->10，最后一层一般不需要在此处指定激活函数，在计算Loss的时候会自动运用激活函数
network.build(input_shape=[None, 784])  # 28*28=784，None表示样本数量，是不确定的值。
network.summary()  # 打印神经网络model的简要信息# 四、设置神经网络各个参数
network.compile(optimizer=optimizers.Adam(lr=0.01),loss=tf.losses.CategoricalCrossentropy(from_logits=True),metrics=['accuracy'])# 五、给神经网络喂数据，训练神经网络模型参数
print('\n++++++++++++++++++++++++++++++++++++++++++++Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++')
network.fit(dataset_batch_train, epochs=5, validation_data=dataset_batch_val, validation_freq=2)  # validation_freq参数表示每多少个epoch做一次验证/validation
print('++++++++++++++++++++++++++++++++++++++++++++Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++')# 六、模型评估 test/evluation
print('\n++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++')
network.evaluate(dataset_batch_val)
print('++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++')# 七、模型上线应用
sample = next(iter(dataset_batch_val))  # 从 dataset_batch_val 中取一个batch数据做模拟
x = sample[0]
y = sample[1]  # one-hot
pred = network.predict(x)  # [b, 10]
y = tf.argmax(y, axis=1)    # convert back to number
pred = tf.argmax(pred, axis=1)
print('\n++++++++++++++++++++++++++++++++++++++++++++应用阶段：开始++++++++++++++++++++++++++++++++++++++++++++')
print(pred)
print(y)
print('++++++++++++++++++++++++++++++++++++++++++++应用阶段：结束++++++++++++++++++++++++++++++++++++++++++++')

打印结果：

C:\Program_Files_AI\Anaconda3531\python.exe D:/Workspaces_Python/《网易云课堂》深度学习与TensorFlow入门实战-源码和PPT/lesson30-Keras高层API/Compile_Fit_Main.py
X_train.shpae = (60000, 28, 28)，Y_train.shpae = (60000,)------------type(X_train) = <class 'numpy.ndarray'>，type(Y_train) = <class 'numpy.ndarray'>
dataset_train = <ShuffleDataset shapes: ((784,), (10,)), types: (tf.float32, tf.float32)>，type(dataset_train) = <class 'tensorflow.python.data.ops.dataset_ops.ShuffleDataset'>
dataset_batch_train = <BatchDataset shapes: ((None, 784), (None, 10)), types: (tf.float32, tf.float32)>，type(dataset_batch_train) = <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 500)               392500    
_________________________________________________________________
dense_1 (Dense)              (None, 300)               150300    
_________________________________________________________________
dense_2 (Dense)              (None, 100)               30100     
_________________________________________________________________
dense_3 (Dense)              (None, 10)                1010      
=================================================================
Total params: 573,910
Trainable params: 573,910
Non-trainable params: 0
_________________________________________________________________++++++++++++++++++++++++++++++++++++++++++++Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
Epoch 1/5
3/3 [==============================] - 3s 110ms/step - loss: 2.4614 - accuracy: 0.2259
Epoch 2/5
3/3 [==============================] - 2s 482ms/step - loss: 3.0340 - accuracy: 0.2840 - val_loss: 1.2846 - val_accuracy: 0.5411
Epoch 3/5
3/3 [==============================] - 2s 111ms/step - loss: 1.1396 - accuracy: 0.6152
Epoch 4/5
3/3 [==============================] - 2s 249ms/step - loss: 0.6815 - accuracy: 0.7853 - val_loss: 0.5006 - val_accuracy: 0.8409
Epoch 5/5
3/3 [==============================] - 2s 99ms/step - loss: 0.4985 - accuracy: 0.8491
++++++++++++++++++++++++++++++++++++++++++++Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
2/2 [==============================] - 0s 20ms/step - loss: 0.3861 - accuracy: 0.8883
++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++应用阶段：开始++++++++++++++++++++++++++++++++++++++++++++
tf.Tensor([6 7 2 ... 8 4 4], shape=(5000,), dtype=int64)
tf.Tensor([6 7 2 ... 8 4 4], shape=(5000,), dtype=int64)
++++++++++++++++++++++++++++++++++++++++++++应用阶段：结束++++++++++++++++++++++++++++++++++++++++++++Process finished with exit code 0

4、自定义神经网络模型：集成版Compile&Fit(cifar10数据集)

import osos.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # 放在 import tensorflow as tf 之前才有效import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, datasets# 一、获取数据集
(X_train, Y_train), (X_val, Y_val) = datasets.cifar10.load_data()
print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
print('X_val.shpae = {0}，Y_val.shpae = {1}------------type(X_val) = {2}，type(Y_val) = {3}'.format(X_val.shape, Y_val.shape, type(X_val), type(Y_val)))
Y_train = tf.squeeze(Y_train)
Y_val = tf.squeeze(Y_val)
print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
print('X_val.shpae = {0}，Y_val.shpae = {1}------------type(X_val) = {2}，type(Y_val) = {3}'.format(X_val.shape, Y_val.shape, type(X_val), type(Y_val)))# 二、数据处理
# 预处理函数：将numpy数据转为tensor
def preprocess(x, y):x = 2 * tf.cast(x, dtype=tf.float32) / 255. - 1     # 将x值限定在[-1,1]之间x = tf.reshape(x, [32*32*3])y = tf.cast(y, dtype=tf.int32)y = tf.one_hot(y, depth=10)return x, y# 2.1 处理训练集
# print('X_train.shpae = {0}，Y_train.shpae = {1}------------type(X_train) = {2}，type(Y_train) = {3}'.format(X_train.shape, Y_train.shape, type(X_train), type(Y_train)))
dataset_train = tf.data.Dataset.from_tensor_slices((X_train, Y_train))  # 此步骤自动将numpy类型的数据转为tensor
dataset_train = dataset_train.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
dataset_train = dataset_train.shuffle(len(X_train))  # 打散dataset_train中的样本顺序，防止图片的原始顺序对神经网络性能的干扰
print('dataset_train = {0}，type(dataset_train) = {1}'.format(dataset_train, type(dataset_train)))
batch_size_train = 20000  # 每个batch里的样本数量设置100-200之间合适。
dataset_batch_train = dataset_train.batch(batch_size_train)  # 将dataset_batch_train中每sample_num_of_each_batch_train张图片分为一个batch，读取一个batch相当于一次性并行读取sample_num_of_each_batch_train张图片
print('dataset_batch_train = {0}，type(dataset_batch_train) = {1}'.format(dataset_batch_train, type(dataset_batch_train)))
# 2.2 处理测试集
dataset_val = tf.data.Dataset.from_tensor_slices((X_val, Y_val))  # 此步骤自动将numpy类型的数据转为tensor
dataset_val = dataset_val.map(preprocess)  # 调用map()函数批量修改每一个元素数据的数据类型
dataset_val = dataset_val.shuffle(len(X_val))  # 打散样本顺序，防止图片的原始顺序对神经网络性能的干扰
batch_size_val = 5000  # 每个batch里的样本数量设置100-200之间合适。
dataset_batch_val = dataset_val.batch(batch_size_val)  # 将dataset_val中每sample_num_of_each_batch_val张图片分为一个batch，读取一个batch相当于一次性并行读取sample_num_of_each_batch_val张图片# 三、自定义：Layer
class MyDenseLayer(layers.Layer):def __init__(self, inp_dim, outp_dim):super(MyDenseLayer, self).__init__()self.kernel = self.add_weight('w', [inp_dim, outp_dim])   # UserWarning: `layer.add_variable` is deprecated and will be removed in a future version. Please use `layer.add_weight` method instead.# self.bias = self.add_weight('b', [outp_dim])  # 自定义Layer中可以将bias项去除掉# call()函数负责前向传播计算def call(self, inputs, training=None):# out = inputs @ self.kernel + self.biasout = inputs @ self.kernelreturn out# 四、自定义：Model
class MyDIYModel(keras.Model):def __init__(self):super(MyDIYModel, self).__init__()self.fc1 = MyDenseLayer(32*32*3, 500)self.fc2 = MyDenseLayer(500, 300)self.fc3 = MyDenseLayer(300, 100)self.fc4 = MyDenseLayer(100, 10)def call(self, inputs, training=None):x = self.fc1(inputs)x = tf.nn.relu(x)x = self.fc2(x)x = tf.nn.relu(x)x = self.fc3(x)x = tf.nn.relu(x)x = self.fc4(x)return x# 五、通过自定义的MyDIYModel实例化一个神经网络结构
network = MyDIYModel()
network.build(input_shape=[None, 32*32*3])  # None表示样本数量，是不确定的值。
network.summary()  # 打印神经网络model的简要信息# 四、设置神经网络各个参数
network.compile(optimizer=optimizers.Adam(lr=0.01),loss=tf.losses.CategoricalCrossentropy(from_logits=True),metrics=['accuracy'])# 五、给神经网络喂数据，训练神经网络模型参数
print('\n++++++++++++++++++++++++++++++++++++++++++++Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++')
network.fit(dataset_batch_train, epochs=5, validation_data=dataset_batch_val, validation_freq=2)  # validation_freq参数表示每多少个epoch做一次验证/validation
print('++++++++++++++++++++++++++++++++++++++++++++Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++')# 六、模型评估 test/evluation
print('\n++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++')
network.evaluate(dataset_batch_val)
print('++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++')# 七、模型上线应用
sample = next(iter(dataset_batch_val))  # 从 dataset_batch_val 中取一个batch数据做模拟
x = sample[0]
y = sample[1]  # one-hot
pred = network.predict(x)  # [b, 10]
y = tf.argmax(y, axis=1)  # convert back to number
pred = tf.argmax(pred, axis=1)
print('\n++++++++++++++++++++++++++++++++++++++++++++应用阶段：开始++++++++++++++++++++++++++++++++++++++++++++')
print(pred)
print(y)
print('++++++++++++++++++++++++++++++++++++++++++++应用阶段：结束++++++++++++++++++++++++++++++++++++++++++++')

打印结果：

X_train.shpae = (50000, 32, 32, 3)，Y_train.shpae = (50000, 1)------------type(X_train) = <class 'numpy.ndarray'>，type(Y_train) = <class 'numpy.ndarray'>
X_val.shpae = (10000, 32, 32, 3)，Y_val.shpae = (10000, 1)------------type(X_val) = <class 'numpy.ndarray'>，type(Y_val) = <class 'numpy.ndarray'>
X_train.shpae = (50000, 32, 32, 3)，Y_train.shpae = (50000,)------------type(X_train) = <class 'numpy.ndarray'>，type(Y_train) = <class 'tensorflow.python.framework.ops.EagerTensor'>
X_val.shpae = (10000, 32, 32, 3)，Y_val.shpae = (10000,)------------type(X_val) = <class 'numpy.ndarray'>，type(Y_val) = <class 'tensorflow.python.framework.ops.EagerTensor'>
dataset_train = <ShuffleDataset shapes: ((3072,), (10,)), types: (tf.float32, tf.float32)>，type(dataset_train) = <class 'tensorflow.python.data.ops.dataset_ops.ShuffleDataset'>
dataset_batch_train = <BatchDataset shapes: ((None, 3072), (None, 10)), types: (tf.float32, tf.float32)>，type(dataset_batch_train) = <class 'tensorflow.python.data.ops.dataset_ops.BatchDataset'>
Model: "my_diy_model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
my_dense_layer (MyDenseLayer multiple                  1536000   
_________________________________________________________________
my_dense_layer_1 (MyDenseLay multiple                  150000    
_________________________________________________________________
my_dense_layer_2 (MyDenseLay multiple                  30000     
_________________________________________________________________
my_dense_layer_3 (MyDenseLay multiple                  1000      
=================================================================
Total params: 1,717,000
Trainable params: 1,717,000
Non-trainable params: 0
_________________________________________________________________++++++++++++++++++++++++++++++++++++++++++++Training 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
Epoch 1/5
3/3 [==============================] - 3s 306ms/step - loss: 10.8439 - accuracy: 0.1140
Epoch 2/5
3/3 [==============================] - 3s 735ms/step - loss: 13.1332 - accuracy: 0.1376 - val_loss: 7.9842 - val_accuracy: 0.1083
Epoch 3/5
3/3 [==============================] - 2s 249ms/step - loss: 6.6515 - accuracy: 0.1192
Epoch 4/5
3/3 [==============================] - 3s 519ms/step - loss: 2.5907 - accuracy: 0.1478 - val_loss: 2.3578 - val_accuracy: 0.1530
Epoch 5/5
3/3 [==============================] - 2s 242ms/step - loss: 2.3293 - accuracy: 0.1529
++++++++++++++++++++++++++++++++++++++++++++Training 阶段：结束++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：开始++++++++++++++++++++++++++++++++++++++++++++
2/2 [==============================] - 1s 76ms/step - loss: 2.2856 - accuracy: 0.1317
++++++++++++++++++++++++++++++++++++++++++++Evluation 阶段：结束++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++应用阶段：开始++++++++++++++++++++++++++++++++++++++++++++
tf.Tensor([0 0 0 ... 3 0 1], shape=(5000,), dtype=int64)
tf.Tensor([3 8 3 ... 2 4 6], shape=(5000,), dtype=int64)
++++++++++++++++++++++++++++++++++++++++++++应用阶段：结束++++++++++++++++++++++++++++++++++++++++++++Process finished with exit code 0