deep_learning_week2_logistic回归

2024-04-25 12:18

本文主要是介绍deep_learning_week2_logistic回归,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

deep_learning_week2_logistic回归

标签: 机器学习深度学习


代码已上传github:
https://github.com/PerfectDemoT/my_deeplearning_homework

这是吴恩达深度学习里的第一次作业

  • deep_learning_week2_logistic回归
      • 这是吴恩达深度学习里的第一次作业
        • 先导入包
        • 导入数据再显示一张图片看看效果
        • 现在可以开始写函数了
        • sogmoid函数
        • 随机初始化w和b的函数
        • 现在来前向和反向传播
        • 更新w和b的函数
        • 现在这是预测函数
      • 现在终于可以开始真正的训练参数了之前都是用的测试现在开始用真正的图片数据来训练
        • 最后来点有趣的用自己的图

实现logistic回归


1. 先导入包
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
import pylab
from scipy import ndimage
from lr_utils import load_dataset #这个是里面的一个导入数据的py文件
2. 导入数据,再显示一张图片看看效果

导入数据代码如下:

# Loading the data (cat/non-cat)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
#分别为训练集x 209个,训练集y 209个,测试集x 50个,测试集y 50个

这是一张图片
图片

#这时输出上面这张图的代码
index = 19
plt.imshow(train_set_x_orig[index])
pylab.show()
print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' picture.")

之后会让你看看导入数据的维数,代码如下:

### START CODE HERE ### (≈ 3 lines of code)
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
### END CODE HERE ###print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

结果显示是这样:
结果显示


3. 现在可以开始写函数了

首先把训练数据的64*64*3的矩阵变化为向量,代码如下:

### START CODE HERE ### (≈ 2 lines of code)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T
###现在处理好了图像,将其全部向量化了

将原来的四维(样本个数,图片横,竖,RGB)变为二维(样本个数,其他所有)


然后现在开始归一化

#现在开始归一化,将RGB里的0-255全部变为0-1(除以255)
train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

现在可以开始写各种函数了:

sogmoid函数
    def sigmoid(z):#     """#     Compute the sigmoid of z##     Arguments:#     z -- A scalar or numpy array of any size.# ​#     Return:#     s -- sigmoid(z)#     """### START CODE HERE ### (≈ 1 line of code)s=1./(1.+ np.exp(-z))### END CODE HERE ###return s
随机初始化w和b的函数
def initialize_with_zeros(dim):# """# This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.## Argument:# dim -- size of the w vector we want (or number of parameters in this case)## Returns:# w -- initialized vector of shape (dim, 1)# b -- initialized scalar (corresponds to the bias)# """### START CODE HERE ### (≈ 1 line of code)w = np.zeros((dim, 1))b = 0### END CODE HERE ###assert (w.shape == (dim, 1))assert (isinstance(b, float) or isinstance(b, int))return w, b

现在来前向和反向传播
def propagate(w, b, X, Y):#     """#     Implement the cost function and its gradient for the propagation explained above# ​#     Arguments:#     w -- weights, a numpy array of size (num_px * num_px * 3, 1)#     b -- bias, a scalar#     X -- data of size (num_px * num_px * 3, number of examples)#     Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)# ​#     Return:#     cost -- negative log-likelihood cost for logistic regression#     dw -- gradient of the loss with respect to w, thus same shape as w#     db -- gradient of the loss with respect to b, thus same shape as b##     Tips:#     - Write your code step by step for the propagation. np.log(), np.dot()#     """m = X.shape[1]# FORWARD PROPAGATION (FROM X TO COST)### START CODE HERE ### (≈ 2 lines of code)A = sigmoid(np.dot(w.T , X) + b)  # compute activationcost = np.sum(Y * np.log(A) + (1-Y) * np.log(1-A))/(-m)  # compute cost### END CODE HERE #### BACKWARD PROPAGATION (TO FIND GRAD)### START CODE HERE ### (≈ 2 lines of code)dw = np.dot(X , (A-Y).T)/mdb = np.sum(A-Y)/m### END CODE HERE ### ​assert (dw.shape == w.shape)assert (db.dtype == float)cost = np.squeeze(cost)assert (cost.shape == ())grads = {"dw": dw,"db": db}return grads, cost

更新w和b的函数
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost=False):# """# This function optimizes w and b by running a gradient descent algorithm## Arguments:# w -- weights, a numpy array of size (num_px * num_px * 3, 1)# b -- bias, a scalar# X -- data of shape (num_px * num_px * 3, number of examples)# Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)# num_iterations -- number of iterations of the optimization loop# learning_rate -- learning rate of the gradient descent update rule# print_cost -- True to print the loss every 100 steps## Returns:# params -- dictionary containing the weights w and bias b# grads -- dictionary containing the gradients of the weights and bias with respect to the cost function# costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve.## Tips:# You basically need to write down two steps and iterate through them:#     1) Calculate the cost and the gradient for the current parameters. Use propagate().#     2) Update the parameters using gradient descent rule for w and b.# """costs = []for i in range(num_iterations):# Cost and gradient calculation (≈ 1-4 lines of code)### START CODE HERE ###grads, cost = propagate(w , b , X , Y)### END CODE HERE #### Retrieve derivatives from gradsdw = grads["dw"]db = grads["db"]# update rule (≈ 2 lines of code)### START CODE HERE ###w = w - learning_rate * dwb = b - learning_rate * db### END CODE HERE #### Record the costsif i % 100 == 0:costs.append(cost)# Print the cost every 100 training examples# 每100次输出一个代价函数if print_cost and i % 100 == 0:print("Cost after iteration %i: %f" % (i, cost))params = {"w": w,"b": b}grads = {"dw": dw,"db": db}return params, grads, costs

现在这是预测函数
def predict(w, b, X):# '''# Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)## Arguments:# w -- weights, a numpy array of size (num_px * num_px * 3, 1)# b -- bias, a scalar# X -- data of size (num_px * num_px * 3, number of examples)## Returns:# Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X# '''m = X.shape[1]Y_prediction = np.zeros((1, m))w = w.reshape(X.shape[0], 1)# Compute vector "A" predicting the probabilities of a cat being present in the picture### START CODE HERE ### (≈ 1 line of code)A = np.dot(w.T , X)### END CODE HERE ###for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]### START CODE HERE ### (≈ 4 lines of code)if (A[0, i] > 0.5):Y_prediction[0][i] = 1else:Y_prediction[0][i] = 0### END CODE HERE ###assert (Y_prediction.shape == (1, m))return Y_prediction

4.现在终于可以开始真正的训练参数了(之前都是用的测试,现在开始用真正的图片数据来训练)

代码如下:

#现在开始将上面的这些函数整合起来
# GRADED FUNCTION: model
print("===============终于,开始处理图像了=========================")
def model(X_train, Y_train, X_test, Y_test, num_iterations=10000, learning_rate=0.01, print_cost=False):# """# Builds the logistic regression model by calling the function you've implemented previously## Arguments:# X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)# Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)# X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)# Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)# num_iterations -- hyperparameter representing the number of iterations to optimize the parameters# learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()# print_cost -- Set to true to print the cost every 100 iterations## Returns:# d -- dictionary containing information about the model.# """### START CODE HERE #### initialize parameters with zeros (≈ 1 line of code)w, b = initialize_with_zeros(X_train.shape[0])# Gradient descent (≈ 1 line of code)parameters, grads, costs = optimize(w , b , X_train , Y_train , num_iterations , learning_rate , print_cost=False)# Retrieve parameters w and b from dictionary "parameters"w = parameters["w"]b = parameters["b"]# Predict test/train set examples (≈ 2 lines of code)Y_prediction_test = predict(w , b , X_test)Y_prediction_train = predict(w , b , X_train)### END CODE HERE #### Print train/test Errorsprint("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))d = {"costs": costs,"Y_prediction_test": Y_prediction_test,"Y_prediction_train": Y_prediction_train,"w": w,"b": b,"learning_rate": learning_rate,"num_iterations": num_iterations}return d

下面是调用这个函数来训练的例子:

d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 10000, learning_rate = 0.001, print_cost = True)

训练完毕,我们来看看准确率

print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))

结果是这样的:
结果


现在我们来看看代价函数曲线(代码长这样):

costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

图是这样的(这里的学习速率是0.001,迭代次数10000):
图片


下面来看看学习速率的选择(先上代码):

learning_rates = [0.01, 0.001, 0.0001]#定义一个有三个值的数组,这里只是给你看方法,实际上的取值是有方法的(在机器学习里)
models = {}
for i in learning_rates:print ("learning rate is: " + str(i))models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)print ('\n' + "-------------------------------------------------------" + '\n')for i in learning_rates:plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))plt.ylabel('cost')
plt.xlabel('iterations')legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()
#通过图像的发现,三个学习速率中,0.001是最好的(好吧,如果不管前面蓝色的波动,0.01或许也不错,不过,你会发现训练集的误差已经很小了,但验证集误差却很大,存在过拟合情况,所以用0.01并不会减小验证集的误差)

下面是图片
图片
然后学习率对应的准确率如下:
图片


最后,来点有趣的,用自己的图

现在用自己的图像玩玩,1表示预测是猫,0表示不是

## START CODE HERE ## (PUT YOUR IMAGE NAME)
my_image = "my_image2.jpg"   # change this to the name of your image file
## END CODE HERE ##
# ​
# We preprocess the image to fit your algorithm.
fname = "images/" + my_image
image = np.array(ndimage.imread(fname, flatten=False))
my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1, num_px*num_px*3)).T
my_predicted_image = predict(d["w"], d["b"], my_image)plt.imshow(image)
print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")

图长这样:
图片
然后判断是:
图片

我还试了一张埃菲尔铁塔的,判断为不是猫,感觉还不错。。。(好吧,其实看识别率还挺低),这里还缺少了一个正规化来防止过拟合(并且个人感觉,这个一定overfitting了,因为训练样本准确率非常高,而测试样本准确率却很低),防止过拟合将在下次来实现

这篇关于deep_learning_week2_logistic回归的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/934655

相关文章

vue解决子组件样式覆盖问题scoped deep

《vue解决子组件样式覆盖问题scopeddeep》文章主要介绍了在Vue项目中处理全局样式和局部样式的方法,包括使用scoped属性和深度选择器(/deep/)来覆盖子组件的样式,作者建议所有组件... 目录前言scoped分析deep分析使用总结所有组件必须加scoped父组件覆盖子组件使用deep前言

✨机器学习笔记(二)—— 线性回归、代价函数、梯度下降

1️⃣线性回归(linear regression) f w , b ( x ) = w x + b f_{w,b}(x) = wx + b fw,b​(x)=wx+b 🎈A linear regression model predicting house prices: 如图是机器学习通过监督学习运用线性回归模型来预测房价的例子,当房屋大小为1250 f e e t 2 feet^

用Python实现时间序列模型实战——Day 14: 向量自回归模型 (VAR) 与向量误差修正模型 (VECM)

一、学习内容 1. 向量自回归模型 (VAR) 的基本概念与应用 向量自回归模型 (VAR) 是多元时间序列分析中的一种模型,用于捕捉多个变量之间的相互依赖关系。与单变量自回归模型不同,VAR 模型将多个时间序列作为向量输入,同时对这些变量进行回归分析。 VAR 模型的一般形式为: 其中: ​ 是时间  的变量向量。 是常数向量。​ 是每个时间滞后的回归系数矩阵。​ 是误差项向量,假

简单的Q-learning|小明的一维世界(3)

简单的Q-learning|小明的一维世界(1) 简单的Q-learning|小明的一维世界(2) 一维的加速度世界 这个世界,小明只能控制自己的加速度,并且只能对加速度进行如下三种操作:增加1、减少1、或者不变。所以行动空间为: { u 1 = − 1 , u 2 = 0 , u 3 = 1 } \{u_1=-1, u_2=0, u_3=1\} {u1​=−1,u2​=0,u3​=1}

简单的Q-learning|小明的一维世界(2)

上篇介绍了小明的一维世界模型 、Q-learning的状态空间、行动空间、奖励函数、Q-table、Q table更新公式、以及从Q值导出策略的公式等。最后给出最简单的一维位置世界的Q-learning例子,从给出其状态空间、行动空间、以及稠密与稀疏两种奖励函数的设置方式。下面将继续深入,GO! 一维的速度世界 这个世界,小明只能控制自己的速度,并且只能对速度进行如下三种操作:增加1、减

深度学习与大模型第3课:线性回归模型的构建与训练

文章目录 使用Python实现线性回归:从基础到scikit-learn1. 环境准备2. 数据准备和可视化3. 使用numpy实现线性回归4. 使用模型进行预测5. 可视化预测结果6. 使用scikit-learn实现线性回归7. 梯度下降法8. 随机梯度下降和小批量梯度下降9. 比较不同的梯度下降方法总结 使用Python实现线性回归:从基础到scikit-learn 线性

【python因果推断库11】工具变量回归与使用 pymc 验证工具变量4

目录  Wald 估计与简单控制回归的比较 CausalPy 和 多变量模型 感兴趣的系数 复杂化工具变量公式  Wald 估计与简单控制回归的比较 但现在我们可以将这个估计与仅包含教育作为控制变量的简单回归进行比较。 naive_reg_model, idata_reg = make_reg_model(covariate_df.assign(education=df[

什么是GPT-3的自回归架构?为什么GPT-3无需梯度更新和微调

文章目录 知识回顾GPT-3的自回归架构何为自回归架构为什么架构会影响任务表现自回归架构的局限性与双向模型的对比小结 为何无需梯度更新和微调为什么不需要怎么做到不需要 🍃作者介绍:双非本科大四网络工程专业在读,阿里云专家博主,专注于Java领域学习,擅长web应用开发,目前开始人工智能领域相关知识的学习 🦅个人主页:@逐梦苍穹 📕所属专栏:人工智能 🌻gitee地址:x

Deep Ocr

1.圈出内容,文本那里要有内容.然后你保存,并'导出数据集'. 2.找出deep_ocr_recognition_training_workflow.hdev 文件.修改“DatasetFilename := 'Test.hdict'” 310行 write_deep_ocr (DeepOcrHandle, BestModelDeepOCRFilename) 3.推理test.hdev

回归预测 | MATLAB实现PSO-LSTM(粒子群优化长短期记忆神经网络)多输入单输出

回归预测 | MATLAB实现PSO-LSTM(粒子群优化长短期记忆神经网络)多输入单输出 目录 回归预测 | MATLAB实现PSO-LSTM(粒子群优化长短期记忆神经网络)多输入单输出预测效果基本介绍模型介绍PSO模型LSTM模型PSO-LSTM模型 程序设计参考资料致谢 预测效果 Matlab实现PSO-LSTM多变量回归预测 1.input和outpu