【Machine Learning】Suitable Learning Rate in Machine Learning

2024-03-17 10:36
文章标签 suitable learning machine rate

本文主要是介绍【Machine Learning】Suitable Learning Rate in Machine Learning,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

一、The cases of different learning rates:

        In the gradient descent algorithm model:

w = w - \alpha \frac{ \partial J(w,b) }{ \partial w }

        \alpha is the learning rate of the demand, how to determine the learning rate, and what impact does it have if it is too large or too small? We will analyze it through the following graph:

        We can use the same method as before to understand this equation, so that b in J (w, b) is 0, and then we can create a two-dimensional coordinate graph:

        So let's first observe the case of a smaller learning rate (starting from F):

        In this case, there is a high probability that the minimum point can be found, which means that it can eventually converge.

        Then there are situations with high learning rates:

        We can find that when the learning rate is high but within a certain limit, convergence can also be achieved. The reason for this can be started from the formula. Whenever a point drops to a point with a smaller slope, its learning rate remains unchanged, but the slope decreases, and it will eventually continue to decline until convergence. However, will this situation continue? We can take a look at the following situation:

        The difference between this and the above is that when descending, it may just skip the optimal point, which may result in the convergence value not being optimal.

        Finally, there is the case of divergence:

        So the situation is roughly like these:

        In the picture, loss is an indicator that measures the difference between the predicted results of the model and the actual labels, and epoch is a complete training process in the gradient descent algorithm, which includes multiple iterations of parameter updates.

二、How to choose the Suitable Learning Rate:

        In algorithm design, we should adjust the learning rate in real time and determine the size of the adjustment by observing the fitted model. After each iteration, use the estimated model parameters to view the value of the error function. If the error rate decreases compared to the previous iteration, the learning rate can be increased. If the error rate increases compared to the previous iteration, the value of the previous iteration should be reset and the learning rate reduced to 50% of the previous iteration. Therefore, this is a method of adaptive learning rate adjustment. There are simple and direct methods for dynamically changing learning rates in deep learning frameworks such as Caffe and TensorFlow.

        The commonly used learning rates are 0.00001, 0.0001, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, 10

这篇关于【Machine Learning】Suitable Learning Rate in Machine Learning的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/818702

相关文章

简单的Q-learning|小明的一维世界(3)

简单的Q-learning|小明的一维世界(1) 简单的Q-learning|小明的一维世界(2) 一维的加速度世界 这个世界,小明只能控制自己的加速度,并且只能对加速度进行如下三种操作:增加1、减少1、或者不变。所以行动空间为: { u 1 = − 1 , u 2 = 0 , u 3 = 1 } \{u_1=-1, u_2=0, u_3=1\} {u1​=−1,u2​=0,u3​=1}

简单的Q-learning|小明的一维世界(2)

上篇介绍了小明的一维世界模型 、Q-learning的状态空间、行动空间、奖励函数、Q-table、Q table更新公式、以及从Q值导出策略的公式等。最后给出最简单的一维位置世界的Q-learning例子,从给出其状态空间、行动空间、以及稠密与稀疏两种奖励函数的设置方式。下面将继续深入,GO! 一维的速度世界 这个世界,小明只能控制自己的速度,并且只能对速度进行如下三种操作:增加1、减

SQLException: No Suitable Driver Found - 完美解决方法详解

🚨 SQLException: No Suitable Driver Found - 完美解决方法详解 🚨 **🚨 SQLException: No Suitable Driver Found - 完美解决方法详解 🚨****摘要 📝****引言 🎯****正文 📚****1. 问题概述 ❗****2. JDBC 驱动程序的工作原理 🔧****3. 错误的根本原因 🕵️**

ZOJ 3324 Machine(线段树区间合并)

这道题网上很多代码是错误的,由于后台数据水,他们可以AC。 比如这组数据 10 3 p 0 9 r 0 5 r 6 9 输出应该是 0 1 1 所以有的人直接记录该区间是否被覆盖过的方法是错误的 正确方法应该是记录这段区间的最小高度(就是最接近初始位置的高度),和最小高度对应的最长左区间和右区间 开一个sum记录这段区间最小高度的块数,min_v 记录该区间最小高度 cover

SpringBoot启动报错Failed to determine a suitable driver class

两种解决办法 1.在Application类上加 ` @EnableAutoConfiguration(exclude={DataSourceAutoConfiguration.class}) package com.example.demo3;import org.springframework.boot.SpringApplication;import org.springframewo

Learning Memory-guided Normality for Anomaly Detection——学习记忆引导的常态异常检测

又是一篇在自编码器框架中研究使用记忆模块的论文,可以看做19年的iccv的论文的衍生,在我的博客中对19年iccv这篇论文也做了简单介绍。韩国人写的,应该是吧,这名字听起来就像。 摘要abstract 我们解决异常检测的问题,即检测视频序列中的异常事件。基于卷积神经网络的异常检测方法通常利用代理任务(如重建输入视频帧)来学习描述正常情况的模型,而在训练时看不到异常样本,并在测试时使用重建误

Learning Temporal Regularity in Video Sequences——视频序列的时间规则性学习

Learning Temporal Regularity in Video Sequences CVPR2016 无监督视频异常事件检测早期工作 摘要 由于对“有意义”的定义不明确以及场景混乱,因此在较长的视频序列中感知有意义的活动是一个具有挑战性的问题。我们通过在非常有限的监督下使用多种来源学习常规运动模式的生成模型(称为规律性)来解决此问题。体来说,我们提出了两种基于自动编码器的方法,以

COD论文笔记 Adaptive Guidance Learning for Camouflaged Object Detection

论文的主要动机、现有方法的不足、拟解决的问题、主要贡献和创新点如下: 动机: 论文的核心动机是解决伪装目标检测(COD)中的挑战性任务。伪装目标检测旨在识别和分割那些在视觉上与周围环境高度相似的目标,这对于计算机视觉来说是非常困难的任务。尽管深度学习方法在该领域取得了一定进展,但现有方法仍面临有效分离目标和背景的难题,尤其是在伪装目标与背景特征高度相似的情况下。 现有方法的不足之处: 过于

One-Shot Imitation Learning

发表时间:NIPS2017 论文链接:https://readpaper.com/pdf-annotate/note?pdfId=4557560538297540609&noteId=2424799047081637376 作者单位:Berkeley AI Research Lab, Work done while at OpenAI Yan Duan†§ , Marcin Andrychow

Introduction to Deep Learning with PyTorch

1、Introduction to PyTorch, a Deep Learning Library 1.1、Importing PyTorch and related packages import torch# supports:## image data with torchvision## audio data with torchaudio## text data with t