【论文记录】Stochastic gradient descent with differentially private updates

本文主要是介绍【论文记录】Stochastic gradient descent with differentially private updates,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

记录几条疑问

  • The sample size required for a target utility level increases with the privacy constraint.
  • Optimization methods for large data sets must also be scalable.
  • SGD algorithms satisfy asymptotic guarantees


Introduction

  • 主要工作简介:
    \quad In this paper we derive differentially private versions of single-point SGD and mini-batch SGD, and evaluate them on real and synthetic data sets.

  • 更多运用SGD的原因:
    \quad Stochastic gradient descent (SGD) algorithms are simple and satisfy the same asymptotic guarantees as more computationally intensive learning methods.

  • 由于asymptotic guarantees带来的影响:
    \quad to obtain reasonable performance on finite data sets practitioners must take care in setting parameters such as the learning rate (step size) for the updates.

  • 上述影响的应对之策:
    \quad Grouping updates into “minibatches” to alleviate some of this sensitivity and improve the performance of SGD. This can improve the robustness of the updating at a moderate expense in terms of computation, but also introduces the batch size as a free parameter.


Preliminaries

  • 优化目标:
    \quad solve a regularized convex optimization problem : w ∗ = argmin w ∈ R d λ 2 ∥ w ∥ 2 + 1 n Σ i = 1 n l ( w , x i , y i ) w^* = \mathop{ \textbf{argmin} } \limits_{ w \in \mathbb{R}^d} \frac{\lambda}{2} \Vert w \Vert ^2 + \frac{1}{n} \mathop{ \Sigma }\limits_{i=1}^n \mathbb{l} (w,x_i,y_i) w=wRdargmin2λw2+n1i=1Σnl(w,xi,yi)
    \quad where w w w is the normal vector to the hyperplane separator, and l \mathbb{l} l is a convex loss function.
    \quad l \mathbb{l} l 选为 logistic loss, 即 l ( w , x , y ) = l o g ( 1 + e − y w T x ) \mathbb{l} (w,x,y)=log(1+e^{-yw^Tx}) l(w,x,y)=log(1+eywTx), 则 ⇒ \Rightarrow Logistic Regression
    \quad l \mathbb{l} l 选为 hinge loss, 即 l ( w , x , y ) = \mathbb{l} (w,x,y)= l(w,x,y)= max ( 0 , 1 − y w T x ) (0,1-yw^Tx) (0,1ywTx), 则 ⇒ \Rightarrow SVM

  • 优化算法:
    \quad SGD with mini-batch updates : w t + 1 = w t − η t ( λ w t + 1 b Σ ( x i , y i ) ∈ B t ▽ l ( w t , x i , y i ) ) w_{t+1} = w_t - \eta_t \Big( \lambda w_t + \frac{1}{b} \mathop{\Sigma}\limits_{ (x_i,y_i) \in B_t} \triangledown \mathbb{l} (w_t,x_i,y_i) \Big) wt+1=wtηt(λwt+b1(xi,yi)BtΣl(wt,xi,yi))
    \quad where η t \eta_t ηt is a learning rate, the update at each step t t t is based on a small subset B t B_t Bt of examples of size b b b.



SGD with Differential Privacy

  • 满足差分隐私的 mini-batch SGD :
    \quad A differentially-private version of the mini-batch update : w t + 1 = w t − η t ( λ w t + 1 b Σ ( x i , y i ) ∈ B t ▽ l ( w t , x i , y i ) + 1 b Z t ) w_{t+1} = w_t - \eta_t \Big( \lambda w_t + \frac{1}{b} \mathop{\Sigma}\limits_{ (x_i,y_i) \in B_t} \triangledown \mathbb{l} (w_t,x_i,y_i) \,+ \frac{1}{b}Z_t \Big) wt+1=wtηt(λwt+b1(xi,yi)BtΣl(wt,xi,yi)+b1Zt)
    \quad where Z t Z_t Zt is a random noise vector in R d \mathbb R ^d Rd drawn independently from the density: ρ ( z ) ∝ e − ( α / 2 ) ∥ z ∥ \rho(z) \propto e^{-(\alpha/2) \|z\|} ρ(z)e(α/2)z

  • 使用上式的 mini-batch update 时, 此种updates满足 α \alpha α-differentially private的条件:
    \quad T h e o r e m \mathcal{Theorem \,} Theorem If the initialization point w o w_o wo is chosen independent of the sensitive data, the batches B t B_t Bt are disjoint, and if ∥ ▽ l ( w , x , y ) ∥ ≤ 1 \| \triangledown \mathbb l(w,x,y)\| \leq 1 l(w,x,y)1 for all w w w, and all ( x i , y i ) (x_i,y_i) (xi,yi), then SGD with mini-batch updates is α \alpha α-differentially private.



Experiments

  • 实验现象:
    \quad batch size 为1时DP-SGD的方差比普通的SGD更大。但 batch size 调大后则方差减小了很多。
    在这里插入图片描述

  • 由此而总结出的经验:
    \quad In terms of objective value, guaranteeing differential privacy can come for “free” using SGD with moderate batch size.

  • 实际上 batch size 带来的影响是先减后增
    \quad increasing the batch size improved the performance of private SGD, but there is a limit , much larger batch sizes actually degrade performance.
    在这里插入图片描述


额外记录几条经验

  • 数据维度 d d d与隐私保护参数会影响实验所需的数据量:
    \quad Differentially private learning algorithms often have a sample complexity that scales linearly with the data dimension d d d and inversely with the privacy risk α \alpha α. Thus a moderate reduction in α \alpha α or increase in d d d may require more data.


Ref

S. Song, K. Chaudhuri, and A. Sarwate. Stochastic gradient descent with differentially private updates. In GlobalSIP Conference, 2013.

这篇关于【论文记录】Stochastic gradient descent with differentially private updates的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/445056

相关文章

在Spring Boot中集成RabbitMQ的实战记录

《在SpringBoot中集成RabbitMQ的实战记录》本文介绍SpringBoot集成RabbitMQ的步骤,涵盖配置连接、消息发送与接收,并对比两种定义Exchange与队列的方式:手动声明(... 目录前言准备工作1. 安装 RabbitMQ2. 消息发送者(Producer)配置1. 创建 Spr

k8s上运行的mysql、mariadb数据库的备份记录(支持x86和arm两种架构)

《k8s上运行的mysql、mariadb数据库的备份记录(支持x86和arm两种架构)》本文记录在K8s上运行的MySQL/MariaDB备份方案,通过工具容器执行mysqldump,结合定时任务实... 目录前言一、获取需要备份的数据库的信息二、备份步骤1.准备工作(X86)1.准备工作(arm)2.手

SpringBoot3应用中集成和使用Spring Retry的实践记录

《SpringBoot3应用中集成和使用SpringRetry的实践记录》SpringRetry为SpringBoot3提供重试机制,支持注解和编程式两种方式,可配置重试策略与监听器,适用于临时性故... 目录1. 简介2. 环境准备3. 使用方式3.1 注解方式 基础使用自定义重试策略失败恢复机制注意事项

Python UV安装、升级、卸载详细步骤记录

《PythonUV安装、升级、卸载详细步骤记录》:本文主要介绍PythonUV安装、升级、卸载的详细步骤,uv是Astral推出的下一代Python包与项目管理器,主打单一可执行文件、极致性能... 目录安装检查升级设置自动补全卸载UV 命令总结 官方文档详见:https://docs.astral.sh/

统一返回JsonResult踩坑的记录

《统一返回JsonResult踩坑的记录》:本文主要介绍统一返回JsonResult踩坑的记录,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录统一返回jsonResult踩坑定义了一个统一返回类在使用时,JsonResult没有get/set方法时响应总结统一返回

Go学习记录之runtime包深入解析

《Go学习记录之runtime包深入解析》Go语言runtime包管理运行时环境,涵盖goroutine调度、内存分配、垃圾回收、类型信息等核心功能,:本文主要介绍Go学习记录之runtime包的... 目录前言:一、runtime包内容学习1、作用:① Goroutine和并发控制:② 垃圾回收:③ 栈和

java对接海康摄像头的完整步骤记录

《java对接海康摄像头的完整步骤记录》在Java中调用海康威视摄像头通常需要使用海康威视提供的SDK,下面这篇文章主要给大家介绍了关于java对接海康摄像头的完整步骤,文中通过代码介绍的非常详细,需... 目录一、开发环境准备二、实现Java调用设备接口(一)加载动态链接库(二)结构体、接口重定义1.类型

apache的commons-pool2原理与使用实践记录

《apache的commons-pool2原理与使用实践记录》ApacheCommonsPool2是一个高效的对象池化框架,通过复用昂贵资源(如数据库连接、线程、网络连接)优化系统性能,这篇文章主... 目录一、核心原理与组件二、使用步骤详解(以数据库连接池为例)三、高级配置与优化四、典型应用场景五、注意事

SpringBoot实现文件记录日志及日志文件自动归档和压缩

《SpringBoot实现文件记录日志及日志文件自动归档和压缩》Logback是Java日志框架,通过Logger收集日志并经Appender输出至控制台、文件等,SpringBoot配置logbac... 目录1、什么是Logback2、SpringBoot实现文件记录日志,日志文件自动归档和压缩2.1、

qtcreater配置opencv遇到的坑及实践记录

《qtcreater配置opencv遇到的坑及实践记录》我配置opencv不管是按照网上的教程还是deepseek发现都有些问题,下面是我的配置方法以及实践成功的心得,感兴趣的朋友跟随小编一起看看吧... 目录电脑环境下载环境变量配置qmake加入外部库测试配置我配置opencv不管是按照网上的教程还是de