【计算机视觉】Gaussian Splatting源码解读补充(三)

2024-03-23 18:36

本文主要是介绍【计算机视觉】Gaussian Splatting源码解读补充(三),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

  • 第一部分
  • 第二部分

本文是对学习笔记之——3D Gaussian Splatting源码解读的补充,并订正了一些错误。

目录

  • 五、反向传播
    • 1. `rasterizer_impl.cu`: `CudaRasterizer::Rasterizer::backward`
    • 2. `backward.cu`: `renderCUDA`
    • 3. `backward.cu`: `preprocess`

五、反向传播

原文第6节:

During the backward pass, we must therefore recover the full sequence of blended points per-pixel in the forward pass. One solution would be to store arbitrarily long lists of blended points per-pixel in global memory [Kopanas et al. 2021]. To avoid the implied dynamic memory management overhead, we instead choose to traverse the per-tile lists again; we can reuse the sorted array of Gaussians and tile ranges from the forward pass. To facilitate gradient computation, we now traverse them back-to-front.

The traversal starts from the last point that affected any pixel in the tile, and loading of points into shared memory again happens collaboratively. Additionally, each pixel will only start (expensive) overlap testing and processing of points if their depth is lower than or equal to the depth of the last point that contributed to its color during the forward pass. Computation of the gradients described in Sec. 4 requires the accumulated opacity values at each step during the original blending process. Rather than trasversing an explicit list of progressively shrinking opacities in the backward pass, we can recover these intermediate opacities by storing only the total accumulated opacity at the end of the forward pass. Specifically, each point stores the final accumulated opacity 𝛼 in the forward process; we divide this by each point’s 𝛼 in our back-to-front traversal to obtain the required coefficients for gradient computation.

反向传播是根据loss对每个像素颜色的导求出loss对Gaussian的2维中心坐标、3维中心坐标、椭圆二次型矩阵、不透明度、相机看到的Gaussian颜色、球谐系数、三维协方差矩阵、缩放和旋转的导数。

代码中出现的类似于dL_dx的变量的意思是loss对参数x的导数,即 ∂ L ∂ x \frac{\partial L}{\partial x} xL

1. rasterizer_impl.cu: CudaRasterizer::Rasterizer::backward

// Produce necessary gradients for optimization, corresponding
// to forward render pass
void CudaRasterizer::Rasterizer::backward(const int P, // Gaussian个数int D, // 球谐度数int M, // 三通道球谐系数个数int R,// R对应CudaRasterizer::Rasterizer::forward函数中的num_rendered// 即排序数组的个数(等于每个Gaussian覆盖的tile的个数之和)const float* background, // 背景颜色const int width, int height,const float* means3D,const float* shs,const float* colors_precomp,const float* scales,const float scale_modifier,const float* rotations,const float* cov3D_precomp,const float* viewmatrix,const float* projmatrix,const float* campos,const float tan_fovx, float tan_fovy,const int* radii,char* geom_buffer,char* binning_buffer,char* img_buffer,// 上面的参数不解释const float* dL_dpix, // loss对每个像素颜色的导数float* dL_dmean2D, // loss对Gaussian二维中心坐标的导数float* dL_dconic, // loss对椭圆二次型矩阵的导数float* dL_dopacity, // loss对不透明度的导数float* dL_dcolor, // loss对Gaussian颜色的导数(颜色是从相机中心看向Gaussian的颜色)float* dL_dmean3D, // loss对Gaussian三维中心坐标的导数float* dL_dcov3D, // loss对Gaussian三维协方差矩阵的导数float* dL_dsh, // loss对Gaussian的球谐系数的导数float* dL_dscale, // loss对Gaussian的缩放参数的导数float* dL_drot, // loss对Gaussian旋转四元数的导数bool debug)
{GeometryState geomState = GeometryState::fromChunk(geom_buffer, P);BinningState binningState = BinningState::fromChunk(binning_buffer, R);ImageState imgState = ImageState::fromChunk(img_buffer, width * height);// 上面这些缓冲区都是在前向传播的时候存下来的,现在拿出来用if (radii == nullptr){radii = geomState.internal_radii;}const float focal_y = height / (2.0f * tan_fovy);const float focal_x = width / (2.0f * tan_fovx);const dim3 tile_grid((width + BLOCK_X - 1) / BLOCK_X, (height + BLOCK_Y - 1) / BLOCK_Y, 1);const dim3 block(BLOCK_X, BLOCK_Y, 1);// Compute loss gradients w.r.t. 2D mean position, conic matrix,// opacity and RGB of Gaussians from per-pixel loss gradients.// If we were given precomputed colors and not SHs, use them.const float* color_ptr = (colors_precomp != nullptr) ? colors_precomp : geomState.rgb;CHECK_CUDA(BACKWARD::render(tile_grid,block,imgState.ranges,binningState.point_list,width, height,background,geomState.means2D,geomState.conic_opacity,color_ptr,imgState.accum_alpha,imgState.n_contrib,dL_dpix,(float3*)dL_dmean2D,(float4*)dL_dconic,dL_dopacity,dL_dcolor), debug)// Take care of the rest of preprocessing. Was the precomputed covariance// given to us or a scales/rot pair? If precomputed, pass that. If not,// use the one we computed ourselves.const float* cov3D_ptr = (cov3D_precomp != nullptr) ? cov3D_precomp : geomState.cov3D;// 因为是反向传播,所以preprocess放在后面了(❁´◡`❁)CHECK_CUDA(BACKWARD::preprocess(P, D, M,(float3*)means3D,radii,shs,geomState.clamped,(glm::vec3*)scales,(glm::vec4*)rotations,scale_modifier,cov3D_ptr,viewmatrix,projmatrix,focal_x, focal_y,tan_fovx, tan_fovy,(glm::vec3*)campos,(float3*)dL_dmean2D,dL_dconic,(glm::vec3*)dL_dmean3D,dL_dcolor,dL_dcov3D,dL_dsh,(glm::vec3*)dL_dscale,(glm::vec4*)dL_drot), debug)
}

2. backward.cu: renderCUDA

这个函数中每个线程负责一个像素,计算loss对Gaussian的二维中心坐标、椭圆二次型矩阵、不透明度和Gaussian颜色的导。

这里求导利用的公式是颜色的递推公式,与前向传播有所不同。设 b i \boldsymbol{b}_i bi是第 i i i个即其后的Gaussians渲染出来的颜色(三个通道), g i \boldsymbol{g}_i gi是第 i i i个Gaussian的颜色,则有递推公式: b i = α i g i + ( 1 − α i ) b i + 1 \boldsymbol{b}_i=\alpha_i \boldsymbol{g}_i+(1-\alpha_i)\boldsymbol{b}_{i+1} bi=αigi+(1αi)bi+1 T i = ( 1 − α 1 ) ( 1 − α 2 ) ⋯ ( 1 − α i − 1 ) T_i=(1-\alpha_1)(1-\alpha_2)\cdots(1-\alpha_{i-1}) Ti=(1α1)(1α2)(1αi1)为第 i i i个Gaussian对像素点的透光率, C \boldsymbol{C} C是像素点的颜色,则有 ∂ L ∂ b i = ∂ L ∂ C ⋅ ∂ b 1 ∂ b 2 ⋅ ∂ b 2 ∂ b 3 ⋯ ∂ b i − 1 ∂ b i = ∂ L ∂ C ⋅ ( 1 − α 1 ) I ⋅ ( 1 − α 2 ) I ⋯ ( 1 − α i − 1 ) I = T i ∂ L ∂ C \begin{aligned} \cfrac{\partial L}{\partial\boldsymbol{b}_i}&=\cfrac{\partial L}{\partial\boldsymbol{C}}\cdot\cfrac{\partial \boldsymbol{b}_1}{\partial\boldsymbol{b}_2}\cdot\cfrac{\partial \boldsymbol{b}_2}{\partial\boldsymbol{b}_3}\cdots\cfrac{\partial \boldsymbol{b}_{i-1}}{\partial\boldsymbol{b}_i}\\ &=\cfrac{\partial L}{\partial\boldsymbol{C}}\cdot(1-\alpha_1)I\cdot(1-\alpha_2)I\cdots(1-\alpha_{i-1})I\\ &=T_i\cfrac{\partial L}{\partial\boldsymbol{C}}\\ \end{aligned} biL=CLb2b1b3b2bibi1=CL(1α1)I(1α2)I(1αi1)I=TiCL ∂ L ∂ α i = T i ∂ L ∂ C ( g i − b i + 1 ) ∂ L ∂ g i = T i α i ∂ L ∂ C \cfrac{\partial L}{\partial \alpha_i}=T_i\cfrac{\partial L}{\partial\boldsymbol{C}}(\boldsymbol{g}_i-\boldsymbol{b}_{i+1})\\ \cfrac{\partial L}{\partial \boldsymbol{g}_i}=T_i\alpha_i\cfrac{\partial L}{\partial\boldsymbol{C}} αiL=TiCL(gibi+1)giL=TiαiCL α = min ⁡ ( 0.99 , σ G ) \alpha=\min(0.99,\sigma G) α=min(0.99,σG),其中 σ \sigma σ是“opacity”, G G G是正态分布给出的“exponential falloff”,则 ∂ L ∂ σ = G ∂ L ∂ α ∂ L ∂ G = σ ∂ L ∂ α \frac{\partial L}{\partial \sigma}=G\frac{\partial L}{\partial \alpha}\\\ \\ \frac{\partial L}{\partial G}=\sigma\frac{\partial L}{\partial \alpha} σL=GαL GL=σαL(注:代码中似乎不考虑 α > 0.99 \alpha>0.99 α>0.99的情况;也许是因为想让 σ \sigma σ过大时也有梯度从而变小?)
G = exp ⁡ { − 1 2 d T A d } G=\exp\{-\frac{1}{2}\boldsymbol{d}^T A\boldsymbol{d}\} G=exp{21dTAd} d = p − μ \boldsymbol{d}=\boldsymbol{p}-\boldsymbol\mu d=pμ,其中 p \boldsymbol p p为像素坐标, μ \boldsymbol\mu μ为Gausian中心的2D坐标, d \boldsymbol d d为它们之间的位移向量, A A A为椭圆二次型的矩阵,因此 ∂ L ∂ d = − 1 2 G ∂ L ∂ G [ d T ( A T + A ) ] ∂ L ∂ μ = − ∂ L ∂ d ∂ L ∂ A 00 = − 1 2 G ∂ L ∂ G d x 2 ∂ L ∂ A 11 = − 1 2 G ∂ L ∂ G d y 2 ∂ L ∂ A 01 = − 1 2 G ∂ L ∂ G d x d y \frac{\partial L}{\partial \boldsymbol d}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}[\boldsymbol d^T(A^T+A)]\\ \cfrac{\partial L}{\partial \boldsymbol \mu}=-\cfrac{\partial L}{\partial \boldsymbol d}\\ \frac{\partial L}{\partial \boldsymbol A_{00}}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}d_x^2\\ \frac{\partial L}{\partial \boldsymbol A_{11}}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}d_y^2\\ \frac{\partial L}{\partial \boldsymbol A_{01}}=-\cfrac{1}{2}G\frac{\partial L}{\partial G}d_x d_y dL=21GGL[dT(AT+A)]μL=dLA00L=21GGLdx2A11L=21GGLdy2A01L=21GGLdxdy注意计算 ∂ L ∂ μ \frac{\partial L}{\partial \boldsymbol \mu} μL时要乘以像素坐标对像平面坐标的导。

公式中出现的变量和代码中变量的关系如下表:

公式变量代码变量
b i + 1 \boldsymbol{b}_{i+1} bi+1accum_rec
g i \boldsymbol{g}_i gic
T i T_i TiT
∂ L ∂ C \cfrac{\partial L}{\partial\boldsymbol{C}} CLdL_dpixel
σ i \sigma_i σicon_o.w
d \boldsymbol{d} dd
μ \boldsymbol\mu μxy
p \boldsymbol p ppixf
A 00 , A 01 , A 11 A_{00},A_{01},A_{11} A00,A01,A11con_o.x,con_o.y,con_o.z
// Backward version of the rendering procedure.
template <uint32_t C>
__global__ void __launch_bounds__(BLOCK_X * BLOCK_Y)
renderCUDA(const uint2* __restrict__ ranges,const uint32_t* __restrict__ point_list,int W, int H,const float* __restrict__ bg_color,const float2* __restrict__ points_xy_image,const float4* __restrict__ conic_opacity,const float* __restrict__ colors,const float* __restrict__ final_Ts,const uint32_t* __restrict__ n_contrib,const float* __restrict__ dL_dpixels,float3* __restrict__ dL_dmean2D,float4* __restrict__ dL_dconic2D,float* __restrict__ dL_dopacity,float* __restrict__ dL_dcolors)
{// We rasterize again. Compute necessary block info.auto block = cg::this_thread_block();const uint32_t horizontal_blocks = (W + BLOCK_X - 1) / BLOCK_X;const uint2 pix_min = { block.group_index().x * BLOCK_X, block.group_index().y * BLOCK_Y };const uint2 pix_max = { min(pix_min.x + BLOCK_X, W), min(pix_min.y + BLOCK_Y , H) };const uint2 pix = { pix_min.x + block.thread_index().x, pix_min.y + block.thread_index().y };const uint32_t pix_id = W * pix.y + pix.x;const float2 pixf = { (float)pix.x, (float)pix.y };// 上面这些就是当前线程负责的像素点的信息const bool inside = pix.x < W&& pix.y < H;const uint2 range = ranges[block.group_index().y * horizontal_blocks + block.group_index().x];const int rounds = ((range.y - range.x + BLOCK_SIZE - 1) / BLOCK_SIZE);// 轮数,和前向传播一样bool done = !inside;int toDo = range.y - range.x;__shared__ int collected_id[BLOCK_SIZE];__shared__ float2 collected_xy[BLOCK_SIZE];__shared__ float4 collected_conic_opacity[BLOCK_SIZE];__shared__ float collected_colors[C * BLOCK_SIZE];// In the forward, we stored the final value for T, the// product of all (1 - alpha) factors. const float T_final = inside ? final_Ts[pix_id] : 0;float T = T_final; // 最后的透光度// We start from the back. The ID of the last contributing// Gaussian is known from each pixel from the forward.uint32_t contributor = toDo;const int last_contributor = inside ? n_contrib[pix_id] : 0;float accum_rec[C] = { 0 };float dL_dpixel[C]; // loss对该像素RGB通道的导数if (inside)for (int i = 0; i < C; i++)dL_dpixel[i] = dL_dpixels[i * H * W + pix_id];float last_alpha = 0;float last_color[C] = { 0 };// Gradient of pixel coordinate w.r.t. normalized // screen-space viewport corrdinates (-1 to 1)// 像素坐标对像平面坐标的导(参见ndc2Pix函数)const float ddelx_dx = 0.5 * W;const float ddely_dy = 0.5 * H;// Traverse all Gaussiansfor (int i = 0; i < rounds; i++, toDo -= BLOCK_SIZE){// Load auxiliary data into shared memory, start in the BACK// and load them in revers order.block.sync();const int progress = i * BLOCK_SIZE + block.thread_rank();if (range.x + progress < range.y){const int coll_id = point_list[range.y - progress - 1];collected_id[block.thread_rank()] = coll_id;collected_xy[block.thread_rank()] = points_xy_image[coll_id];collected_conic_opacity[block.thread_rank()] = conic_opacity[coll_id];for (int i = 0; i < C; i++)collected_colors[i * BLOCK_SIZE + block.thread_rank()] = colors[coll_id * C + i];}block.sync();// Iterate over Gaussiansfor (int j = 0; !done && j < min(BLOCK_SIZE, toDo); j++){// Keep track of current Gaussian ID. Skip, if this one// is behind the last contributor for this pixel.contributor--;if (contributor >= last_contributor)continue;// 之所以不直接把contributor设置成last_contributor,// 是因为排在我的last_contributor后面的Gaussian可能// 对于其他像素来说是可见的// Compute blending values, as before.const float2 xy = collected_xy[j];const float2 d = { xy.x - pixf.x, xy.y - pixf.y };const float4 con_o = collected_conic_opacity[j];const float power = -0.5f * (con_o.x * d.x * d.x + con_o.z * d.y * d.y) - con_o.y * d.x * d.y;if (power > 0.0f)continue;const float G = exp(power);const float alpha = min(0.99f, con_o.w * G); // 先把alpha算出来if (alpha < 1.0f / 255.0f)continue;T = T / (1.f - alpha); // 一点一点把T除掉(T原本是累乘)const float dchannel_dcolor = alpha * T;// Propagate gradients to per-Gaussian colors and keep// gradients w.r.t. alpha (blending factor for a Gaussian/pixel// pair).float dL_dalpha = 0.0f;const int global_id = collected_id[j];for (int ch = 0; ch < C; ch++){const float c = collected_colors[ch * BLOCK_SIZE + j];// Gaussian的color// Update last color (to be used in the next iteration)accum_rec[ch] = last_alpha * last_color[ch] + (1.f - last_alpha) * accum_rec[ch];// 更新我以及后面所有Gaussian的颜色贡献last_color[ch] = c; // 我后面的Gaussian的颜色const float dL_dchannel = dL_dpixel[ch];dL_dalpha += (c - accum_rec[ch]) * dL_dchannel;// Update the gradients w.r.t. color of the Gaussian. // Atomic, since this pixel is just one of potentially// many that were affected by this Gaussian.atomicAdd(&(dL_dcolors[global_id * C + ch]), dchannel_dcolor * dL_dchannel);}dL_dalpha *= T;// Update last alpha (to be used in the next iteration)last_alpha = alpha;// Account for fact that alpha also influences how much of// the background color is added if nothing left to blendfloat bg_dot_dpixel = 0;for (int i = 0; i < C; i++)bg_dot_dpixel += bg_color[i] * dL_dpixel[i];dL_dalpha += (-T_final / (1.f - alpha)) * bg_dot_dpixel;// Helpful reusable temporary variablesconst float dL_dG = con_o.w * dL_dalpha;const float gdx = G * d.x;const float gdy = G * d.y;const float dG_ddelx = -gdx * con_o.x - gdy * con_o.y;const float dG_ddely = -gdy * con_o.z - gdx * con_o.y;// 下面累积这些参数的梯度// Update gradients w.r.t. 2D mean position of the GaussianatomicAdd(&dL_dmean2D[global_id].x, dL_dG * dG_ddelx * ddelx_dx);atomicAdd(&dL_dmean2D[global_id].y, dL_dG * dG_ddely * ddely_dy);// Update gradients w.r.t. 2D covariance (2x2 matrix, symmetric)atomicAdd(&dL_dconic2D[global_id].x, -0.5f * gdx * d.x * dL_dG);atomicAdd(&dL_dconic2D[global_id].y, -0.5f * gdx * d.y * dL_dG);atomicAdd(&dL_dconic2D[global_id].w, -0.5f * gdy * d.y * dL_dG);// Update gradients w.r.t. opacity of the GaussianatomicAdd(&(dL_dopacity[global_id]), G * dL_dalpha);}}
}

3. backward.cu: preprocess

这个函数主要干了四件事:

  1. 从Gaussian中心在像平面上的二维坐标的梯度计算其三维坐标的梯度;
  2. 从Gaussian颜色的梯度计算球谐系数的梯度(利用computeColorFromSH函数);
  3. 从Gaussian在像平面上投影的2D椭圆二次型矩阵的梯度计算其3D协方差矩阵梯度;
  4. 从协方差矩阵梯度计算缩放和旋转参数的梯度。

该函数及其调用的函数的代码我就不贴了,都是一些纷繁复杂的数学运算。


至此,我们大体上读完了Gaussian Splatting的代码。

完结撒花!!🎆​🎆​

这篇关于【计算机视觉】Gaussian Splatting源码解读补充(三)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/839182

相关文章

java之Objects.nonNull用法代码解读

《java之Objects.nonNull用法代码解读》:本文主要介绍java之Objects.nonNull用法代码,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐... 目录Java之Objects.nonwww.chinasem.cnNull用法代码Objects.nonN

Python实现无痛修改第三方库源码的方法详解

《Python实现无痛修改第三方库源码的方法详解》很多时候,我们下载的第三方库是不会有需求不满足的情况,但也有极少的情况,第三方库没有兼顾到需求,本文将介绍几个修改源码的操作,大家可以根据需求进行选择... 目录需求不符合模拟示例 1. 修改源文件2. 继承修改3. 猴子补丁4. 追踪局部变量需求不符合很

SpringCloud负载均衡spring-cloud-starter-loadbalancer解读

《SpringCloud负载均衡spring-cloud-starter-loadbalancer解读》:本文主要介绍SpringCloud负载均衡spring-cloud-starter-loa... 目录简述主要特点使用负载均衡算法1. 轮询负载均衡策略(Round Robin)2. 随机负载均衡策略(

解读spring.factories文件配置详情

《解读spring.factories文件配置详情》:本文主要介绍解读spring.factories文件配置详情,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录使用场景作用内部原理机制SPI机制Spring Factories 实现原理用法及配置spring.f

Spring MVC使用视图解析的问题解读

《SpringMVC使用视图解析的问题解读》:本文主要介绍SpringMVC使用视图解析的问题解读,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录Spring MVC使用视图解析1. 会使用视图解析的情况2. 不会使用视图解析的情况总结Spring MVC使用视图

Linux中的进程间通信之匿名管道解读

《Linux中的进程间通信之匿名管道解读》:本文主要介绍Linux中的进程间通信之匿名管道解读,具有很好的参考价值,希望对大家有所帮助,如有错误或未考虑完全的地方,望不吝赐教... 目录一、基本概念二、管道1、温故知新2、实现方式3、匿名管道(一)管道中的四种情况(二)管道的特性总结一、基本概念我们知道多

Spring 中 BeanFactoryPostProcessor 的作用和示例源码分析

《Spring中BeanFactoryPostProcessor的作用和示例源码分析》Spring的BeanFactoryPostProcessor是容器初始化的扩展接口,允许在Bean实例化前... 目录一、概览1. 核心定位2. 核心功能详解3. 关键特性二、Spring 内置的 BeanFactory

Linux系统之authconfig命令的使用解读

《Linux系统之authconfig命令的使用解读》authconfig是一个用于配置Linux系统身份验证和账户管理设置的命令行工具,主要用于RedHat系列的Linux发行版,它提供了一系列选项... 目录linux authconfig命令的使用基本语法常用选项示例总结Linux authconfi

解读docker运行时-itd参数是什么意思

《解读docker运行时-itd参数是什么意思》在Docker中,-itd参数组合用于在后台运行一个交互式容器,同时保持标准输入和分配伪终端,这种方式适合需要在后台运行容器并保持交互能力的场景... 目录docker运行时-itd参数是什么意思1. -i(或 --interactive)2. -t(或 --

解读为什么@Autowired在属性上被警告,在setter方法上不被警告问题

《解读为什么@Autowired在属性上被警告,在setter方法上不被警告问题》在Spring开发中,@Autowired注解常用于实现依赖注入,它可以应用于类的属性、构造器或setter方法上,然... 目录1. 为什么 @Autowired 在属性上被警告?1.1 隐式依赖注入1.2 IDE 的警告: