Demystifying the Linux Kernel Socket File Systems (Sockfs)

2024-06-03 16:08

本文主要是介绍Demystifying the Linux Kernel Socket File Systems (Sockfs),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

All Linux  networking works with System Calls creating network sockets (using the Socket System Call). The Socket System Call returns an integer (socket descriptor).

“Writing” or “reading” to/from that socket descriptor (as though it were a file) using generic System Calls  write / read respectively creates TCP network traffic rather than file-system writes/reads.

Note: The file-system descriptor would have been created by the “Open” system call IF … the descriptor were a “regular” file-system descriptor, intended for “regular” / file-system writes and reads (via System Calls write/read respectively) to files etc.

Further Note: This implies that the network socket descriptor created by the “socket” System Call will be used by systems programmer to write/read , using the same System Calls write/read used for “regular” file system writes/reads (System Calls that would, under normal and other circumstances, write/read data to/from memory).

Further further Note:  A System Call  “write” (to the descriptor that was created by the socket System Call)  must translate “magically” into a TCP transaction that “writes” the data across the network (ostensibly to the client on the other end), with the data “written” encapsulated within the payload section of a TCP packet.

This process of adapting  and hijacking the kernel file-system infrastructure to incorporate network operations /socket operations is called SOCKFS (Socket File System).

So how does  the linux kernel accomplish this process, where a file-system write is “faked” into a network-system “write”, if indeed it can be called that ?

Well…as is usually the case, the linux kernel’s methods begins at System / Kernel Initialization, when a special socket file-system (statically defined sock_fs_type)  for networks is “registered” by register_file_system. This happens in sock_init. File systems are registered so that disk partitions can be mounted for that file system.

The kernel registered file system type sock_fs_type  so that it could create a fake mount point  using kern_mount (for the file system sock_fs_type).  This mount point is necessary if the kernel is to later create a “fake file”   *struct file  using  existing/generic mechanisms and infrastructure  made available for the Virtual File System (VFS). These mechanisms  and infrastructure would include a mount point being available.

         Note:  No “actual” mount point exists, not in the sense an inode etc etc.

                       We will blog on file systems later.

Then when the socket System Call is initiated (to create the socket descriptor),  the kernel executes sock_create to create a new descriptor (aka the socket descriptor). The kernel also  executes sock_map_fd, which creates a   “fake file” , and  assigns the “fake file” to the socket descriptor. The “fake” files ops ( file->f_op) are then initialized to be socket_file_ops  (statically defined at compile time in source/net/socket.c).

The kernel assigns/maps the socket descriptor created earlier to the new “fake”  file using fd_install.

This socket descriptor is returned by the Socket System Call (as required by the MAN page of the Socket System Call) to the user program.

I only call it “fake” file because a System Call write executed against that socket descriptor will use the VFS infrastructure created, but  the data will not be written into a disk-file anywhere. It will, instead, be translated into a network operation because of the f_op‘s assigned to the “fake” file (socket_file_ops).

The kernel is now set up to create network traffic when System Calls write/read  are executed to/from to the “fake” file descriptor (the socket descriptor)  which was returned to the user when System Call socket was executed.

In point of fact, a System Call write to the “fake” files socket descriptor will then translate into a call to  __sock_sendmsg within the kernel, instead of a write into the “regular” file system. Because that is how socket_file_ops is statically defined before assignment to the “fake” file.

And then we are into networking space. And the promised Lan of milk, honey,  TCP traffic,  SOCKFS and File Systems.

No one said understanding the kernel was easy. But extremely gratification awaits those that work on it. And also creates enormous opportunities for innovation.  I  explain Linux Kernel concepts and more in my classes ( Advanced Linux Kernel Programming @UCSC-Extension, and also in other classes that I teach independently).

As always, Feedback, Questions  and Comments are appreciated and will be responded to. I will like to listen to gripes, especially  if you also paypal me some.  Thanks

这篇关于Demystifying the Linux Kernel Socket File Systems (Sockfs)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1027461

相关文章

linux-基础知识3

打包和压缩 zip 安装zip软件包 yum -y install zip unzip 压缩打包命令: zip -q -r -d -u 压缩包文件名 目录和文件名列表 -q:不显示命令执行过程-r:递归处理,打包各级子目录和文件-u:把文件增加/替换到压缩包中-d:从压缩包中删除指定的文件 解压:unzip 压缩包名 打包文件 把压缩包从服务器下载到本地 把压缩包上传到服务器(zip

Linux 网络编程 --- 应用层

一、自定义协议和序列化反序列化 代码: 序列化反序列化实现网络版本计算器 二、HTTP协议 1、谈两个简单的预备知识 https://www.baidu.com/ --- 域名 --- 域名解析 --- IP地址 http的端口号为80端口,https的端口号为443 url为统一资源定位符。CSDNhttps://mp.csdn.net/mp_blog/creation/editor

【Python编程】Linux创建虚拟环境并配置与notebook相连接

1.创建 使用 venv 创建虚拟环境。例如,在当前目录下创建一个名为 myenv 的虚拟环境: python3 -m venv myenv 2.激活 激活虚拟环境使其成为当前终端会话的活动环境。运行: source myenv/bin/activate 3.与notebook连接 在虚拟环境中,使用 pip 安装 Jupyter 和 ipykernel: pip instal

Linux_kernel驱动开发11

一、改回nfs方式挂载根文件系统         在产品将要上线之前,需要制作不同类型格式的根文件系统         在产品研发阶段,我们还是需要使用nfs的方式挂载根文件系统         优点:可以直接在上位机中修改文件系统内容,延长EMMC的寿命         【1】重启上位机nfs服务         sudo service nfs-kernel-server resta

【Linux 从基础到进阶】Ansible自动化运维工具使用

Ansible自动化运维工具使用 Ansible 是一款开源的自动化运维工具,采用无代理架构(agentless),基于 SSH 连接进行管理,具有简单易用、灵活强大、可扩展性高等特点。它广泛用于服务器管理、应用部署、配置管理等任务。本文将介绍 Ansible 的安装、基本使用方法及一些实际运维场景中的应用,旨在帮助运维人员快速上手并熟练运用 Ansible。 1. Ansible的核心概念

Linux服务器Java启动脚本

Linux服务器Java启动脚本 1、初版2、优化版本3、常用脚本仓库 本文章介绍了如何在Linux服务器上执行Java并启动jar包, 通常我们会使用nohup直接启动,但是还是需要手动停止然后再次启动, 那如何更优雅的在服务器上启动jar包呢,让我们一起探讨一下吧。 1、初版 第一个版本是常用的做法,直接使用nohup后台启动jar包, 并将日志输出到当前文件夹n

[Linux]:进程(下)

✨✨ 欢迎大家来到贝蒂大讲堂✨✨ 🎈🎈养成好习惯,先赞后看哦~🎈🎈 所属专栏:Linux学习 贝蒂的主页:Betty’s blog 1. 进程终止 1.1 进程退出的场景 进程退出只有以下三种情况: 代码运行完毕,结果正确。代码运行完毕,结果不正确。代码异常终止(进程崩溃)。 1.2 进程退出码 在编程中,我们通常认为main函数是代码的入口,但实际上它只是用户级

【Linux】应用层http协议

一、HTTP协议 1.1 简要介绍一下HTTP        我们在网络的应用层中可以自己定义协议,但是,已经有大佬定义了一些现成的,非常好用的应用层协议,供我们直接使用,HTTP(超文本传输协议)就是其中之一。        在互联网世界中,HTTP(超文本传输协议)是一个至关重要的协议,他定义了客户端(如浏览器)与服务器之间如何进行通信,以交换或者传输超文本(比如HTML文档)。

如何编写Linux PCIe设备驱动器 之二

如何编写Linux PCIe设备驱动器 之二 功能(capability)集功能(capability)APIs通过pci_bus_read_config完成功能存取功能APIs参数pos常量值PCI功能结构 PCI功能IDMSI功能电源功率管理功能 功能(capability)集 功能(capability)APIs int pcie_capability_read_wo