本文主要是介绍在Bcache上启动OSD报unable to read osd superblock错误,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
环境信息
环境 | 具体信息 |
---|---|
架构 | LoongArch |
处理器 | Loongson-3C5000 |
内核版本 | 4.19 |
操作系统版本 | lns8 |
Ceph版本 | Nautilus 14.2.22 |
Ceph Cluster | 单机最小集群,一个Monitor,两个OSD,一个Manager |
PAGESIZE | 16384 |
[root@ceph01 ~]# getconf PAGESIZE
16384
问题描述
使用Bcache加速块设备,在上述环境中创建Bcache,并在Bcache上创建OSD。但是systemctl restart ceph-osd@0.service
时失败,/var/log/ceph/ceph-osd.0.log
日志如下:
2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
2023-10-13 05:26:42.705 fff37c0030 -1 bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0x246e0328, expected 0x6d5d9709, device location [0x2000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#
2023-10-13 05:26:42.705 fff37c0030 -1 osd.0 0 OSD::init() : unable to read osd superblock
2023-10-13 05:26:42.705 fff37c0030 1 bluestore(/var/lib/ceph/osd/ceph-0) umount
2023-10-13 05:26:42.705 fff37c0030 4 rocksdb: [db/db_impl.cc:390] Shutdown: canceling all background work
2023-10-13 05:26:42.705 fff37c0030 4 rocksdb: [db/db_impl.cc:563] Shutdown complete
2023-10-13 05:26:42.709 fff37c0030 1 bluefs umount
2023-10-13 05:26:42.709 fff37c0030 1 bdev(0xaac6157500 /var/lib/ceph/osd/ceph-0/block.wal) close
2023-10-13 05:26:42.989 fff37c0030 1 bdev(0xaac6157880 /var/lib/ceph/osd/ceph-0/block.db) close
2023-10-13 05:26:43.273 fff37c0030 1 bdev(0xaac6157c00 /var/lib/ceph/osd/ceph-0/block) close
2023-10-13 05:26:43.509 fff37c0030 1 freelist shutdown
2023-10-13 05:26:43.509 fff37c0030 1 bdev(0xaac6156000 /var/lib/ceph/osd/ceph-0/block) close
2023-10-13 05:26:43.709 fff37c0030 -1 ** ERROR: osd init failed: (22) Invalid argument
可以看到OSD::init() : unable to read osd superblock,在OSD初始化时,无法读取OSD superblock。
解决方法
有两种解决办法:
- 将内核参数——PAGESIZE修改为4K。在鲲鹏BoostKit分布式存储使能套件文档中提供了将内核参数——PAGESIZE修改为4K的方法。
- https://www.hikunpeng.com/document/detail/zh/kunpengsdss/appAccelFeatures/globalCache/kunpengglobalcache_05_0040.html
- (推荐)在loongarch平台16K页大小情况下,OSD采用direct write写superblock到地址8K-12K,采用buffer write写设备标签到地址0-4K,对buffer write操作系统会按页对齐刷盘,superblock和设备标签刚好在同一个页上,刷盘导致superblock被覆盖,无法读出正确的数据。将写设备标签改成direct write修复此问题。
- https://gitee.com/src-openeuler/ceph/blob/master/0007-bluestore-use-direct-write-for-bdevlabel.patch
- 在最新版本的Ceph源码中也修复了此问题:https://github.com/ceph/ceph/blob/main/src/os/bluestore/BlueStore.cc#L6480
参考
- https://gitee.com/src-openeuler/ceph/issues/I54Q01
- https://gitee.com/src-openeuler/ceph/pulls/121/
这篇关于在Bcache上启动OSD报unable to read osd superblock错误的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!