golang map真有那么随机吗?——map遍历研究

2024-01-26 04:52

本文主要是介绍golang map真有那么随机吗?——map遍历研究,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

在随机选取map中元素时,本想用map遍历的方式来返回,但是却并没有通过测试。

那么难道map的遍历并不是那么的随机吗?

以下代码参考go1.18

hiter是map遍历的结构,主要记录了当前遍历的元素、开始位置等来完成整个遍历过程

// A hash iteration structure.
// If you modify hiter, also change cmd/compile/internal/reflectdata/reflect.go
// and reflect/value.go to match the layout of this structure.
type hiter struct {// 指向下一个遍历key的地址key         unsafe.Pointer // Must be in first position.  Write nil to indicate iteration end (see cmd/compile/internal/walk/range.go).// 指向下一个遍历value的地址elem        unsafe.Pointer // Must be in second position (see cmd/compile/internal/walk/range.go).// map类型t           *maptype// map headerh           *hmap// 初始化时指向的bucketbuckets     unsafe.Pointer // bucket ptr at hash_iter initialization time// 当前遍历到的bmapbptr        *bmap          // current bucketoverflow    *[]*bmap       // keeps overflow buckets of hmap.buckets aliveoldoverflow *[]*bmap       // keeps overflow buckets of hmap.oldbuckets alive// 开始桶startBucket uintptr        // bucket iteration started at// 桶内偏移量offset      uint8          // intra-bucket offset to start from during iteration (should be big enough to hold bucketCnt-1)// 是否从头遍历了wrapped     bool           // already wrapped around from end of bucket array to beginningB           uint8// 正在遍历的槽位i           uint8// 正在遍历的桶位bucket      uintptr// 用于扩容时进行检查checkBucket uintptr
}

mapiterinit为开始遍历的方法,主要是确定初始遍历的位置

// mapiterinit initializes the hiter struct used for ranging over maps.
// The hiter struct pointed to by 'it' is allocated on the stack
// by the compilers order pass or on the heap by reflect_mapiterinit.
// Both need to have zeroed hiter since the struct contains pointers.
func mapiterinit(t *maptype, h *hmap, it *hiter) {// 若map为空,则跳过遍历过程it.t = tif h == nil || h.count == 0 {return}if unsafe.Sizeof(hiter{})/goarch.PtrSize != 12 {throw("hash_iter size incorrect") // see cmd/compile/internal/reflectdata/reflect.go}it.h = h// grab snapshot of bucket state// 迭代器快照记录map桶信息it.B = h.Bit.buckets = h.bucketsif t.bucket.ptrdata == 0 {// Allocate the current slice and remember pointers to both current and old.// This preserves all relevant overflow buckets alive even if// the table grows and/or overflow buckets are added to the table// while we are iterating.h.createOverflow()it.overflow = h.extra.overflowit.oldoverflow = h.extra.oldoverflow}// decide where to start// 开始bucket选择随机数的低B位// 偏移量选择随机数高B位与桶数量,显然这个桶数量是不包括溢出桶的r := uintptr(fastrand())if h.B > 31-bucketCntBits {r += uintptr(fastrand()) << 31}it.startBucket = r & bucketMask(h.B)it.offset = uint8(r >> h.B & (bucketCnt - 1))// iterator state// 更新迭代器桶为初始桶it.bucket = it.startBucket// Remember we have an iterator.// Can run concurrently with another mapiterinit().// 标记可能有迭代正在使用桶和旧桶if old := h.flags; old&(iterator|oldIterator) != iterator|oldIterator {atomic.Or8(&h.flags, iterator|oldIterator)}mapiternext(it)
}

从上面的代码分析我们便可以看出随机选取的元素并不是真的随机,溢出桶并不包含在随机选择的范围里面

在具体的遍历过程,存在以下疑问

  • 如果在扩容中,如何进行遍历?
  • 如何保证不遗漏?
  • 如何防止重复遍历?
func mapiternext(it *hiter) {h := it.h// 如果标记已经写入,则抛出并发迭代写入错误if h.flags&hashWriting != 0 {throw("concurrent map iteration and map write")}t := it.tbucket := it.bucketb := it.bptri := it.icheckBucket := it.checkBucketnext:if b == nil {// 如果再次遇到开始bucket且是从头遍历的,则说明迭代结束,返回if bucket == it.startBucket && it.wrapped {// end of iterationit.key = nilit.elem = nilreturn}// 如果正在迁移过程中,且老桶没被迁移,采用老桶if h.growing() && it.B == h.B {// Iterator was started in the middle of a grow, and the grow isn't done yet.// If the bucket we're looking at hasn't been filled in yet (i.e. the old// bucket hasn't been evacuated) then we need to iterate through the old// bucket and only return the ones that will be migrated to this bucket.oldbucket := bucket & it.h.oldbucketmask()b = (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)))// bucket未迁移,记录bucket// checkBucket在当前map处于迁移而bucket未迁移时,为当前bucket// 否则为noCheckif !evacuated(b) {checkBucket = bucket} else {b = (*bmap)(add(it.buckets, bucket*uintptr(t.bucketsize)))checkBucket = noCheck}} else {// map处于未迁移,或者bucket迁移完成,采用新桶b = (*bmap)(add(it.buckets, bucket*uintptr(t.bucketsize)))checkBucket = noCheck}// 推进到下一桶bucket++// 遍历到最后一个桶,要绕回0桶继续遍历if bucket == bucketShift(it.B) {bucket = 0it.wrapped = true}i = 0}// 遍历桶内元素for ; i < bucketCnt; i++ {// 从offset槽开始offi := (i + it.offset) & (bucketCnt - 1)// 跳过空槽if isEmpty(b.tophash[offi]) || b.tophash[offi] == evacuatedEmpty {// TODO: emptyRest is hard to use here, as we start iterating// in the middle of a bucket. It's feasible, just tricky.continue}// 获取元素key、valuek := add(unsafe.Pointer(b), dataOffset+uintptr(offi)*uintptr(t.keysize))if t.indirectkey() {k = *((*unsafe.Pointer)(k))}e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+uintptr(offi)*uintptr(t.elemsize))// 扩容迁移时过滤掉不属于当前指向新桶的旧桶元素if checkBucket != noCheck && !h.sameSizeGrow() {// Special case: iterator was started during a grow to a larger size// and the grow is not done yet. We're working on a bucket whose// oldbucket has not been evacuated yet. Or at least, it wasn't// evacuated when we started the bucket. So we're iterating// through the oldbucket, skipping any keys that will go// to the other new bucket (each oldbucket expands to two// buckets during a grow).// 若key是有效的if t.reflexivekey() || t.key.equal(k, k) {// If the item in the oldbucket is not destined for// the current new bucket in the iteration, skip it.// 如果旧桶中的项在迭代中不打算用于当前的新桶,则跳过它。hash := t.hasher(k, uintptr(h.hash0))if hash&bucketMask(it.B) != checkBucket {continue}} else {// 对k!=k,也就是nil之类的,判断是否属于该新桶// 不是,则跳过// Hash isn't repeatable if k != k (NaNs).  We need a// repeatable and randomish choice of which direction// to send NaNs during evacuation. We'll use the low// bit of tophash to decide which way NaNs go.// NOTE: this case is why we need two evacuate tophash// values, evacuatedX and evacuatedY, that differ in// their low bit.if checkBucket>>(it.B-1) != uintptr(b.tophash[offi]&1) {continue}}}// 如果当前桶未扩容迁移,或者是每次hash不一致的key,获取到key、value添加到迭代器中if (b.tophash[offi] != evacuatedX && b.tophash[offi] != evacuatedY) ||!(t.reflexivekey() || t.key.equal(k, k)) {// This is the golden data, we can return it.// OR// key!=key, so the entry can't be deleted or updated, so we can just return it.// That's lucky for us because when key!=key we can't look it up successfully.it.key = kif t.indirectelem() {e = *((*unsafe.Pointer)(e))}it.elem = e} else {// 数据已经迁移情况下,处理键已被删除、更新或删除并重新插入的情况,定位数据,最后添加遍历key、value// The hash table has grown since the iterator was started.// The golden data for this key is now somewhere else.// Check the current hash table for the data.// This code handles the case where the key// has been deleted, updated, or deleted and reinserted.// NOTE: we need to regrab the key as it has potentially been// updated to an equal() but not identical key (e.g. +0.0 vs -0.0).rk, re := mapaccessK(t, h, k)if rk == nil {continue // key has been deleted}it.key = rkit.elem = re}// 迭代器记录进度it.bucket = bucketif it.bptr != b { // avoid unnecessary write barrier; see issue 14921it.bptr = b}it.i = i + 1it.checkBucket = checkBucketreturn}// 遍历溢出桶b = b.overflow(t)i = 0goto next
}

通过以上代码分析,可以看出:

  • 在扩容时遍历,

    • 如果当前遍历的桶已经迁移好了,那么取新桶

    • 如果仍然处于旧桶,则取旧桶。

      但值得注意的是要过滤掉那些不属于该新桶的旧桶元素。因为旧桶在扩容迁移时会分为两块,当前指向的新桶只属于其中之一

  • bucket从初始桶逐渐递增,保证正常桶都能遍历到。此外也保证了完整遍历溢出桶,直到溢出桶为空

  • 通过记录是否从头遍历的标志和起始bucket,以及在扩容过程中过滤不属于该新桶的元素来保证不会重复遍历

Ref

  1. https://zhuanlan.zhihu.com/p/597348765
  2. https://www.cnblogs.com/cnblogs-wangzhipeng/p/13292524.html
  3. https://qcrao.com/post/dive-into-go-map/

这篇关于golang map真有那么随机吗?——map遍历研究的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/645686

相关文章

Golang操作DuckDB实战案例分享

《Golang操作DuckDB实战案例分享》DuckDB是一个嵌入式SQL数据库引擎,它与众所周知的SQLite非常相似,但它是为olap风格的工作负载设计的,DuckDB支持各种数据类型和SQL特性... 目录DuckDB的主要优点环境准备初始化表和数据查询单行或多行错误处理和事务完整代码最后总结Duck

Golang的CSP模型简介(最新推荐)

《Golang的CSP模型简介(最新推荐)》Golang采用了CSP(CommunicatingSequentialProcesses,通信顺序进程)并发模型,通过goroutine和channe... 目录前言一、介绍1. 什么是 CSP 模型2. Goroutine3. Channel4. Channe

Python中的随机森林算法与实战

《Python中的随机森林算法与实战》本文详细介绍了随机森林算法,包括其原理、实现步骤、分类和回归案例,并讨论了其优点和缺点,通过面向对象编程实现了一个简单的随机森林模型,并应用于鸢尾花分类和波士顿房... 目录1、随机森林算法概述2、随机森林的原理3、实现步骤4、分类案例:使用随机森林预测鸢尾花品种4.1

Golang使用minio替代文件系统的实战教程

《Golang使用minio替代文件系统的实战教程》本文讨论项目开发中直接文件系统的限制或不足,接着介绍Minio对象存储的优势,同时给出Golang的实际示例代码,包括初始化客户端、读取minio对... 目录文件系统 vs Minio文件系统不足:对象存储:miniogolang连接Minio配置Min

关于Java内存访问重排序的研究

《关于Java内存访问重排序的研究》文章主要介绍了重排序现象及其在多线程编程中的影响,包括内存可见性问题和Java内存模型中对重排序的规则... 目录什么是重排序重排序图解重排序实验as-if-serial语义内存访问重排序与内存可见性内存访问重排序与Java内存模型重排序示意表内存屏障内存屏障示意表Int

Golang使用etcd构建分布式锁的示例分享

《Golang使用etcd构建分布式锁的示例分享》在本教程中,我们将学习如何使用Go和etcd构建分布式锁系统,分布式锁系统对于管理对分布式系统中共享资源的并发访问至关重要,它有助于维护一致性,防止竞... 目录引言环境准备新建Go项目实现加锁和解锁功能测试分布式锁重构实现失败重试总结引言我们将使用Go作

使用C#如何创建人名或其他物体随机分组

《使用C#如何创建人名或其他物体随机分组》文章描述了一个随机分配人员到多个团队的代码示例,包括将人员列表随机化并根据组数分配到不同组,最后按组号排序显示结果... 目录C#创建人名或其他物体随机分组此示例使用以下代码将人员分配到组代码首先将lstPeople ListBox总结C#创建人名或其他物体随机分组

leetcode105 从前序与中序遍历序列构造二叉树

根据一棵树的前序遍历与中序遍历构造二叉树。 注意: 你可以假设树中没有重复的元素。 例如,给出 前序遍历 preorder = [3,9,20,15,7]中序遍历 inorder = [9,3,15,20,7] 返回如下的二叉树: 3/ \9 20/ \15 7   class Solution {public TreeNode buildTree(int[] pr

Collection List Set Map的区别和联系

Collection List Set Map的区别和联系 这些都代表了Java中的集合,这里主要从其元素是否有序,是否可重复来进行区别记忆,以便恰当地使用,当然还存在同步方面的差异,见上一篇相关文章。 有序否 允许元素重复否 Collection 否 是 List 是 是 Set AbstractSet 否

一种改进的red5集群方案的应用、基于Red5服务器集群负载均衡调度算法研究

转自: 一种改进的red5集群方案的应用: http://wenku.baidu.com/link?url=jYQ1wNwHVBqJ-5XCYq0PRligp6Y5q6BYXyISUsF56My8DP8dc9CZ4pZvpPz1abxJn8fojMrL0IyfmMHStpvkotqC1RWlRMGnzVL1X4IPOa_  基于Red5服务器集群负载均衡调度算法研究 http://ww