三色标记（Tri-color marking）

本文主要是介绍三色标记（Tri-color marking），希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

维基百科部分

原文

https://en.wikipedia.org/wiki/Tracing_garbage_collection#TRI-COLOR

Because of these performance problems, most modern tracing garbage collectors implement some variant of the tri-color marking abstraction, but simple collectors (such as the mark-and-sweep collector) often do not make this abstraction explicit. Tri-color marking works as described below.

Three sets are created – white, black and gray:

The white set, or condemned set, is the set of objects that are candidates for having their memory recycled.
The black set is the set of objects that can be shown to have no outgoing references to objects in the white set, and to be reachable from the roots. Objects in the black set are not candidates for collection.
The gray set contains all objects reachable from the roots but yet to be scanned for references to “white” objects. Since they are known to be reachable from the roots, they cannot be garbage-collected and will end up in the black set after being scanned.

In many algorithms, initially the black set starts as empty, the gray set is the set of objects which are directly referenced from roots and the white set includes all other objects. Every object in memory is at all times in exactly one of the three sets. The algorithm proceeds as following:

Pick an object from the gray set and move it to the black set.
Move each white object it references to the gray set. This ensures that neither this object nor any object it references can be garbage-collected.
Repeat the last two steps until the gray set is empty.

When the gray set is empty, the scan is complete; the black objects are reachable from the roots, while the white objects are not and can be garbage-collected.

Since all objects not immediately reachable from the roots are added to the white set, and objects can only move from white to gray and from gray to black, the algorithm preserves an important invariant – no black objects reference white objects. This ensures that the white objects can be freed once the gray set is empty. This is called the tri-color invariant. Some variations on the algorithm do not preserve this invariant but use a modified form for which all the important properties hold.

The tri-color method has an important advantage – it can be performed “on-the-fly”, without halting the system for significant periods of time. This is accomplished by marking objects as they are allocated and during mutation, maintaining the various sets. By monitoring the size of the sets, the system can perform garbage collection periodically, rather than as needed. Also, the need to touch the entire working set on each cycle is avoided.

An example of tri-color marking on a heap with 8 objects. White, grey, and black objects are represented by light-grey, yellow, and blue, respectively.

渣渣翻译

由于一些性能问题，大多数现代跟踪垃圾收集器实现了某些变种的三色标记，但是简单收集器(如标记-清除收集器)通常不会显式地实现这种抽象。三色标记工作如下所述。

三组被创建 ——白色，黑色和灰色：

白色集合，或废弃集合，是候选对象的集合，它们的内存将被回收。
黑色集合是没有引用到白色集合的对象集合，并且可以从根开始访问到。黑色集合中的对象不是集合的候选对象。
灰色集合全部是根结点可到达的对象，但是存在尚未扫描到的“白色”对象的引用。由于已知从根可以到达它们，所以它们不能被垃圾收集，在被扫描后会在黑色集合中结束。

在许多算法中，最初黑色集合是空的，灰色集合是直接从根引用的对象的集合，而白色集合包括所有其他对象。内存中的每个对象在任何时候都恰好处于这三个集合中的一个。算法进行如下：

从灰色集合中选择一个对象，并将其移动到黑色集合中。
将它引用的每个白色对象移动到灰色集合中。这可以确保该对象和它引用的任何对象都不会被垃圾收集。
重复上面两个步骤，直到灰色集合为空。

当灰度集合为空时，扫描完成；黑色的对象是可以从根中到达的，而白色的对象则不能并且可以被垃圾收集。

由于所有不能立即从根到达的对象都被添加到了白色集合中，并且对象只能从白色集合移动到灰色集合，从灰色集合移动到黑色集合，这样算法保留了一个重要的不变量——没有黑色对象引用白色对象。这确保了在灰色集合为空时可以释放白色对象。这叫做三色不变量。算法的一些变化不保持这个不变，而是使用一种修改的形式，确保所有重要的性质保持不变。

三色法有一个重要的优点——它可以“即时”执行，而不需要在很长一段时间内停止系统。这是通过在分配对象时以及在变化期间标记对象、维护各种集合来实现的。通过监视集合的大小，系统可以定期执行垃圾收集，而不是根据需要执行。此外，也避免了在每个周期中接触整个工作集的需要。

An example of tri-color marking on a heap with 8 objects. White, grey, and black objects are represented by light-grey, yellow, and blue, respectively.

这有8个对象的堆上进行三色标记的示例。白色、灰色和黑色物体分别用浅灰色、黄色和蓝色表示。

Wilson论文部分

Tricolor Marking

三色标记

The abstraction of tricolor marking is helpful in understanding incremental garbage collection. Garbage collection algorithms can be described as a process of traversing the graph of reachable objects and coloring them. The objects subject to garbage collection are conceptually colored white, and by the end of collection, those that will be retained must be colored black. When there are no reachable nodes left to blacken, the traversal of live data structures is finished.

三色标记的抽象有助于理解增量垃圾收集。垃圾收集算法可以描述为一个遍历可达对象图并对其着色的过程。在概念上，接受垃圾收集的对象被涂成白色，在收集结束时，那些将被保留的对象必须涂成黑色。当没有剩余的可到达节点要变黑时，对活动数据结构的遍历就完成了。

In a simple mark-sweep collector,this coloring is directly implemented by setting mark bits, objects whose bit is set are black. In a copy collector, this is the process of moving objects from fromspace to tospace, unreached objects in fromspace are considered white, and objects moved to tospace are considered black. The abstraction of coloring is orthogonal to the distinction between marking and copying collectors, and is important for understanding the basic differences between incremental collectors.

在一个简单的标记-清除收集器中，这种着色是通过设置标记位直接实现的，这些对象的位被设置为黑色。在复制收集器中，这是将对象从fromspace移动到tospace的过程，fromspace中未到达的对象被认为是白色的，而移动到tospace的对象被认为是黑色的。着色的抽象与标记收集器和复制收集器之间的区别无关，并且对于理解增量收集器之间的基本区别非常重要。

In incremental collectors, the intermediate states of the coloring traversal are important, because of ongoing mutator activity, the mutator can’t be allowed to change things “behind the collector’s back” in such a way that the collector will fail to find all reachable objects.

在增量收集器中，着色遍历的中间状态非常重要，因为正在进行的mutator活动，不能允许mutator“在收集器背后”更改内容，这样收集器就无法找到所有可访问的对象。

To understand and prevent such interactions between the mutator and the collector, it is useful to introduce a third color, gray, to signify that an object has been reached by the traversal, but that its descendants may not have been. That is, as the traversal proceeds outward from the roots, objects are initially colored gray. When they are scanned and pointers to their offspring are traversed, they are blackened and the offspring are colored gray.

为了理解和防止mutator和collector之间的这种交互，引入第三种颜色gray是很有用的，它表示遍历已经到达了对象，但是它的后代可能还没有到达。也就是说，当遍历从根开始向外进行时，对象最初是灰色的。当扫描它们并遍历它们后代的指针时，它们被涂黑，后代被涂成灰色。

In a copying collector, the gray objects are the objects in the unscanned area of tospace, if a Cheney breadth-first traversal is used, that’s the objects between the scan and free pointers. In a mark-sweep collector, the gray objects correspond to the stack or queue of objects used to control the marking traversal, and the black objects are the ones that have been removed from the queue. In both cases, objects that have not been reached yet are white.

在复制收集器中，灰色对象是tospace未扫描区域中的对象，如果使用Cheney宽度优先遍历，则是扫描和空闲指针之间的对象。在标记-清除收集器中，灰色对象对应于用于控制标记遍历的对象堆栈或对象队列，而黑色对象是已从队列中删除的对象。在这两种情况下，尚未到达的对象都是白色的。

Intuitively, the traversal proceeds in a wavefront of gray objects, which separates the white (unreached) objects from the black objects that have been passed by the wave, that is, there are no pointers directly from black objects to white ones. This abstracts away from the particulars of the traversal algorithm, it may be depth-first, breadth-first, or just about any kind of exhaustive traversal. It is only important that a well-defined gray fringe be identifiable, and that the mutator preserve the invariant that no black object hold a pointer directly to a white object.

直观地说，遍历过程是在灰色对象的波前进行的，它将白色(未触及的)对象与通过波传递的黑色对象分开，也就是说，没有从黑色对象直接指向白色对象的指针。这抽象了遍历算法的细节，它可以是深度优先、广度优先，或者几乎是任何类型的穷举遍历。重要的是，定义良好的灰色条纹是可识别的，mutator保持不变，没有黑色对象持有直接指向白色对象的指针。

The importance of this invariant is that the collector must be able to assume that it is “finished with” black objects, and can continue to traverse gray objects and move the wavefront forward. If the mutator creates a pointer from a black object to a white one, it must somehow notify the collector that its assumption has been violated. This ensures that the collector’s book-keeping is brought up to date.

这个不变量的重要性在于，收集器必须能够假设它已经“用完了”黑色物体，并且能够继续遍历灰色物体并向前移动波前。如果mutator创建了一个从黑色对象指向白色对象的指针，它必须以某种方式通知收集器它的假设被违背了。这确保了收集者的簿记是最新的。

Figure 7 demonstrates this need for coordination. Suppose the object A has been completely scanned (and therefore blackened); its descendants have been reached and grayed. Now suppose that the mutator swaps the pointer from A to C with the pointer from B to D. The only pointer to D is now in a field of A, which the collector has already scanned. If the traversal continues without any coordination, B will be blackened, C will be reached again (from B), and D will never be reached at all, and hence will be erroneously deemed garbage and reclaimed.

图7演示了这种协调的需要。假设对象A已经被完全扫描(因此变黑);它的后代已经到达并且变灰了。现在假设mutator将指针从A交换到C，同时将指针从B交换到D。唯一指向D的指针现在在A的字段中，收集器已经扫描了该字段。如果遍历在没有任何协调的情况下继续进行，那么B将被黑，C将再次(从B)到达，而D将永远不会到达，因此将被错误地认为是垃圾并被回收。

在这里插入图片描述