How CoreCLR GC understand objects (part 2)
In addition to size described in an earlier post, it is also important for the GC to know where the pointers are in an object. This is because in the mark phase, we need to traverse the object graph. Here is a function for the mark phase in gc.cpp
line 19415-19452.
Where is the code that perform the traversal?
//this method assumes that *po is in the [low. high[ range
void
gc_heap::mark_object_simple (uint8_t** po THREAD_NUMBER_DCL)
{
uint8_t* o = *po;
/* some code that does not change o or po */
if (gc_mark1 (o))
{
/* some code that does not change o or po */
{
go_through_object_cl (method_table(o), o, s, poo,
{
uint8_t* oo = *poo;
/* some code to recursively mark oo */
}
);
}
}
}
Imagine this is run during the mark phase, if we discover an unmarked object, then we traverse into its pointers and recursively mark the objects it points to. There are some complexity with respect to managing the stack, as the objects graph gets deep, we might ran out of stack space. In this post, we will ignore that and focus into how GC discovered the pointed objects. The magic ingredient here is obviously go_through_object_cl
in gc.cpp
line 18602-18627.
// 1 thing to note about this macro:
// 1) you can use *parm safely but in general you don't want to use parm
// because for the collectible types it's not an address on the managed heap.
#ifndef COLLECTIBLE_CLASS
#define go_through_object_cl(mt,o,size,parm,exp) \
{ \
if (header(o)->ContainsPointers()) \
{ \
go_through_object_nostart(mt,o,size,parm,exp); \
} \
}
#else //COLLECTIBLE_CLASS
#define go_through_object_cl(mt,o,size,parm,exp) \
{ \
if (header(o)->Collectible()) \
{ \
uint8_t* class_obj = get_class_object (o); \
uint8_t** parm = &class_obj; \
do {exp} while (false); \
} \
if (header(o)->ContainsPointers()) \
{ \
go_through_object_nostart(mt,o,size,parm,exp); \
} \
}
#endif //COLLECTIBLE_CLASS
We have two cases, either we have COLLECTIBLE_CLASS
defined or not. In the COLLECTIBLE_CLASS
case, we simply have an extra object (i.e. the class object) to mark that is implicitly ‘pointed’ by the object, otherwise, we simply delegate to go_through_object_nostart
in gc.cpp line 18600.
#define go_through_object_nostart(mt,o,size,parm,exp) {go_through_object(mt,o,size,parm,o,ignore_start,(o + size),exp); }
Again, it delegates to go_through_object
. It has an interesting option that can take an extra start
parameter, but in this case we are not using it. Here we reach the most interesting macro that does the work.
#define go_through_object(mt,o,size,parm,start,start_useful,limit,exp) \
{ \
CGCDesc* map = CGCDesc::GetCGCDescFromMT((MethodTable*)(mt)); \
CGCDescSeries* cur = map->GetHighestSeries(); \
ptrdiff_t cnt = (ptrdiff_t) map->GetNumSeries(); \
\
if (cnt >= 0) \
{ \
CGCDescSeries* last = map->GetLowestSeries(); \
uint8_t** parm = 0; \
do \
{ \
assert (parm <= (uint8_t**)((o) + cur->GetSeriesOffset())); \
parm = (uint8_t**)((o) + cur->GetSeriesOffset()); \
uint8_t** ppstop = \
(uint8_t**)((uint8_t*)parm + cur->GetSeriesSize() + (size));\
if (!start_useful || (uint8_t*)ppstop > (start)) \
{ \
if (start_useful && (uint8_t*)parm < (start)) parm = (uint8_t**)(start);\
while (parm < ppstop) \
{ \
{exp} \
parm++; \
} \
} \
cur--; \
\
} while (cur >= last); \
} \
else \
{ \
/* Handle the repeating case - array of valuetypes */ \
uint8_t** parm = (uint8_t**)((o) + cur->startoffset); \
if (start_useful && start > (uint8_t*)parm) \
{ \
ptrdiff_t cs = mt->RawGetComponentSize(); \
parm = (uint8_t**)((uint8_t*)parm + (((start) - (uint8_t*)parm)/cs)*cs); \
} \
while ((uint8_t*)parm < ((o)+(size)-plug_skew)) \
{ \
for (ptrdiff_t __i = 0; __i > cnt; __i--) \
{ \
HALF_SIZE_T skip = cur->val_serie[__i].skip; \
HALF_SIZE_T nptrs = cur->val_serie[__i].nptrs; \
uint8_t** ppstop = parm + nptrs; \
if (!start_useful || (uint8_t*)ppstop > (start)) \
{ \
if (start_useful && (uint8_t*)parm < (start)) parm = (uint8_t**)(start); \
do \
{ \
{exp} \
parm++; \
} while (parm < ppstop); \
} \
parm = (uint8_t**)((uint8_t*)ppstop + skip); \
} \
} \
} \
}
So there we go, it is the code for performing the traversal.
How does the traversal code actually work?
We have two cases. Let’s focus on the first case first. Conceptually, the MethodTable has a CGCDesc
. It contains a list of CGCDescSeries
. Each series contains an offset and a size, which is used to compute parm
and ppstop
. The while loop will be used to run exp
while parm
is marching towards ppstop
one pointer at a time. When we first read the code, let’s assume start_useful
is false. This basically means the first condition is always true and the second condition is always false. Here is an interesting gem in the code.
parm = (uint8_t**)((o) + cur->GetSeriesOffset()); \
uint8_t** ppstop = \
(uint8_t**)((uint8_t*)parm + cur->GetSeriesSize() + (size));\
It is easy to understand that we want parm to point into the object by series offset. The next line is weird, the ending position is the object plus the series size plus the object size. The last plus is unconventional, why?
The last plus is actually used to merge the reference array case with the object case. In case of a reference array, we already know the size of the array, so we can simply store 0 as the series size for any array, and it will always march to the end of the object. It works because every location inside reference array after the offset is a pointer. For normal objects, all we need to do is to subtract the conventional length of the length by the size of the object, that only need to be done once when the CCGSeries
is constructed and its worth the effort to avoid a comparison here for every array during GC.
It is a very cool optimization, in my humble opinion.
That begs the question for the second case. In case of a value type array in which the value type contains some references. We need to do something different. First of all, we use negative GetNumSeries
value to signal the fact that the CGDesc
describes a value type array and therefore an alternative algorithm is needed. Conceptually, a CCGDesc
for value type array has a single offset
and a collection of val_serie
indexed from 0
to cnt + 1
. (Note that cnt
is negative). A val_serie
contains a skip
and a nptrs
. The idea is that we move the pointer by offset
. Then it should point to a contiguous region of val_serie[0].nptrs
pointers, then we skip for val_serie[0].skip
pointers, and then we have a contiguous region of val_serie[-1].nptrs
pointers, then we skip for val_serie[-1].skip
pointers, and so on until we exhaust the val_serie
up to cnt, and then the whole process repeats until the end of the object. This description allows us to describe an object with a periodic pattern of pointers.
How is the data stored?
Last but not least, let’s look into the physical representation of these data. First of all, GetCGCDesc::CGCDescFromMT
is just a cast in gcdesc.h
line 166 to line 175. That is, the CGCDesc
is actually part of the method table.
static PTR_CGCDesc GetCGCDescFromMT (MethodTable * pMT)
{
// If it doesn't contain pointers, there isn't a GCDesc
PTR_MethodTable mt(pMT);
_ASSERTE(mt->ContainsPointersOrCollectible());
return PTR_CGCDesc(mt);
}
We can see that in the CGCDesc::GetNumSeries()
implementation in gcdesc.h line 176 to line 179. We are accessing the region before this
. That is, the CCGDesc
data is actually stored before the method table pointer.
size_t GetNumSeries ()
{
return *(PTR_size_t(PTR_CGCDesc(this))-1);
}
The various GetSize()
methods in the same file probably describe the layout the best.
// Size of the entire slot map.
size_t GetSize ()
{
ptrdiff_t numSeries = (ptrdiff_t) GetNumSeries();
if (numSeries < 0)
{
return ComputeSizeRepeating(-numSeries);
}
else
{
return ComputeSize(numSeries);
}
}
static size_t ComputeSize (size_t NumSeries)
{
_ASSERTE (ptrdiff_t(NumSeries) > 0);
return sizeof(size_t) + NumSeries*sizeof(CGCDescSeries);
}
// For value type array
static size_t ComputeSizeRepeating (size_t NumSeries)
{
_ASSERTE (ptrdiff_t(NumSeries) > 0);
return sizeof(size_t) + sizeof(CGCDescSeries) +
(NumSeries-1)*sizeof(val_serie_item);
}
In case of objects or reference array, we have a CGCDescSeries
array growing backwards, and that’s why the size is a pointer size times the number of CGCDescSeries
times the size of the it. In the case of value type array, we have a single CCGSeries
, but then the val_serie
array has NumSeries
elements. Note that in the computation we used NumSeries - 1
because the size of CCGSeries
already included one element from that array.
The also explains why the val_serie
array is indexed using negative indexes. It is because it is growing backwards.
This concludes how the GC finds pointers in an object using the description.