Managed pointers in .NET

Disclaimer – this article consists of fragments of my book, adapted and re-edited considerably to be presented in the form of an independent whole post.

Most of the time a regular .NET developer uses object references and it is simple enough because this is how a managed world is constructed – objects are referencing each other via object references. An object reference is, in fact, a type-safe pointer (address) that always points to an object MethodTable reference field (it is often said it points at the beginning of an object). Thus, using them may be quite efficient. Having an object reference, we simply have the whole object address. For example, the GC can quickly access its header via constant offset. Addresses of fields are also easily computable due to information stored in MethodTable.

There is, however, another pointer type in CLR – a managed pointer. It could be defined as a more general type of reference, which may point to other locations than just the beginning of an object. ECMA-335 says that a managed pointer can point to:

  • local variable – whether it be a reference to a heap-allocated object or simply stack-allocated type,
  • parameter – like above,
  • field of a compound type – meaning a field of other type (whether it is value or reference type),
  • an element of an array

Despite this flexibility, managed pointers are still types. There is a managed pointer type that points to System.Int32 objects, regardless of their localization, denoted as System.Int32& in CIL. Or SomeNamespace.SomeClass& type pointing to our custom SomeNamespace.SomeClass instances. Strong typing makes them safer than pure,
unmanaged pointers that may be cast back and forth for literally everything. This is also why managed pointers do not offer pointer arithmetic known from raw pointers – it particularly does not make sense to “add” or “subtract” addresses they represent, pointing to various places inside objects or to local variables.

However, flexibility does not come without a cost. It reveals itself as limitations of a possible place where we can use managed pointers. As ECMA-335 says, managed
pointer types are only allowed for:

  • local variables
  • parameter signatures

It is directly said that:

“they cannot be used for field signatures, as the element type of an array and boxing a value of managed pointer type is disallowed. Using a managed pointer type for the return type of methods is not verifiable”

Due to those limitations, managed pointers are not directly exposed into C# language. However, they have long been present in the well-known form of ref
parameters. Passing parameter by reference is nothing else than using a managed pointer underneath. Thus, managed pointers are also often referred to as byref types
(or byref simply).

Note. Recently, since C# 7.0, managed pointers usage has been widened in the form of ref locals and ref returns (collectively referred to as ref variables). Thus, the last sentence from the above ECMA citation about using a managed pointer type as the return type has been relaxed.

Ref parameters

The well-known, long-lasting example of ref variable is ref parameter – instead of passing an argument by value (whether it is a struct or a value of reference), we may pass just a reference to it. This is especially useful in case of value-types (structs) because we avoid copying them (passing them by value):

This is a great optimization trick – not only that we cause no heap allocation (by using struct instead of class), we also eliminate the overhead of possible data copying regardless of the struct size.

Ref locals

You can see ref local as a local variable to store a managed pointer. Thus, it is a convenient way of creating helper variables that may be later on used for direct access to a given field, array element or another local variable. Please note that both the left and right side of an assignment must be marked with the ref keyword to denote operating on managed pointers:

A trivial above example makes only illustrative sense – we are gaining direct access to an int field so the performance gain will be neglectable. More commonly you may want to use ref local to gain direct pointer to some heavyweight instance to make sure copying will not happen and pass it by reference somewhere or use locally. Ref locals are also commonly used to store the result of ref return method.

Ref local may be assigned to reference that itself is null. At first glance, it may sound strange but makes perfect sense. You can think of ref local as a variable storing an address to a reference, but it does not mean that the reference itself points to anything.

Ref returns

Ref return allows us to return from a method a variable type representing the managed pointer. Obviously, some limitations must be introduced when using them. As MSDN says:

“The return value must have a lifetime that extends beyond the execution of the method. In other words, it cannot be a local variable in the method that returns it. It can be an instance or static field of a class, or it can be an argument passed to the method”

Attempting to return a local variable generates compiler error:

However, it is perfectly fine to ref return element of the method parameter because from the method perspective, this argument lives longer than the method itself:

This concludes very short summary of the ref variables usage in C#. If you want to dig into its usage performance impact, read this great blog post from Adam Sitnik.

Note. Passing-by reference is so important in terms of optimizing common code base of different libraries that it constantly gains more and more attention from creators of .NET and C# language. From C# 7.1 and 7.2 there is the possibility to pass by read-only reference (by using in keyword instead of ref) and use read-only refs, to explicitly say that a reference is used only for accessing data, without a possibility to modify it. I will look at such possibilities in the one of the next blog post.

Ref types internals

Some interesting questions may arise regarding all byref types. For example, how does passing around all those managed pointers cooperate with the GC? What code is generated underneath by the JIT compiler? Let’s dig deeper into main use cases that managed pointer usage may be grouped into. Understanding them will reveal the reasons behind the mentioned limitations as well as will help us to understand them better.

Managed pointer into “stack-allocated” object

A managed pointer can point to a method’s local variable or parameter. From an implementation point of view, a local variable or parameter may be stack-allocated or enregistered into CPU register (if JIT compiler decides so). How does a managed pointer work in such a case then?

Simply put, it is perfectly fine that the managed pointer points to a stack address! This is one of the reasons why a managed pointer may not be the object’s field (and may not be boxed). If it appears in this way on the Managed Heap, it could outlive the method within which the indicated stack address is located. It would be very dangerous (pointed stack address would contain undefined data, most probably other’s method stack frame). So by limiting a managed pointer’s usage to local variables and parameters, their lifetime is limited to the most restrictive lifetime of a possible target they can point to – data on the stack.

What about enregistered local variables and parameters? Remember that such an enregistered target is just an optimization detail; it has to provide at least the same
lifetime characteristics as a stack-allocated target. A lot depends on the JIT compiler here. If some target was enregistered, it is even better! Such a register may be simply used as a managed pointer. In other words, using a CPU register instead of a stack address does not change much from the JIT compiler perspective.

But how are managed pointers (or more precisely, objects pointed by them) reported to the GC? They must be, because otherwise, GC may not detect reachability of the target object; if it happens that managed pointer is the only root at the moment.

Let’s analyze a very simple passing by reference scenario. To remove the effects of inlining and make things clearer, NoInlining attribute was used that prevents inlining of Test method:

What is interesting for us at the moment is to see how such code is represented both on CIL and assembly level, after JITting. Corresponding CIL code reveals usage of
strongly typed SomeClass& managed pointer. In the Main method ldloca instruction is used that loads the address of the local variable at a specific index (and index 0 corresponds to our someClass variable) onto the evaluation stack, which is then passed to Test method. Then Test method uses ldind.ref instruction to
dereference such address and push resulting object reference on the evaluation stack:

But while CIL code may be interesting, usually only JITted code reveals the true nature of what happens underneath. Looking at the assembly code of both methods, we indeed see that Test method receives an address pointing to the stack where reference to newly created SomeClass instance is stored:

From a pure assembly code point of view, sa imilar code would be generated, for example, if using pointer to a pointer in C++. But how, while Test method is executing, the GC knows that RCX register contains an object address? The answer is interesting for us – Test method contains an empty GCInfo. In other words, Test method is so simple that GC will not interrupt its work. Thus, it does not need to report anything! As simple as that.

If Test method was more complex, it could be JITted into fully- or partially interruptible method (those are explained in detail in my book’s Chapter 8). For example, in the latter case, we could see various safepoints, some of them listing some CPU registers (or stack addresses) as live slots:

Those live slots would be listed as so-called interior pointers because managed pointers, in general, may point inside objects (it will be explained soon). Thus, managed pointers are always reported as interior roots; besides that in our case, they point in fact at the beginning of the object. Interpretation of such pointers is on the GC side, explained later.

Very similar code would be generated in case of using a struct instead of class. What is more interesting, even it is theoretically known in such case that Test method operates only on stack-allocated data (local variable of SomeStruct value type), the corresponding GCInfo will still list live slots because of using a managed pointer. It is up to the GC just to ignore them.

Managed pointer into a heap-allocated object

While stack-pointing managed pointers may seem to be interesting, those that are pointing to objects on the Managed Heap are even more interesting. In contrast to the
object reference, a managed pointer can point to the inside of the object – field of a type or element of an array as already cited ECMA standard says.

 

Managed pointer visualization

That is why they are in fact “interior pointers” as it is named in the literature. When you think about it a little, it may seem very interesting – how interior pointers pointing inside managed objects may be reported to the GC?

Let’s modify a little above code, to pass by reference only a field of heap-allocated SomeClass instance:

The Main method looks straightforward. It instantiates SomeClass object, passes a reference to one of its field to the Test method, and prints the result.

But our modified Test method expects now System.Int32& managed pointer. During execution, Test method operates only on a managed pointer to int. But it is not
just a regular pointer to int – it is a field of a heap-allocated object! From where the GC knows that it may not collect the corresponding object, to which used managed pointer belongs? There is absolutely nothing said about from where int& pointer comes from!

First of all, please note that our Test method contrived example will be JITted into atomic (from the GC point of view) method that the GC will simply not interrupt at all:

So again, the question of proper root reporting is not needed at all for such a simple method.

But let’s suppose Test method is complex enough to produce interruptible code. Below is an example of how corresponding JITted code could look then. RSI
register, which keeps the value of the integer field address passed as an argument in RCX register, is reported as an interior pointer:

If GC happens and Test method is suspended when RSI contains such interior pointer, GC must interpret it to find the corresponding object. This is in general, not
trivial. One could think about simple algorithm that starts from such a pointer’s address and then tries to find the beginning of the object by scanning memory to the left byte by byte (it must have been done with a single byte shift because it is not guaranteed in any way how aligned are interior pointers with respect to the object’s beginning). This obviously is not efficient and has many drawbacks:

  • Interior pointer may point to a distant field of big object (or distant element of very large array) – so a lot of such naïve scans had to be performed
  • It is not trivial to detect beginning of the object – it could be a check if subsequent 8 bytes (or 4 in 32-bit case) forms valid MT address but this only increases such algorithm complexity. One could imagine some “marker” bytes that are allocated at the beginning of each object but this adds unnecessary memory overhead just to support theoretically rare interior pointer’s usage (and it would be really hard to define mark bytes unique enough to identify object beginning
    unambiguously).
  • All managed pointers are reported as interior pointers – so they may point to the stack and it makes no sense to find containing object in the first place (as it may point, for example, inside stack-allocated struct).

I hope you get the point that such an algorithm is impractical. Some more intelligent support is required to resolve interior pointers efficiently.

The fact is that, during GC relocation phase (when object are being compacted), interior pointers are translated into corresponding objects thanks to the bricks and plugs tree mechanism. Being crucial during the whole GC process, it is also very useful in the context of interior pointers. Given a specified address, a proper brick table entry is calculated and a corresponding plug tree traversed to find the plug (a continuous region of live objects) within which such an address lives.

bricks01

A brick table consists of brick entries, each representing 4kB region of the managed heap. Such brick entry contains an offset of the root plug info within a region. 

bricks02

Root plug info and plug info related to each of the plugs constitute a binary tree, easily searchable representation of each plug within a region.

Then, such plug is being scanned object by object to find the one that contains the considered address (plug scanning is possible because the plug starts with an object and then the following objects are easily found because object sizes are known).

Obviously, such an algorithm has its costs also. Plug tree traversal and plug scanning take some time. Dereferencing interior pointer is not trivial then. This is the second important reason why managed pointers are not allowed to live on the heap (especially as the object’s fields) – creating complex graphs of objects referenced by interior pointers would make traversing such a graph quite costly. Giving such flexibility is simply not worth the quite significant overhead it introduces.

Note. Even more, during Mark phase, plugs trees do not yet exist so interpreting interior pointers to mark corresponding objects is even more costly. The whole corresponding brick must be scanned to find an object corresponding to a given interior pointer.

Please also note that with such implementation, dereferencing the interior pointer (to know an object within which it lives) is possible only during GC, after Plan phase. Only then plug and gaps are constructed, altogether with the corresponding plug tree. Obviously, this creates only overhead during GC’s Mark phase – when such dereferencing is necessary. During normal program execution, interior pointers are just pointers – one can just read and write memory pointed by them.

Summary

Interior pointer interpretation allows some magic things to happen, dangerous at first glance. For example, we are able to return a managed pointer to a locally created class instance or an array:

This may seem to be counterintuitive – how one could return from a method reference to a single integer array element, while the array object itself seems to become unreachable? Obviously, it is not, because after such method ends, the returned interior pointer becomes the only root of the array.

The array itself is then still alive because of the interior pointer; however, we have lost the array object reference. Due to the limitation mentioned previously (bricks and plug tree availability), such a pointer cannot be at runtime “converted back” to the proper reference of the object it points to.

As a summary remark, please remember that ref variables (ref parameters, ref locals, and ref return usage) are small wrappers around managed pointers. They should not be treated as pointers obviously. They are variables! Read great “Ref returns are not pointers” article by Vladimir Sadov if you feel like needing further clarification.

In the next articles performance implications of ref variables, ref returning collections and readonly refs will be discussed.

PS. For fun, here is a generic interior pointer generator 😉

References:

https://mustoverride.com/managed-refs-CLR/
https://mustoverride.com/refs-not-ptrs/
https://adamsitnik.com/ref-returns-and-ref-locals/

4 comments

  1. “This is also why managed pointers do not offer pointer arithmetic known from raw pointers”, this is incorrect, managed pointers support pointer arithmetic, although this is not directly exposed in C#. CoreFX and CoreCLR make use of managed pointer arithmetic with the help of the Unsafe class and C# 7.3 uses that to implement https://github.com/dotnet/csharplang/blob/master/proposals/csharp-7.3/indexing-movable-fixed-fields.md

    Also a thing i’ve missed in the book is how interior pointers are marked, you kept spoiling us and then ended it in this cliffhanger (bricks and plugs make it possible to track them but we don’t have them during mark). This actually seems like the most uninteresting part to me, because after the bricks and plugs mechanism is available, they are no different from any pointer… just add the relocation offset to them and be done. Interior pointer marking however is still a mystery 🙂

    1. How interior pointer marking is still a mystery? This is exactly explained both in this article and in the book. Having an interior pointer you find a corresponding object thanks to plugs and bricks and you… mark this object. What’s more to explain here?

      1. Plugs and bricks are only available in the Plan phase which happens after the mark phase, thus the plugs and bricks mechanism cannot be used during the earlier mark phase, or am i missing something essential here?

        1. Ok, I see your point clearly now. Unfortunately, I’ve misworded this part about marking indeed and it may be misleading. I’ve just added a note here and for sure I will include it in a book’s errata.

Leave a Reply

Your email address will not be published. Required fields are marked *