debuggers02

Have you ever wonder what happens when you create and use breakpoints in .NET? Here’s a little picture that answers that question (if you don’t like the font, you have a different version at the bottom).

We have the main actors here as follows:

  • .NET Application – our regular .NET application that we want to debug. Methods, as provided by the compiler in the Intermediate Language form (IL) are Just-in-Time compiled to a native code when called. So, imagine our “e8 50 ff ff” represents an example binary code of a line we want to debug (no matter what it does now)
  • Debugger Runtime Control Thread (hereinafter referred to as Debugger RC Thread) – it is a special thread inside every .NET process for debugging purposes and serves as a bridge between the CLR and an external debugger. It consists of a so-called “debugger loop”, listening on events coming from the Debug Port (supported by the OS). Please note that in case of native debugging, such a special thread is typically injected into the debuggee process. But we don’t need to do that here, as .NET runtime provides it. And moreover, this thread understand the CLR data structures, so it is able to cooperate with JIT and so.
  • external Debugger – it is our external process that we cooperate with. Imagine it as a tooling part of Visual Studio or other IDE you use. It is using a set of COM objects that are able to communicate via Inter-process communication (IPC) mechanism with Debugger Runtime Control Thread.

Continue reading

poh01

In the upcoming .NET 5 a very interesting change is added to the GC – a dedicated Pinned Object Heap, a very new type of the managed heap segment (as we have Small and Large Object Heaps so far). Pinning has its own costs, because it introduces fragmentation (and in general complicates object compaction a lot). We are used to have some good practices about it, like “pin only for…:

  • a very short time” so, the GC will not bother – to reduce probability that the GC happens while many objects were pinned. That’s a scenario to use fixed keyword, which is in fact only a very lightweight way of flagging particular local variable as a pinned reference. As long as GC does not happen, there is no additional overhead.
  • a very long time”, so the GC will promote those objects to generation 2 – as gen2 GCs should be not so common, the impact will be minimized also. That’s a scenario to use GCHandle of type Pinned, which is a little bigger overhead because we need to allocate/free handle.

However, even if applied, those rules will produce some fragmentation, depending how much you pin, for how long, what’s the resulting layout of the pinned objects in memory and many other, intermittent conditions.

So, in the end, it would be perfect just to get rid of pinned objects and move them to a different place than SOH/LOH. This separate place would be simply ignored, by the GC design, when considering heap compaction so we will get pinning behaviour out of the box.Continue reading

cilvalid

Everyone knows that C# is a strongly typed language and incorrect type usage is simply not possible there. So, the following program will just not compile:

That’s good, it means we can trust Roslyn (C# compiler) not to generate improper type-safety code. But what if we rewrite the same code to the Common Intermediate Language, omitting completely C# and its compiler?

First of all, it will be assembled by ILASM tool without any errors because it is a syntactically correct CIL. And ILASM is not a compiler, so it will not do any type checks on its own. So we end up with an assembly file with a smelly CIL inside. If not using ILASM, we could also simply modify CIL with the help of any tool like dnSpy.

Ok, let’s say that is fine. But what will happen when we try to execute such code? Will .NET runtime verify somehow the CIL of those methods? Just-In-Time compiler for sure will notice type mismatch and do something to prevent executing it, right?

What will happen is the program will just execute without any errors and will print 4 (the length of “Test”) followed by… 0 in a new line. The truth is that JIT or any other part of .NET runtime does not examine type safety.

Why the result is 0? Because when the JIT emits native code of a particular method, it uses type layout information of the data/types being used. And it happens that string.Length property is just an inlined method call that access the very first int field of an object (because string length is stored there):

As we pass a newly created object instance, which always has one pointer-sized field initialized to zero (this is a requirement of the current GC), the result is 0.

And yes, if we pass a reference to an object with some int field, its value will be returned (again, instead of throwing any type-safety related runtime exception). The following code (when converted to CIL) will execute with no errors and print 44!

This all may be quite suprising, so what ECMA-335 standard says about it? Point “II.3 Validation and verification” mentions all CIL verification rules and algorithms and states:

“Aside from these rules, this standard leaves as unspecified:

  • The time at which (if ever) such an algorithm should be performed.
  • What a conforming implementation should do in the event of a verification failure.”

And:

“Ordinarily, a conforming implementation of the CLI can allow unverifiable code (valid code that does not pass verification) to be executed, although this can be subject to administrative trust controls that are not part of this standard.”

While indeed .NET runtime does some validation, it does not verify the IL. The difference? If we run the following code:

It will end up with System.InvalidProgramException: Common Language Runtime detected an invalid program. being thrown. So, we can summarize it as the fact that invalid CIL code may trigger InvalidProgramException for some cases, but for others will just allow the program to execute (with many unexpected results). And all this may happen only during JIT compilation, at runtime.

So, what can we do to protect ourselves, before deploying and running it on production? We need to verify our IL on our own. There is PEVerify tool for exactly that purpose, shipped with .NET Framework SDK. You can find one in a folder similar to c:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\.

When running against our example, it will indeed detect an incorrect method with a proper explanation:

The only problem with PEVerify is… it does not support .NET Core.

What for .NET Core then? There is ILVerify, a cross-platform, open source counterpart of it developed as a part of CoreRT runtime (although it supports analyzing both .NET Framework and .NET Core). Currently, to have it working we need to compile the whole CoreRT (How to run ILVerify? issue #6198) OR you can use unofficial Microsoft.DotNet.ILVerification package to write your own command line tool (inspired by the original Program.cs).

So, nothing officially supported and shipped with the runtime itself, yet. And BTW, there is ongoing process to make Roslyn IL verification fully working as well.

Sidenote

The previous example was a little simplified because ConsumeString(string) called a virtual get_Length method on a sealed string type, so it was aggressively inlined. If we experiment with regular virtual method on a not sealed type, things become more intermittent because now the call is using virtual stub dispatch mechanism. In the following example (again, if rewritten to CIL), how Consume will behave depends on what we have passed as an argument and where the pointers of VSD will follow (most likely, triggering access violation).

Conclusions

  • if you do write in CIL, to have more power in hands (like using Reflection.Emit, manipulate CIL fore the code weaving or any other magic like the whole Unsafe class), please be aware of the difference between validation and verification. And verify your assembly on your own, as JIT compiler will not do it!
  • if you do want to trust your app FULLY, run IL verification before executing it. Probably it could be even added to you CI pipeline as an additional check – you may trust your code but not someone else code (and the code modified by the tools you use). And yes, it is not straightforward currently in .NET Core case.

Subscribe to my mailing list dedicated for .NET performance and internals related stuff!

Please select all the ways you would like to hear from Konrad Kokosa:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

TL;DR – would be post-mortem finalization available thanks to phantom references useful in .NET? What is your opinion, especially based on your experience with the finalization of your use cases? Please, share your insights in comments!

Both JVM and CLR has the concept of finalizers which is a way of implicit (non-deterministic) cleanup – at some point after an object is recognized as no longer reachable (and thus, may be garbage collected) we may take an action specified by the finalizer – a special, dedicated method (i.e. Finalize in C#, finalize in Java). This is mostly used for the purpose of cleaning/releasing non-managed resources held by the object to be reclaimed (like OS-limited, and thus valuable, file or socket handles).

However, such form of finalization has its caveats (elaborated in detail below). That’s why in Java 9 finalize() method (and thus, finalization in general) has been deprecated, which is nicely explained in the documentation:

“Deprecated. The finalization mechanism is inherently problematic. Finalization can lead to performance issues, deadlocks, and hangs. Errors in finalizers can lead to resource leaks; there is no way to cancel finalization if it is no longer necessary; and no order is specified among calls to finalize methods of different objects. Furthermore, there are no guarantees regarding the timing of finalization. The finalize method might be called on a finalizable object only after an indefinite delay, if at all.”

Continue reading

Tune screenshot

I would like to present you a new tool I’ve started to work on recently. I’ve called it The Ultimate .NET Experiment (Tune) as its purpose is to learn .NET internals and performance tuning by experiments with C# code. As it is currently in very early 0.2 version, it can be treated as Proof Of Concept with many, many features still missing. But it is usable enough to have some fun with it already.

The main way of working with this tool is as follows:

  • write a sample, valid C# script which contains at least one class with public method taking a single string parameter. It will be executed by hitting Run button. This script can contain as many additional methods and classes as you wish. Just remember that first public method from the first public class will be executed (with single parameter taken from the input box below the script). You may also choose whether you want to build in Debug or Release mode (note: currently it is only x64 bit compilation).
  • after clicking Run button, the script will be compiled and executed. Additionally, it will be decompiled both to IL (Intermediate Language) and assembly code in the corresponding tab.
  • all the time Tune is running (including time during script execution) a graph with GC data is being drawn. It shows information about generation sizes and GC occurrences (as vertical lines with the number below showing which generation has been triggered).

Continue reading

Book Cover

 

I don’t know if you have a driving license. Even if not, you will surely understand this sublime analogy presented by me. Sometimes, you drive a car and know traffic regulations. You can drive this manner successfully for the whole life. However, if you aim to be a professional driver, an old stager, it takes whole days in a car and sooner or later you get dirty with grease by working with an engine. This is how I see the .NET developer’s life. The vast majority may ride „very well”, knowing the syntax, design patterns, tricks of trade. They are professionals. Yet, there is a small group of geeks, nerds, old stagers, that are looking for something more. They want to understand CLR internals, know how does everything work, how to dig in into memory and how to use raw tools such as WinDbg. I consider this is plainly speaking a mind-absorbing occupation and can draws everybody’s attention, even for a while. The popularity of devWorkshops delivered in Poland  by me and Sebastian Solnica confirms this presumption.Continue reading