.NET JIT compiler is not type safe

cilvalid

Everyone knows that C# is a strongly typed language and incorrect type usage is simply not possible there. So, the following program will just not compile:

That’s good, it means we can trust Roslyn (C# compiler) not to generate improper type-safety code. But what if we rewrite the same code to the Common Intermediate Language, omitting completely C# and its compiler?

First of all, it will be assembled by ILASM tool without any errors because it is a syntactically correct CIL. And ILASM is not a compiler, so it will not do any type checks on its own. So we end up with an assembly file with a smelly CIL inside. If not using ILASM, we could also simply modify CIL with the help of any tool like dnSpy.

Ok, let’s say that is fine. But what will happen when we try to execute such code? Will .NET runtime verify somehow the CIL of those methods? Just-In-Time compiler for sure will notice type mismatch and do something to prevent executing it, right?

What will happen is the program will just execute without any errors and will print 4 (the length of “Test”) followed by… 0 in a new line. The truth is that JIT or any other part of .NET runtime does not examine type safety.

Why the result is 0? Because when the JIT emits native code of a particular method, it uses type layout information of the data/types being used. And it happens that string.Length property is just an inlined method call that access the very first int field of an object (because string length is stored there):

As we pass a newly created object instance, which always has one pointer-sized field initialized to zero (this is a requirement of the current GC), the result is 0.

And yes, if we pass a reference to an object with some int field, its value will be returned (again, instead of throwing any type-safety related runtime exception). The following code (when converted to CIL) will execute with no errors and print 44!

This all may be quite suprising, so what ECMA-335 standard says about it? Point “II.3 Validation and verification” mentions all CIL verification rules and algorithms and states:

“Aside from these rules, this standard leaves as unspecified:

  • The time at which (if ever) such an algorithm should be performed.
  • What a conforming implementation should do in the event of a verification failure.”

And:

“Ordinarily, a conforming implementation of the CLI can allow unverifiable code (valid code that does not pass verification) to be executed, although this can be subject to administrative trust controls that are not part of this standard.”

While indeed .NET runtime does some validation, it does not verify the IL. The difference? If we run the following code:

It will end up with System.InvalidProgramException: Common Language Runtime detected an invalid program. being thrown. So, we can summarize it as the fact that invalid CIL code may trigger InvalidProgramException for some cases, but for others will just allow the program to execute (with many unexpected results). And all this may happen only during JIT compilation, at runtime.

So, what can we do to protect ourselves, before deploying and running it on production? We need to verify our IL on our own. There is PEVerify tool for exactly that purpose, shipped with .NET Framework SDK. You can find one in a folder similar to c:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\.

When running against our example, it will indeed detect an incorrect method with a proper explanation:

The only problem with PEVerify is… it does not support .NET Core.

What for .NET Core then? There is ILVerify, a cross-platform, open source counterpart of it developed as a part of CoreRT runtime (although it supports analyzing both .NET Framework and .NET Core). Currently, to have it working we need to compile the whole CoreRT (How to run ILVerify? issue #6198) OR you can use unofficial Microsoft.DotNet.ILVerification package to write your own command line tool (inspired by the original Program.cs).

So, nothing officially supported and shipped with the runtime itself, yet. And BTW, there is ongoing process to make Roslyn IL verification fully working as well.

Sidenote

The previous example was a little simplified because ConsumeString(string) called a virtual get_Length method on a sealed string type, so it was aggressively inlined. If we experiment with regular virtual method on a not sealed type, things become more intermittent because now the call is using virtual stub dispatch mechanism. In the following example (again, if rewritten to CIL), how Consume will behave depends on what we have passed as an argument and where the pointers of VSD will follow (most likely, triggering access violation).

Conclusions

  • if you do write in CIL, to have more power in hands (like using Reflection.Emit, manipulate CIL fore the code weaving or any other magic like the whole Unsafe class), please be aware of the difference between validation and verification. And verify your assembly on your own, as JIT compiler will not do it!
  • if you do want to trust your app FULLY, run IL verification before executing it. Probably it could be even added to you CI pipeline as an additional check – you may trust your code but not someone else code (and the code modified by the tools you use). And yes, it is not straightforward currently in .NET Core case.

Subscribe to my mailing list dedicated for .NET performance and internals related stuff!

Please select all the ways you would like to hear from Konrad Kokosa:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

7 comments

  1. Pingback: dotnetomaniak.pl
  2. Interesting post! Do you know if any of the IL emitting helper libraries, like Sigil or Gremit, would catch this error at development-time?

  3. So, what can we do to protect ourselves, before deploying and running it on production?

    Simple, don’t develop in IL. It’s not meant for human consumption.

  4. Why, instead of producing bytecode, Microsoft does not adopt the concept of C or Cobol – Write once, compile anywhere – producing an executable according to the operating system it is intended for. The import of the Java language concept – Write once, run anywhere – I think it doesn’t make sense if there is a language standard. That is, if the standard is respected, the compilation generates native code for that OS, thus ensuring that the program runs.
    In this way, the problem would cease to exist. We would have strong code verification and validation.
    It’s just my opinion, however I really liked your article. I had no idea this was possible.
    Sorry my english.

Leave a Reply

Your email address will not be published. Required fields are marked *