Have you ever wonder what happens when you create and use breakpoints in .NET? Here’s a little picture that answers that question (if you don’t like the font, you have a different version at the bottom).
We have the main actors here as follows:
- .NET Application – our regular .NET application that we want to debug. Methods, as provided by the compiler in the Intermediate Language form (IL) are Just-in-Time compiled to a native code when called. So, imagine our “e8 50 ff ff” represents an example binary code of a line we want to debug (no matter what it does now)
- Debugger Runtime Control Thread (hereinafter referred to as Debugger RC Thread) – it is a special thread inside every .NET process for debugging purposes and serves as a bridge between the CLR and an external debugger. It consists of a so-called “debugger loop”, listening on events coming from the Debug Port (supported by the OS). Please note that in case of native debugging, such a special thread is typically injected into the debuggee process. But we don’t need to do that here, as .NET runtime provides it. And moreover, this thread understand the CLR data structures, so it is able to cooperate with JIT and so.
- external Debugger – it is our external process that we cooperate with. Imagine it as a tooling part of Visual Studio or other IDE you use. It is using a set of COM objects that are able to communicate via Inter-process communication (IPC) mechanism with Debugger Runtime Control Thread.
Testing shows the presence of errors in a product, but “cannot prove that there are no defects” – you probably know that quote. I remember so many hours spent on debugging those little, mean bugs hidding deeply in the code edge cases. But what’s worse, I remember even more hours trying to understand and reproduce an error that happens only in production environment. Here’s the first top 5 most popular issues I’ve met during last years:
- app hangs due to deadlocks (in the app or external library)
- memory issues like memory leaks, long GC pauses or high CPU usage due to the GC
- swallowed exception preventing some logic, with no logs available
- threading issues like thread-pool starvation
- intermittent errors due to the resources shortage, like running out of sockets or file handles
BTW. And what’s yours top 5?
As a part of my consultancy job, I have a pleasure to help various customers with problems that could be described collectively as GC-related (or memory-related in general). One day Tamir Dresher from Clarizen company (BTW, an author of Rx.NET in Action) contacted me with such an extremely interesting message (emphasis mine):
We are experiencing a phenomenon of GC duration of 15 minutes in our backend servers. (…) Do you think we can have a session with you and perhaps you’ll have ideas on how to find the root cause?
15 minutes! That’s an infinity! If we see something like this, one thought comes to mind – something really serious must be happening there! As nowadays most of such problems may be diagnosed remotely, after signing NDAs we could go straight into attacking the problem. Clarizen has provided a very well-prepared and concise summary of their architecture and current findings.Continue reading