A few months ago I wrote an article about Zero GC in .NET Core 2.0. This proof of concept was based on a preview version of .NET Core 2.0 in which a possibility to plug in custom garbage collector has been added. Such “standalone GC”, as it was named, required custom CoreCLR compilation because it was not enabled by default. Quite a lot of other tweaks were necessary to make this working – especially including required headers from CoreCLR code was very cumbersome.
However upcoming .NET Core 2.1 contains many improvements in that field so I’ve decided to write follow up post. I’ve also answered one of the questions bothering me for a long time (well, at least started answering…) – how would real usage of Zero GC like in the context of ASP.NET Core application?
.NET Core 2.1 changes
Here is a short summary of most important changes. I’ve updated CoreCLR.Zero repository to reflect them.
- first of all, as previously mentioned, now standalone GC is pluggable by default so no custom CoreCLR is required. We will be able to plug our custom GC just by setting a single environment variable:
1set COMPlus_GCName=f:\GithubProjects\CoreCLR.ZeroGC\x64\Release\ZeroGC.dll - as standalone GC matured, documentation in CoreCLR appeared
- a great improvement is that code between library implementing standalone GC and CoreCLR has been greatly decoupled. Now it is possible to include only a few files directly from CoreCLR code to have things compiled:
123#include "debugmacros.h"#include "gcenv.base.h"#include "gcinterface.h"
Previously I had to create my own headers with some of the declarations from CoreCLR copy-pasted which was obviously not maintanable and cumbersome. - loading path has been refactored slightly. InitializeGarbageCollector inside CoreCLR calls GCHeapUtilities::LoadAndInitialize() with the following code inside:
12345678910LPWSTR standaloneGcLocation = nullptr;CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_GCName, &standaloneGcLocation);if (!standaloneGcLocation){return InitializeDefaultGC();}else{return LoadAndInitializeGC(standaloneGcLocation);}
Inside LoadAndInitializeGC there is a brand new functionality – verification of GC/EE interface version match. It checks whether version used by standalone GC library (returned by GC_VersionInfo function) matches the runtime version – major version must match and minor version must be equal or higher. Additionaly, GC initialization function has been renamed to GC_Initialize. - core logic of my the poor man’s allocator remained the same so please refer to the original article for details
ASP.NET Core 2.1 integration
As this CoreCLR feature has matured, I’ve decided do use standard .NET CLI instead of CoreRun.exe. This allowed me to easily test the question bothering me for a long time – how even the simplest ASP.NET Core application will consume memory without garbage collection? .NET Core 2.1 is still in preview so I’ve just used Latest Daily Build of .NET CLI to create WebApi project:
1 |
> f:\dotnetcli\dotnet new webapi -o CoreCLR.WebApi |
I’ve modified Controller a little to do something more dynamic that just returning two string literals:
1 2 3 4 5 |
[HttpGet] public IEnumerable<string> Get() { return new string[] { DateTime.Now.ToLongTimeString(), "value2" }; } |
Additionally, I’ve disabled Server GC which is enabled by default. Obviously setting GC mode does not make sense as there is no GC at all, right? However, Server GC crashes runtime because GC JIT_WriteBarrier_SVR64 is being used which requires valid card table address – and there are no card tables either 🙂
Then we simply compile and run, remembering about the environment variable:
1 2 3 |
> f:\dotnetcli\dotnet build -c Release > set COMPlus_GCName=f:\GithubProjects\CoreCLR.ZeroGC\x64\Release\ZeroGC.dll > f:\dotnetcli\dotnet run -c Release |
Everything should be running fine so… congratulations! We’ve just run ASP.NET Core application on .NET Core with standalone GC plugged in which is doing nothing but allocating.
Benchmarks
I’ve created the same WebApi via regular .NET Core 2.0 CLI for reference. Then via SuperBenchmarker I’ve started simple load test: 10 concurrent users making 100 000 requests in total with 10 ms delay between each request.
.NET Core 2.1 with Zero GC:
.NET Core 2.0:
As we can see classic GC from .NET Core was able to process slightly more requests (357.8 requests/second) comparing to version with Zero GC plugged in. It does not surprise me at all because my version uses the most primitive allocation based on calloc. I’m quite surprised that Zero GC is doing so well after all. However, this is not so interesting because I assume that replacing calloc with a simple bump a pointer allocation would improve performance noticeably.
What is interesting is the memory usage over time. As you can see in the chart below, after a minute of such test, the process using Zero GC takes around 1 GB of memory. This is… quite a lot. Not sure yet how to interpret this. Version with regular GC ended with a stable 120 MB size. Both started from fresh run.
This would mean that each REST WebApi requests triggers around 55 kB of allocations. Any comments will be appreciated here…
Update 30.01.2018: After debugging allocations during single ASP.NET requests, most of them comes from RouterMiddleware. This is no surprise as currently this application does almost nothing but routing… I’ve uploaded sample log of such single request which seems to be minimal (others are allocating some buffers from time to time). It consumes around 7 kB of memory.
Could you do the test in .NET Core 2.1 without ZeroGC so we can see the difference between 2.0 and 2.1
I will, probably in a few days, write a post about those memory allocations and I’m going to add such comparison there.
“the process using Zero GC takes around 1 GB of memory. This is… quite a lot. Not sure yet how to interpret this.”
Who would’ve guessed that when you disable the thing that doesn’t allow a program to use a lot of memory, the program uses a lot of memory.
I am not surprised that the memory has grown, just surprised at how quickly. If you know why, I’d like to listen.
I assume it’s safe to say the entire instance of the 2.1 CLR is using zero garbage collector? Could there be other 2.1 processes adding to the overhead?
Nope, there was only this single process. I’ve double checked that.
kestrel isn’t adding anything on its end?
In fact, it is, as it can be seen in the attached log mentioned at the end of an article.
Nice. I have written a post on Asp .Net Core 2.1 features which you can find here: https://neelbhatt.com/2018/02/06/asp-net-core-2-1-features/
Do you know if it is possible to override certain methods of the default GC? For example use all the defaults, but simply return when GC is executed? (Which would in effect be a ZeroGC that has all of the default characteristics except performing actual GC)
AFAIK currently it is not possible to override only certain methods of the default GC. It would be a nice exercise to extract all this juicy GC code as standalone GC though. I bet GC team have something like that prepared as a prototype for testing:)
I don’t know, what is on the graphic and how does information about memory is retrieved, but don’t you think, that every calloc per request in your GC makes linear grow, when in normal GC, where “free” is implemented, memory for new requests is just reallocated in old memory, that marked as “free” and information about this allocations is not passed to the utillity, which you use to make the graphic?
> don’t you think, that every calloc per request in your GC makes linear grow
Absolutely yes. This is why memory is growing and it is an expected behavior as nothing is freeing that memory. I was only quite surprised by the magnitude of this growth – I thought less memory is allocated per single request.
I guess this could be useful as a debugging garbage collector. See where you are allocating too many objects, or you are just using too much string concatenation instead of using a string builder.
How about removing the noise of return new string[] { DateTime.Now.ToLongTimeString(), “value2” }; by statically generate a string[] to only look at the stack-footprint?