Imagine we want to create a method that generates an array composed by square values of its indexes and the only parameter we want to pass to this method is the size of the array. A straight-forward approach would be this:
We could have displayed the values of the array in the first “for” loop but the second one is just here to ensure that the result is the one expected (we’ll remove it when we start inspecting the memory).
As you know, the type “int” is an array, so under the hood, an object is created on the “heap” along with a pointer on the stack to reference it. Then, each access to that array in the “for” loop will just access and update this “heap” object through the “stack” pointer (it will be clearer at the end of the post… hopefully…).
To be sure that I’m not just saying random stuff to make this post sounds really cool, we’ll use “Visual Studio” to check this. Start by removing the second “loop” in order not to pollute memory because of the “Console.WriteLine”. Now, add a breakpoint at line 3 and another one at line 14 and start your application (in debug mode).
Before being able to analyze anything, we need to display some useful “Visual Studio” windows. The first one is the one used to see what what’s in the memory: “Build > Windows > Memory > Memory 1”. The second one (that should normally appear by default) is the diagnostic one: “Debug > Windows > Show Diagnostic Tools”.
Now, we are going to use the “diagnostic tool” to make snapshot of the memory in order to compare it before and after the execution of our code. To do so, click on the “Take a snapshot” button in the memory usage tab of the “diagnostic tools”:
Now, let’s press “F5” to hit the second breakpoint and take another snapshot.
What do we see here? It seems that there is one more object on the “heap” between the moment the first breakpoint has been hit and the moment the second breakpoint has been hit (olala, what a coincidence ^^). Click on “+1” to open another window where the new object will be found. Click on the “Count diff.” column to sort by the number of new objects. If you only get “0” everywhere, be sure that you unchecked the checkbox “Just my code” accessible through the filter icon next to the “Compare to:” dropdown. If you did everything correctly, you should get something like this:
Mmmmh, it seems that there is one more “Int32” object on the heap, interesting 🙂 This object is of course ours. If you want to be sure, just double click on that line to see all the objects of that type on the “heap” at the current moment and you’ll see that there is a “int”. Moreover, if you hover it, you’ll even be able to see its content:
So now, it’s proven, our method really created an object on the “heap” to contain our value. What? You’re still not convinced? You think that it could be a coincidence that an array with the exact values we’re expecting exist in the memory and we got lucky to find it (man, you’re difficult to convince ^^). Okay, okay… Let me really convince you. Still in debug mode, open the “immediate window” and type “&array”:
The “&” operator used like that returns the memory address of the variable it is applied to. So our “array” is located at the address “0x00000027b861d790” (in the stack). Indeed, the “&” operator returned us the address of the pointer in the stack to the “heap”. Moreover, the “immediate window” also tells us that this pointer contains the value “170630713464” which ins hexadecimal equals “0x27BA621078” and… Well… Let’s have a new look at the object we found in the “heap”:
The first column is actually the address of the object in the “heap” and… Oh my god ! It’s the same that the one contained in the pointer that we found on the stack at the address pointing to the “array” variable. So, it is our array 😀
I don’t want to use the heap!
However, allocating memory in the “heap” is a relatively costly operation and it could be that this method is a critical part of your application that needs to perform fast. If this is the case and if you know that the size of the array will always be pretty small, it might be a good idea to allocate the array on the stack instead of the heap. This way, no allocation would be done on the “heap”, improving the performance of your method. Why did I say pretty small? Simply because you will allocate the array on the stack and if this one is too big, the stack will get overflowed and we don’t want that.
It’s for that matter that you can only do that in a unsafe context and you have different options to create such a context. You can either create an unsafe block in a method:
But you can also define it at method level by using the “unsafe” keyword in the method declaration:
Moreover, you have to specify to the compiler that you want to use “unsafe” code in your program by specifying the flag “/unsafe” during compilation. You can also do this by adding:
In the “PropertyGroup” of your build configuration (in the “csproj” file of your project) or by checking the checkbox “Allow unsafe code” in the build section of your project properties (which simply adds the “XML” tag in the “csproj” file).
Now, we can implement a new version of the method above using the stack instead of the “heap”. To do so, we’ll use the “stackalloc” keyword. Even though this operator exists for a long time now, it has been pimped with C# 7.3 to allow the allocation of arrays. To ensure that you’re using this version of C#, you need to add:
In the “PropertyGroup” of your build configuration (again in the “csproj” file of your project). First, we’ll start by creating the method as “unsafe”. We could have used a “unsafe” block in a normal method but as the whole method body will be unsafe, it’s easier to make it as unsafe:
Then we create our array on the stack via the “stackalloc” keyword. This one is used exactly like the “new” operator:
If you ever did some “C/C++”, you probably started screaming and crying at the same time when you noticed the “*” symbol… “aaaah ! noooo, not pointers, nooooo !“. Indeed, most of the time, if you use unsafe code, it’s because you are going to play with pointers. As a reminder, a pointer is simply a container for a memory address. In this case, “array” simply contains a memory address. In our case, it’s the address of the array on the stack.
Now, let’s fill up our array with the correct values and display them:
As we need to access the different elements of the array, we are going to use another pointer “p” (more on that in a moment). Then we use a “for” loop to execute a block of code the number of times defined by the “length” variable. Note that besides “i”, we also increment the value of “p”, then, in the body of the loop, we update the value located at the address contained in “p” with the result of “i * i”.
When you use a pointer to an array, using the “++” operator without the “*” one will simply return the address to the next element of the array. This is why we used the “p” pointer instead of the “array” one. If we did that, at the end of our loop, the “array” pointer would have pointed to the element after the last one of the array instead of pointing to the beginning of the array. It’s not dramatic but we could need this array later in the method, so it’s better to preserve its beginning address somewhere.
Finally, when you use a pointer, you can simply update the value it points to by using the “*” operator. If we call this method, we get exactly the same result as the previous one… At least, in appearance because under the hood, it will be totally different. Indeed, as we said, this time, the array has been allocated on the stack instead of the “heap” but as you’re really not someone credulous, I’m going to prove it to you.
Like for the first method, start by removing the “loop” used to display the values, add a breakpoint at the line declaring your array and one at the line containing the closing bracket of your method, then run your code in debug mode. Finally, take a first memory snapshot when hitting the first breakpoint, then another one when you hit the second breakpoint.
This time, we can see that the number of objects on the “heap” between the two breakpoints are actually the same, meaning no object have been created on the “heap”. You still don’t believe me? Move your first breakpoint to the next line and relaunch the debug mode. When this breakpoint is hit, you should be able to hover the “array” pointer to see its content.
Let’s copy this value and paste it in the memory window of “Visual Studio”.
This is actually the content of the stack in memory. As we can see, we only have a bunch of “0”, so it could be anything. Press “F5” and see what happened to this window.
The red parts of the memory are the ones that got modified between the two breakpoints, but what could it be? Let’s convert the few first ones from hexadecimal to decimal… Just for fun…
- 01 => 1
- 04 => 4
- 09 => 9
- 10 => 16
- 19 => 25
- 24 => 36
Wait a second… I know this sequence. Ah yeah, it’s exactly the same as the one previously displayed by the “for” loops used to display the content of the arrays. So does that mean that… Yep… Indeed, the memory above is actually showing us our array on the stack. Isn’t it awesome?
What about performances?
I keep saying that allocating objects on the “heap” has a cost compared to doing it on the stack, but is it really true? Well, let’s test it.
When it comes to bench-marking, I’m not a pro, but this code is pretty straightforward. Basically, it will execute our two methods 5 000 000 times. I even start by testing the unsafe one to be sure that we don’t get some “warm up” perturbation for the “safe” one. Note that I also execute both methods once also to avoid these kind of issues (I’m not sure it helps but it does not hurt ^^). Here is the result:
As you can see, using the “unsafe” method is “1.6” seconds faster then using the other one. Not bad…
This is pretty awesome but you’re probably thinking “Ok, but I don’t care… Why would I use a language like C# if I want to bother using pointers and other complex stuff like that? I want programming to be easy!“. And in a sense, you’re right.
However, this keyword is not there just for fun. It could happen that you develop a tool (say a highly performant web server that you would call… Mestrel…) for which you need some crucial parts to be as performant as possible. In this specific case, “stackalloc” can become very useful. As we said, allocating new objects on the “heap” and keeping track of these objects is an “expensive” operation in C#, so using the stack instead of the “heap” for small objects can make a big difference in the performances of your code.