Functions and Stack

About functions, their stack allocations and stack management by the system

What is stack?

Stack is part of computer memory in the RAM that is used for memory allocation and management. The advantage of using this type of allocation would be that the memory management is handled for us by the system automatically as opposed to using heap allocations.

We must note that the stack allocated is of limited size and we can easily overwhelm the stack if we don’t take precautions to use the stack efficiently. Specially in an embedded environment, a memory limited system, it is  of utmost importance that we write stack friendly code. If you don’t want to be the reckless coder human-1 in the pic below, let’s dive into the details of what call stack is and how the system manages it for us.

stackof.png

Actual details of the stack depends on the compiler, programming language , operating system and available instruction set. In high level languages, details of the usage of the stack are abstracted from the programmer who has access only to functions and not the memory. However, programmer needs to understand the ways to using stack efficiently.


Call stack in a system

In the previous post on program and memory we had seen that the variables local to function are allocated on the stack memory. System essentially sets aside a scratch space for every function or thread under execution on the stack memory referred to as the call frame or stack frame.The stack space used for this is referred to as the call stack or program stack. The “call stack” essentially holds stack frames of the all the active functions of the system : functions for which return has not been hit.

When a function is called, system performs below sequence of operations :

  1. Take back up of any registers required (ex: EAX, ECX, EDX)
  2. Arguments evaluated to values (parameters passed)
  3. Control flow jumps to body of the function
  4. Code execution begins
  5. Once “return” is encountered, control returns back to the function call

During this process, each stack frame in the call stack is used to hold:

  1. Parameter passed to the function
  2. The return address
  3. Local variables of a function under execution
  4. Evaluation stack in case of complex processing in the function
  5. Pointer to current context
  6. Exception handlers – in cases where try and catch may be used for some exception handling.
st1.png

Fig (1) Stack frame

And above set n Fig(1)  forms a “stack frame” associated with a “function call”. Please note that the items listed need not be in the same order.

System Stack Management

As the data structure we had looked at in older blog , the call stack also supports push and pop operations. It can grow downwards from higher address towards lower address or vice versa depending on the system implementation. The system uses two pointers for stack management :

a. Extended stack pointer (ESP) and

b. Extended Base Stack Pointer (EBP) or the frame pointer

Additionally, EIP is the Extended Instruction pointer, that points to next instruction to be executed in the program memory. This value is copied on to stack – to keep track of the return address.

ESP is like a mutable register shared between all function invocations. ESP always points to the top of the stack – so ESP essentially keeps changing with added function calls.

EBP is maintained by each stack frame. It is set equal to current ESP before allocating the stack frame for any new function call. As the name implies, base pointer is used like a fixed point from which data within stack frame can be accessed using an offset.Once the function execution is complete, the stack pointer that has progressed is set back to EBP in the current stack frame and the memory that was allocated for the function execution is considered available (note that memory is not cleaned per se, so it may contain garbage values from previous allocations). I assume an example would clarify the statements here.


Example stack operation

Let’s consider an example program below. With a gcc compiler on a Linux system, we would see the stack usage depicted in Fig (2) below.

int main() {

int localVar1;

int a=10,b=20;

localVar1 = computeSomething(a,b);

printf(“Result is %d”, localVar1);

return;

}

int computeSomething(int c, int d)

{

int localVar2; // local variable in the function

localVar2 = c;

int result = (c+d)-localVar2;

//or perform some other operation on arguments c ,d  and return

return result;

}

Stack allocation associated with the code block above is depicted in the left part, while corresponding stack de-allocation is depicted on the right. Blocks colored light grey are part of caller’s stack frame while the blocks colored in yellow are part of callee’s stack frame.

PART I : ALLOCATION

(also called stack management : procedure prolog)

[1] Assume the green line at the bottom is the current stack pointer location when main() starts execution. In the example below main() is the caller function and computeSomething() is the callee.

[2] When main() invokes computeSomething(), new stack frame is to be allocated for the function execution. The parameters passed to the function are ‘pushed’ on to stack from the last to first argument, to begin with. It is also responsibility of ‘caller’ to take a back up of the registers that may be edited by callee execution- namely EAX, EDX and ECX.

[3] Also, before starting execution of the function, the next instruction to be executed (EIP) at that point is saved as return address on the stack.

[4] EBP in computeSomething()’s stack frame is set to be main()’s EBP. Then, the execution scope shifts to computeSomething(). Stack frame contents like the local variables, arguments passed to function are accessed using an address offset from the EBP in the corresponding frame. i.e., EBP is fixed while ESP keeps moving to point at the top of the stack, remember?

[5]Before starting execution of computeSomething(), a back-up of registers that callee is responsible for, is taken and stored in it’s stack frame- namely EBX, ESI and EDI.

[6] Once in the callee scope, local variables are allocated, there may or may not be temporary storage used – say some complicated expressions are to be evaluated.

[7] If the return value of the callee is greater than 4 bytes, an extra variable pointer is passed to the function from the caller.

Ex: computeSomething(&extra,a,b) (instead of computeSomething(a,b).

st3.png

Fig (2) Stack allocation and de-allocation with function calls

PART II : DE-ALLOCATION

(also called stack management : procedure epilog)

[1] Once computeSomething()’s execution is complete and return is hit, the stack pointer- ESP is set to EBP of the frame- memory is not freed as such- so will contain garbage.  The registers (EBX, ESI, EDI) are reverted back.

[2] The execution comes back to caller’s context () – return address points to the next instruction to be executed after the function. Registers EAX,EDX and ECX are reverted to what they were before function call.

[3]If return is hit in the main() as well, then ESP is reset to EBP in the stack frame and the ESP returns to where it started – the GREEN line where stack pointer was when we started execution of main().

Please note that stack cannot grow infinitely. Stack is limited memory.

And with that, we are done analyzing how stack can grow or shrink with function calls and execution!


Stack overflow

Now that we have looked at the operation of call stack, it would be useful to look at how we could overwhelm it through our code (depicted in Fig(3) below). If stack pointer crosses the stack size limit marked with a red line- we hit the “stack overflow” issue. Example scenarios how this can happen are listed  below:

  • Trying to allocate data structure more than the allocated stack size.
  • Recursive function with no exit clause – stack frames are allocated for every instance of recursion and eventually exceeds stack size.
  • Infinite loop with some memory allocation within. In this case the local variable allocations add to stack frame and eventually exceed stack limit (not shown below).
st4.png

Fig(3) Stack overflow scenarios


Comparing with Heap allocations

In multi-thread environment, each thread is given a stack while all threads share heap space. Stack allocation and de-allocation are considered inexpensive as compared to allocations on the heap. This is because, with stack allocation and de-allocation is achieved by just moving a pointer , by virtue of LIFO operation of the stack data structure and requirements of memory management between function calls.  Whereas, with heap the allocation and de-allocation do not follow a given pattern and require book keeping and much more than just moving pointer to adjust.


Above is true for stack based programming that utilizes system stack for operations. So, if you are thinking , can there be stack-less implementations ? check this : StacklessPython

In any case the key to efficient programming is to essentially understand how system operates internally, which is what we tried today. Happy coding!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: