x86-64 Calling Conventions

Photo by Patrick on Unsplash
Photo by Patrick on Unsplash
Calling conventions refers to the specifications that the two functions should follow when one function calls another function. For example, how to pass parameters and a return value ​​between them. Calling conventions are part of the application binary interface (ABI).

Calling conventions refers to the specifications that the two functions should follow when one function calls another function. For example, how to pass parameters and a return value ​​between them. Calling conventions are part of the application binary interface (ABI).

Registers

x86-64 has 16 general purpose registers, as shown in the table below. The general purpose registers refer to the registers used for general purposes when writing an assembly. However, the C language uses some general purpose registers for special purposes. When a C function named caller calls another C function named callee, the caller will store the parameters in certain registers such as %rdi and %rsi, and the callee can obtain the parameters through these registers. Therefore, the caller and callee must know which registers are used to pass parameters, and their order. These specifications are calling conventions.

The first six parameters of the callee are passed by the caller to the callee through %rdi, %rsi, %rdx, %rcx, %r8, and %r9 in order, and the return value of the callee is passed back to the caller through %rax.

Callee-saved registers means that these registers belong to the caller. In other words, if the callee wants to use these registers, it must first save their values ​​in the stack. When the callee is about to finish and return to the caller, it must restore the values ​​of these registers from the stack.

Caller-saved registers means that these registers belong to the callee. In other words, after the caller calls the callee, the callee can use these registers freely. Therefore, the caller must first save the values ​​of these registers in the stack before calling the callee. When the callee finishes and returns to the caller, the caller can restore the values ​​of these registers from the stack.

64-bit32-bit16-bit8-bitSpecial PurposeCaller-savedCallee-saved
raxeaxaxah, alReturn value
rbxebxbxbh, bl
rcxecxcxch, cl4th argument
rdxedxdxdh, dl3rd argument
rsiesisisil2nd argument
rdiedididil1st argument
rbpebpbpbplFrame pointer
rspespspsplStack pointer
r8r8dr8wr8b5th argument
r9r9dr9wr9b6th argument
r10r10dr10wr10b
r11r11dr11wr11b
r12r12dr12wr12b
r13r13dr13wr13b
r14r14dr14wr14b
r15r15dr15wr15b
x86-64 Registers.

Stack Frame

If the callee has more than 6 parameters, except for the first 6 parameters which are passed through registers, the remaining parameters are passed through stack. In addition, stack is also used to save caller-saved and callee-saved registers. Therefore, stack is a very important part of the calling conventions.

Assume that the callee function has 8 parameters, 3 local variables, and a return value, as follows.

long callee(long a, long b, long c, long d, long e, long f, long g, long h) {
    long x;
    long y;
    long z;
    return 10;
}

void caller() {
    ...
    long x = calc(1, 2, 3, 4, 5, 6, 7, 8);
    ...
}

The assembly code of the caller may be as follows. Because the stack grows from high address to low address, the caller first subtracts 16 from %rsp to allocate two spaces, put the 7th and 8th parameters onto it, and then place the first 6 parameters into registers. Then, the callee is called. At this time, the call instruction will push the address of the next instruction onto the stack and subtract 8 from %rsp.

When the callee ends and returns to the caller, the caller must clear the two spaces just allocated in the stack, so 16 is added to %rsp. Then, get the return value of the callee from %rax and store it in a local variable.

caller:
	...
	subq   $16, %rsp      # Make stack space for the 7th and 8th parameters
	movq   $8, 8(%rsp)
	movq   $7, (%rsp)
	movq   $6, %r9
	movq   $5, %r8
	movq   $4, %rcx
	movq   $3, %rdx
	movq   $2, %rsi
	movq   $1, %rdi
	call   callee          # Call callee and push the return address onto the stack
	addq   $16, %rsp       # Clean up the stack
	movq   %rax, -8(%rbp)  # Save the return value to a local variable
	...

The assembly code of the caller may be as follows. The callee first stores the caller’s %rbp onto the stack and set its own %rbp. Then, it subtracts 24 from %rsp to allocate three spaces to three local variables.

Before the callee ends, place the value to be returned to the caller into %rax. Then, the leave instruction will copy %rbp to %rsp and pop the previous %rbp value from the stack. Finally, the ret instruction pops the return address from the stack.

callee:
	pushq   %rbp           # Save previous %rbp to the stack
	movq    %rsp, %rbp     # Move %rsp to %rbp
	subq    $24, %rsp      # Allocate space for the local variables
	...
	movq	$10, %rax      # Move return value to %rax
	leave                  # Copy %rbp to %rsp, restore previous %rbp from the stack
	ret                    # Return by pop the return address from the stack

When the caller calls the callee, the stack will be as shown below. Please go through the process of the caller calling the caller above again with the figure below. You will have a better understanding of the changes in the stack.

x86-64 Calling Conventions.
x86-64 Calling Conventions.

Red zone refers to the 128 bytes after %rsp. Functions can use red zones to store temporary data. Especially when a function is a leaf function, that is, the function that does not call other functions, it can use this area directly without adjusting %rsp to allocate space.

Example

Now let’s use the following C code as an example to see how GCC handles the process of calling a function. First, save the following C code as test.c.

#include <stdio.h>

long callee(long a, long b, long c, long d, long e, long f, long g, long h) {
    long sum = a + b + c + d + e + f + g + h;
    long value = sum * 10;
    return value;
}

void caller() {
    long value = callee(1, 2, 3, 4, 5, 6, 7, 8);
    printf("value is %ldn", value);
}

We can use gcc -S test.c to generate test.s, as follows. We can see that when the caller calls the callee, it is not exactly the same as the process we described in this article, especially the processing of %rsp is very different. This is because GCC is very smart. It can determine whether it is possible to achieve the same result without changing %rsp. This can reduce some assembly code and improve performance.

	.file	"test.c"
	.text
	.globl	callee
	.type	callee, @function
callee:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	movq	%rdi, -24(%rbp)
	movq	%rsi, -32(%rbp)
	movq	%rdx, -40(%rbp)
	movq	%rcx, -48(%rbp)
	movq	%r8, -56(%rbp)
	movq	%r9, -64(%rbp)
	movq	-32(%rbp), %rax
	movq	-24(%rbp), %rdx
	addq	%rax, %rdx
	movq	-40(%rbp), %rax
	addq	%rax, %rdx
	movq	-48(%rbp), %rax
	addq	%rax, %rdx
	movq	-56(%rbp), %rax
	addq	%rax, %rdx
	movq	-64(%rbp), %rax
	addq	%rax, %rdx
	movq	16(%rbp), %rax
	addq	%rax, %rdx
	movq	24(%rbp), %rax
	addq	%rdx, %rax
	movq	%rax, -8(%rbp)
	movq	-8(%rbp), %rdx
	movq	%rdx, %rax
	salq	$2, %rax
	addq	%rdx, %rax
	addq	%rax, %rax
	movq	%rax, -16(%rbp)
	movq	-16(%rbp), %rax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	callee, .-callee
	.section	.rodata
.LC0:
	.string	"value is %ldn"
	.text
	.globl	caller
	.type	caller, @function
caller:
.LFB1:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	subq	$32, %rsp
	movq	$8, 8(%rsp)
	movq	$7, (%rsp)
	movl	$6, %r9d
	movl	$5, %r8d
	movl	$4, %ecx
	movl	$3, %edx
	movl	$2, %esi
	movl	$1, %edi
	call	callee
	movq	%rax, -8(%rbp)
	movq	-8(%rbp), %rax
	movq	%rax, %rsi
	movl	$.LC0, %edi
	movl	$0, %eax
	call	printf
	leave
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE1:
	.size	caller, .-caller
	.ident	"GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-44)"
	.section	.note.GNU-stack,"",@progbits

Conclusion

If you need to use assembly to call C functions, or C to call assembly code, you absolutely must understand calling conventions. In addition, if you are writing C/C++ code, understanding calling conventions is also very helpful.

Reference

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like
Photo by Timothée Geenens on Unsplash
Read More

x86 Memory Map

After the x86 PC boots, it will be in real mode. At this time, we can access memory below 1 MB. However, the BIOS also uses some memory. Therefore, we must know which areas the BIOS occupies in order to avoid them.
Read More
Photo by Lanju Fotografie on Unsplash
Read More

Makefile

Makefile is the most commonly used compilation tool in Linux. Stuart Feldman created it at Bell Labs in 1967. Although it may be older than you and me, it is still active nowadays.
Read More