Calling conventions refers to the specifications that the two functions should follow when one function calls another function. For example, how to pass parameters and a return value between them. Calling conventions are part of the application binary interface (ABI).
Table of Contents
Registers
x86-64 has 16 general purpose registers, as shown in the table below. The general purpose registers refer to the registers used for general purposes when writing an assembly. However, the C language uses some general purpose registers for special purposes. When a C function named caller calls another C function named callee, the caller will store the parameters in certain registers such as %rdi
and %rsi
, and the callee can obtain the parameters through these registers. Therefore, the caller and callee must know which registers are used to pass parameters, and their order. These specifications are calling conventions.
The first six parameters of the callee are passed by the caller to the callee through %rdi
, %rsi
, %rdx
, %rcx
, %r8
, and %r9
in order, and the return value of the callee is passed back to the caller through %rax
.
Callee-saved registers means that these registers belong to the caller. In other words, if the callee wants to use these registers, it must first save their values in the stack. When the callee is about to finish and return to the caller, it must restore the values of these registers from the stack.
Caller-saved registers means that these registers belong to the callee. In other words, after the caller calls the callee, the callee can use these registers freely. Therefore, the caller must first save the values of these registers in the stack before calling the callee. When the callee finishes and returns to the caller, the caller can restore the values of these registers from the stack.
64-bit | 32-bit | 16-bit | 8-bit | Special Purpose | Caller-saved | Callee-saved |
---|---|---|---|---|---|---|
rax | eax | ax | ah, al | Return value | ✓ | |
rbx | ebx | bx | bh, bl | ✓ | ||
rcx | ecx | cx | ch, cl | 4th argument | ✓ | |
rdx | edx | dx | dh, dl | 3rd argument | ✓ | |
rsi | esi | si | sil | 2nd argument | ✓ | |
rdi | edi | di | dil | 1st argument | ✓ | |
rbp | ebp | bp | bpl | Frame pointer | ✓ | |
rsp | esp | sp | spl | Stack pointer | ✓ | |
r8 | r8d | r8w | r8b | 5th argument | ✓ | |
r9 | r9d | r9w | r9b | 6th argument | ✓ | |
r10 | r10d | r10w | r10b | ✓ | ||
r11 | r11d | r11w | r11b | ✓ | ||
r12 | r12d | r12w | r12b | ✓ | ||
r13 | r13d | r13w | r13b | ✓ | ||
r14 | r14d | r14w | r14b | ✓ | ||
r15 | r15d | r15w | r15b | ✓ |
Stack Frame
If the callee has more than 6 parameters, except for the first 6 parameters which are passed through registers, the remaining parameters are passed through stack. In addition, stack is also used to save caller-saved and callee-saved registers. Therefore, stack is a very important part of the calling conventions.
Assume that the callee function has 8 parameters, 3 local variables, and a return value, as follows.
long callee(long a, long b, long c, long d, long e, long f, long g, long h) { long x; long y; long z; return 10; } void caller() { ... long x = calc(1, 2, 3, 4, 5, 6, 7, 8); ... }
The assembly code of the caller may be as follows. Because the stack grows from high address to low address, the caller first subtracts 16 from %rsp
to allocate two spaces, put the 7th and 8th parameters onto it, and then place the first 6 parameters into registers. Then, the callee is called. At this time, the call
instruction will push the address of the next instruction onto the stack and subtract 8 from %rsp
.
When the callee ends and returns to the caller, the caller must clear the two spaces just allocated in the stack, so 16 is added to %rsp
. Then, get the return value of the callee from %rax
and store it in a local variable.
caller: ... subq $16, %rsp # Make stack space for the 7th and 8th parameters movq $8, 8(%rsp) movq $7, (%rsp) movq $6, %r9 movq $5, %r8 movq $4, %rcx movq $3, %rdx movq $2, %rsi movq $1, %rdi call callee # Call callee and push the return address onto the stack addq $16, %rsp # Clean up the stack movq %rax, -8(%rbp) # Save the return value to a local variable ...
The assembly code of the caller may be as follows. The callee first stores the caller’s %rbp
onto the stack and set its own %rbp
. Then, it subtracts 24 from %rsp
to allocate three spaces to three local variables.
Before the callee ends, place the value to be returned to the caller into %rax
. Then, the leave
instruction will copy %rbp
to %rsp
and pop the previous %rbp
value from the stack. Finally, the ret
instruction pops the return address from the stack.
callee: pushq %rbp # Save previous %rbp to the stack movq %rsp, %rbp # Move %rsp to %rbp subq $24, %rsp # Allocate space for the local variables ... movq $10, %rax # Move return value to %rax leave # Copy %rbp to %rsp, restore previous %rbp from the stack ret # Return by pop the return address from the stack
When the caller calls the callee, the stack will be as shown below. Please go through the process of the caller calling the caller above again with the figure below. You will have a better understanding of the changes in the stack.
Red zone refers to the 128 bytes after %rsp
. Functions can use red zones to store temporary data. Especially when a function is a leaf function, that is, the function that does not call other functions, it can use this area directly without adjusting %rsp
to allocate space.
Example
Now let’s use the following C code as an example to see how GCC handles the process of calling a function. First, save the following C code as test.c.
#include <stdio.h> long callee(long a, long b, long c, long d, long e, long f, long g, long h) { long sum = a + b + c + d + e + f + g + h; long value = sum * 10; return value; } void caller() { long value = callee(1, 2, 3, 4, 5, 6, 7, 8); printf("value is %ldn", value); }
We can use gcc -S test.c
to generate test.s, as follows. We can see that when the caller calls the callee, it is not exactly the same as the process we described in this article, especially the processing of %rsp
is very different. This is because GCC is very smart. It can determine whether it is possible to achieve the same result without changing %rsp
. This can reduce some assembly code and improve performance.
.file "test.c" .text .globl callee .type callee, @function callee: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 movq %rdi, -24(%rbp) movq %rsi, -32(%rbp) movq %rdx, -40(%rbp) movq %rcx, -48(%rbp) movq %r8, -56(%rbp) movq %r9, -64(%rbp) movq -32(%rbp), %rax movq -24(%rbp), %rdx addq %rax, %rdx movq -40(%rbp), %rax addq %rax, %rdx movq -48(%rbp), %rax addq %rax, %rdx movq -56(%rbp), %rax addq %rax, %rdx movq -64(%rbp), %rax addq %rax, %rdx movq 16(%rbp), %rax addq %rax, %rdx movq 24(%rbp), %rax addq %rdx, %rax movq %rax, -8(%rbp) movq -8(%rbp), %rdx movq %rdx, %rax salq $2, %rax addq %rdx, %rax addq %rax, %rax movq %rax, -16(%rbp) movq -16(%rbp), %rax popq %rbp .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE0: .size callee, .-callee .section .rodata .LC0: .string "value is %ldn" .text .globl caller .type caller, @function caller: .LFB1: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 movq %rsp, %rbp .cfi_def_cfa_register 6 subq $32, %rsp movq $8, 8(%rsp) movq $7, (%rsp) movl $6, %r9d movl $5, %r8d movl $4, %ecx movl $3, %edx movl $2, %esi movl $1, %edi call callee movq %rax, -8(%rbp) movq -8(%rbp), %rax movq %rax, %rsi movl $.LC0, %edi movl $0, %eax call printf leave .cfi_def_cfa 7, 8 ret .cfi_endproc .LFE1: .size caller, .-caller .ident "GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-44)" .section .note.GNU-stack,"",@progbits
Conclusion
If you need to use assembly to call C functions, or C to call assembly code, you absolutely must understand calling conventions. In addition, if you are writing C/C++ code, understanding calling conventions is also very helpful.
Reference
- System V Application Binary Interface AMD64 Architecture Processor Supplement.
- Calling Conventions, OSDev Wiki.
- x86 calling conventions, Wikipedia.
- Stack frame layout on x86_64, Eli Bendersky’s website.
- x86-64 Registers.