ELF

Photo by NEOM on Unsplash
Photo by NEOM on Unsplash
ELF file is the file format used by Linux. Whether you want to learn linkers or disassemble, you need to understand ELF files.

ELF file is the file format used by Linux. Whether you want to learn linkers or disassemble, you need to understand ELF files. This article will introduce ELF file format.

ELF File

ELF (Executable and Linking Format) is an object file format. There are three main object file types:

  • Relocatable files: It contains code and data. Linkers link a relocatable file with other object files to create an executable or shared object file.
  • Executable files: It contains programs and therefore can be executed.
  • Shared object files: It contains code and data. It can be used in two ways:
    • The linker links a shared object file with other relocatable files and shared object files to create another object file.
    • The dynamic linker combines it with an executable file and other shared objects to create a process.

From the above, when an object file is a relocatable file or a shared object file, it can be accessed by the linker. When it is an executable file, it can be executed. Therefore, an ELF file can be viewed from two viewpoints, namely linking view and execution view, as shown below.

Assemblers and linkers view an ELF file from the perspective of linkable sections on the left. That is, they treat an ELF file as a relocatable or shred object file. For them, an ELF file is composed of several sections. They access all sections through the section header table.

System loaders view an ELF file from the perspective of executable segments on the right. That is, they treat an ELF file as an executable or shared object file. For them, an ELF file is composed of several segments. They access all segments through the program head table.

Relocatable files have section tables, executable files have program header tables, and shared object files have both. In addition, a segment usually consists of several sections.

Two views of an ELF file, from <i>Linkers & Loaders</i>.
Two views of an ELF file, from Linkers & Loaders.

Data Representation

ELF supports a variety of processors, so it has 32-bit and 64-bit versions. All the fields are the same, just the length of the fields is different.

Purposes32-Bit Name32-Bit Size32-Bit Alignment64-Bit Name64-Bit Size64-Bit Alignment
Unsigned program addressELF32_Addr44ELF64_Addr88
Unsigned file offsetELF32_Off44ELF64_Off88
Unsigned medium integerELF32_Half22ELF64_Half22
Unsigned integerELF32_Word44ELF64_Word44
Signed integerELF32_Sword44ELF64_Sword44
Unsigned long integerELF64_Xword88
Signed long integerELF64_Sxword88
Unsigned small integerunsigned char11unsigned char11
32-Bit and 64-Bit Data Types

ELF Header

We can find the definition of ELF header in /usr/include/elf.h.

#define EI_NIDENT (16)

typedef struct
{
  unsigned char	e_ident[EI_NIDENT];	/* Magic number and other info */
  Elf32_Half	e_type;			/* Object file type */
  Elf32_Half	e_machine;		/* Architecture */
  Elf32_Word	e_version;		/* Object file version */
  Elf32_Addr	e_entry;		/* Entry point virtual address */
  Elf32_Off	e_phoff;		/* Program header table file offset */
  Elf32_Off	e_shoff;		/* Section header table file offset */
  Elf32_Word	e_flags;		/* Processor-specific flags */
  Elf32_Half	e_ehsize;		/* ELF header size in bytes */
  Elf32_Half	e_phentsize;		/* Program header table entry size */
  Elf32_Half	e_phnum;		/* Program header table entry count */
  Elf32_Half	e_shentsize;		/* Section header table entry size */
  Elf32_Half	e_shnum;		/* Section header table entry count */
  Elf32_Half	e_shstrndx;		/* Section header string table index */
} Elf32_Ehdr;

typedef struct
{
  unsigned char	e_ident[EI_NIDENT];	/* Magic number and other info */
  Elf64_Half	e_type;			/* Object file type */
  Elf64_Half	e_machine;		/* Architecture */
  Elf64_Word	e_version;		/* Object file version */
  Elf64_Addr	e_entry;		/* Entry point virtual address */
  Elf64_Off	e_phoff;		/* Program header table file offset */
  Elf64_Off	e_shoff;		/* Section header table file offset */
  Elf64_Word	e_flags;		/* Processor-specific flags */
  Elf64_Half	e_ehsize;		/* ELF header size in bytes */
  Elf64_Half	e_phentsize;		/* Program header table entry size */
  Elf64_Half	e_phnum;		/* Program header table entry count */
  Elf64_Half	e_shentsize;		/* Section header table entry size */
  Elf64_Half	e_shnum;		/* Section header table entry count */
  Elf64_Half	e_shstrndx;		/* Section header string table index */
} Elf64_Ehdr;

We will briefly introduce each field. For details, please refer to ELF Header.

FieldsDescriptions
e_identSee e_ident.
e_typeObject file type.
ET_REL: 1 => Relocatable file
ET_EXEC: 2 => Executable file
ET_DYN: 3 => Shared object file
ET_CORE: 4 => Core file
e_machineRequired architecture.
ex: EM_386: 3
e_versionObject file version.
EV_CURRENT: 1
e_entryThe virtual address of the entry point if this is a executable file; otherwise, it is 0.
e_phoffThe program header table’s file offset in bytes. If the file has no program header table, it is 0.
e_shoffThe section header table’s file offset in bytes. If the file has no section header table, it is 0.
e_flagsProcessor-specific flags associated with the file.
e_ehsizeThe ELF header’s size in bytes.
e_phentsizeThe size in bytes of 1 entry in the file’s program header table; all entries are the same size.
e_phnumThe number of entries in the program header table. If the file has no program header table, it is 0.
The total size of the program header table is e_phentsize * e_phnum bytes.
e_shentsizeThe size in bytes of 1 entry in the section header table; all entries are the same size.
e_shnumThe number of entries in the section header table. If the file has no section header table, it is 0.
The total size of the program header table is e_shentsize * e_shnum bytes.
e_shstrndxThe section header table index of the entry associated with the section name string table.
ELF Header.

e_ident

e_ident has a total of 16 bytes. For details, please refer to ELF Identification.

FieldsBytesDescriptions
EI_MAG4File identification.
Magic number: 7f 45 4c 46 => 7f E L F.
EI_CLASS1File’s class.
ELFCLASS32: 1 => 32-bit objects
ELFCLASS64: 2 => 64-bite object
EI_DATA1Data encoding.
ELFDATA2LSB: 1 => 2’s complement, little endian
ELFDATA2MSB: 2 => 2’s complement, big endian
EI_VERSION1ELF header version number.
EI_OSABI1OS- or ABI-specific ELF extensions used by this file.
EI_ABIVERSION1Identifies the version of the ABI to which the object is targeted.
EI_PAD7Reserved bytes and set to 0.
ELF Identification.

Example

Now let’s look at a real example. We will use the following hello.c as an example.

// hello.c
#include <stdio.h>

int inited_var = 6;
int uninited_var;

int sum(int x, int y) {
    return x * y;
}

int main() {
    int s = sum(10, 20);
    printf("The sum 10 and 20 is %d.n", s);
    return 0;
}

We first compile hello.c into a relocatable file, and use readelf -h to read the ELF header of hello.o, as follows.

$ gcc -c hello.c -o hello.o
$ readelf -h hello.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          904 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         13
  Section header string table index: 12

We then compile hello.c into an executable file, and use readelf -h to read hello’s ELF header, as follows.

$ gcc hello.c -o hello
$ readelf -h hello
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400440
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6528 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         30
  Section header string table index: 29

Sections

Section Header

Section header table is an array of Elf32_Shdr or Elf64_Shdr. The e_shoff of the ELF header indicates the offset of the section header table from the beginning of the file, e_shnum tells us how many entries the section header table has, and e_shentsize indicates the size of each entry. The entry of Section header table is defined as follows.

/* Section header.  */

typedef struct
{
  Elf32_Word	sh_name;		/* Section name (string tbl index) */
  Elf32_Word	sh_type;		/* Section type */
  Elf32_Word	sh_flags;		/* Section flags */
  Elf32_Addr	sh_addr;		/* Section virtual addr at execution */
  Elf32_Off	sh_offset;		/* Section file offset */
  Elf32_Word	sh_size;		/* Section size in bytes */
  Elf32_Word	sh_link;		/* Link to another section */
  Elf32_Word	sh_info;		/* Additional section information */
  Elf32_Word	sh_addralign;		/* Section alignment */
  Elf32_Word	sh_entsize;		/* Entry size if section holds table */
} Elf32_Shdr;

typedef struct
{
  Elf64_Word	sh_name;		/* Section name (string tbl index) */
  Elf64_Word	sh_type;		/* Section type */
  Elf64_Xword	sh_flags;		/* Section flags */
  Elf64_Addr	sh_addr;		/* Section virtual addr at execution */
  Elf64_Off	sh_offset;		/* Section file offset */
  Elf64_Xword	sh_size;		/* Section size in bytes */
  Elf64_Word	sh_link;		/* Link to another section */
  Elf64_Word	sh_info;		/* Additional section information */
  Elf64_Xword	sh_addralign;		/* Section alignment */
  Elf64_Xword	sh_entsize;		/* Entry size if section holds table */
} Elf64_Shdr;

We briefly introduce each field.

FieldsDescriptions
sh_nameThe name of the section.
The value is an index into the section header string table section.
sh_typeThe type of the section’s contents and semantics, see sh_type.
sh_flagsFlag bits, see sh_flags.
sh_addrIf the section is loadable, this gives the address where it should reside.
sh_offsetThe offset of the section from the beginning of the file.
sh_sizeThe size of the section.
sh_linkHolds a section header table index link, whose depends on the section type.
sh_infoHolds extra information, whose depends on the section type.
sh_addralignSome sections have address alignment constraints.
sh_entsizeSome sections hold a table of fixed size entries. For such a section, this gives the size of each entry.
An entry in Section Header Table.

sh_type

Part of the sh_type is listed below, please refer to Sections for details.

NamesValuesDescriptions
SHT_NULL0Marks the section header as inactive.
SHT_PROGBITS1Holds the program contents including code, data, and debugger information.
SHT_SYMTAB2Holds a symbol table including all symbols and is intended for regular linker.
SHT_STRTAB3Holds a string table.
SHT_RELA4Holds relocation entries with explicit addends.
SHT_HASH5Holds a symbol hash table.
SHT_DYNAMIC6Holds information for dynamic linking.
SHT_NOTE7Holds information marking the file in some way.
SHT_NOBITS8Similar to SHT_PROGBITS but no space is allocated in the file. It is used for .bss data allocated at program load time.
SHT_REL9Holds relocation entries without explicit addends.
SHT_SHLIB10Reserved.
SHT_DYNSYM11Hold a symbol table including some symbols for dynamic linking.
The values of sh_type.

sh_flags

The following lists some of the sh_flags, please refer to Sections for details .

NamesValuesDescriptions
SHF_WRITE0x1The section contains data and is writable when loaded.
SHF_ALLOC0x2The section occupies memory when the program is loaded.
SHF_EXECINSTR0x4The section contains executable machine code.
The value of sh_flags.

Special Sections

Although we can define any sections, there are some commonly used sections. Listed below are some commonly used sections. Understanding them will help us understand ELF files better. These sections all have prefix .. For more special sections, please refer to Special Sections.

NamesTypesDescriptions
.bssSHT_NOBITSUninitialized data and takes no file space.
.commentSHT_PROGBITSVersion control information.
.data
.data1
SHT_PROGBITSInitialized data.
.debugSHT_PROGBITSInformation for symbolic debugging.
.dynamicSHT_DYNAMICDynamic linking information.
.dynstrSHT_STRTABStrings for dynamic linker symbol table.
.dynsymSHT_DYNSYMDynamic linking symbol table.
.gotSHT_PROGBITSGlobal offset table.
.hashSHT_HASHSymbol hash table.
.lineSHT_PROGBITSLine number information for symbolic debugging.
.noteSHT_NOTENote section.
.pltSHT_PROGBITSProcedure linkage table.
.relnameSHT_RELRelocation information.
.rel.text is a relocation section for .text.
.rel.data is for .data.
.relanameSHT_RELARelocation information.
.rela.text is a relocation section for .text.
.rela.data is for .data.
.rodata
.rodata1
SHT_PROGBITSRead-only data for a non-writable segment.
.shstrtabSHT_STRTABSection names.
.strtabSHT_STRTABStrings for symbol table.
.symtabSHT_STRTABSymbol table.
.tbssSHT_PROGBITSUninitialized thread-local data and takes no file space.
.tdataSHT_PROGBITSInitialized thread-local data.
.textSHT_PROGBITSExecutable instructions.
Special Sections.

Example

We can use readelf -S to read section headers. The following are the section headers of hello.o.

$ readelf -s hello.o
There are 13 section headers, starting at offset 0x388:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       0000000000000048  0000000000000000  AX       0     0     1
  [ 2] .rela.text        RELA             0000000000000000  000002a8
       0000000000000048  0000000000000018   I      10     1     8
  [ 3] .data             PROGBITS         0000000000000000  00000088
       0000000000000004  0000000000000000  WA       0     0     4
  [ 4] .bss              NOBITS           0000000000000000  0000008c
       0000000000000000  0000000000000000  WA       0     0     1
  [ 5] .rodata           PROGBITS         0000000000000000  0000008c
       000000000000001a  0000000000000000   A       0     0     1
  [ 6] .comment          PROGBITS         0000000000000000  000000a6
       000000000000002e  0000000000000001  MS       0     0     1
  [ 7] .note.GNU-stack   PROGBITS         0000000000000000  000000d4
       0000000000000000  0000000000000000           0     0     1
  [ 8] .eh_frame         PROGBITS         0000000000000000  000000d8
       0000000000000058  0000000000000000   A       0     0     8
  [ 9] .rela.eh_frame    RELA             0000000000000000  000002f0
       0000000000000030  0000000000000018   I      10     8     8
  [10] .symtab           SYMTAB           0000000000000000  00000130
       0000000000000150  0000000000000018          11     9     8
  [11] .strtab           STRTAB           0000000000000000  00000280
       0000000000000026  0000000000000000           0     0     1
  [12] .shstrtab         STRTAB           0000000000000000  00000320
       0000000000000061  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

The following are the section headers for hello.

$ readelf -s hello
There are 30 section headers, starting at offset 0x1980:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400238  00000238
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.ABI-tag     NOTE             0000000000400254  00000254
       0000000000000020  0000000000000000   A       0     0     4
  [ 3] .note.gnu.build-i NOTE             0000000000400274  00000274
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000400298  00000298
       000000000000001c  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           00000000004002b8  000002b8
       0000000000000060  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           0000000000400318  00000318
       000000000000003f  0000000000000000   A       0     0     1
  [ 7] .gnu.version      VERSYM           0000000000400358  00000358
       0000000000000008  0000000000000002   A       5     0     2
  [ 8] .gnu.version_r    VERNEED          0000000000400360  00000360
       0000000000000020  0000000000000000   A       6     1     8
  [ 9] .rela.dyn         RELA             0000000000400380  00000380
       0000000000000018  0000000000000018   A       5     0     8
  [10] .rela.plt         RELA             0000000000400398  00000398
       0000000000000048  0000000000000018  AI       5    23     8
  [11] .init             PROGBITS         00000000004003e0  000003e0
       000000000000001a  0000000000000000  AX       0     0     4
  [12] .plt              PROGBITS         0000000000400400  00000400
       0000000000000040  0000000000000010  AX       0     0     16
  [13] .text             PROGBITS         0000000000400440  00000440
       00000000000001b2  0000000000000000  AX       0     0     16
  [14] .fini             PROGBITS         00000000004005f4  000005f4
       0000000000000009  0000000000000000  AX       0     0     4
  [15] .rodata           PROGBITS         0000000000400600  00000600
       000000000000002a  0000000000000000   A       0     0     8
  [16] .eh_frame_hdr     PROGBITS         000000000040062c  0000062c
       000000000000003c  0000000000000000   A       0     0     4
  [17] .eh_frame         PROGBITS         0000000000400668  00000668
       0000000000000114  0000000000000000   A       0     0     8
  [18] .init_array       INIT_ARRAY       0000000000600e10  00000e10
       0000000000000008  0000000000000008  WA       0     0     8
  [19] .fini_array       FINI_ARRAY       0000000000600e18  00000e18
       0000000000000008  0000000000000008  WA       0     0     8
  [20] .jcr              PROGBITS         0000000000600e20  00000e20
       0000000000000008  0000000000000000  WA       0     0     8
  [21] .dynamic          DYNAMIC          0000000000600e28  00000e28
       00000000000001d0  0000000000000010  WA       6     0     8
  [22] .got              PROGBITS         0000000000600ff8  00000ff8
       0000000000000008  0000000000000008  WA       0     0     8
  [23] .got.plt          PROGBITS         0000000000601000  00001000
       0000000000000030  0000000000000008  WA       0     0     8
  [24] .data             PROGBITS         0000000000601030  00001030
       0000000000000008  0000000000000000  WA       0     0     4
  [25] .bss              NOBITS           0000000000601038  00001038
       0000000000000008  0000000000000000  WA       0     0     4
  [26] .comment          PROGBITS         0000000000000000  00001038
       000000000000002d  0000000000000001  MS       0     0     1
  [27] .symtab           SYMTAB           0000000000000000  00001068
       0000000000000630  0000000000000018          28    46     8
  [28] .strtab           STRTAB           0000000000000000  00001698
       00000000000001dd  0000000000000000           0     0     1
  [29] .shstrtab         STRTAB           0000000000000000  00001875
       0000000000000108  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

String Table

String Table

String table contains a sequence of null-terminated strings. Object files use these strings to represent symbol and section names. The first byte of String table must be \0.

The sh_name of the Section header contains the index of the string table. Assume that the value of sh_name of a section header is 11, then the string represented is able.

String table.
String table.
String table indexes.
String table indexes.

Example

We can use readelf -p name to read the string table. As long as the type of section is SHT_STRTAB, it is a string table. The following are all string tables for hello.o.

$ readelf -p .strtab hello.o
String dump of section '.strtab':
  [     1]  hello.c
  [     9]  uninited_var
  [    16]  sum
  [    1a]  main
  [    1f]  printf

$ readelf -p .shstrtab hello.o
String dump of section '.shstrtab':
  [     1]  .symtab
  [     9]  .strtab
  [    11]  .shstrtab
  [    1b]  .rela.text
  [    26]  .data
  [    2c]  .bss
  [    31]  .rodata
  [    39]  .comment
  [    42]  .note.GNU-stack
  [    52]  .rela.eh_frame

The following are all string tables for hello.

$ readelf -p .dynstr hello
String dump of section '.dynstr':
  [     1]  libc.so.6
  [     b]  printf
  [    12]  __libc_start_main
  [    24]  __gmon_start__
  [    33]  GLIBC_2.2.5

$ readelf -p .strtab hello
String dump of section '.strtab':
  [     1]  crtstuff.c
  [     c]  __JCR_LIST__
  [    19]  deregister_tm_clones
  [    2e]  __do_global_dtors_aux
  [    44]  completed.6355
  [    53]  __do_global_dtors_aux_fini_array_entry
  [    7a]  frame_dummy
  [    86]  __frame_dummy_init_array_entry
  [    a5]  hello.c
  [    ad]  __FRAME_END__
  [    bb]  __JCR_END__
  [    c7]  __init_array_end
  [    d8]  _DYNAMIC
  [    e1]  __init_array_start
  [    f4]  __GNU_EH_FRAME_HDR
  [   107]  _GLOBAL_OFFSET_TABLE_
  [   11d]  __libc_csu_fini
  [   12d]  _edata
  [   134]  printf@@GLIBC_2.2.5
  [   148]  __libc_start_main@@GLIBC_2.2.5
  [   167]  __data_start
  [   174]  __gmon_start__
  [   183]  __dso_handle
  [   190]  sum
  [   194]  _IO_stdin_used
  [   1a3]  __libc_csu_init
  [   1b3]  uninited_var
  [   1c0]  __bss_start
  [   1cc]  main
  [   1d1]  __TMC_END__

$ readelf -p .shstrtab hello
String dump of section '.shstrtab':
  [     1]  .symtab
  [     9]  .strtab
  [    11]  .shstrtab
  [    1b]  .interp
  [    23]  .note.ABI-tag
  [    31]  .note.gnu.build-id
  [    44]  .gnu.hash
  [    4e]  .dynsym
  [    56]  .dynstr
  [    5e]  .gnu.version
  [    6b]  .gnu.version_r
  [    7a]  .rela.dyn
  [    84]  .rela.plt
  [    8e]  .init
  [    94]  .text
  [    9a]  .fini
  [    a0]  .rodata
  [    a8]  .eh_frame_hdr
  [    b6]  .eh_frame
  [    c0]  .init_array
  [    cc]  .fini_array
  [    d8]  .jcr
  [    dd]  .dynamic
  [    e6]  .got
  [    eb]  .got.plt
  [    f4]  .data
  [    fa]  .bss
  [    ff]  .comment

Symbol Table

Symbol Table

Symbol table contains symbolic definitions and references for locating and relocating a program. The entry of a symbol table is defined as follows.

typedef struct
{
  Elf32_Word	st_name;		/* Symbol name (string tbl index) */
  Elf32_Addr	st_value;		/* Symbol value */
  Elf32_Word	st_size;		/* Symbol size */
  unsigned char	st_info;		/* Symbol type and binding */
  unsigned char	st_other;		/* Symbol visibility */
  Elf32_Section	st_shndx;		/* Section index */
} Elf32_Sym;

typedef struct
{
  Elf64_Word	st_name;		/* Symbol name (string tbl index) */
  unsigned char	st_info;		/* Symbol type and binding */
  unsigned char st_other;		/* Symbol visibility */
  Elf64_Section	st_shndx;		/* Section index */
  Elf64_Addr	st_value;		/* Symbol value */
  Elf64_Xword	st_size;		/* Symbol size */
} Elf64_Sym;

typedef struct
{
  Elf32_Half si_boundto;		/* Direct bindings, symbol bound to */
  Elf32_Half si_flags;			/* Per symbol flags */
} Elf32_Syminfo;

typedef struct
{
  Elf64_Half si_boundto;		/* Direct bindings, symbol bound to */
  Elf64_Half si_flags;			/* Per symbol flags */
} Elf64_Syminfo;

We briefly introduce each field.

FieldsDescriptions
st_nameA symbol’s name.
An index into symbol string table.
st_valueA symbol’s value.
st_sizeA symbol’s size.
st_infoA symbol’s type and binding attributes.
See Symbol Binding and Symbol Types.
st_otherA symbol’s visibility.
st_shndxThe relevant section header table index.
Symbol Table Entry.

Symbol Binding

The high four bits of st_info in the Symbol table represent symbol binding. Some of its values ​​are listed below. For other values, please refer to the Symbol Table.

FieldsValuesDescriptions
STB_LOCAL0Local symbols are not visible outside the object file.
STB_GLOBAL1Global symbols are visible to all object files being combined.
STB_WEAK2Weak symbols resemble global symbols, but their definitions have lower precedence.
Symbol Binding.

Symbol Types

The lower four bits of st_info in the Symbol table represent the symbol type. Some of its values ​​are listed below. For other values, please refer to the Symbol Table.

FieldsValuesDescriptions
STT_NOTYPE0Not specified.
STT_OBJECT1A data object, such as a variable, an array, and so on.
STT_FUNC2A function or other executable code.
STT_SECTION3A section for relocation.
STT_FILE4The name of the source file.
STT_COMMON5Uninitialized common block.
Symbol Types.

Example

We can use readelf -s to read the symbol table. The following are the symbol tables for hello.o.

$ readelf -s hello.o
Symbol table '.symtab' contains 14 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1 
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4 
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    7 
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    8 
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    6 
     9: 0000000000000000     4 OBJECT  GLOBAL DEFAULT    3 inited_var
    10: 0000000000000004     4 OBJECT  GLOBAL DEFAULT  COM uninited_var
    11: 0000000000000000    19 FUNC    GLOBAL DEFAULT    1 sum
    12: 0000000000000013    53 FUNC    GLOBAL DEFAULT    1 main
    13: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND printf

The following are the symbol tables for hello.o.

$ readelf -s hello
Symbol table '.dynsym' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5 (2)
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
     3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__

Symbol table '.symtab' contains 66 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000400238     0 SECTION LOCAL  DEFAULT    1 
     2: 0000000000400254     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000400274     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000400298     0 SECTION LOCAL  DEFAULT    4 
     5: 00000000004002b8     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000400318     0 SECTION LOCAL  DEFAULT    6 
     7: 0000000000400358     0 SECTION LOCAL  DEFAULT    7 
     8: 0000000000400360     0 SECTION LOCAL  DEFAULT    8 
     9: 0000000000400380     0 SECTION LOCAL  DEFAULT    9 
    10: 0000000000400398     0 SECTION LOCAL  DEFAULT   10 
    11: 00000000004003e0     0 SECTION LOCAL  DEFAULT   11 
    12: 0000000000400400     0 SECTION LOCAL  DEFAULT   12 
    13: 0000000000400440     0 SECTION LOCAL  DEFAULT   13 
    14: 00000000004005f4     0 SECTION LOCAL  DEFAULT   14 
    15: 0000000000400600     0 SECTION LOCAL  DEFAULT   15 
    16: 000000000040062c     0 SECTION LOCAL  DEFAULT   16 
    17: 0000000000400668     0 SECTION LOCAL  DEFAULT   17 
    18: 0000000000600e10     0 SECTION LOCAL  DEFAULT   18 
    19: 0000000000600e18     0 SECTION LOCAL  DEFAULT   19 
    20: 0000000000600e20     0 SECTION LOCAL  DEFAULT   20 
    21: 0000000000600e28     0 SECTION LOCAL  DEFAULT   21 
    22: 0000000000600ff8     0 SECTION LOCAL  DEFAULT   22 
    23: 0000000000601000     0 SECTION LOCAL  DEFAULT   23 
    24: 0000000000601030     0 SECTION LOCAL  DEFAULT   24 
    25: 0000000000601038     0 SECTION LOCAL  DEFAULT   25 
    26: 0000000000000000     0 SECTION LOCAL  DEFAULT   26 
    27: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    28: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   20 __JCR_LIST__
    29: 0000000000400470     0 FUNC    LOCAL  DEFAULT   13 deregister_tm_clones
    30: 00000000004004a0     0 FUNC    LOCAL  DEFAULT   13 register_tm_clones
    31: 00000000004004e0     0 FUNC    LOCAL  DEFAULT   13 __do_global_dtors_aux
    32: 0000000000601038     1 OBJECT  LOCAL  DEFAULT   25 completed.6355
    33: 0000000000600e18     0 OBJECT  LOCAL  DEFAULT   19 __do_global_dtors_aux_fin
    34: 0000000000400500     0 FUNC    LOCAL  DEFAULT   13 frame_dummy
    35: 0000000000600e10     0 OBJECT  LOCAL  DEFAULT   18 __frame_dummy_init_array_
    36: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS hello.c
    37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    38: 0000000000400778     0 OBJECT  LOCAL  DEFAULT   17 __FRAME_END__
    39: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   20 __JCR_END__
    40: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS 
    41: 0000000000600e18     0 NOTYPE  LOCAL  DEFAULT   18 __init_array_end
    42: 0000000000600e28     0 OBJECT  LOCAL  DEFAULT   21 _DYNAMIC
    43: 0000000000600e10     0 NOTYPE  LOCAL  DEFAULT   18 __init_array_start
    44: 000000000040062c     0 NOTYPE  LOCAL  DEFAULT   16 __GNU_EH_FRAME_HDR
    45: 0000000000601000     0 OBJECT  LOCAL  DEFAULT   23 _GLOBAL_OFFSET_TABLE_
    46: 00000000004005f0     2 FUNC    GLOBAL DEFAULT   13 __libc_csu_fini
    47: 0000000000601034     4 OBJECT  GLOBAL DEFAULT   24 inited_var
    48: 0000000000601030     0 NOTYPE  WEAK   DEFAULT   24 data_start
    49: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   24 _edata
    50: 00000000004005f4     0 FUNC    GLOBAL DEFAULT   14 _fini
    51: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@@GLIBC_2.2.5
    52: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@@GLIBC_
    53: 0000000000601030     0 NOTYPE  GLOBAL DEFAULT   24 __data_start
    54: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    55: 0000000000400608     0 OBJECT  GLOBAL HIDDEN    15 __dso_handle
    56: 000000000040052d    19 FUNC    GLOBAL DEFAULT   13 sum
    57: 0000000000400600     4 OBJECT  GLOBAL DEFAULT   15 _IO_stdin_used
    58: 0000000000400580   101 FUNC    GLOBAL DEFAULT   13 __libc_csu_init
    59: 0000000000601040     0 NOTYPE  GLOBAL DEFAULT   25 _end
    60: 0000000000400440     0 FUNC    GLOBAL DEFAULT   13 _start
    61: 000000000060103c     4 OBJECT  GLOBAL DEFAULT   25 uninited_var
    62: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   25 __bss_start
    63: 0000000000400540    53 FUNC    GLOBAL DEFAULT   13 main
    64: 0000000000601038     0 OBJECT  GLOBAL HIDDEN    24 __TMC_END__
    65: 00000000004003e0     0 FUNC    GLOBAL DEFAULT   11 _init

Program Header

Program Header

Executable or shared object files have program header table. Each program header describes a segment or other information. The system needs these when it is ready to execute the program. A segment contains one or more sections.

The e_phnum of the ELF header tells us how many entries there are in the program header table, and the e_phensize indicates the size of each entry.

The following is the definition of program header.

typedef struct
{
  Elf32_Word	p_type;			/* Segment type */
  Elf32_Off	p_offset;		/* Segment file offset */
  Elf32_Addr	p_vaddr;		/* Segment virtual address */
  Elf32_Addr	p_paddr;		/* Segment physical address */
  Elf32_Word	p_filesz;		/* Segment size in file */
  Elf32_Word	p_memsz;		/* Segment size in memory */
  Elf32_Word	p_flags;		/* Segment flags */
  Elf32_Word	p_align;		/* Segment alignment */
} Elf32_Phdr;

typedef struct
{
  Elf64_Word	p_type;			/* Segment type */
  Elf64_Word	p_flags;		/* Segment flags */
  Elf64_Off	p_offset;		/* Segment file offset */
  Elf64_Addr	p_vaddr;		/* Segment virtual address */
  Elf64_Addr	p_paddr;		/* Segment physical address */
  Elf64_Xword	p_filesz;		/* Segment size in file */
  Elf64_Xword	p_memsz;		/* Segment size in memory */
  Elf64_Xword	p_align;		/* Segment alignment */
} Elf64_Phdr;

We briefly introduce each field.

FieldsDescriptions
p_typeThe segment type.
p_flagsThe segment’s flags.
p_offsetThe offset from the beginning of the file the segment resides.
p_vaddrThe virtual address in the memory image the segment resides.
p_paddrThe physical address in the memory image the segment resides.
System V ignores physical addressing.
p_fileszThe number of bytes in the file image of the segment.
p_memszThe number of bytes in the memory image of the segment.
p_alignThe value to which the segment are aligned in memory and in the file.
Program Header.

Segment Types

Some segment types are listed below. For other values, please refer to the Program Header.

NamesValuesDescriptions
PT_NULL0Not specified.
PT_LOAD1A loadable segment.
PT_DYNAMIC2Dynamic linking information.
PT_PHDR6The location and size of the program header table, both in the file and in the memory.
Segment Types.

Segment Contents

A segment contains one or more sections, but this fact is transparent for the program header. The figure below is a typical text segment example. It contains read-only instructions and data.

A example of a text segment.
A example of a text segment.

The figure below is a typical data segment example. It contains writable data and instructions.

An example of a data segment.
An example of a data segment.

Example

We can use readelf -l to read the program header. The following is the program header of hello. hello.o has no program header because it is not an executable file.

$ readelf -l hello
Elf file type is EXEC (Executable file)
Entry point 0x400440
There are 9 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001f8 0x00000000000001f8  R E    8
  INTERP         0x0000000000000238 0x0000000000400238 0x0000000000400238
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000077c 0x000000000000077c  R E    200000
  LOAD           0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x0000000000000228 0x0000000000000230  RW     200000
  DYNAMIC        0x0000000000000e28 0x0000000000600e28 0x0000000000600e28
                 0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x0000000000000254 0x0000000000400254 0x0000000000400254
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x000000000000062c 0x000000000040062c 0x000000000040062c
                 0x000000000000003c 0x000000000000003c  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO      0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x00000000000001f0 0x00000000000001f0  R      1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag .note.gnu.build-id 
   06     .eh_frame_hdr 
   07     
   08     .init_array .fini_array .jcr .dynamic .got 

Conclusion

This article may be a bit boring to read, because it is just about ELF files. However, understanding ELF files is a good way to learn linkers.

Reference

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like
Photo by Timothée Geenens on Unsplash
Read More

x86 Memory Map

After the x86 PC boots, it will be in real mode. At this time, we can access memory below 1 MB. However, the BIOS also uses some memory. Therefore, we must know which areas the BIOS occupies in order to avoid them.
Read More
Photo by Patrick on Unsplash
Read More

x86-64 Calling Conventions

Calling conventions refers to the specifications that the two functions should follow when one function calls another function. For example, how to pass parameters and a return value ​​between them. Calling conventions are part of the application binary interface (ABI).
Read More
Photo by Lanju Fotografie on Unsplash
Read More

Makefile

Makefile is the most commonly used compilation tool in Linux. Stuart Feldman created it at Bell Labs in 1967. Although it may be older than you and me, it is still active nowadays.
Read More