主要著重在匯編器和鏈結器。

簡介

GNU Binutils
GNU Binutils
gcc、glibc和binutils模块之间的关系
- GCC: 編譯器。
- Binutils: 匯編器、鏈結器和處理二進制目標文件的其它工具。
- Glibc: 提供 C 運行時函式庫。
中国科学技术大学 - 陈香兰老師的主頁
How statically linked programs run on Linux
- 靜態編譯的程序啟動過程較為簡單，不牽涉到動態鏈結。一般情況下，程序使用的 C 運行時函式庫是動態鏈結。

開發

盡可能參考其它類似架構，修改相關代碼。

# git clone git://sourceware.org/git/binutils-gdb.git
$ wget http://ftp.gnu.org/gnu/binutils/binutils-2.21.1.tar.bz2; tar xvf binutils-2.21.tar.bz2
# 只編譯 gas
$ mkdir build; cd build
$ ../binutils-2.21.1/configure --enable-as --prefix=$INSTALL
$ make; make install

主要目錄:

bfd: 提供一層抽象給上層用來操作目的檔。bfd 針對不同平台和不同的目的檔格式 (例如: ELF) 負責最後的 binary layout。請見 Binary File Descriptor library。
opcodes: 提供指令相關資訊。
gas: as.c 中的 main 是其入口函式。請見 GNU Assembler。
- frag (gas/frags.h) 代表最後生成目的檔的代碼片段 (code fragment)。symbol (gas/struc-symbol.h) 和 fix (gas/write.h) 皆附屬於某一個 frag。symbol 可以是 expression (gas/expr.h)。gas 在最後將 frag 中的 symbol 和 fix 處理之後，即可輸出至目的檔。
  - fr_address: frag 的起始位址。
  - fr_fix: frag 固定大小。
  - fr_var: frag 變動大小，之後可能會在 frag 插入額外指令。
  - fr_next: 下一段代碼片段。
  - tc_frag_data: 目標平台欲添加於 frag 的額外訊息。
  - fr_literal: 存放 frag 真實資料的部分。
- fix (gas/write.h) 即是重定位項。部分類型的重定位項可以在匯編階段修正，無法修正的則交由鏈結器處理。在匯編階段修正部分重定位項，可以減少鏈結器，特別是動態鏈結器 (Dynamic linker) 的工作，加快動態鏈結的速度。gas/write.c 提供 fix_new 和 fix_new_exp 供例如: gas/config/tc-xtensa.c 生成重定位項。
ld: ldmain.c。請見 GNU linker。

以 xtensa 為例 (xtensa 需要處理 VLIW。也可以參考 Add support for Andes NDS32):

include
bfd
- xtensa-isa.c: 定義接口供 bfd/elf32-xtensa.c 和 gas/config/tc-xtensa.c 調用。
- elf32-xtensa.c: 用於生成 32 位 ELF 格式的目的檔。
  - elf_howto_table: 定義目標平台 reloc type 的相關訊息。
  - elf_xtensa_reloc_type_lookup: 將 bfd reloc type 映射成目標平台的 reloc type。
  - elf_xtensa_reloc_name_lookup: 查詢 elf_howto_table，返回目標平台的 reloc type 的相關訊息。
  - elf_xtensa_do_reloc: 鏈結器調用 elf_xtensa_relocate_section，最終調用 elf_xtensa_do_reloc 讀取目的檔中的重定位項，並修正相關指令。
- cpu-xtensa.c
opcodes
- xtensa-dis.c: 反匯編。
gas
- config/xtensa-istack.h
  - TInsn
  - vliw_insn
- config/tc-xtensa.h
  - xtensa_frag_type 自定義的 frag data，為 struct frags 的一個欄位。一般透過 fragP→tc_frag_data 取得。
  - fragP→fr_type
    - _relax_state (gas/as.h)
  - fragP→fr_subtype
    - xtensa_relax_statesE (gas/config/tc-xtensa.h)
- config/tc-xtensa.c: 匯編器主要檔案，組譯 xtensa 匯編指令，生成機器碼。
  - md_assemble 是入口點，輸入為一條匯編指令。
  - md_convert_frag: 被 gas/writer.c 調用。根據 frag 的 type 或是 subtype，調整 frag，主要是為了 alignment (4.2 General relaxing)。
    - 以 RELAX_ADD_NOP_IF_A0_B_RETW 為例。
      - xg_assemble_vliw_tokens
      - xtensa_fix_a0_b_retw_frags
      - relax_frag_add_nop
      - assemble_nop
        tinsn_to_insnbuf (&tinsn, insnbuf); // tinsn -> insnbuf // insnbuf -> chars xtensa_insnbuf_to_chars (xtensa_default_isa, insnbuf, (unsigned char *) buf, 0);
  - xg_assemble_vliw_token 匯編當前的 VLIW bundle 中的匯編指令，寫至其所在的 frag。
ld
- scripttempl/elfxtensa.sc: 目標平台預設鏈結的腳本。
- emultempl/xtensaelf.em
- elf32xtensa.sh: 放置可以設定 xtensaelf.em 的參數。

開發流程:

如果有腳本能從指令集規格中自動化產生部分代碼最佳。
bfd/gas/opcodes/ld
- 匯編和反匯編單條目標平台指令，觀察其編碼是否有問題。
```
$ as t.s -o t.o
$ objdump -D t.o
```
- 匯編器在處理 VLIW 時，需要根據 VLIW layout 調整 VLIW 裡面的指令順序。可以設想一套如何依序擺放指令的規則，其結果能最大滿足 VLIW layout。
- 處理重定位項。
```
$ as t.s -o t.o
# 觀察重定位項是否正確。
$ objdump -r t.o
$ objdump -D t.o
# 觀察鏈結器是否能正確修正重定位項。
$ ld t.o
$ objdump -D a.out
```
- 編譯 helloworld.c O0。需要目標平台的 libc。可以嘗試在 as 和 ld 初步成形，編譯器可以生成基本匯編之後，編譯 libc。
- 搭配模擬器驗證。
利用 as_assert 及早發現錯誤。GDB 也可下斷點在 as_assert，反推出錯的地方。

使用

Big/Little Endian
MacOS 上改用 otool

# -d 只反匯編包含指令的部分，-D 對所有的段進行反匯編，-s 把所有段的內容以 16 進制的方式印出。
$ objdump -d hello
# 只列出特定 function 的組語。
$ objdump -d hello | grep -A20 main.:
# 編譯時加上 -g，顯示源碼。
$ objdump -S hello
# -h 顯示段表，-p 顯示 program header
$ objdump -h hello
# -t 顯示 symbol table，-T 顯示 dynamic symbol table
$ objdump -t hello
# 顯示 ELF header 詳細訊息
$ readelf -h hello

$ readelf -h hello | less
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x8309
  Start of program headers:          52 (bytes into file)
  Start of section headers:          4516 (bytes into file)
  Flags:                             0x5000002, has entry point, Version5 EABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         9
  Size of section headers:           40 (bytes)
  Number of section headers:         30
  Section header string table index: 27

ELF 格式

ELF 規格:
- Tool Interface Standard (TIS) Executable and Linking Format (ELF) Specification
- System V Application Binary Interface
操作 ELF 文件的函式庫:
- libelf by Example。
這裡提到的表 (table) 基本上都是段 (section)。如: 段頭表、符號表、字符串表等等。

ELF 格式最主要精神在於用 header 描述 ELF 文件相關資訊，主要有三個 header (ELF文件):

ELF header: 描述了體系結構和操作系統等基本信息。並指出 section header table 和 program header table 在文件中的位置。
section header table: 保存了所有 section 的描述信息 (section header，段頭)。鏈結時會用到。
program header table: 保存了所有 segment 的描述信息 (program header)。加載時會用到。一個 segment 由一個或多個 section 組成，這些 section 加載到內存時具有相同的訪問權限。有些 section 只對鏈接器有意義，在運行時用不到，也不需要加載到內存，那麼就不屬於任何 segment。

目標文件需要鏈接器做進一步處理，所以一定有 section header table；可執行文件需要加載運行，所以一定有 program header table；而共享庫既要加載運行，又要在加載時做動態鏈接，所以既有 section header table 又有 program header table。

有幾個特殊段:

.shstrtab: 存放其它各段的段名字符串。
.symtab: 符號表。
.strtab: 存放所有符號字符串。

段頭

段頭 (段描述符) 表 (section header table) 中存放的即是段頭 (section header)，描述其它段的訊息。
- Sections

Elf32_Shdr
- sh_name: 段名。實際上用來索引存放段名字符串的 shstrtab。
- sh_type: 段類型。表明此段是符號表、字符串表、代碼段、數據段等等。
- sh_flag: 段標誌位。表明此段可讀或可執行。
- sh_link: 一般指向與其關聯的符號表。例如: .rel.text 相關的符號所在的符號表，sh_link 為該符號表於 section header table 的索引。
- sh_info: 一般用於重定位表。例如: .rel.text 其欲重定位的段是 .text，sh_info 為 .text 於 section header table 的索引。

符號表

基本上可以想成是匯編中的符號。
- Symbol Table

Elf32_Sym
- st_name: 符號名。索引字符串表 strtab。
- st_value: 符號值。可以是常量或是位址。
- st_info: 包含兩類訊息: binding 和 type。前者表示此符號是 local 或 global 符號; 後者表示此符號是 object (即一般變量)、function、section 等等。
- st_shndx: 索引 section header table。代表定義此符號的段。

字符串表

段名或符號名都是字符串表的索引。
- String Table

Elf32_Shdr
- sh_name: 段名。索引段名字符串表 shstrtab。
Elf32_Sym
- st_name: 符號名。索引字符串表 strtab。

重定位

Relocation
- r_offset: 欲重定位的位址相對於段的偏移量。
- r_info: 包含符號表索引和重定位類型。
- r_addend: 用來計算修正值的常量。
- relocation 段參照到 symbol 段和欲被修正的段，此種關係保存在 relocation 的 section header 中 sh_info 和 sh_link 欄位。

除錯資訊

Debugger

欲支持 DWARF，在 tc-nios2.h 加入以下巨集。

/* We want .cfi_* pseudo-ops for generating unwind info.  */
#define TARGET_USE_CFIPOP 1
#define DWARF2_DEFAULT_RETURN_COLUMN 31
#define DWARF2_CIE_DATA_ALIGNMENT (-4)
#define tc_regname_to_dw2regnum nios2_regname_to_dw2regnum
extern int nios2_regname_to_dw2regnum (char *regname);
#define tc_cfi_frame_initial_instructions  nios2_frame_initial_instructions
extern void nios2_frame_initial_instructions (void);

在 dw2gencfi.c 會用到。

匯編器

匯編器 (Assembler )，讀入匯編代碼，輸出目的檔 (object file) 和清單檔 (listing file)。處理匯編指令、巨集和匯編指示符。

原理

可以從第二階段，第一階段至第零階段依次實現匯編，匯編指示符和巨集功能。

Assemblers And Loaders
- 1 Basic Principles
  - 1.2 The Two-Pass Assembler
    - 匯編分成以下三個階段。所謂的 two-pass 專指第一和第二階段，第零階段是為了處理巨集而特別引入的。
      1. 第零階段: 定義並展開巨集。生成暫時檔供下一階段處理。
      2. 第一階段: 維護位址計數器 (LC)，並依照指令長度對其累加。收集 label 並指定位址，將其寫入符號表。生成暫時檔供下一階段處理。
        
        此階段要處理 multiply-defined labels 和 invalid labels。
        
        符號表中存有: 符號名、位址和屬性。
      3. 第二階段: 輸出目的檔 (object file) 和清單檔 (listing file)。
  - 1.3 The One-Pass Assembler
    - 要做到重定位 (relocation)，一般是匯編器在目的檔中記錄 relocation entry，再由鏈結器根據 relocation entry 進行重定位。但 one-pass assembler 通常是生成 object 並直接載入至內存執行，因此基本上只能產生 absolute object file，
    - 以下面的匯編作為例子:
      BEQ AB JMP AB AB:
      - 遇到 BEQ 時，在 symbol table 找不到 AB，在 symbol table 為 AB 插入一筆記錄，記錄 AB 在 BEQ 被使用。
      - 遇到 JMP 時，在 symbol table 為 AB 再插入一筆記錄，記錄 AB 在 BEQ 被使用。
      - 遇到 AB 時，在 symbol table 根據先前紀錄，修正 BEQ 和 JMP。
  - 1.4 Absolute and Relocatable Object Files
  - 1.6 Forcing Upper
    - 和 VLIW 有關。
  - 1.7 Absolute and Relocatable Address Expressions
    - symbol 加上算數運算取得地址。
  - 1.8 Local Labels
    - 僅供前後跳躍或是引用的 label 可以單以數字表示。使用時加上 B 或是 F 後綴，表示向後或是向前引用。
    - 每個 local label 在 symbol table 都有一個 entry，記錄其所在位置 (location counter)。當引用時，查找 symbol table，選用距離當前 location counter 最近的 local label。
    - 指向當前位址的 location counter，LC，可以視為特殊的 local label。當被引用時 (例如透過 *)，直接傳回當前 location counter 的值。
  - 1.9 Multiple Location Counters
    - 為方便在匯編代碼任意處撰寫欲使用的 data 和 procedure，使用 USE 匯編指示符切換不同的 location counter，其實就是將代碼切成多個 section。
  - 1.10 Literals
    - 常量 operand 可以加上 # 前綴。如果平台不支持常量 operand，改以 literal pool 實現，將常量寫入 literal pool，再從 literal pool 載入常量。
- 2. The Symbol Table
  - 只需插入和查找 symbol，不需要刪除 symbol。
- 3. Directives
  - 指示符在第一和第二階段起作用。與巨集和條件匯編有關的指示符於第零階段起作用。會影響 location counter 的指示符在第一階段起作用。
  - opcode table 可以用來存放 directive，該 directive 該在哪一個 pass 作用，和 directive 的 callback 函式。
  - 可以參考此章實現 directive。
- 4. Macros
  - 於第零階段處理巨集的定義和展開，定義和展開又可各自另成一個階段。對於巨集定義中的任何錯誤皆不檢查。
  - 巨集的定義存放在 MDT (Macro Definition Table) 表中。
  - 展開巨集，即是將 MDT 中對映的巨集定義寫至暫時檔，如過程中遇到第零階段指示符，則執行之。
  - 展開巨集的同時，必須替換對應的參數項。
  - 4.6 Nested Macros
    - 4.6.1 Nested macro expansion
      - 巨集定義中又呼叫到其它巨集。macro expansion stack 用於記錄前一個巨集的位址，當前的巨集展開完畢之後，檢視 macro expansion stack 繼續展開之前一個巨集。Nested macro expansion 搭配 Conditional Assembly 用於實現 Recursive Macros。
    - 4.9 Nested Macro Definition
      - 巨集定義中又定義其它巨集。
  - 4.7 Recursive Macros
    - 巨集定義中又呼叫到自身。
  - 4.8 Conditional Assembly
    - 指示符 SET 用來設定遞歸終止條件符號, SET 屬於第零階段指示符。EQU 屬於第一和第二階段指示符。符號可以被 SET 重定義，但不能被 EQU 重定義。
    - 指示符 AIF (Assembler IF) 根據 SET 符號決定是否要繼續展開巨集。
- 5 The Listing File
  - 類似於 objdump 的輸出。對於匯編檔的每一條匯編，輸出其所在位址，編碼和其它相關訊息。如果需要輸出巨集，則要考慮將巨集訊息從第零階段傳遞到第二階段，最後將其輸出。
```
# 代碼行號 位址  目的碼  源代碼
     line  LC Object Source
```
- 7 Loaders
  - 主要工作: 為程序分配內存，載入 (loading)，重定位 (relocation)，鏈結 (linking)。為程序分配內存使得載入器和匯編器相比，和作業系統的關係更加緊密。
  - 7.1 Assemble-Go Loaders
    - one-pass assembler 的一部份，直接將結果輸出至內存。通常用於內存受限的環境。
  - 7.2 Absolute Loaders
    - 獨立於 one-pass assembler，將其結果載入至內容，並跳至起始位址執行。one-pass assembler 生成的目的檔必須另外包含程序的載入地址，和第一條欲執行指令的位址。
  - 7.3 Linking Loaders
    - 集 linker 和 loader 的功能於一身。鏈結可以在編譯時期或運行時期進行，後者稱為動態鏈結 (dynamic linking)。
    - 匯編器必須將 extern 和 entry symbol 記錄到目的檔供鏈結器進行符號解析。

symbol table:
- symbol
- location counter
- type
  - defined
  - undefined
  - multiply-defined
  - invalid
  - extern: 當前匯編檔用到的外部符號。
  - entry: 當前匯編暴露給其它檔案使用的符號。
opcode table:
1. 助記符 (mnemonic)
2. opcode
3. 運算元的型別和個數
4. 指令長度
macro definition table

問題

Chapter 4, Forward References
```
FOO: .EQU BAR  ; forward reference to BAR
 
  ; snip
 
BAR: .EQU 0    ; BAR is defined here
```
- one-pass 可以解決 forward reference。需要在符號表中，插入 UNDEFINED 符號 BAR，其值記錄 BAR 被引用的地方。可能的缺點是不能產生可重定位的目的檔。
- two-pass 可以解決 forward reference。於第一步，先定義所有符號。第二步，再開始解析。

實現

lex & yacc 分成兩套。一套用於處理特定匯編指示符 (assembler directive)，如巨集展開，引用檔案和條件匯編 (conditional assembly)。另一套用於處理匯編代碼。
yacc 可以有一些輔助用的 non-terminal，用以切換 parser 的狀態 (如: macro define 或 macro invoke)
為管理引用檔案，需要 stack 儲存。
處理 macro body 時，需要將 body 內容壓入 stack 儲存，因為有可能需要處理 invoke macro 的情況。
- 需要替換調用巨集時所傳入的參數。
匯編主要處理符號 (symbol)。符號又可分標識符 (identifier)、巨集 (macro)、子程式 (subroutine) 或段 (section) 等等。
- 符號依範圍可分成: 本地 (local)、全域 (global) 和外部引用 (extern)。
- 符號在設計時，必須要考量到如何對應到最終輸出格式中的欄位 (如: section 對應到 ELF 格式中的 STT_SECTION)。
- 5.3 Symbol Names
匯編指令 (instruction)。
還需要考慮到生成除錯訊息。

前處理器的 lex 對於指令部分，可以當作字元加以忽略; yacc 針對巨集展開，引用檔案和條件匯編撰寫語法規則。

術語

regular label
本地標籤 (local label)。有效範圍侷限在兩個一般標籤之間。

鏈結器

鏈結器 (Linker) 負責將輸入目的檔中的各個段 (section) 合併，輸出至最終執行檔。鏈結器在各個輸出段之間，會依照對齊要求，插入填充値。另外，鏈結器在合併各輸入段時，也會依照對齊要求，插入填充値。

原理

Linkers and Loaders
- 1. Why do we need linkers and loaders?
  - 符號解析 (symbol resolution) 和重定位 (relocation)。
- 2. Architectural Issues
  - 鏈結器需要考慮目標平台的定址模式 (addressing mode) 和指令格式 (instruction format)。
- 3. Object files
  - 目的檔格式設計上需要考慮底下幾點:
    - linkable: 目的檔是否是鏈結器的輸入?
    - executable: 目的檔是否能被載入，並作為主程序執行?
    - loadable: 目的檔是否能被載入，並作為函式庫被主程序調用執行?

鏈結腳本

GNU ld 在线手册
- 3.1 Basic Linker Script Concepts
- 基本術語:
  - 段 (section)
    - Each object file has, among other things, a list of sections. We sometimes refer to a section in an input file as an input section; similarly, a section in the output file is an output section.
  - 段屬性 (section property)
    - A section may be marked as loadable, which means that the contents should be loaded into memory when the output file is run. A section with no contents may be allocatable, which means that an area in memory should be set aside, but nothing in particular should be loaded there (in some cases this memory must be zeroed out). A section which is neither loadable nor allocatable typically contains some sort of debugging information.
  - 虛擬位址和載入位址 (VMA and LMA)
    - Every loadable or allocatable output section has two addresses. The first is the VMA, or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA, or load memory address. This is the address at which the section will be loaded. In most cases the two addresses will be the same. An example of when they might be different is when a data section is loaded into ROM, and then copied into RAM when the program starts up (this technique is often used to initialize global variables in a ROM based system). In this case the ROM address would be the LMA, and the RAM address would be the VMA.
  - 符號 (symbol)
    - Every object file also has a list of symbols, known as the symbol table. A symbol may be defined or undefined. Each symbol has a name, and each defined symbol has an address, among other information. If you compile a C or C\+\+ program into an object file, you will get a defined symbol for every defined function and global or static variable. Every undefined function or global variable which is referenced in the input file will become an undefined symbol.
  - 常見段
    - .text: 代碼段。
    - .data
    - .bss: .bss 在硬盤上不佔空間，只於載入內存後才佔用內存空間。針對未初始化的全局變量和局部靜態變量 (預設値為零) 預留空間。
- 3.6 SECTIONS Command
  - 描述如何將輸入段 (input section) 合併成輸出段 (output section)。
- 3.7 MEMORY Command
  - 描述鏈結器可分配內存區段 (memory block)。
- 3.8 PHDRS Command
  - 描述如何將段 (segment) 載入至內存。鏈結段 (section) 可以附屬於載入段 (segment)，用以指明載入方式。
    - You use the `:phdr' output section attribute to place a section in a particular segment.
- 3.10.5 The Location Counter
  - 位址指示符。

顯示 ld 預設 linker script 內容。
```
$ ld -verbose
```

底下是鏈結器預設腳本概述。參考底下文件:

AVR32795: Using the GNU Linker Scripts on AVR UC3 Devices
ld script 脚本浅析
解释一个ld.script文件

对.lds连接脚本文件的分析

ENTRY(_start) // 指定入口函式

SECTIONS // 其中的 SECTIONS 命令指示 linker 如何合併目標文件的 section
{
  // 定義在目標文件被引用，但未被所有目標文件定義的符號。
  // SEGMENT_START 是內建函式，返回指定 segment 的基底位址。如果命令行參數沒有指定該 segment 的基底位址，返回預設值。
  // . 這個特殊符號代表目前位址計數器的值。
  PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
  
  // 將所有輸入目標文件的 .interp 合併成一個 .interp 至輸出目標文件。
  .interp         : { *(.interp) }

  // .init 是在進入 main 函式之前必須要執行的初始化，與之對應的是 .fini。
  // 如果命令行參數有 --gc-sections，ld 會去掉輸入目標文件中被視為無用的 section。
  // KEEP 代表保留該 section。
  // [=FILLEXP] 代表該段空隙處以 FILEEXP 填滿。 
  .init           :
  {
    KEEP (*(.init))
  } =0x90909090

  // 在 C++ 中，全局建構子必須在 main 函式之前被呼叫，全局解構子必須在 main 返回後被呼叫。
  // 透過 crtbegin.o 和 crtend.o，以及 .ctors 和 .dtors 可達成以上目的。  
  .ctors          :
  {
    KEEP (*crtbegin.o(.ctors))
    KEEP (*crtbegin?.o(.ctors))
    // 在對 .ctors 排序之前，不包含 crtend 的 .ctors。
     // crtend 的 .ctors 包含 ctors 結尾的標記，必須擺在最後。
     KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .ctors))
    KEEP (*(SORT(.ctors.*)))
    KEEP (*(.ctors))
  }
  // .bss 預留空間給未初始化的全局變量和局部靜態變量。
  .bss            :
  {
   *(.dynbss)
   *(.bss .bss.* .gnu.linkonce.b.*)
   *(COMMON)
   // ALIGN(exp,align) 
   . = ALIGN(. != 0 ? 64 / 8 : 1); 
  }
  // ALIGN(align)
  . = ALIGN(64 / 8);
  _end = .; PROVIDE (end = .);
}

OUTPUT_FORMAT (3.4.3 Commands Dealing with Object File Formats)
OUTPUT_ARCH (3.4.5 Other Linker Script Commands)
ENTRY (3.4.1 Setting the Entry Point)
PROVIDE (3.5.3 PROVIDE)
- In some cases, it is desirable for a linker script to define a symbol only if it is referenced and is not defined by any object included in the link.
KEEP (3.6.4.4 Input Section and Garbage Collection)
- link-time garbage collection (2.1 Command Line Options)
- Link time dead code and data elimination using GNU toolchain
- 透過將函式放至其專屬的 .text.* 段 (-ffunction-sections)，鏈結器可以於鏈結時期捨棄 (link-time garbage collection) 沒有被參考到的函式所屬段。
- 但是某些段有特殊作用，即使未被參考，也必須鏈結到最後的執行檔。使用 KEEP 指令告知鏈結器不捨棄此類段。

VMA 和 LMA

對於 section 來說，有虚拟地址 (Virtual Memory Address) 或加載地址 (Load Memory Address) 之分。虛擬地址代表該 section 被執行時所在的位址; 加載地址代表該 section 被加載時所在的位址。一般情況下，虚拟地址與加載地址具有相同的值。但對於嵌入式系統來說，可能會出現加載該 section 至 ROM (flash)，等到執行時再把該 section 複製到 RAM (內存) 再開始執行。

Address spaces, Object file formats

嵌入式系統的鏈結腳本範例。

// MEMORY 描述目標平台內存區塊起始位址和長度，也可以指定區塊的存取權限。
// ld 預設可以使用內存任意位址。MEMORY 可以指定 ld 使用那些區塊。
MEMORY
{
   // 定義 rom 和 ram 兩個內存區塊，並指定其起始位址和長度。
   rom : ORIGIN = 0x0,    LENGTH = 32K
   ram : ORIGIN = 0x8000, LENGTH = 32K
}

SECTIONS
{
   .loader :
   {
      *(.loader)
   } > rom // .loader 段放在 rom 區塊。

   .text ram_start :
   {
      text_start = . ;
      *(.text)
      text_end = . ;
   } AT > rom // 指定该 section 加载地址的范围， 
}

'>' (3.6.8.6 Output Section Region): 指定輸出段至 MEMORY 命令所定義的內存區塊。
'AT>' (3.6.8.2 Output Section LMA): 指定輸出段於載入時所在位址 (LMA)。運行時的位址 (VMA) 有可能會不一樣。
':' (3.6.8.7 Output Section Phdr): 指定輸出段至 PHDRS 命令所定義的載入段 (segment)。
'=fillexp' (3.6.8.8 Output Section Fill): 指定輸出段內，各個輸入段之間，因為對齊要求所需填充的值。
(3.7 MEMORY Command)

對齊和填充

替 ELF 特定區段填充預設値。

FILLing unused Memory with the GNU Linker

FILLING UNUSED MEMORY AREA USING GNU LINKER SCRIPT

'=fillexp' (3.6.8.8 Output Section Fill) 作用域為整個段，段中沒有填充値的區段，皆填充 'fillexp'。
- Any otherwise unspecified regions of memory within the output section (for example, gaps left due to the required alignment of input sections) will be filled with the value, repeated as necessary.
'FILL(expression)' (3.6.5 Output Section Data) 作用域在段之內，在 'FILL(expression)' 命令之後沒有填充値的區段，皆填充 'expression'。'FILL(expression)' 優先級較 '=fillexp' 為高。
- The FILL command is similar to the `=fillexp' output section attribute, but it only affects the part of the section following the FILL command, rather than the entire section. If both are used, the FILL command takes precedence.

欲填充內存 (RAM 或 ROM) 剩餘空間，需要自訂填充段。該填充段緊接內存中最後一個真正有意義的段。

  // .fini_array 是 m_text 內存區段中最後一個段。
  .fini_array :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(SORT(.fini_array.*)))
    KEEP (*(.fini_array*))
    PROVIDE_HIDDEN (__fini_array_end = .);

    // 在此段中定義符號 ___ROM_AT 供外部使用。因為填充段 .fill 的存在，將此符號定義移到 .fill。
    /*___ROM_AT = .; */
  } > m_text
  
  // .fill 為人工創造的填充段。
  .fill :
  {
    // 自 FILL 以下填充 0xDEADBEEF。
    FILL(0xDEADBEEF);
    // 設置位址指示符，並填充 .fill 最後一個 byte。此處是為了 .fill 有內容。
    . = ORIGIN(m_text) + LENGTH(m_text) - 1;
    BYTE(0xAA)
    // 定義原本定義在 .fini_array 段的符號 ___ROM_AT。
    ___ROM_AT = .;
  } > m_text

預從特定位址開始，填充特定值。

  .fill <start address>:
  {
    FILL(<fill pattern>);
    . += <end address> - <start address> - 1;
    BYTE(<fill pattern>)
  } > ram

  // 自位址 0x3b0 開始，到 0x3ff 為止，填充 0xCC。
  .fill 0x3b0:
  {
    FILL(0xCC);
    . += 0x400 - 0x3b0 - 1;
    BYTE(0xCC)
  } > ram

位址無關代碼

位址無關代碼 (Position Independent Code (PIC))。

Ian Lance Taylor
- Shared Libraries
- More Shared Libraries -- specifically, linker implementation; ELF Symbols

Eli Bendersky

Sorry state of dynamic libraries on Linux
- How to Write Shared Libraries
- Position-independent code

PLT and GOT - the key to code sharing and dynamic libraries
- Procedure Linkage Table (PLT): 過程鏈接表。用於延遲綁定 (lazy binding)，等到目標函式 (過程) 被調用時，再修正其位址 (即填入對應的 GOT 項)，不再於一開始載入模塊時修正。
- Global Offset Table (GOT): 全局偏移表。用於間接存取跨模塊的全域變量，或調用跨模塊的函式。
prelink
Copy Relocations
Gcc和Open64中的-fPIC选项
PLT实例讲解

鏈結與裝載

Introduction to Libraries and Linking

# 列出 LD_DEBUG 有哪些選項
$ LD_DEBUG=help ./a.out
# 顯示 runtime linker 做了哪些事
$ LD_DEBUG=all ./a.out

外部連結

其它

# 將映像檔移至 4G 虛擬位址以上。-fPIE 是編譯器選項，-pie 是鏈結器選項。
$ gcc -Wl,-Ttext-segment=0x100000000 -fPIE -pie hello.c -o hello
$ ./hello
$ cat /proc/`pidof hello`/maps

3.13 Options for Linking
3.17.15 Intel 386 and AMD x86-64 Options
- 注意 -mcmodel 選項。預設情況下所鏈結的函式庫都是以 -mcmodel=small 編譯，因此必須在虛擬位址 2G 以下的空間。
3.18 Options for Code Generation Conventions

Is it possible to make the whole image above 4G on a 64 bit machine?
- How to move the whole image above 4G virtual space?

Before adding a PIE mode the program's executable can't be placed at randomly address in memory, only PIC dynamic libraries can be relocated to random. http://stackoverflow.com/questions/2463150/fpie-position-independent-executable-option-gcc-ld

Position Independent Executables
- GCC中的pie和fpie选项

映射文件

       -M
       --print-map
           Print a link map to the standard output.  A link map provides information about the link, including the following:

           ·  Where object files are mapped into memory.

           ·  How common symbols are allocated.

           ·  All archive members included in the link, with a mention of the symbol which caused the archive member to be brought in.

           ·  The values assigned to symbols.

               Note - symbols whose values are computed by an expression which involves a reference to a previous value of the same
               symbol may not have correct result displayed in the link map.  This is because the linker discards intermediate results
               and only retains the final value of an expression.  Under such circumstances the linker will display the final value
               enclosed by square brackets.  Thus for example a linker script containing:

                          foo = 1
                          foo = foo * 4
                          foo = foo + 8

               will produce the following output in the link map if the -M option is used:

                          0x00000001                foo = 0x1
                          [0x0000000c]                foo = (foo * 0x4)
                          [0x0000000c]                foo = (foo + 0x8)

               See Expressions for more information about expressions in linker scripts.

; section                vma       size                     lma
  .ovly0          0x00100100      0x1f4 load address 0x00108000

轉成二進制檔

Make Executable Binary File From Elf Using GNU objcopy

去掉 ELF 的標頭，留下特定段的二進制內容。

# Project.out 為輸入，Project.bin 為輸出。
$ arm-none-eabi-objcopy.exe -O binary Project.out Project.bin

覆蓋

3.6.9 Overlay Description
- An overlay description provides an easy way to describe sections which are to be loaded as part of a single memory image but are to be run at the same memory address. At run time, some sort of overlay manager will copy the overlaid sections in and out of the runtime memory address as required, perhaps by simply manipulating addressing bits. This approach can be useful, for example, when a certain region of memory is faster than another.
  - sections which are to be loaded as part of a single memory image
    - LMA
  - but are to be run at the same memory address.
    - VMA
- The sections are all defined with the same starting address. The load addresses of the sections are arranged such that they are consecutive in memory starting at the load address used for the OVERLAY as a whole.
  - 由鏈結器排列段的位址 (load addresses，即 LMA)，並將此資訊透過特定符號傳遞給程式使用。
- If the NOCROSSREFS keyword is used, and there are any references among the sections, the linker will report an error. Since the sections all run at the same address, it normally does not make sense for one section to refer directly to another.
  - 由於這些段於運行時，會被載入到同一內存位址執行 (VMA)，因此段之間不應該參照到彼此的符號。使用 NOCROSSREFS，要求鏈結器對此情況報出錯誤。
- For each section within the OVERLAY, the linker automatically provides two symbols. The symbol load_start_secname is defined as the starting load address of the section. The symbol load_stop_secname is defined as the final load address of the section.
  - 鏈結器針對這些段，提供 load_start_secname 和 load_stop_secname 兩個符號，各自用來標記段 LMA 的啟始和終止位址。
- 為語法糖。
- OVERLAY命令
- Overlays Not Yet Extinct

14.4 Overlay Sample Program

經過測試，範例程式可行 (gdb/testsuite/gdb.base ¹⁾)。但需要手動修改鏈結腳本，修正相關段所擺放的位址。

   /* Overlay sections: */
   /* section       vma :          lma                    */
      .ovly0  0x1001000 : AT (0x108000) { foo.o(.text)  }
      .ovly1  0x1001000 : AT (0x109000) { bar.o(.text)  }
      .ovly2  0x1002000 : AT (0x10a000) { baz.o(.text)  }
      .ovly3  0x1002000 : AT (0x10b000) { grbx.o(.text) }
      .data00 0x2001000 : AT (0x10c000) { foo.o(.data)  }
      .data01 0x2001000 : AT (0x10d000) { bar.o(.data)  }
      .data02 0x2002000 : AT (0x10e000) { baz.o(.data)  }
      .data03 0x2002000 : AT (0x10f000) { grbx.o(.data) }
  .text           :
  {
    /* snip  */
  }
  
  /* snip */
  
  .data           :
  {
    _ovly_table = .;
        LONG(ABSOLUTE(ADDR(.ovly0)));  /* VMA    */
        LONG(SIZEOF(.ovly0));          /* SIZE   */
        LONG(LOADADDR(.ovly0));        /* LMA    */
        LONG(0);                       /* MAPPED */
        /* snip */
        LONG(ABSOLUTE(ADDR(.data01)));
        LONG(SIZEOF(.data01));
        LONG(LOADADDR(.data01));
        LONG(0);
        /* snip */
    _novlys = .;
        LONG((_novlys - _ovly_table) / 16);
    /* snip */
  }

在鏈結腳本中，為函式的 .text 和 .data 段建表 (_ovly_table)，其中存放段的 LMA 和 VMA 資訊。
- 3.10.9 Builtin Functions

利用 ROM (LMA) 儲存多份代碼，在運行時，載入一份代碼至 RAM (VMA) 執行。適用於 RAM 空間受限的情況。上述例子當中，根據 VMA 位址可知，函式 foo 和 bar 相重疊，函式 baz 和 grbx 相重疊。程序員必須自己撰寫 overlay 管理函式，手動搬移相關代碼從 ROM 至 RAM。

int main ()
{
  int a, b, c, d, e;
 
  OverlayLoad (0); // 載入 foo 的 .text 段
  OverlayLoad (4); // 載入 foo 的 .data 段
  a = foo (1);     // a = 'f' + 'o' + 'o'
  OverlayLoad (1); // 載入 bar 的 .text 段
  OverlayLoad (5); // 載入 bar 的 .data 段
  b = bar (1);     // b = 'b' + 'a' + 'r'
  OverlayLoad (2); // 載入 baz 的 .text 段
  OverlayLoad (6); // 載入 baz 的 .data 段
  c = baz (1);     // c = 'b' + 'a' + 'z'
  OverlayLoad (3); // 載入 grbx 的 .text 段
  OverlayLoad (7); // 載入 grbx 的 .data 段
  d = grbx (1);    // d = 'g' + 'r' + 'b' + 'x'
  e = a + b + c + d;
  return (e != ('f' + 'o' +'o'
		+ 'b' + 'a' + 'r'
		+ 'b' + 'a' + 'z'
		+ 'g' + 'r' + 'b' + 'x'));
 
}

OverlayLoad 從 _ovly_table 讀取目標函式的 LMA 和 VMA，將其從 ROM 搬移至 RAM。

bool
OverlayLoad (unsigned long ovlyno)
{
  unsigned long i;
 
  if (ovlyno < 0 || ovlyno >= _novlys)
    exit (-1);	/* fail, bad ovly number */
 
  if (_ovly_table[ovlyno][MAPPED])
    return TRUE;	/* this overlay already mapped -- nothing to do! */
 
  for (i = 0; i < _novlys; i++)
    if (i == ovlyno)
      _ovly_table[i][MAPPED] = 1;	/* this one now mapped */
    else if (_ovly_table[i][VMA] == _ovly_table[ovlyno][VMA])
      _ovly_table[i][MAPPED] = 0;	/* this one now un-mapped */
 
  // 對 memcpy 的包裝。在調用 memcpy 之前，可能需要對位址做轉換。
  ovly_copy (_ovly_table[ovlyno][VMA],   /* dst */
	     _ovly_table[ovlyno][LMA],   /* src */
	     _ovly_table[ovlyno][SIZE]);
 
  FlushCache ();         // 目標平台若有快取 (cache)，需要將其內容清除。
  _ovly_debug_event ();  // 提供 GDB 下斷點，在 _ovly_table 內容被更改時，通知 GDB。
  return TRUE;
}

overlay sample code

計算內存使用量

text, data and bss: Code and Data Size Explained
Application Flash / RAM size
- 需要鏈結腳本的搭配，明確區分 ROM 和 RAM 的空間，並將各段合併至 ROM 或 RAM。ROM 包含 .text 和 .data; RAM 包含 .data 和 .bss。前述計算方式只包含靜態大小，不包含程序運行時，棧和堆的使用空間。
  - .text: ROM。代碼和常量。
  - .data: ROM 和 RAM。初始化變量。初始値放在 ROM，變量放在 RAM。於系統啟動時，將放在 ROM 中的初始値賦値予 RAM 中的變量。固在 ROM 和 RAM 皆佔有空間。
  - .bss: RAM。未初始化變量。
- size
```
# 預設即為 Berkeley 格式。System V 格式列出較詳細的資訊，但這裡不需要。
$ size --format=Berkeley ranlib size
text    data    bss     dec     hex     filename
294880  81920   11592   388392  5ed28   ranlib
294880  81920   11888   388688  5ee50   size
```

鏈結順序

Linker order - GCC
- Library order in static linking
- 關於 ld 的連結順序
  - 如果一個程式庫 A 需要依賴程式庫 B，在連結命令中 A 應該要放在 B 之前。
  - 順序對於靜態函式庫才有影響。

2.1 Command Line Options
- –start-group archives –end-group
  - The specified archives are searched repeatedly until no new undefined references are created. Normally, an archive is searched only once in the order that it is specified on the command line. If a symbol in that archive is needed to resolve an undefined symbol referred to by an object in an archive that appears later on the command line, the linker would not be able to resolve that reference. By grouping the archives, they all be searched repeatedly until all possible references are resolved.
- –whole-archive
  - For each archive mentioned on the command line after the –whole-archive option, include every object file in the archive in the link, rather than searching the archive for the required object files.

二進制檔嵌入

ld 小把戲

透過 'ld' 將二進制檔 (音樂、圖片等等) 嵌入到另一個二進制檔。

Embedding binary into elf with objcopy may cause alignment issues?

$ cat hello.txt
hello world!
$ cat main.c
#include <stdio.h>
 
extern char _binary_hello_txt_start[];
 
int main() {
  char *p;
  p = _binary_hello_txt_start;
 
  printf("%s", p);
 
  return 0;
}
$ ld -r -b binary -o hello.o hello.txt
$ gcc -o main.exe main.c hello.o
$ ./main.exe
hello world!

binary.cc

The joy of INCBIN
- 7.63 .incbin "file"[,skip[,count]
- C/C++ with GCC: Statically add resource files to executable/library

弱符號和弱參考

弱符號 (weak symbol) 和弱參考 (weak reference)
- Weak symbol
- The "weak" attribute of gcc
- Understand Weak Symbols by Examples
- gcc-weakref-example
- 6.31.1 Common Function Attributes
  - The weak attribute causes the declaration to be emitted as a weak symbol rather than a global. This is primarily useful in defining library functions that can be overridden in user code, though it can also be used with non-function declarations. Weak symbols are supported for ELF targets, and also for a.out targets when using the GNU assembler and linker.
    - 弱符號 (weak symbol) 主要用於函式庫中提供的函式。使用者可以選擇改用自定義，而非函式庫提供的版本。
      // def.c #include <stdio.h> __attribute((weak)) int f() { printf("I am in def.c\n"); } // main.c #include <stdio.h> // override function f in def.c void f() { printf("I am in main.c\n"); } int main() { if (f) { f(); } else { printf("f() is not found\n"); } return 0; }
  - The weakref attribute marks a declaration as a weak reference.
    - 弱參考 (weak reference) 只能用在靜態函式。用意在於將該靜態函式視作目標函式 (target) 的別名。兩者必須在同一個編譯單元之內 (亦即同一個檔案之內)。
      #include <stdio.h> // 1. static int x() __attribute__ ((weakref ("y"))); // 2. static int x() __attribute__ ((weak, weakref, alias ("y"))); // 缺少參數的 weakref，必須伴隨對應的 alias，指定目標函式。 static int x() __attribute__ ((weakref)); static int x() __attribute__ ((alias ("y"))); static int y() { printf("I am in y()\n"); } int main() { x(); // I am in y() }
      - Without arguments, it should be accompanied by an alias attribute naming the target symbol.
        static int x() __attribute__ ((weakref)); static int x() __attribute__ ((alias ("y")))
      - Optionally, the target may be given as an argument to weakref itself.
        static int x() __attribute__ ((weakref ("y")));
      - In either case, weakref implicitly marks the declaration as weak. Without a target, given as an argument to weakref or to alias, weakref is equivalent to weak.
        static int x() __attribute__ ((weak, weakref, alias ("y")));
ELF中的.bss section和COMMON section
- COMMON其實主要的用途是用來讓linker做merge用的。因此uninitialized的global變數會被暫時放在COMMON section，等Linker做完merge之後再看情況搬到正確的section中，也可能繼續留在COMMON section。
- Why uninitialized global variable is weak symbol?
- .bss vs COMMON: what goes where?
- COMMON 塊只存在於目標文件 (object file)，用來暫存弱符號 (因為允許多個的弱符號存在，生成目標文件時，不知道最終所佔空間)。鏈結器在讀入所有目標文件之後，可以確定弱符號所佔空間，並於執行檔的 BSS 塊分配空間。
- 未初始化的全局變量為弱符號的一種。

語言相關段

What is a 'linkonce' section
- 7.3 Vague Linkage
- .gnu.linkonce.? and .gcc_except_table
- Linkonce vs comdat
- 用於消除為實現語言特性而存在的冗餘段。GCC 使用 linkonce 段; Windows 使用 comdat 段。
.note.GNU-stack
- Executable stack
- 為實現 6.4 Nested Functions (Nested function)。

鏈結器相關環境變數

ld.so(8)
- LD_LIBRARY_PATH
  - Why LD_LIBRARY_PATH is bad
- LD_PRELOAD
- LD_DEBUG

錯誤訊息

How to list linker allocated code objects w/ gcc?
- ld: section .data_bank1 loaded at [0000000000002000,0000000000003fff] overlaps section .text loaded at [0000000000000630,00000000000020df]
在 Windows 底下，鏈結腳本描述路徑會有問題
- wildcard doesn't work correctly in ld script
- wildcard doesn't work correctly in GNU ld script
- obj\test\*.o(.s_foo_v0)
  - Cygwin 失敗，Windows 失敗。
- obj/test/*.o(.s_foo_v0)
  - Cygwin 成功，Windows 失敗。
- obj\\test\\*.o(.s_foo_v0)
  - Windows 成功。
How to go from linker error to line of code in the sources?
- 似乎只能抓 .text 段。

addr2line not returning anything useful (maybe just for me?)

How to use addr2line command in linux

// gcc -g hello.c
// addr2line -e a.out 0x400536 #offset in the `main` function
#include <stdio.h>
int main() {
    printf("hello\n");
    return 0;
}

外部連結

¹⁾ http://www.gnu.org/software/gdb/download/

目录

簡介

開發

使用

ELF 格式

段頭

符號表

字符串表

重定位

除錯資訊

匯編器

原理

問題

實現

術語

鏈結器

原理

鏈結腳本

VMA 和 LMA

對齊和填充

位址無關代碼

鏈結與裝載

外部連結

其它

映射文件

轉成二進制檔

覆蓋

計算內存使用量

鏈結順序

二進制檔嵌入

弱符號和弱參考

語言相關段

鏈結器相關環境變數

錯誤訊息

外部連結