ARM aarch64 stack operation example
前情提要:
AArch64 Register | Special | Role in the procedure call standard |
---|---|---|
x0…x7 | Parameter/result registers | |
x8 | Indirect result location register | |
x9..x15 | Temporary registers | |
x16 | IP0 | The first intra-procedure-call scratch register (can be used by call veneers and PLT code); at other times may be used as a temporary register. |
x17 | IP1 | The second intra-procedure-call temporary register (can be used by call veneers and PLT code); at other times may be used as a temporary register. |
x18 | The Platform Register, if needed; otherwise a temporary register. | |
x19..x28 | Callee-saved registers | |
x29 | FP | The Frame Pointer. (Where the last local data is) |
x30 | LR | The Link Register |
SP | The Stack Pointer. (Where local data is) |
幾個常用…
MOV X1,X0 //將Reg X0 的value 存入 reg X1
ADD X0,X1,X2 //Reg X1 和 X2 的value 相加 後存到 X0
SUB X0,X1,X2 //Reg X1 和 X2 的value 相減 後存到 X0
AND X0,X0,#0xF // X0的value 與 0xF AND 之後的value存到X0
ORR X0,X0,#9 // X0的value 與 9 OR 之後的value存到X0
EOR X0,X0,#0xF // X0的value 與 0xF XOR 之後的value存到X0
LDR X5,[X6,#0x08] // ld:load; X6 Reg value 加0x08 之後, 所指memory address的value 存到X5
LDP x29, x30, [sp, #0x10] // ldp :load pair ; SP(stack pointer) reg value + 0x10所指的memory value 存到x29, 下一個(通常是offset +0x8)memory value 存到x30
STR X0, [SP, #0x8] //st:store, str:往memory寫入value; X0 Reg 的value存到SP+0x8地址值指向的memory
STUR w0, [x29, #-0x8] //往memory中寫value(offset 為負)
STP x29, x30, [sp, #0x10] //store pair, 將x29, x30 的reg value 分別存到 SP+0x10 與 SP+0x10+0x8
CBZ Wn, label //比較(Compare),如果Wn 為0(Zero)就跳到label
CBNZ Wn, label//比較,如果Wn 结果非零(Non Zero)就跳到label
B //B: Branchm 跳轉指令
BL //带返回的跳轉指令, 下一個Instruction(返回地址) 保存到LR(X30)
BLR Xn // 带返回的跳转指令,跳转到指令到Reg Xn 中保存的地址
RET //callee 返回指令,下一個Instruction(返回地址)在LR(X30)
其中 MOV 指令只能用於register 之間傳值 LD 與 ST 系列(LDR/LDP/STR/STUR/STP) 用於Register 與 Memory 之間傳值
正文開始:
以如下一段code 用arm cross compiler
//main.c
int var = 10;
int func(int a,int b)
{
int c = 0;
c = a + b;
return c;
}
int main()
{
int i = 2;
int j = 3;
var = func(i, j);
return 0;
}
Compiler from: https://releases.linaro.org/components/toolchain/binaries/latest-7/aarch64-linux-gnu/
compiler command:
gcc-linaro-7.5.0-2019.12-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc main.c -o main
objdump:
gcc-linaro-7.5.0-2019.12-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-objdump -D main
然後找出 main 跟 func 的assembly
000000000040051c <func>:
40051c: d10083ff sub sp, sp, #0x20
400520: b9000fe0 str w0, [sp, #12]
400524: b9000be1 str w1, [sp, #8]
400528: b9001fff str wzr, [sp, #28]
40052c: b9400fe1 ldr w1, [sp, #12]
400530: b9400be0 ldr w0, [sp, #8]
400534: 0b000020 add w0, w1, w0
400538: b9001fe0 str w0, [sp, #28]
40053c: b9401fe0 ldr w0, [sp, #28]
400540: 910083ff add sp, sp, #0x20
400544: d65f03c0 ret
0000000000400548 <main>:
400548: a9be7bfd stp x29, x30, [sp, #-32]!
// Store pair (from reg to mem)
40054c: 910003fd mov x29, sp
400550: 52800040 mov w0, #0x2 // #2
400554: b9001fa0 str w0, [x29, #28]
400558: 52800060 mov w0, #0x3 // #3
40055c: b9001ba0 str w0, [x29, #24]
400560: b9401ba1 ldr w1, [x29, #24]
400564: b9401fa0 ldr w0, [x29, #28]
400568: 97ffffed bl 40051c <func>
40056c: 2a0003e1 mov w1, w0
400570: b0000080 adrp x0, 411000 <__libc_start_main@GLIBC_2.17>
400574: 9100a000 add x0, x0, #0x28
400578: b9000001 str w1, [x0]
40057c: 52800000 mov w0, #0x0 // #0
400580: a8c27bfd ldp x29, x30, [sp], #32 // load pair register (from mem to reg)
400584: d65f03c0 ret
...(略)
0000000000411028 <var>:
411028: 0000000a .word 0x0000000a
arm64 的 LR(x30) 與 FP(x29) 在stack frame的頂部
1. stp x29, x30, [sp, #-32]!
//把 sp-32 存到X29 X30
stp x29, x30, [sp, #-32]!
把x29 跟 x30 Register 的value 分別存到 “SP指向的memory address - 32”
並且SP register value = SP-32
所以這步基本上把main 的FP LR 位置定位 並且把CPU register的SP 更新
why "-32"?
FP 8 bytes
LR 8 bytes
2個變數 int 總共4 bytes x 2 = 8 bytes
SP 需要16 bytes align, 每次移動需要是16bytes 一動
8 + 8 + 8 =24, 但是須為16倍數, 故32
假設main 需要5個int 變數
承上總共需要 8(LR) + 8(FP) + 20(4 x5個int) = 36
36 不滿16之倍數, 故需48
ex assembly:
0000000000400548 <main>:
400548: a9bd7bfd stp x29, x30, [sp, #-48]!
40054c: 910003fd mov x29, sp
400550: 52800040 mov w0, #0x2 // #2
400554: b9002fa0 str w0, [x29, #44]
400558: 52800060 mov w0, #0x3 // #3
40055c: b9002ba0 str w0, [x29, #40]
400560: 52800080 mov w0, #0x4 // #4
400564: b90027a0 str w0, [x29, #36]
400568: 528000a0 mov w0, #0x5 // #5
40056c: b90023a0 str w0, [x29, #32]
--
400588: b0000080 adrp x0, 411000 <__libc_start_main@GLIBC_2.17>
40058c: 9100a000 add x0, x0, #0x28
400590: b9000001 str w1, [x0]
400594: 52800000 mov w0, #0x0 // #0
400598: a8c37bfd ldp x29, x30, [sp], #48
40059c: d65f03c0 ret
2. mov x29, sp
40054c: 910003fd mov x29, sp
這一步就是把最早的 SP-32 更新到 CPU Register X29 (Frame Point)
3. update variable
400550: 52800040 mov w0, #0x2 // #2
400554: b9001fa0 str w0, [x29, #28]
400558: 52800060 mov w0, #0x3 // #3
40055c: b9001ba0 str w0, [x29, #24]
400560: b9401ba1 ldr w1, [x29, #24]
400564: b9401fa0 ldr w0, [x29, #28]
這邊就是把變數儲存 先把2存到reg w0 再把reg w0的value 存到 mem address FP+28
同理 把3存到reg w0 再把reg w0的value 存到 mem address FP+24
最後因為要進到func (callee) 把要給callee的變數存到cpu register裡面 w1 跟 w0
4. branch to callee
400568: 97ffffed bl 40051c <func>
40056c: 2a0003e1 mov w1, w0
bl, branch with link 他會把下一行instruction的address (2a0003e1) 存到X30 (LR)
5. func (callee start)
40051c: d10083ff sub sp, sp, #0x20
6. func execute
400520: b9000fe0 str w0, [sp, #12]
400524: b9000be1 str w1, [sp, #8]
400528: b9001fff str wzr, [sp, #28]
40052c: b9400fe1 ldr w1, [sp, #12]
400530: b9400be0 ldr w0, [sp, #8]
400534: 0b000020 add w0, w1, w0
400538: b9001fe0 str w0, [sp, #28]
40053c: b9401fe0 ldr w0, [sp, #28]
把reg w0 value(int 2) 存進sp+12的mem address 把reg w1 value(int 3) 存進sp+8的mem address 把mem address sp+28 存0 (str wzr, [sp, #28])
再度把w0, w1 從memory 讀入 (int a, b) 最後相加存到reg w0 和mem address sp+28 再把mem address sp+28 的value 存進w0
why?
int c = 0;
c = a+b;
這會讓ISA多做C的初始化(wzr)
7. func return
400540: 910083ff add sp, sp, #0x20
400544: d65f03c0 ret
把SP 位置+20, 移回main的FP ret 把PC(當前執行的) 設成 reg LR( 40056c: 2a0003e1 mov w1, w0 “main bl進func後的下一個位置”)
8. main continue
40056c: 2a0003e1 mov w1, w0
400570: b0000080 adrp x0, 411000 <__libc_start_main@GLIBC_2.17>
400574: 9100a000 add x0, x0, #0x28
400578: b9000001 str w1, [x0]
40057c: 52800000 mov w0, #0x0 // #0
reg w0(剛剛func 算完的c) 存到reg w1
reg x0 存了 “411000” (global variable 的區域)
reg x0 = x0 + 28, 所以reg x0 目前value = 411028 (var的位置)
reg w1 存進 x0所指的memory(411028, global int var)
再把w0 清為0
9. main end
400580: a8c27bfd ldp x29, x30, [sp], #32
400584: d65f03c0 ret
把SP memory address pair load 進reg X29, X30 SP 再設為 SP+32