ARM Aarch64 Stack Operation Example
@ Aeric | Monday, Oct 26, 2020 | 5 minute read | Update at Monday, Oct 26, 2020

ARM aarch64 stack operation example


前情提要:

AArch64 Register Special Role in the procedure call standard
x0…x7 Parameter/result registers
x8 Indirect result location register
x9..x15 Temporary registers
x16 IP0 The first intra-procedure-call scratch register (can be used by call veneers and PLT code); at other times may be used as a temporary register.
x17 IP1 The second intra-procedure-call temporary register (can be used by call veneers and PLT code); at other times may be used as a temporary register.
x18 The Platform Register, if needed; otherwise a temporary register.
x19..x28 Callee-saved registers
x29 FP The Frame Pointer. (Where the last local data is)
x30 LR The Link Register
SP The Stack Pointer. (Where local data is)

幾個常用…

MOV    X1,X0         //將Reg X0 的value 存入 reg X1
ADD    X0,X1,X2     //Reg X1 和 X2 的value 相加 後存到 X0
SUB    X0,X1,X2     //Reg X1 和 X2 的value 相減 後存到 X0

AND    X0,X0,#0xF    // X0的value 與 0xF AND 之後的value存到X0
ORR    X0,X0,#9      // X0的value 與 9 OR 之後的value存到X0
EOR    X0,X0,#0xF    // X0的value 與 0xF XOR 之後的value存到X0

LDR    X5,[X6,#0x08]        // ld:load; X6 Reg value 加0x08 之後, 所指memory address的value 存到X5
LDP  x29, x30, [sp, #0x10]    // ldp :load pair ; SP(stack pointer) reg value + 0x10所指的memory value 存到x29, 下一個(通常是offset +0x8)memory value 存到x30

STR X0, [SP, #0x8]         //st:store, str:往memory寫入value; X0 Reg 的value存到SP+0x8地址值指向的memory
STUR   w0, [x29, #-0x8]   //往memory中寫value(offset 為負)
STP  x29, x30, [sp, #0x10]    //store pair, 將x29, x30 的reg value 分別存到 SP+0x10 與 SP+0x10+0x8

CBZ Wn, label  //比較(Compare),如果Wn 為0(Zero)就跳到label
CBNZ Wn, label//比較,如果Wn 结果非零(Non Zero)就跳到label

B   //B: Branchm 跳轉指令
BL  //带返回的跳轉指令, 下一個Instruction(返回地址) 保存到LR(X30)
BLR Xn // 带返回的跳转指令,跳转到指令到Reg Xn 中保存的地址
RET   //callee 返回指令,下一個Instruction(返回地址)在LR(X30)

其中 MOV 指令只能用於register 之間傳值 LD 與 ST 系列(LDR/LDP/STR/STUR/STP) 用於Register 與 Memory 之間傳值

Addressing: https://developer.arm.com/architectures/learn-the-architecture/aarch64-instruction-set-architecture/loads-and-stores-addressing


正文開始:

以如下一段code 用arm cross compiler

//main.c
int var = 10;
int func(int a,int b)
{
    int c = 0;
    c = a + b;
    return c;
}
int main()
{
    int i = 2;
    int j = 3;
    var = func(i, j);
    return 0;
}

Compiler from: https://releases.linaro.org/components/toolchain/binaries/latest-7/aarch64-linux-gnu/

compiler command:

gcc-linaro-7.5.0-2019.12-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gcc main.c -o main

objdump:

gcc-linaro-7.5.0-2019.12-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-objdump -D main

然後找出 main 跟 func 的assembly

000000000040051c <func>:
  40051c:	d10083ff 	sub	sp, sp, #0x20
  400520:	b9000fe0 	str	w0, [sp, #12]
  400524:	b9000be1 	str	w1, [sp, #8]
  400528:	b9001fff 	str	wzr, [sp, #28]
  40052c:	b9400fe1 	ldr	w1, [sp, #12]
  400530:	b9400be0 	ldr	w0, [sp, #8]
  400534:	0b000020 	add	w0, w1, w0
  400538:	b9001fe0 	str	w0, [sp, #28]
  40053c:	b9401fe0 	ldr	w0, [sp, #28]
  400540:	910083ff 	add	sp, sp, #0x20
  400544:	d65f03c0 	ret

0000000000400548 <main>:
  400548:	a9be7bfd 	stp	x29, x30, [sp, #-32]!
  // Store pair (from reg to mem)
  40054c:	910003fd 	mov	x29, sp
  400550:	52800040 	mov	w0, #0x2                   	// #2
  400554:	b9001fa0 	str	w0, [x29, #28]
  400558:	52800060 	mov	w0, #0x3                   	// #3
  40055c:	b9001ba0 	str	w0, [x29, #24]
  400560:	b9401ba1 	ldr	w1, [x29, #24]
  400564:	b9401fa0 	ldr	w0, [x29, #28]
  400568:	97ffffed 	bl	40051c <func>
  40056c:	2a0003e1 	mov	w1, w0
  400570:	b0000080 	adrp	x0, 411000 <__libc_start_main@GLIBC_2.17>
  400574:	9100a000 	add	x0, x0, #0x28
  400578:	b9000001 	str	w1, [x0]
  40057c:	52800000 	mov	w0, #0x0                   	// #0
  400580:	a8c27bfd 	ldp	x29, x30, [sp], #32 // load pair register (from mem to reg)
  400584:	d65f03c0 	ret
  
  ...(略)
  
  0000000000411028 <var>:
  411028:	0000000a 	.word	0x0000000a



arm64 的 LR(x30) 與 FP(x29) 在stack frame的頂部


1. stp x29, x30, [sp, #-32]!

//把 sp-32 存到X29 X30
stp	x29, x30, [sp, #-32]!

把x29 跟 x30 Register 的value 分別存到 “SP指向的memory address - 32”

並且SP register value = SP-32

所以這步基本上把main 的FP LR 位置定位 並且把CPU register的SP 更新



why "-32"?

FP 8 bytes
LR 8 bytes
2個變數 int  總共4 bytes x 2 = 8 bytes
SP 需要16 bytes align, 每次移動需要是16bytes 一動

8 + 8 + 8 =24, 但是須為16倍數, 故32

假設main 需要5個int 變數
承上總共需要 8(LR) + 8(FP) + 20(4 x5個int) = 36
36 不滿16之倍數, 故需48

ex assembly:
0000000000400548 <main>:
  400548:	a9bd7bfd 	stp	x29, x30, [sp, #-48]!
  40054c:	910003fd 	mov	x29, sp
  400550:	52800040 	mov	w0, #0x2                   	// #2
  400554:	b9002fa0 	str	w0, [x29, #44]
  400558:	52800060 	mov	w0, #0x3                   	// #3
  40055c:	b9002ba0 	str	w0, [x29, #40]
  400560:	52800080 	mov	w0, #0x4                   	// #4
  400564:	b90027a0 	str	w0, [x29, #36]
  400568:	528000a0 	mov	w0, #0x5                   	// #5
  40056c:	b90023a0 	str	w0, [x29, #32]
--
  400588:	b0000080 	adrp	x0, 411000 <__libc_start_main@GLIBC_2.17>
  40058c:	9100a000 	add	x0, x0, #0x28
  400590:	b9000001 	str	w1, [x0]
  400594:	52800000 	mov	w0, #0x0                   	// #0
  400598:	a8c37bfd 	ldp	x29, x30, [sp], #48
  40059c:	d65f03c0 	ret


2. mov x29, sp

40054c:	910003fd 	mov	x29, sp

這一步就是把最早的 SP-32 更新到 CPU Register X29 (Frame Point)



3. update variable

  400550:	52800040 	mov	w0, #0x2                   	// #2
  400554:	b9001fa0 	str	w0, [x29, #28]
  400558:	52800060 	mov	w0, #0x3                   	// #3
  40055c:	b9001ba0 	str	w0, [x29, #24]
  400560:	b9401ba1 	ldr	w1, [x29, #24]
  400564:	b9401fa0 	ldr	w0, [x29, #28]

這邊就是把變數儲存 先把2存到reg w0 再把reg w0的value 存到 mem address FP+28

同理 把3存到reg w0 再把reg w0的value 存到 mem address FP+24


最後因為要進到func (callee) 把要給callee的變數存到cpu register裡面 w1 跟 w0


4. branch to callee

  400568:	97ffffed 	bl	40051c <func>
  40056c:	2a0003e1 	mov	w1, w0

bl, branch with link 他會把下一行instruction的address (2a0003e1) 存到X30 (LR)


5. func (callee start)

  40051c:	d10083ff 	sub	sp, sp, #0x20


6. func execute

  400520:	b9000fe0 	str	w0, [sp, #12]
  400524:	b9000be1 	str	w1, [sp, #8]
  400528:	b9001fff 	str	wzr, [sp, #28]
  40052c:	b9400fe1 	ldr	w1, [sp, #12]
  400530:	b9400be0 	ldr	w0, [sp, #8]
  400534:	0b000020 	add	w0, w1, w0
  400538:	b9001fe0 	str	w0, [sp, #28]
  40053c:	b9401fe0 	ldr	w0, [sp, #28]

把reg w0 value(int 2) 存進sp+12的mem address 把reg w1 value(int 3) 存進sp+8的mem address 把mem address sp+28 存0 (str wzr, [sp, #28])

再度把w0, w1 從memory 讀入 (int a, b) 最後相加存到reg w0 和mem address sp+28 再把mem address sp+28 的value 存進w0


why?
int c = 0;
c = a+b;
這會讓ISA多做C的初始化(wzr)

7. func return

  400540:	910083ff 	add	sp, sp, #0x20
  400544:	d65f03c0 	ret

把SP 位置+20, 移回main的FP ret 把PC(當前執行的) 設成 reg LR( 40056c: 2a0003e1 mov w1, w0 “main bl進func後的下一個位置”)


8. main continue

  40056c:	2a0003e1 	mov	w1, w0
  400570:	b0000080 	adrp	x0, 411000 <__libc_start_main@GLIBC_2.17>
  400574:	9100a000 	add	x0, x0, #0x28
  400578:	b9000001 	str	w1, [x0]
  40057c:	52800000 	mov	w0, #0x0                   	// #0

reg w0(剛剛func 算完的c) 存到reg w1

reg x0 存了 “411000” (global variable 的區域)

reg x0 = x0 + 28, 所以reg x0 目前value = 411028 (var的位置)

reg w1 存進 x0所指的memory(411028, global int var)

再把w0 清為0


9. main end

  400580:	a8c27bfd 	ldp	x29, x30, [sp], #32
  400584:	d65f03c0 	ret

把SP memory address pair load 進reg X29, X30 SP 再設為 SP+32

About Me

Hi, my name is Aeric

This is my blog.