Linux Debugging（二）：熟悉AT&T匯編語言

沒想到《Linux Debugging:使用反匯編理解C++程序函數調用棧》發表了收到了大家的歡迎。但是有網友留言說不熟悉匯編，因此本書列了匯編的基礎語法。這些對於我們平時的調試應該是夠用了。

1 AT&T與Intel匯編語法對比

本科時候大家學的基本上都是Intel的8086匯編語言，微軟采用的就是這種格式的匯編。GCC采用的是AT&T的匯編格式, 也叫GAS格式(Gnu ASembler GNU匯編器)。

1、寄存器命名不同

AT&T	Intel	說明
%eax	eax	Intel的不帶百分號

2、操作數順序不同

AT&T	Intel	說明
movl %eax, %ebx	mov ebx, eax	Intel的目的操作數在前,源操作數在後；AT&T相反

3、常數/立即數的格式不同

AT&T	Intel	說明
movl $_value,%ebx	mov eax,_value	Intel的立即數前麵不帶$符號
movl $0xd00d,%ebx	mov ebx,0xd00d	規則同樣適用於16進製的立即數

4、操作數長度標識

AT&T	Intel	說明
movw %ax,%bx	mov bx,ax	Intel的匯編中, 操作數的長度並不通過指令符號來標識。 AT&T的格式中, 每個操作都有一個字符後綴, 表明操作數的大小. 例如:mov指令有三種形式: movb 傳送字節 movw 傳送字 movl 傳送雙字如果沒有指定操作數長度的話，編譯器將按照目標操作數的長度來設置。比如指令“mov %ax, %bx”，由於目標操作數bx的長度為word，那麼編譯器將把此指令等同於“movw %ax, %bx”。

5、尋址方式

AT&T	Intel	說明
imm32(basepointer, indexpointer, indexscale)	[basepointer + indexpointer*indexscale + imm32)	兩種尋址的實際結果都應該是 imm32 + basepointer + indexpointer*indexscale

例如: 下麵是一些尋址的例子：

AT&T	Intel	說明
mov 4(%ebp), %eax	mov eax, [ebp + 4]	基址尋址（Base Pointer Addressing Mode）,用於訪問結構體成員比較方便，例如一個結構體的基地址保存在`eax`寄存器中，其中一個成員在結構體內的偏移量是4字節，要把這個成員讀上來就可以用這條指令
`data_items(,%edi,4)`	[data_items+edi*4	變址尋址（Indexed Addressing Mode），訪問數組
movl $addr, %eax	mov eax, addr	直接尋址（Direct Addressing Mode）
`movl (%eax), %ebx`	mov ebx, [eax]	間接尋址（Indirect Addressing Mode）,把`eax`寄存器的值看作地址，把內存中這個地址處的32位數傳送到`ebx`寄存器
mov $12, %eax	mov eax, 12	立即數尋址（Immediate Mode）
mov $12, %eax	mov eax, 12	寄存器尋址（Register Addressing Mode

6.跳轉方式不同

AT&T 匯編格式中，絕對轉移和調用指令（jump/call）的操作數前要加上'*'作為前綴，而在 Intel 格式中則不需要。

AT&T	Intel	說明
jmp *%eax	jmp %eax	用寄存器%eax中的值作為跳轉目標
jmp *(%eax)	jmp (%eax)	以%eax中的值作為讀入的地址, 從存儲器中讀出跳轉目標

2 求一個數組最大數

通過求一個數組的最大數，來進一步學習AT&T的語法

[cpp] view plain copy

#PURPOSE: This program finds the maximum number of a
# set of data items.
#
#VARIABLES: The registers have the following uses:
#
# %edi - Holds the index of the data item being examined
# %ebx - Largest data item found
# %eax - Current data item
#
# The following memory locations are used:
#
# data_items - contains the item data. A 0 is used
# to terminate the data
#
.section .data #全局變量
data_items: #These are the data items
.long 3,67,34,222,45,75,54,34,44,33,22,11,66,0
.section .text
.globl _start
_start:
movl $0, %edi # move 0 into the index register
movl data_items(,%edi,4), %eax # load the first byte of data
movl %eax, %ebx # since this is the first item, %eax is
# the biggest
start_loop: # start loop
cmpl $0, %eax # check to see if we've hit the end
je loop_exit
incl %edi # load next value
movl data_items(,%edi,4), %eax
cmpl %ebx, %eax # compare values
jle start_loop # jump to loop beginning if the new
# one isn't bigger
movl %eax, %ebx # move the value as the largest
jmp start_loop # jump to loop beginning
loop_exit:
# %ebx is the status code for the _exit system call
# and it already has the maximum number
movl $1, %eax #1 is the _exit() syscall
int $0x80

匯編程序中以.開頭的名稱並不是指令的助記符，不會被翻譯成機器指令，而是給匯編器一些特殊指示，稱為匯編指示（Assembler Directive）或偽操作（Pseudo-operation），由於它不是真正的指令所以加個“偽”字。.section指示把代碼劃分成若幹個段（Section），程序被操作係統加載執行時，每個段被加載到不同的地址，操作係統對不同的頁麵設置不同的讀、寫、執行權限。.data段保存程序的數據，是可讀可寫的，相當於C++程序的全局變量。

.text段保存代碼，是隻讀和可執行的，後麵那些指令都屬於.text段。

.long指示聲明一組數，每個數占32；.quad類似，占64位；.byte是8位；.word 是16位。.ascii，例如.ascii "Hello world"，聲明11個數，取值為相應字符的ASCII碼。

參考資料：

1. 最簡單的匯編程序

2. 第二個匯編程序

3. https://blog.chinaunix.net/uid-27717694-id-3942757.html

最後複習一下lea命令：

mov 4(%ebp) %eax #將%ebp+4地址處所存的值，mov到%eax

leal 4(%ebp) %eax #將%ebp+4的地址值， mov到%eax

leal 可以被mov取代：

addl $4, %ebp

mov. %ebp, %eax

最後更新：2017-04-03 12:54:47

Linux Debugging（二）：熟悉AT&T匯編語言

1 AT&T與Intel匯編語法對比

2 求一個數組最大數

上一篇： [LeetCode]89.Gray Code

下一篇： adb logcat -b main -b system

相關內容

熱門內容

最新內容

Linux Debugging（二）： 熟悉AT&amp;T匯編語言

1 AT&T與Intel匯編語法對比

2 求一個數組最大數

上一篇： [LeetCode]89.Gray Code

下一篇： adb logcat -b main -b system

相關內容

熱門內容

最新內容

Linux Debugging（二）：熟悉AT&T匯編語言