3.3. Effective Addresses

3.3. Effective Addresses
Prev	Chapter 3. The NASM Language	Next

An effective address is any operand to an instruction which references memory. Effective addresses, in NASM, have a very simple syntax: they consist of an expression evaluating to the desired address, enclosed in square brackets. For example:

wordvar dw 123
        mov ax,[wordvar]
        mov ax,[wordvar+1]
        mov ax,[es:wordvar+bx]

Anything not conforming to this simple system is not a valid memory reference in NASM, for example es:wordvar[bx].

More complicated effective addresses, such as those involving more than one register, work in exactly the same way:

        mov eax,[ebx*2+ecx+offset]
        mov ax,[bp+di+8]

NASM is capable of doing algebra on these effective addresses, so that things which don’t necessarily look legal are perfectly all right:

        mov eax,[ebx*5]         ; assembles as [ebx*4+ebx]
        mov eax,[label1*2-label2] ; ie [label1+(label1-label2)]

Some forms of effective address have more than one assembled form; in most such cases NASM will generate the smallest form it can. For example, there are distinct assembled forms for the 32-bit effective addresses [eax*2+0] and [eax+eax], and NASM will generally generate the latter on the grounds that the former requires four bytes to store a zero offset.

NASM has a hinting mechanism which will cause [eax+ebx] and [ebx+eax] to generate different opcodes; this is occasionally useful because [esi+ebp] and [ebp+esi] have different default segment registers.

However, you can force NASM to generate an effective address in a particular form by the use of the keywords BYTE, WORD, DWORD and NOSPLIT. If you need [eax+3] to be assembled using a double-word offset field instead of the one byte NASM will normally generate, you can code [dword eax+3]. Similarly, you can force NASM to use a byte offset for a small value which it hasn’t seen on the first pass (see Section 3.8 for an example of such a code fragment) by using [byte eax+offset]. As special cases, [byte eax] will code [eax+0] with a byte offset of zero, and [dword eax] will code it with a double-word offset of zero. The normal form, [eax], will be coded with no offset field.

The form described in the previous paragraph is also useful if you are trying to access data in a 32-bit segment from within 16 bit code. In particular, if you need to access data with a known offset that is larger than will fit in a 16-bit value, if you don’t specify that it is a dword offset, NASM will cause the high word of the offset to be lost.

Similarly, NASM will split [eax*2] into [eax+eax] because that allows the offset field to be absent and space to be saved; in fact, it will also split [eax*2+offset] into [eax+eax+offset]. You can combat this behaviour by the use of the NOSPLIT keyword: [nosplit eax*2] will force [eax*2+0] to be generated literally.

3.3.1. 64-bit Displacements

In BITS 64 mode, displacements, for the most part, remain 32 bits and are sign extended prior to use. The exception is one restricted form of the mov instruction: between an AL, AX, EAX, or RAX register and a 64-bit absolute address (no registers are allowed in the effective address, and the address cannot be RIP-relative). In NASM syntax, use of the 64-bit absolute form requires QWORD. Examples in NASM syntax:

        mov eax, [1]    ; 32 bit, with sign extension
        mov al, [rax-1] ; 32 bit, with sign extension
        mov al, [qword 0x1122334455667788] ; 64-bit absolute
        mov al, [0x1122334455667788] ; truncated to 32-bit (warning)

3.3.2. `RIP` Relative Addressing

In 64-bit mode, a new form of effective addressing is available to make it easier to write position-independent code. Any memory reference may be made RIP relative (RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction).

In NASM syntax, there are two ways to specify RIP-relative addressing:

        mov dword [rip+10], 1

stores the value 1 ten bytes after the end of the instruction. 10 can also be a symbolic constant, and will be treated the same way. On the other hand,

        mov dword [symb wrt rip], 1

stores the value 1 into the address of symbol symb. This is distinctly different than the behavior of:

        mov dword [symb+rip], 1

which takes the address of the end of the instruction, adds the address of symb to it, then stores the value 1 there. If symb is a variable, this will not store the value 1 into the symb variable!

Yasm also supports the following syntax for RIP-relative addressing. The REL keyword makes it produce RIP-relative addresses, while the ABS keyword makes it produce non-RIP-relative addresses:

        mov [rel sym], rax  ; RIP-relative
        mov [abs sym], rax  ; not RIP-relative

The behavior of mov [sym], rax depends on a mode set by the DEFAULT directive (see Section 5.2), as follows. The default mode at Yasm start-up is always ABS, and in REL mode, use of registers, a FS or GS segment override, or an explicit ABS override will result in a non-RIP-relative effective address.

default rel
        mov [sym], rbx      ; RIP-relative
        mov [abs sym], rbx  ; not RIP-relative (explicit override)
        mov [rbx+1], rbx    ; not RIP-relative (register use)
        mov [fs:sym], rbx   ; not RIP-relative (fs or gs use)
        mov [ds:sym], rbx   ; RIP-relative (segment, but not fs or gs)
        mov [rel sym], rbx  ; RIP-relative (redundant override)

default abs
        mov [sym], rbx      ; not RIP-relative
        mov [abs sym], rbx  ; not RIP-relative
        mov [rbx+1], rbx    ; not RIP-relative
        mov [fs:sym], rbx   ; not RIP-relative
        mov [ds:sym], rbx   ; not RIP-relative
        mov [rel sym], rbx  ; RIP-relative (explicit override)

Prev	Up	Next
3.2. Pseudo-Instructions	Home	3.4. Immediate Operands