A limitation of NASM is that it is a two-pass assembler; unlike TASM and others, it will always do exactly two assembly passes. Therefore it is unable to cope with source files that are complex enough to require three or more passes.
The first pass is used to determine the size of all the assembled code and data, so that the second pass, when generating all the code, knows all the symbol addresses the code refers to. So one thing NASM can’t handle is code whose size depends on the value of a symbol declared after the code in question. For example,
times (label-$) db 0 label: db 'Where am I?'
The argument to
TIMES in this case could equally legally
evaluate to anything at all; NASM will reject this example because it cannot tell the
size of the
TIMES line when it first sees it. It will just
as firmly reject the slightly paradoxical code
times (label-$+1) db 0 label: db 'NOW where am I?'
in which any value for the
TIMES argument is by definition wrong!
NASM rejects these examples by means of a concept called a critical expression, which is defined to be an
expression whose value is required to be computable in the first pass, and which must
therefore depend only on symbols defined before it. The argument to the
TIMES prefix is a critical expression; for the same reason, the
arguments to the
RESB family of pseudo-instructions are
also critical expressions.
Critical expressions can crop up in other contexts as well: consider the following code.
mov ax, symbol1 symbol1 equ symbol2 symbol2:
On the first pass, NASM cannot determine the value of
symbol1 is defined to be
symbol2 which NASM hasn’t seen yet. On the second
pass, therefore, when it encounters the line
it is unable to generate the code for it because it still doesn’t know the value of
symbol1. On the next line, it would see the
again and be able to determine the value of
symbol1, but by
then it would be too late.
NASM avoids this problem by defining the right-hand side of an
EQU statement to be a critical expression, so the definition of
symbol1 would be rejected in the first pass.
mov eax, [ebx+offset] offset equ 10
NASM, on pass one, must calculate the size of the instruction
mov eax,[ebx+offset] without knowing the value of
offset. It has no way of knowing that
offset is small enough to fit into a one-byte offset field and that it
could therefore get away with generating a shorter form of the effective-address encoding; for all it knows, in pass
offset could be a symbol in the code segment, and it
might need the full four-byte form. So it is forced to compute the size of the
instruction to accommodate a four-byte address part. In pass two, having made this
decision, it is now forced to honour it and keep the instruction large, so the code
generated in this case is not as small as it could have been. This problem can be solved
offset before using it, or by forcing byte size
in the effective address by coding