4.8. Standard Macros

Yasm defines a set of standard macros in the NASM preprocessor which are already defined when it starts to process any source file. If you really need a program to be assembled with no pre-defined macros, you can use the %clear directive to empty the preprocessor of everything.

Most user-level NASM syntax directives (see Chapter 5) are implemented as macros which invoke primitive directives; these are described in Chapter 5. The rest of the standard macro set is described here.

4.8.1. __YASM_MAJOR__, etc: Yasm Version

The single-line macros __YASM_MAJOR__, __YASM_MINOR__, and __YASM_SUBMINOR__ expand to the major, minor, and subminor parts of the version number of Yasm being used. In addition, __YASM_VER__ expands to a string representation of the Yasm version and __YASM_VERSION_ID__ expands to a 32-bit BCD-encoded representation of the Yasm version, with the major version in the most significant 8 bits, followed by the 8-bit minor version and 8-bit subminor version, and 0 in the least significant 8 bits. For example, under Yasm 0.5.1, __YASM_MAJOR__ would be defined to be 0, __YASM_MINOR__ would be defined as 5, __YASM_SUBMINOR__ would be defined as 1, __YASM_VER__ would be defined as "0.5.1", and __YASM_VERSION_ID__ would be defined as 000050100h.

In addition, the single line macro __YASM_BUILD__ expands to the Yasm build number, typically the Subversion changeset number. It should be seen as less significant than the subminor version, and is generally only useful in discriminating between Yasm nightly snapshots or pre-release (e.g. release candidate) Yasm versions.

4.8.2. __FILE__ and __LINE__: File Name and Line Number

Like the C preprocessor, the NASM preprocessor allows the user to find out the file name and line number containing the current instruction. The macro __FILE__ expands to a string constant giving the name of the current input file (which may change through the course of assembly if %include directives are used), and __LINE__ expands to a numeric constant giving the current line number in the input file.

These macros could be used, for example, to communicate debugging information to a macro, since invoking __LINE__ inside a macro definition (either single-line or multi-line) will return the line number of the macro call, rather than definition. So to determine where in a piece of code a crash is occurring, for example, one could write a routine stillhere, which is passed a line number in EAX and outputs something like line 155: still here. You could then write a macro

%macro notdeadyet 0
        push    eax
        mov     eax, __LINE__
        call    stillhere
        pop     eax
%endmacro

and then pepper your code with calls to notdeadyet until you find the crash point.

4.8.3. __YASM_OBJFMT__ and __OUTPUT_FORMAT__: Output Object Format Keyword

__YASM_OBJFMT__, and its NASM-compatible alias __OUTPUT_FORMAT__, expand to the object format keyword specified on the command line with -f keyword (see Section 1.3.1.2). For example, if yasm is invoked with -f elf, __YASM_OBJFMT__ expands to elf.

These expansions match the option given on the command line exactly, even when the object formats are equivalent. For example, -f elf and -f elf32 are equivalent specifiers for the 32-bit ELF format, and -f elf -m amd64 and -f elf64 are equivalent specifiers for the 64-bit ELF format, but __YASM_OBJFMT__ would expand to elf and elf32 for the first two cases, and elf and elf64 for the second two cases.

4.8.4. STRUC and ENDSTRUC: Declaring Structure Data Types

The NASM preprocessor is sufficiently powerful that data structures can be implemented as a set of macros. The macros STRUC and ENDSTRUC are used to define a structure data type.

STRUC takes one parameter, which is the name of the data type. This name is defined as a symbol with the value zero, and also has the suffix _size appended to it and is then defined as an EQU giving the size of the structure. Once STRUC has been issued, you are defining the structure, and should define fields using the RESB family of pseudo-instructions, and then invoke ENDSTRUC to finish the definition.

For example, to define a structure called mytype containing a longword, a word, a byte and a string of bytes, you might code

        struc   mytype
mt_long:        resd 1
mt_word:        resw 1
mt_byte:        resb 1
mt_str:         resb 32
        endstruc

The above code defines six symbols: mt_long as 0 (the offset from the beginning of a mytype structure to the longword field), mt_word as 4, mt_byte as 6, mt_str as 7, mytype_size as 39, and mytype itself as zero.

The reason why the structure type name is defined at zero is a side effect of allowing structures to work with the local label mechanism: if your structure members tend to have the same names in more than one structure, you can define the above structure like this:

        struc   mytype
.long:  resd 1
.word:  resw 1
.byte:  resb 1
.str:   resb 32
        endstruc

This defines the offsets to the structure fields as mytype.long, mytype.word, mytype.byte and mytype.str.

Since NASM syntax has no intrinsic structure support, does not support any form of period notation to refer to the elements of a structure once you have one (except the above local-label notation), so code such as mov ax,[mystruc.mt_word] is not valid. mt_word is a constant just like any other constant, so the correct syntax is mov ax,[mystruc+mt_word] or mov ax,[mystruc+mytype.word].

4.8.5. ISTRUC, AT and IEND: Declaring Instances of Structures

Having defined a structure type, the next thing you typically want to do is to declare instances of that structure in your data segment. The NASM preprocessor provides an easy way to do this in the ISTRUC mechanism. To declare a structure of type mytype in a program, you code something like this:

mystruc:        istruc  mytype
        at mt_long, dd 123456
        at mt_word, dw 1024
        at mt_byte, db 'x'
        at mt_str,  db 'hello, world', 13, 10, 0
                iend

The function of the AT macro is to make use of the TIMES prefix to advance the assembly position to the correct point for the specified structure field, and then to declare the specified data. Therefore the structure fields must be declared in the same order as they were specified in the structure definition.

If the data to go in a structure field requires more than one source line to specify, the remaining source lines can easily come after the AT line. For example:

        at mt_str, db 123,134,145,156,167,178,189
        db 190,100,0

Depending on personal taste, you can also omit the code part of the AT line completely, and start the structure field on the next line:

        at mt_str
        db 'hello, world'
        db 13,10,0

4.8.6. ALIGN and ALIGNB: Data Alignment

The ALIGN and ALIGNB macros provide a convenient way to align code or data on a word, longword, paragraph or other boundary. The syntax of the ALIGN and ALIGNB macros is

        align 4                 ; align on 4-byte boundary
        align 16                ; align on 16-byte boundary
        align 16,nop            ; equivalent to previous line
        align 8,db 0            ; pad with 0s rather than NOPs
        align 4,resb 1          ; align to 4 in the BSS
        alignb 4                ; equivalent to previous line

Both macros require their first argument to be a power of two; they both compute the number of additional bytes required to bring the length of the current section up to a multiple of that power of two, and output either NOP fill or apply the TIMES prefix to their second argument to perform the alignment.

If the second argument is not specified, the default for ALIGN is NOP, and the default for ALIGNB is RESB 1. ALIGN treats a NOP argument specially by generating maximal NOP fill instructions (not necessarily NOP opcodes) for the current BITS setting, whereas ALIGNB takes its second argument literally. Otherwise, the two macros are equivalent when a second argument is specified. Normally, you can just use ALIGN in code and data sections and ALIGNB in BSS sections, and never need the second argument except for special purposes.

ALIGN and ALIGNB, being simple macros, perform no error checking: they cannot warn you if their first argument fails to be a power of two, or if their second argument generates more than one byte of code. In each of these cases they will silently do the wrong thing.

ALIGNB (or ALIGN with a second argument of RESB 1) can be used within structure definitions:

        struc   mytype2
mt_byte:        resb 1
                alignb 2
mt_word:        resw 1
                alignb 4
mt_long:        resd 1
mt_str:         resb 32
        endstruc

This will ensure that the structure members are sensibly aligned relative to the base of the structure.

A final caveat: ALIGNB works relative to the beginning of the section, not the beginning of the address space in the final executable. Aligning to a 16-byte boundary when the section you’re in is only guaranteed to be aligned to a 4-byte boundary, for example, is a waste of effort. Again, Yasm does not check that the section’s alignment characteristics are sensible for the use of ALIGNB. ALIGN is more intelligent and does adjust the section alignment to be the maximum specified alignment.