21.2. Execution Modes and Extensions

The x86 has been extended in many ways throughout its history, remaining mostly backwards compatible while adding execution modes and large extensions to the instruction set. A modern x86 processor can operate in one of four major modes: 16-bit real mode, 16-bit protected mode, 32-bit protected mode, and 64-bit long mode. The primary difference between real and protected mode is in the handling of segments: in real mode the segments directly address memory as 16-byte pages, whereas in protected mode the segments are instead indexes into a descriptor table that contains the physical base and size of the segment. 32-bit protected mode allows paging and virtual memory as well as a 32-bit rather than a 16-bit offset.

The 16-bit and 32-bit operating modes both allow for use of both 16-bit and 32-bit registers via instruction prefixes that set the operation and address size to either 16-bit or 32-bit, with the active operating mode setting the default operation size and the other size being flagged with a prefix. These operation and address sizes also affect the size of immediate operands: for example, an instruction with a 32-bit operation size with an immediate operand will have a 32-bit value in the encoded instruction, excepting optimizations such as sign-extended 8-bit values.

Unlike the 16-bit and 32-bit modes, 64-bit long mode is more of a break from the legacy modes. Long mode obsoletes several instructions. It is also the only mode in which 64-bit registers are available; 64-bit registers cannot be accessed from either 16-bit or 32-bit mode. Also, unlike the other modes, most encoded values in long mode are limited to 32 bits in size. A small subset of the MOV instructions allow 64 bit encoded values, but values greater than 32 bits in other instructions must come from a register. Partly due to this limitation, but also due to the wide use of relocatable shared libraries, long mode also adds a new addressing mode: RIP-relative.

21.2.1. CPU Options

The NASM parser allows setting what subsets of instructions and operands are accepted by Yasm via use of the CPU directive (see Section 5.8). As the x86 architecture has a very large number of extensions, both specific feature flags such as SSE3 and CPU names such as P4 can be specified. The feature flags have both normal and no-prefixed versions to turn on and off a single feature, while the CPU names turn on only the features listed, turning off all other features. Table 21.3 lists the feature flags, and Table 21.4 lists the CPU names Yasm supports. Having both feature flags and CPU names allows for combinations such as CPU P3 nofpu. Both feature flags and CPU names are case insensitive.

Table 21.3. x86 CPU Feature Flags

Name Description

FPU

Floating Point Unit (FPU) instructions

MMX

MMX SIMD instructions

SSE

Streaming SIMD Extensions (SSE) instructions

SSE2

Streaming SIMD Extensions 2 instructions

SSE3

Streaming SIMD Extensions 3 instructions

SSSE3

Supplemental Streaming SIMD Extensions 3 instructions

SSE4.1

Streaming SIMD Extensions 4, Penryn subset (47 instructions)

SSE4.2

Streaming SIMD Extensions 4, Nehalem subset (7 instructions)

SSE4

All Streaming SIMD Extensions 4 instructions (both SSE4.1 and SSE4.2)

SSE4a

Streaming SIMD Extensions 4a (AMD)

SSE5

Streaming SIMD Extensions 5

XSAVE

XSAVE instructions

AVX

Advanced Vector Extensions instructions

FMA

Fused Multiply-Add instructions

AES

Advanced Encryption Standard instructions

CLMUL, PCLMULQDQ

PCLMULQDQ instruction

3DNow

3DNow! instructions

Cyrix

Cyrix-specific instructions

AMD

AMD-specific instructions (older than K6)

SMM

System Management Mode instructions

Prot, Protected

Protected mode only instructions

Undoc, Undocumented

Undocumented instructions

Obs, Obsolete

Obsolete instructions

Priv, Privileged

Privileged instructions

SVM

Secure Virtual Machine instructions

PadLock

VIA PadLock instructions

EM64T

Intel EM64T or better instructions (not necessarily 64-bit only)


Table 21.4. x86 CPU Names

Name Feature Flags Description

8086

Priv

Intel 8086

186, 80186, i186

Priv

Intel 80186

286, 80286, i286

Priv

Intel 80286

386, 80386, i386

SMM, Prot, Priv

Intel 80386

486, 80486, i486

FPU, SMM, Prot, Priv

Intel 80486

586, i586, Pentium, P5

FPU, SMM, Prot, Priv

Intel Pentium

686, i686, P6, PPro, PentiumPro

FPU, SMM, Prot, Priv

Intel Pentium Pro

P2, Pentium2, Pentium-2, PentiumII, Pentium-II

MMX, FPU, SMM, Prot, Priv

Intel Pentium II

P3, Pentium3, Pentium-3, PentiumIII, Pentium-III, Katmai

SSE, MMX, FPU, SMM, Prot, Priv

Intel Pentium III

P4, Pentium4, Pentium-4, PentiumIV, Pentium-IV, Williamette

SSE2, SSE, MMX, FPU, SMM, Prot, Priv

Intel Pentium 4

IA64, IA-64, Itanium

SSE2, SSE, MMX, FPU, SMM, Prot, Priv

Intel Itanium (x86)

K6

3DNow, MMX, FPU, SMM, Prot, Priv

AMD K6

Athlon, K7

SSE, 3DNow, MMX, FPU, SMM, Prot, Priv

AMD Athlon

Hammer, Clawhammer, Opteron, Athlon64, Athlon-64

SSE2, SSE, 3DNow, MMX, FPU, SMM, Prot, Priv

AMD Athlon64 and Opteron

Prescott

SSE3, SSE2, SSE MMX, FPU, SMM, Prot, Priv

Intel codename Prescott

Conroe, Core2

SSSE3, SSE3, SSE2, SSE, MMX, FPU, SMM, Prot, Priv

Intel codename Conroe

Penryn

SSE4.1, SSSE3, SSE3, SSE2, SSE, MMX, FPU, SMM, Prot, Priv

Intel codename Penryn

Nehalem, Corei7

XSAVE, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, SSE, MMX, FPU, SMM, Prot, Priv

Intel codename Nehalem

Westmere

CLMUL, AES, XSAVE, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, SSE, MMX, FPU, SMM, Prot, Priv

Intel codename Westmere

Sandybridge

AVX, CLMUL, AES, XSAVE, SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, SSE, MMX, FPU, SMM, Prot, Priv

Intel codename Sandy Bridge

Venice

SSE3, SSE2, SSE, 3DNow, MMX, FPU, SMM, Prot, Priv

AMD codename Venice

K10, Phenom, Family10h

SSE4a, SSE3, SSE2, SSE, 3DNow, MMX, FPU, SMM, Prot, Priv

AMD codename K10

Bulldozer

SSE5, SSE4a, SSE3, SSE2, SSE, 3DNow, MMX, FPU, SMM, Prot, Priv

AMD codename Bulldozer


In order to have access to 64-bit instructions, both a 64-bit capable CPU must be selected, and 64-bit assembly mode must be set (in NASM syntax) by either using BITS 64 (see Section 5.1) or targetting a 64-bit object format such as elf64.

The default CPU setting is for the latest processor and all feature flags to be enabled; e.g. all x86 instructions for any processor, including all instruction set extensions and 64-bit instructions.