Skip to content

Instantly share code, notes, and snippets.

@wbenny
Created November 23, 2018 19:46
Show Gist options
  • Save wbenny/4470b90f32ae19e3a3982fee3ad9f7f4 to your computer and use it in GitHub Desktop.
Save wbenny/4470b90f32ae19e3a3982fee3ad9f7f4 to your computer and use it in GitHub Desktop.
This file has been truncated, but you can view the full file.
diff --git a/2018may.txt b/2018nov.txt
index f007f17..4270c98 100644
--- a/2018may.txt
+++ b/2018nov.txt
@@ -8,8 +8,8 @@ Developer's Manual: Basic Architecture, Order Number 253665; Instruction Set Ref
Number 325383; System Programming Guide, Order Number 325384; Model-Specific Registers, Order
Number 335592. Refer to all four volumes when evaluating your design needs.
-Order Number: 325462-067US
-May 2018
+Order Number: 325462-068US
+November 2018
Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn
more at intel.com, or from the OEM or retailer.
@@ -43,8 +43,8 @@ Guide, Part 3, Order Number 326019; System Programming Guide, Part 4, Order Numb
Model-Specific Registers, Order Number 335592. Refer to all ten volumes when evaluating your design
needs.
-Order Number: 253665-067US
-May 2018
+Order Number: 253665-068US
+November 2018
Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn
more at intel.com, or from the OEM or retailer.
@@ -927,49 +927,49 @@ x87 FPU Instruction Operands . . . . . . . . . . . . . . . . . . . . . . . . . .
8.3.3
Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16
8.3.4
-Load Constant Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18
+Load Constant Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
8.3.5
-Basic Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18
+Basic Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
8.3.6
-Comparison and Classification Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
+Comparison and Classification Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18
8.3.6.1
Branching on the x87 FPU Condition Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
8.3.7
-Trigonometric Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21
+Trigonometric Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
8.3.8
Approximation of Pi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21
8.3.9
-Logarithmic, Exponential, and Scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
+Logarithmic, Exponential, and Scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21
8.3.10
Transcendental Instruction Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
8.3.11
-x87 FPU Control Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
+x87 FPU Control Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23
8.3.12
Waiting vs. Non-waiting Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8.3.13
-Unsupported x87 FPU Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25
+Unsupported x87 FPU Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8.4
-X87 FPU FLOATING-POINT EXCEPTION HANDLING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25
+X87 FPU FLOATING-POINT EXCEPTION HANDLING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
8.4.1
Arithmetic vs. Non-arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25
8.5
X87 FPU FLOATING-POINT EXCEPTION CONDITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
8.5.1
-Invalid Operation Exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
+Invalid Operation Exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
8.5.1.1
-Stack Overflow or Underflow Exception (#IS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
+Stack Overflow or Underflow Exception (#IS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
8.5.1.2
Invalid Arithmetic Operand Exception (#IA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
8.5.2
Denormal Operand Exception (#D). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
8.5.3
-Divide-By-Zero Exception (#Z) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
+Divide-By-Zero Exception (#Z) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
8.5.4
Numeric Overflow Exception (#O) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
8.5.5
-Numeric Underflow Exception (#U) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30
+Numeric Underflow Exception (#U) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
8.5.6
-Inexact-Result (Precision) Exception (#P) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-31
+Inexact-Result (Precision) Exception (#P) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30
8.6
X87 FPU EXCEPTION SYNCHRONIZATION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-31
8.7
@@ -977,7 +977,7 @@ HANDLING X87 FPU EXCEPTIONS IN SOFTWARE . . . . . . . . . . . . . . . . . . . .
8.7.1
Native Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32
8.7.2
-MS-DOS* Compatibility Sub-mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33
+MS-DOS* Compatibility Sub-mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32
Vol. 1 ix
CONTENTS
@@ -1992,7 +1992,6 @@ Figure D-2.
Figure D-3.
Figure D-4.
Figure D-5.
-Figure D-6.
xviii Vol. 1
Example x87 FPU Dot Product Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
@@ -2048,18 +2047,20 @@ Bound Paging Structure and Address Translation in 64-Bit Mode . . . . . . . . .
Bound Paging Structure and Address Translation Outside 64-Bit Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-9
Memory-Mapped I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2
I/O Permission Bit Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-4
-Recommended Circuit for MS-DOS Compatibility x87 FPU Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
+Recommended Circuit for MS-DOS Compatibility x87 FPU
+Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-4
Behavior of Signals During x87 FPU Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-5
Timing of Receipt of External Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-6
Arithmetic Example Using Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-9
General Program Flow for DNA Exception Handler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-17
-Program Flow for a Numeric Exception Dispatch Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-18
CONTENTS
PAGE
+Figure D-6.
Figure E-1.
+Program Flow for a Numeric Exception Dispatch Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-18
Control Flow for Handling Unmasked Floating-Point Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E-4
Vol. 1 xix
@@ -2098,8 +2099,8 @@ Table 7-4.
Table 8-1.
Table 8-2.
Table 8-3.
-Table 8-5.
Table 8-4.
+Table 8-5.
Table 8-6.
Table 8-7.
Table 8-8.
@@ -2157,14 +2158,14 @@ Conditional Jump Instructions. . . . . . . . . . . . . . . . . . . . . . . . . .
Condition Code Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-5
Precision Control Field (PC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-8
Unsupported Double Extended-Precision Floating-Point Encodings and Pseudo-Denormals . . . . . . . . . . . . . . . . . . . . . 8-14
-Floating-Point Conditional Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
-Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
-Setting of x87 FPU Condition Code Flags for Floating-Point Number Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
-Setting of EFLAGS Status Flags for Floating-Point Number Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
-TEST Instruction Constants for Conditional Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21
-Arithmetic and Non-arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26
-Invalid Arithmetic Operations and the Masked Responses to Them . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
-Divide-By-Zero Conditions and the Masked Responses to Them . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
+Data Transfer Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16
+Floating-Point Conditional Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16
+Setting of x87 FPU Condition Code Flags for Floating-Point Number Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
+Setting of EFLAGS Status Flags for Floating-Point Number Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
+TEST Instruction Constants for Conditional Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20
+Arithmetic and Non-arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25
+Invalid Arithmetic Operations and the Masked Responses to Them . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
+Divide-By-Zero Conditions and the Masked Responses to Them . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
Data Range Limits for Saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-5
MMX Instruction Set Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-6
Effect of Prefixes on MMX Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-11
@@ -6418,7 +6419,8 @@ or TPR) has been added. See Chapter 2, “Intel® 64 and IA-32 Architectures,”
Debug registers — Debug registers expand to 64 bits. See Chapter 17, “Debug, Branch Profile, TSC, and
-Quality of Service,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.
+Intel® Resource Director Technology (Intel® RDT) Features,” in the Intel® 64 and IA-32 Architectures
+Software Developer’s Manual, Volume 3B.
Vol. 1 3-5
@@ -19467,10 +19469,6 @@ value in a selected x87 FPU data register onto the top of the register stack.
The FILD (load integer) instruction converts an integer operand in memory into double extended-precision floatingpoint format and pushes the value onto the top of the register stack. The FBLD (load packed decimal) instruction
performs the same load operation for a packed BCD operand in memory.
-8-16 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
Table 8-4. Data Transfer Instructions
Floating Point
@@ -19567,6 +19565,11 @@ ZF=0
Not equal
+8-16 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
+Table 8-5. Floating-Point Conditional Move Instructions (Contd.)
Instruction Mnemonic
Status Flag States
@@ -19602,10 +19605,6 @@ help eliminate branching overhead for IF operations and the possibility of branc
Software can check if the FCMOVcc instructions are supported by checking the processor’s feature information with
the CPUID instruction.
-Vol. 1 8-17
-
- PROGRAMMING WITH THE X87 FPU
-
8.3.4
Load Constant Instructions
@@ -19665,11 +19664,6 @@ FDIVR/FDIVRP
FIDIVR
FABS
FCHS
-FSQRT
-FPREM
-FPREM1
-FRNDINT
-FXTRACT
Add floating point
Add integer to floating point
@@ -19685,6 +19679,17 @@ Reverse divide
Reverse divide integer by floating point
Absolute value
Change sign
+
+Vol. 1 8-17
+
+ PROGRAMMING WITH THE X87 FPU
+
+FSQRT
+FPREM
+FPREM1
+FRNDINT
+FXTRACT
+
Square root
Partial remainder
IEEE partial remainder
@@ -19703,11 +19708,6 @@ See Section 8.1.2, “x87 FPU Data Registers,” for a description of how operan
stack.
Operands in memory can be in single-precision floating-point, double-precision floating-point, word-integer, or
doubleword-integer format. They are converted to double extended-precision floating-point format automatically.
-
-8-18 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
Reverse versions of the subtract (FSUBR) and divide (FDIVR) instructions enable efficient coding. For example, the
following options are available with the FSUB and FSUBR instructions for operating on values in a specified x87 FPU
data register ST(i) and the ST(0) register:
@@ -19743,6 +19743,11 @@ FUCOM/FUCOMP/FUCOMPPUnordered compare floating point and set
x87 FPU condition code flags.
FICOM/FICOMPCompare integer and set x87 FPU
condition code flags.
+
+8-18 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
FCOMI/FCOMIPCompare floating point and set EFLAGS
status flags.
FUCOMI/FUCOMIPUnordered compare floating point and
@@ -19759,11 +19764,6 @@ cannot have less than, equal, or greater than relationships with other floating-
The FCOM, FCOMP, and FCOMPP instructions compare the value in register ST(0) with a floating-point source
operand and set the condition code flags (C0, C2, and C3) in the x87 FPU status word according to the results (see
Table 8-6).
-
-Vol. 1 8-19
-
- PROGRAMMING WITH THE X87 FPU
-
If an unordered condition is detected (one or both of the values are NaNs or in an undefined format), a floatingpoint invalid-operation exception is generated.
The pop versions of the instruction pop the x87 FPU register stack once or twice after the comparison operation is
complete.
@@ -19866,6 +19866,10 @@ Unordered
1
+Vol. 1 8-19
+
+ PROGRAMMING WITH THE X87 FPU
+
Software can check if the FCOMI and FCOMIP instructions are supported by checking the processor’s feature information with the CPUID instruction.
The FUCOMI and FUCOMIP instructions operate the same as the FCOMI and FCOMIP instructions, except that they
do not generate a floating-point invalid-operation exception if the unordered condition is the result of one or both
@@ -19883,10 +19887,6 @@ Branching on the x87 FPU Condition Codes
The processor does not offer any control-flow instructions that branch on the setting of the condition code flags
(C0, C2, and C3) in the x87 FPU status word. To branch on the state of these flags, the x87 FPU status word must
-8-20 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
first be moved to the AX register in the integer unit. The FSTSW AX (store status word) instruction can be used for
this purpose. When these flags are in the AX register, the TEST instruction can be used to control conditional
branching as follows:
@@ -19896,14 +19896,14 @@ flags indicate an unordered result; otherwise, the ZF flag will be set. The JNZ
transfer control (if necessary) to a procedure for handling unordered operands.
Table 8-8. TEST Instruction Constants for Conditional Branching
-Order
-
Constant
Branch
ST(0) > Source Operand
+Order
+
4500H
JZ
@@ -19961,6 +19961,10 @@ FPATAN
Arctangent
+8-20 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
These instructions operate on the top one or two registers of the x87 FPU register stack and they return their
results to the stack. The source operands for the FSIN, FCOS, FSINCOS, and FPTAN instructions must be given in
radians; the source operand for the FPATAN instruction is given in rectangular coordinate units.
@@ -19977,11 +19981,6 @@ Approximation of Pi
When the argument (source operand) of a trigonometric function is within the domain of the function, the argument is automatically reduced by the appropriate multiple of 2π through the same reduction mechanism used by
the FPREM and FPREM1 instructions. The internal value of π (3.1415926…) that the x87 FPU uses for argument
-
-Vol. 1 8-21
-
- PROGRAMMING WITH THE X87 FPU
-
reduction and other computations, denoted as Pi in the expression below. The numerical value of Pi can be written
as:
Pi = 0.f ∗ 22
@@ -20027,6 +20026,10 @@ are close to 0.
The F2XM1 instruction computes (2x − 1). This instruction only operates on source values in the range −1.0 to +1.0.
The FSCALE instruction multiplies the source operand by a power of 2.
+Vol. 1 8-21
+
+ PROGRAMMING WITH THE X87 FPU
+
8.3.10
Transcendental Instruction Accuracy
@@ -20048,10 +20051,6 @@ where k is an integer such that:
f ( x ) < 2.
-8-22 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
With the Pentium processor and later IA-32 processors, the worst case error on transcendental functions is less
than 1 ulp when rounding to the nearest (even) and less than 1.5 ulps when rounding in other modes. The functions are guaranteed to be monotonic, with respect to the input operands, throughout the domain supported by the
instruction.
@@ -20091,6 +20090,11 @@ precision result of FSIN will not have errors larger than 0.72 ulp for |x| < 2.8
larger than 0.5 ulp, and single precision results will be correctly rounded in the vast majority of cases.
Likewise, the double-extended precision result of FCOS will not have errors larger than 0.82 ulp for |x| < 1.31 (so
|x| < 3π/8 will ensure good accuracy, as 3π/8 < 1.31). On the same interval, double precision results from FCOS
+
+8-22 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
will have errors at most slightly larger than 0.5 ulp, and single precision results will be correctly rounded in the vast
majority of cases.
FSINCOS behaves similarly to FSIN and FCOS, combined as a pair.
@@ -20105,10 +20109,6 @@ The instructions FYL2X and FYL2XP1 are two operand instructions and are guarante
y equals 1. When y is not equal to 1, the maximum ulp error is always within 1.35 ulps in round to nearest mode.
(For the two operand functions, monotonicity was proved by holding one of the operands constant.)
-Vol. 1 8-23
-
- PROGRAMMING WITH THE X87 FPU
-
8.3.11
x87 FPU Control Instructions
@@ -20186,6 +20186,11 @@ The WAIT/FWAIT instructions are synchronization instructions. (They are actually
opcode.) These instructions check the x87 FPU status word for pending unmasked x87 FPU exceptions. If any
pending unmasked x87 FPU exceptions are found, they are handled before the processor resumes execution of the
instructions (integer, floating-point, or system instruction) in the instruction stream. The WAIT/FWAIT instructions
+
+Vol. 1 8-23
+
+ PROGRAMMING WITH THE X87 FPU
+
are provided to allow synchronization of instruction execution between the x87 FPU and the processor’s integer
unit. See Section 8.6, “x87 FPU Exception Synchronization,” for more information on the use of the WAIT/FWAIT
instructions.
@@ -20203,15 +20208,12 @@ primary operation; whereas, the non-waiting version (with the “FN” prefix) i
Non-waiting instructions allow software to save the current x87 FPU state without first handling pending exceptions
or to reset or reinitialize the x87 FPU without regard for pending exceptions.
-8-24 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
NOTES
When operating a Pentium or Intel486 processor in MS-DOS compatibility mode, it is possible
(under unusual circumstances) for a non-waiting instruction to be interrupted prior to being
-executed to handle a pending x87 FPU exception. The circumstances where this can happen and
-the resulting action of the processor are described in Section D.2.1.3, “No-Wait x87 FPU Instructions Can Get x87 FPU Interrupt in Window.”
+executed to handle a pending x87 FPU exception. The circumstances where this can happen and the
+resulting action of the processor are described in Section D.2.1.3, “No-Wait x87 FPU Instructions
+Can Get x87 FPU Interrupt in Window.”
When operating a P6 family, Pentium 4, or Intel Xeon processor in MS-DOS compatibility mode,
non-waiting instructions can not be interrupted in this way (see Section D.2.2, “MS-DOS* Compatibility Sub-mode in the P6 Family and Pentium® 4 Processors”).
@@ -20251,6 +20253,11 @@ Each of the six exception classes has a corresponding flag bit in the x87 FPU st
FPU control word (see Section 8.1.3, “x87 FPU Status Register,” and Section 8.1.5, “x87 FPU Control Word,” respectively). In addition, the exception summary (ES) flag in the status word indicates when one or more unmasked
exceptions has been detected. The stack fault (SF) flag (also in the status word) distinguishes between the two
types of invalid-operation exceptions.
+
+8-24 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
The mask bits can be set with FLDCW, FRSTOR, or FXRSTOR; they can be read with either FSTCW/FNSTCW,
FSAVE/FNSAVE, or FXSAVE. The flag bits can be read with the FSTSW/FNSTSW, FSAVE/FNSAVE, or FXSAVE
instruction.
@@ -20268,11 +20275,7 @@ Arithmetic vs. Non-arithmetic Instructions
When dealing with floating-point exceptions, it is useful to distinguish between arithmetic instructions and nonarithmetic instructions. Non-arithmetic instructions have no operands or do not make substantial changes to
their operands. Arithmetic instructions do make significant changes to their operands; in particular, they make
-changes that could result in floating-point exceptions being signaled. Table 8-9 lists the non-arithmetic and arithVol. 1 8-25
-
- PROGRAMMING WITH THE X87 FPU
-
-metic instructions. It should be noted that some non-arithmetic instructions can signal a floating-point stack (fault)
+changes that could result in floating-point exceptions being signaled. Table 8-9 lists the non-arithmetic and arithmetic instructions. It should be noted that some non-arithmetic instructions can signal a floating-point stack (fault)
exception, but this exception is not the result of an operation on an operand.
Table 8-9. Arithmetic and Non-arithmetic Instructions
@@ -20371,6 +20374,15 @@ FSIN
FXCH
FSINCOS
+
+Vol. 1 8-25
+
+ PROGRAMMING WITH THE X87 FPU
+
+Table 8-9. Arithmetic and Non-arithmetic Instructions (Contd.)
+Non-arithmetic Instructions
+
+Arithmetic Instructions
FSQRT
FST/FSTP (single and double)
FSUB/FSUBP/FSUBR/FSUBRP
@@ -20387,13 +20399,7 @@ NOTE:
X87 FPU FLOATING-POINT EXCEPTION CONDITIONS
The following sections describe the various conditions that cause a floating-point exception to be generated by the
-x87 FPU and the masked response of the x87 FPU when these conditions are detected. Intel® 64 and IA-32 Archi-
-
-8-26 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
-tectures Software Developer’s Manual, Volumes 2A & 2B, list the floating-point exceptions that can be signaled for
+x87 FPU and the masked response of the x87 FPU when these conditions are detected. Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A & 2B, list the floating-point exceptions that can be signaled for
each floating-point instruction.
See Section 4.9.2, “Floating-Point Exception Priority,” for a description of the rules for exception precedence when
more than one floating-point exception condition is detected for an instruction.
@@ -20436,6 +20442,10 @@ value (tag value of 10).
Stack underflow — An instruction references an empty x87 FPU register as a source operand, including
attempting to write the contents of an empty register to memory. An empty register has a tag value of 11.
+8-26 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
NOTES
The term stack overflow originates from the situation where the program has loaded (pushed) eight
values from memory onto the x87 FPU register stack and the next value pushed on the stack
@@ -20458,15 +20468,10 @@ Invalid Arithmetic Operand Exception (#IA)
The x87 FPU is able to detect a variety of invalid arithmetic operations that can be coded in a program. These operations are listed in Table 8-10. (This list includes the invalid operations defined in IEEE Standard 754.)
When the x87 FPU detects an invalid arithmetic operand, it sets the IE flag (bit 0) in the x87 FPU status word to 1.
-If the invalid-operation exception is masked, the x87 FPU then returns an indefinite value or QNaN to the destinaVol. 1 8-27
-
- PROGRAMMING WITH THE X87 FPU
-
-tion operand and/or sets the floating-point condition codes as shown in Table 8-10. If the invalid-operation exception is not masked, a software exception handler is invoked (see Section 8.7, “Handling x87 FPU Exceptions in
+If the invalid-operation exception is masked, the x87 FPU then returns an indefinite value or QNaN to the destination operand and/or sets the floating-point condition codes as shown in Table 8-10. If the invalid-operation exception is not masked, a software exception handler is invoked (see Section 8.7, “Handling x87 FPU Exceptions in
Software”) and the top-of-stack pointer (TOP) and source operands remain unchanged.
-Table 8-10. Invalid Arithmetic Operations and the
-Masked Responses to Them
+Table 8-10. Invalid Arithmetic Operations and the Masked Responses to Them
Condition
Masked Response
@@ -20530,6 +20535,11 @@ format.
Store packed BCD integer indefinite value in the destination
operand.
+Vol. 1 8-27
+
+ PROGRAMMING WITH THE X87 FPU
+
+Table 8-10. Invalid Arithmetic Operations and the Masked Responses to Them (Contd.)
FIST/FISTP: Converted value exceeds representable integer range
of the destination operand, or source value is an SNaN, QNaN, ±∞,
or in an unsupported format.
@@ -20562,10 +20572,6 @@ If an attempt is made to load a denormal single-precision or double-precision fl
FPU register. (If the denormal value being loaded is a double extended-precision floating-point value, the
denormal-operand exception is not reported.)
-8-28 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
The flag (DE) for this exception is bit 1 of the x87 FPU status word, and the mask bit (DM) is bit 1 of the x87 FPU
control word.
When a denormal-operand exception occurs and the exception is masked, the x87 FPU sets the DE flag, then
@@ -20605,12 +20611,17 @@ FYL2X instruction.
Returns an ∞ signed with the opposite sign of the non-zero operand to the destination
operand.
-FXTRACT instruction.
+8-28 Vol. 1
-ST(1) is set to –∞; ST(0) is set to 0 with the same sign as the source operand.
+ PROGRAMMING WITH THE X87 FPU
+
+Table 8-11. Divide-By-Zero Conditions and the Masked Responses to Them
+FXTRACT instruction.
8.5.4
+ST(1) is set to –∞; ST(0) is set to 0 with the same sign as the source operand.
+
Numeric Overflow Exception (#O)
The x87 FPU reports a floating-point numeric overflow exception (#O) whenever the rounded result of an arithmetic instruction exceeds the largest allowable finite value that will fit into the floating-point format of the destination operand. (See Section 4.9.1.4, “Numeric Overflow Exception (#O),” for additional information about the
@@ -20633,10 +20644,6 @@ masked, depends on whether the instruction is supposed to store the result in me
Destination is a memory location — The OE flag is set and a software exception handler is invoked (see
Section 8.7, “Handling x87 FPU Exceptions in Software”). The top-of-stack pointer (TOP) and source and
destination operands remain unchanged. Because the data in the stack is in double extended-precision format,
-Vol. 1 8-29
-
- PROGRAMMING WITH THE X87 FPU
-
the exception handler has the option either of re-executing the store instruction after proper adjustment of the
operand or of rounding the significand on the stack to the destination's precision as the standard requires. The
exception handler should ultimately store a value into the destination location in memory if the program is to
@@ -20668,6 +20675,11 @@ Like numeric overflow, numeric underflow can occur on arithmetic operations wher
FPU data register. It can also occur on store floating-point operations (with the FST and FSTP instructions), where
a within-range value in a data register is stored in memory in the smaller single-precision or double-precision
floating-point formats. A numeric underflow exception cannot occur when storing values in an integer or BCD
+
+Vol. 1 8-29
+
+ PROGRAMMING WITH THE X87 FPU
+
integer format, because a value with magnitude less than 1 is always rounded to an integral value of 0 or 1,
depending on the rounding mode in effect.
The flag (UE) for the numeric-underflow exception is bit 4 of the x87 FPU status word, and the mask bit (UM) is bit
@@ -20698,11 +20710,6 @@ cleared if the result was rounded toward 0. After the result is stored, the UE f
handler is invoked. The scaling bias value 24,576 is the same as is used for the overflow exception and has the
same effect, which is to translate the result as nearly as possible to the middle of the double extended-precision
floating-point exponent range.
-
-8-30 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
When using the FSCALE instruction, massive underflow can occur, where the magnitude of the result is too
small to be represented, even with a bias-adjusted exponent. Here, if underflow occurs again after the result
has been biased, a properly signed 0 is stored in the destination operand.
@@ -20734,6 +20741,10 @@ are set and the result is stored as described for the overflow or underflow exce
“Numeric Overflow Exception (#O),” or Section 8.5.5, “Numeric Underflow Exception (#U)”). If the inexact
result exception is unmasked, the x87 FPU also invokes a software exception handler.
+8-30 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
If an inexact result occurs in conjunction with unmasked overflow or underflow and the destination operand is
@@ -20762,14 +20773,10 @@ instruction or a WAIT/FWAIT instruction in the instruction stream, the processor
status word for pending floating-point exceptions. If floating-point exceptions are pending, the x87 FPU makes an
implicit call (traps) to the floating-point software exception handler. The exception handler can then execute
recovery procedures for selected or all floating-point exceptions.
-
-Vol. 1 8-31
-
- PROGRAMMING WITH THE X87 FPU
-
Synchronization problems occur in the time between the moment when the exception is signaled and when it is
-actually handled. Because of concurrent execution, integer or system instructions can be executed during this time.
-It is thus possible for the source or destination operands for a floating-point instruction that faulted to be overwritten in memory, making it impossible for the exception handler to analyze or recover from the exception.
+actually handled. Because of concurrent execution, integer or system instructions can be executed during this
+time. It is thus possible for the source or destination operands for a floating-point instruction that faulted to be
+overwritten in memory, making it impossible for the exception handler to analyze or recover from the exception.
To solve this problem, an exception synchronizing instruction (either a floating-point instruction or a WAIT/FWAIT
instruction) can be placed immediately after any floating-point instruction that might present a situation where
state information pertaining to a floating-point exception might be lost or corrupted. Floating-point instructions
@@ -20784,8 +20791,7 @@ FSQRT
;Subsequent floating-point instruction
In this example, the INC instruction modifies the source operand of the floating-point instruction, FILD. If an
-exception is signaled during the execution of the FILD instruction, the INC instruction would be allowed to overwrite
-the value stored in the COUNT memory location before the floating-point exception handler is called. With the
+exception is signaled during the execution of the FILD instruction, the INC instruction would be allowed to overwrite the value stored in the COUNT memory location before the floating-point exception handler is called. With the
COUNT variable modified, the floating-point exception handler would not be able to recover from the error.
Rearranging the instructions, as follows, so that the FSQRT instruction follows the FILD instruction, synchronizes
floating-point exception handling and eliminates the possibility of the COUNT variable being overwritten before the
@@ -20802,6 +20808,11 @@ INC COUNT
The FSQRT instruction does not require any synchronization, because the results of this instruction are stored in
the x87 FPU data registers and will remain there, undisturbed, until the next floating-point or WAIT/FWAIT instruction is executed. To absolutely insure that any exceptions emanating from the FSQRT instruction are handled (for
example, prior to a procedure call), a WAIT instruction can be placed directly after the FSQRT instruction.
+
+Vol. 1 8-31
+
+ PROGRAMMING WITH THE X87 FPU
+
Note that some floating-point instructions (non-waiting instructions) do not check for pending unmasked exceptions (see Section 8.3.11, “x87 FPU Control Instructions”). They include the FNINIT, FNSTENV, FNSAVE, FNSTSW,
FNSTCW, and FNCLEX instructions. When an FNINIT, FNSTENV, FNSAVE, or FNCLEX instruction is executed, all
pending exceptions are essentially lost (either the x87 FPU status register is cleared or all exceptions are masked).
@@ -20831,10 +20842,6 @@ exception vector 16), immediately before execution of any of the following instr
The next floating-point instruction, unless it is one of the non-waiting instructions (FNINIT, FNCLEX, FNSTSW,
FNSTCW, FNSTENV, and FNSAVE).
-8-32 Vol. 1
-
- PROGRAMMING WITH THE X87 FPU
-
@@ -20867,6 +20874,11 @@ processor freezes just before executing the next WAIT instruction, waiting float
instruction. Whether the FERR# pin was asserted at the preceding floating-point instruction or is just now being
asserted, the freezing of the processor assures that the x87 FPU exception handler will be invoked before the
new floating-point (or MMX) instruction gets executed.
+
+8-32 Vol. 1
+
+ PROGRAMMING WITH THE X87 FPU
+
4. The FERR# pin is connected through external hardware to IRQ13 of a cascaded, programmable interrupt
controller (PIC). When the FERR# pin is asserted, the PIC is programmed to generate an interrupt 75H.
5. The PIC asserts the INTR pin on the processor to signal the interrupt 75H.
@@ -20890,22 +20902,21 @@ Section 4.9.3, “Typical Actions of a Floating-Point Exception Handler,” show
floating-point exception handler. The state of the x87 FPU can be saved with the FSTENV/FNSTENV or
FSAVE/FNSAVE instructions (see Section 8.1.10, “Saving the x87 FPU’s State with FSTENV/FNSTENV and
FSAVE/FNSAVE”).
-
-Vol. 1 8-33
-
- PROGRAMMING WITH THE X87 FPU
-
If the faulting floating-point instruction is followed by one or more non-floating-point instructions, it may not be
useful to re-execute the faulting instruction. See Section 8.6, “x87 FPU Exception Synchronization,” for more information on synchronizing floating-point exceptions.
In cases where the handler needs to restart program execution with the faulting instruction, the IRET instruction
cannot be used directly. The reason for this is that because the exception is not generated until the next floatingpoint or WAIT/FWAIT instruction following the faulting floating-point instruction, the return instruction pointer on
the stack may not point to the faulting instruction. To restart program execution at the faulting instruction, the
-exception handler must obtain a pointer to the instruction from the saved x87 FPU state information, load it into the
-return instruction pointer location on the stack, and then execute the IRET instruction.
+exception handler must obtain a pointer to the instruction from the saved x87 FPU state information, load it into
+the return instruction pointer location on the stack, and then execute the IRET instruction.
See Section D.3.4, “x87 FPU Exception Handling Examples,” for general examples of floating-point exception
handlers and for specific examples of how to write a floating-point exception handler when using the MS-DOS
compatibility mode.
+Vol. 1 8-33
+
+ PROGRAMMING WITH THE X87 FPU
+
8-34 Vol. 1
CHAPTER 9
@@ -29152,7 +29163,7 @@ Opmask state.
As noted in Section 13.2, CPUID.(EAX=0DH,ECX=5):EBX enumerates the offset (in bytes, from the base of the
XSAVE area) of the section of the extended region of the XSAVE area used for opmask state (when the standard
format of the extended region is used). CPUID.(EAX=0DH,ECX=5):EAX enumerates the size (in bytes)
-required for opmask state. The opmask section is used for the 8 64-bit bound registers k0–k7, with
+required for opmask state. The opmask section is used for the 8 64-bit opmask registers k0–k7, with
bytes 8i+7:8i being used for ki.
@@ -29362,7 +29373,7 @@ is 0. An execution of XRSTOR or XRSTORS outside 64-bit mode does not update ZMM8
Section 13.13.)
PKRU state. PKRU state is in its initial configuration if the value of the PKRU is 0.
-HDC state. HDC state is in its initial configuration if the value of the IA32_PM_CTL1 MSR is 0.
+HDC state. HDC state is in its initial configuration if the value of the IA32_PM_CTL1 MSR is 1.
13.7
@@ -29707,9 +29718,9 @@ operation determined by instruction prefixes. See Section 13.13 for details rega
accesses.
Execution of XSAVEC performs the init optimization to reduce the amount of data written to memory. If
XINUSE[i] = 0, state component i is not saved to the XSAVE area (even if RFBM[i] = 1). However, if RFBM[1] = 1
-and MXCSR does not have the value 1F80H, XSAVEC writes saves all of state component 1 (SSE — including the
-XMM registers) even if XINUSE[1] = 0. Unlike the XSAVE instruction, RFBM[2] does not determine whether
-XSAVEC saves MXCSR and MXCSR_MASK.
+and MXCSR does not have the value 1F80H, XSAVEC saves all of state component 1 (SSE — including the XMM
+registers) even if XINUSE[1] = 0. Unlike the XSAVE instruction, RFBM[2] does not determine whether XSAVEC
+saves MXCSR and MXCSR_MASK.
13.11
@@ -29768,8 +29779,8 @@ operation determined by instruction prefixes; in particular, see Section 13.5.6
state by XSAVES. See Section 13.13 for details regarding faults caused by memory accesses.
Execution of XSAVES performs the init optimization to reduce the amount of data written to memory. If
XINUSE[i] = 0, state component i is not saved to the XSAVE area (even if RFBM[i] = 1). However, if RFBM[1] = 1
-and MXCSR does not have the value 1F80H, XSAVES writes saves all of state component 1 (SSE — including the
-XMM registers) even if XINUSE[1] = 0.
+and MXCSR does not have the value 1F80H, XSAVES saves all of state component 1 (SSE — including the XMM
+registers) even if XINUSE[1] = 0.
Like XSAVEOPT, XSAVES may perform the modified optimization. Each execution of XRSTOR and XRSTORS establishes XRSTOR_INFO as a 4-tuple w,x,y,z (see Section 13.8.3 and Section 13.12). Execution of XSAVES uses the
modified optimization only if the following all hold:
@@ -35465,11 +35476,11 @@ Using dword indices specified in vm32y, gather single-precision FP values from m
conditioned on mask specified by ymm2. Conditionally gathered elements are merged into
ymm1.
-VGATHERQPS ymm1, vm64y, ymm2
+VGATHERQPS xmm1, vm64y, xmm2
Using qword indices specified in vm64y, gather single-precision FP values from memory
-conditioned on mask specified by ymm2. Conditionally gathered elements are merged into
-ymm1.
+conditioned on mask specified by xmm2. Conditionally gathered elements are merged into
+xmm1.
VGATHERDQ xmm1, vm32x, xmm2
@@ -45098,8 +45109,8 @@ Basic Architecture, Order Number 253665; Instruction Set Reference A-Z, Order Nu
System Programming Guide, Order Number 325384; Model-Specific Registers, Order Number
335592. Refer to all four volumes when evaluating your design needs.
-Order Number: 325383-067US
-May 2018
+Order Number: 325383-068US
+November 2018
Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn
more at intel.com, or from the OEM or retailer.
@@ -45338,271 +45349,270 @@ Operand Encoding Column in the Instruction Summary Table . . . . . . . . . . . .
3.1.1.5
64/32-bit Mode Column in the Instruction Summary Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-8
3.1.1.6
-CPUID Support Column in the Instruction Summary Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
+CPUID Support Column in the Instruction Summary Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9
3.1.1.7
-Description Column in the Instruction Summary Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
+Description Column in the Instruction Summary Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9
3.1.1.8
-Description Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
+Description Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9
iv Vol. 2A
CONTENTS
PAGE
3.1.1.9
-Operation Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10
+Operation Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
3.1.1.10
-Intel® C/C++ Compiler Intrinsics Equivalents Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-13
+Intel® C/C++ Compiler Intrinsics Equivalents Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-12
3.1.1.11
-Flags Affected Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15
+Flags Affected Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14
3.1.1.12
-FPU Flags Affected Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15
+FPU Flags Affected Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14
3.1.1.13
-Protected Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15
+Protected Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14
3.1.1.14
-Real-Address Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
+Real-Address Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15
3.1.1.15
-Virtual-8086 Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
+Virtual-8086 Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15
3.1.1.16
-Floating-Point Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17
+Floating-Point Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
3.1.1.17
-SIMD Floating-Point Exceptions Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17
+SIMD Floating-Point Exceptions Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
3.1.1.18
-Compatibility Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17
+Compatibility Mode Exceptions Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
3.1.1.19
-64-Bit Mode Exceptions Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17
+64-Bit Mode Exceptions Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
3.2
-INSTRUCTIONS (A-L) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
-AAA—ASCII Adjust After Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-19
-AAD—ASCII Adjust AX Before Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
-AAM—ASCII Adjust AX After Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-23
-AAS—ASCII Adjust AL After Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25
-ADC—Add with Carry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-27
-ADCX — Unsigned Integer Addition of Two Operands with Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
-ADD—Add. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-32
-ADDPD—Add Packed Double-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-34
-ADDPS—Add Packed Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-37
-ADDSD—Add Scalar Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-40
-ADDSS—Add Scalar Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-42
-ADDSUBPD—Packed Double-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-44
-ADDSUBPS—Packed Single-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-46
-ADOX — Unsigned Integer Addition of Two Operands with Overflow Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-49
-AESDEC—Perform One Round of an AES Decryption Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-51
-AESDECLAST—Perform Last Round of an AES Decryption Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-53
-AESENC—Perform One Round of an AES Encryption Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-55
-AESENCLAST—Perform Last Round of an AES Encryption Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-57
-AESIMC—Perform the AES InvMixColumn Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-59
-AESKEYGENASSIST—AES Round Key Generation Assist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-60
-AND—Logical AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-62
-ANDN — Logical AND NOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-64
-ANDPD—Bitwise Logical AND of Packed Double Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-65
-ANDPS—Bitwise Logical AND of Packed Single Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-68
-ANDNPD—Bitwise Logical AND NOT of Packed Double Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-71
-ANDNPS—Bitwise Logical AND NOT of Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-74
-ARPL—Adjust RPL Field of Segment Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-77
+INSTRUCTIONS (A-L) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
+AAA—ASCII Adjust After Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
+AAD—ASCII Adjust AX Before Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-20
+AAM—ASCII Adjust AX After Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
+AAS—ASCII Adjust AL After Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-24
+ADC—Add with Carry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
+ADCX — Unsigned Integer Addition of Two Operands with Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29
+ADD—Add. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-31
+ADDPD—Add Packed Double-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-33
+ADDPS—Add Packed Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-36
+ADDSD—Add Scalar Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-39
+ADDSS—Add Scalar Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41
+ADDSUBPD—Packed Double-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43
+ADDSUBPS—Packed Single-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-45
+ADOX — Unsigned Integer Addition of Two Operands with Overflow Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-48
+AESDEC—Perform One Round of an AES Decryption Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-50
+AESDECLAST—Perform Last Round of an AES Decryption Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-52
+AESENC—Perform One Round of an AES Encryption Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-54
+AESENCLAST—Perform Last Round of an AES Encryption Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-56
+AESIMC—Perform the AES InvMixColumn Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-58
+AESKEYGENASSIST—AES Round Key Generation Assist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-59
+AND—Logical AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-61
+ANDN — Logical AND NOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-63
+ANDPD—Bitwise Logical AND of Packed Double Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-64
+ANDPS—Bitwise Logical AND of Packed Single Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-67
+ANDNPD—Bitwise Logical AND NOT of Packed Double Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-70
+ANDNPS—Bitwise Logical AND NOT of Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-73
+ARPL—Adjust RPL Field of Segment Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-76
+BEXTR — Bit Field Extract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-78
BLENDPD — Blend Packed Double Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-79
-BEXTR — Bit Field Extract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-81
-BLENDPS — Blend Packed Single Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-82
-BLENDVPD — Variable Blend Packed Double Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-84
-BLENDVPS — Variable Blend Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-86
-BLSI — Extract Lowest Set Isolated Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-89
-BLSMSK — Get Mask Up to Lowest Set Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-90
-BLSR — Reset Lowest Set Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-91
-BNDCL—Check Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-92
-BNDCU/BNDCN—Check Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-94
-BNDLDX—Load Extended Bounds Using Address Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-96
-BNDMK—Make Bounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-99
-BNDMOV—Move Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-101
-BNDSTX—Store Extended Bounds Using Address Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-104
-BOUND—Check Array Index Against Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-107
-BSF—Bit Scan Forward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-109
-BSR—Bit Scan Reverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-111
-BSWAP—Byte Swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-113
-BT—Bit Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-114
-BTC—Bit Test and Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-116
+BLENDPS — Blend Packed Single Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-81
+BLENDVPD — Variable Blend Packed Double Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-83
+BLENDVPS — Variable Blend Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-85
+BLSI — Extract Lowest Set Isolated Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-88
+BLSMSK — Get Mask Up to Lowest Set Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-89
+BLSR — Reset Lowest Set Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-90
+BNDCL—Check Lower Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-91
+BNDCU/BNDCN—Check Upper Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-93
+BNDLDX—Load Extended Bounds Using Address Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-95
+BNDMK—Make Bounds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-98
+BNDMOV—Move Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-100
+BNDSTX—Store Extended Bounds Using Address Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-103
+BOUND—Check Array Index Against Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-106
+BSF—Bit Scan Forward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-108
+BSR—Bit Scan Reverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-110
+BSWAP—Byte Swap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-112
+BT—Bit Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-113
+BTC—Bit Test and Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-115
Vol. 2A v
CONTENTS
PAGE
-BTR—Bit Test and Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-118
-BTS—Bit Test and Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-120
-BZHI — Zero High Bits Starting with Specified Bit Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-122
-CALL—Call Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-123
-CBW/CWDE/CDQE—Convert Byte to Word/Convert Word to Doubleword/Convert Doubleword to Quadword . . . . . . .3-136
-CLAC—Clear AC Flag in EFLAGS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-137
-CLC—Clear Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-138
-CLD—Clear Direction Flag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-139
-CLFLUSH—Flush Cache Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-140
-CLFLUSHOPT—Flush Cache Line Optimized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-142
-CLI — Clear Interrupt Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-144
-CLTS—Clear Task-Switched Flag in CR0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-146
-CLWB—Cache Line Write Back . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-147
-CMC—Complement Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-149
-CMOVcc—Conditional Move. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-150
-CMP—Compare Two Operands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-154
-CMPPD—Compare Packed Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-156
-CMPPS—Compare Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-163
-CMPS/CMPSB/CMPSW/CMPSD/CMPSQ—Compare String Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-170
-CMPSD—Compare Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-174
-CMPSS—Compare Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-178
-CMPXCHG—Compare and Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-182
-CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-184
-COMISD—Compare Scalar Ordered Double-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . . . . .3-187
-COMISS—Compare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . . . . . .3-189
-CPUID—CPU Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-191
-CRC32 — Accumulate CRC32 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-228
-CVTDQ2PD—Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point Values . . . . . . . . . . . . .3-231
-CVTDQ2PS—Convert Packed Doubleword Integers to Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . .3-235
-CVTPD2DQ—Convert Packed Double-Precision Floating-Point Values to Packed Doubleword Integers . . . . . . . . . . . . .3-238
-CVTPD2PI—Convert Packed Double-Precision FP Values to Packed Dword Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-242
+BTR—Bit Test and Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-117
+BTS—Bit Test and Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-119
+BZHI — Zero High Bits Starting with Specified Bit Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-121
+CALL—Call Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-122
+CBW/CWDE/CDQE—Convert Byte to Word/Convert Word to Doubleword/Convert Doubleword to Quadword . . . . . . .3-135
+CLAC—Clear AC Flag in EFLAGS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-136
+CLC—Clear Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-137
+CLD—Clear Direction Flag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-138
+CLFLUSH—Flush Cache Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-139
+CLFLUSHOPT—Flush Cache Line Optimized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-141
+CLI — Clear Interrupt Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-143
+CLTS—Clear Task-Switched Flag in CR0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-145
+CLWB—Cache Line Write Back . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-146
+CMC—Complement Carry Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-148
+CMOVcc—Conditional Move. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-149
+CMP—Compare Two Operands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-153
+CMPPD—Compare Packed Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-155
+CMPPS—Compare Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-162
+CMPS/CMPSB/CMPSW/CMPSD/CMPSQ—Compare String Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-169
+CMPSD—Compare Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-173
+CMPSS—Compare Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-177
+CMPXCHG—Compare and Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-181
+CMPXCHG8B/CMPXCHG16B—Compare and Exchange Bytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-183
+COMISD—Compare Scalar Ordered Double-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . . . . .3-186
+COMISS—Compare Scalar Ordered Single-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . . . . . .3-188
+CPUID—CPU Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-190
+CRC32 — Accumulate CRC32 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-229
+CVTDQ2PD—Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point Values . . . . . . . . . . . . .3-232
+CVTDQ2PS—Convert Packed Doubleword Integers to Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . .3-236
+CVTPD2DQ—Convert Packed Double-Precision Floating-Point Values to Packed Doubleword Integers . . . . . . . . . . . . .3-239
+CVTPD2PI—Convert Packed Double-Precision FP Values to Packed Dword Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-243
CVTPD2PS—Convert Packed Double-Precision Floating-Point Values to Packed Single-Precision Floating-Point
-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-243
-CVTPI2PD—Convert Packed Dword Integers to Packed Double-Precision FP Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-247
-CVTPI2PS—Convert Packed Dword Integers to Packed Single-Precision FP Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-248
-CVTPS2DQ—Convert Packed Single-Precision Floating-Point Values to Packed Signed Doubleword Integer
-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-249
+Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-244
+CVTPI2PD—Convert Packed Dword Integers to Packed Double-Precision FP Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-248
+CVTPI2PS—Convert Packed Dword Integers to Packed Single-Precision FP Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-249
+CVTPS2DQ—Convert Packed Single-Precision Floating-Point Values to Packed Signed Doubleword Integer Values .3-250
CVTPS2PD—Convert Packed Single-Precision Floating-Point Values to Packed Double-Precision Floating-Point
-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-252
-CVTPS2PI—Convert Packed Single-Precision FP Values to Packed Dword Integers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-255
-CVTSD2SI—Convert Scalar Double-Precision Floating-Point Value to Doubleword Integer . . . . . . . . . . . . . . . . . . . . . . . . .3-256
-CVTSD2SS—Convert Scalar Double-Precision Floating-Point Value to Scalar Single-Precision Floating-Point Value. .3-258
-CVTSI2SD—Convert Doubleword Integer to Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . .3-260
-CVTSI2SS—Convert Doubleword Integer to Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . .3-262
-CVTSS2SD—Convert Scalar Single-Precision Floating-Point Value to Scalar Double-Precision Floating-Point Value. .3-264
-CVTSS2SI—Convert Scalar Single-Precision Floating-Point Value to Doubleword Integer . . . . . . . . . . . . . . . . . . . . . . . . . .3-266
+Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-253
+CVTPS2PI—Convert Packed Single-Precision FP Values to Packed Dword Integers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-256
+CVTSD2SI—Convert Scalar Double-Precision Floating-Point Value to Doubleword Integer . . . . . . . . . . . . . . . . . . . . . . . . .3-257
+CVTSD2SS—Convert Scalar Double-Precision Floating-Point Value to Scalar Single-Precision Floating-Point Value. .3-259
+CVTSI2SD—Convert Doubleword Integer to Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . .3-261
+CVTSI2SS—Convert Doubleword Integer to Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . .3-263
+CVTSS2SD—Convert Scalar Single-Precision Floating-Point Value to Scalar Double-Precision Floating-Point Value. .3-265
+CVTSS2SI—Convert Scalar Single-Precision Floating-Point Value to Doubleword Integer . . . . . . . . . . . . . . . . . . . . . . . . . .3-267
CVTTPD2DQ—Convert with Truncation Packed Double-Precision Floating-Point Values to Packed Doubleword
-Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-268
-CVTTPD2PI—Convert with Truncation Packed Double-Precision FP Values to Packed Dword Integers . . . . . . . . . . . . .3-272
+Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-269
+CVTTPD2PI—Convert with Truncation Packed Double-Precision FP Values to Packed Dword Integers . . . . . . . . . . . . .3-273
CVTTPS2DQ—Convert with Truncation Packed Single-Precision Floating-Point Values to Packed Signed Doubleword
-Integer Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-273
-CVTTPS2PI—Convert with Truncation Packed Single-Precision FP Values to Packed Dword Integers . . . . . . . . . . . . . .3-276
-CVTTSD2SI—Convert with Truncation Scalar Double-Precision Floating-Point Value to Signed Integer . . . . . . . . . . . . .3-277
-CVTTSS2SI—Convert with Truncation Scalar Single-Precision Floating-Point Value to Integer . . . . . . . . . . . . . . . . . . . . .3-279
-CWD/CDQ/CQO—Convert Word to Doubleword/Convert Doubleword to Quadword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-281
-DAA—Decimal Adjust AL after Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-282
-DAS—Decimal Adjust AL after Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-284
-DEC—Decrement by 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-286
-DIV—Unsigned Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-288
+Integer Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-274
+CVTTPS2PI—Convert with Truncation Packed Single-Precision FP Values to Packed Dword Integers . . . . . . . . . . . . . .3-277
+CVTTSD2SI—Convert with Truncation Scalar Double-Precision Floating-Point Value to Signed Integer . . . . . . . . . . . . .3-278
+CVTTSS2SI—Convert with Truncation Scalar Single-Precision Floating-Point Value to Integer . . . . . . . . . . . . . . . . . . . . .3-280
+CWD/CDQ/CQO—Convert Word to Doubleword/Convert Doubleword to Quadword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-282
+DAA—Decimal Adjust AL after Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-283
+DAS—Decimal Adjust AL after Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-285
+DEC—Decrement by 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-287
+DIV—Unsigned Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-289
+DIVPD—Divide Packed Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-292
vi Vol. 2A
CONTENTS
PAGE
-DIVPD—Divide Packed Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-291
-DIVPS—Divide Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-294
-DIVSD—Divide Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-297
-DIVSS—Divide Scalar Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-299
-DPPD — Dot Product of Packed Double Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-301
-DPPS — Dot Product of Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-303
-EMMS—Empty MMX Technology State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-306
-ENTER—Make Stack Frame for Procedure Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-307
-EXTRACTPS—Extract Packed Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-310
-F2XM1—Compute 2x–1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-312
-FABS—Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-314
-FADD/FADDP/FIADD—Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-315
-FBLD—Load Binary Coded Decimal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-318
-FBSTP—Store BCD Integer and Pop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-320
-FCHS—Change Sign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-322
-FCLEX/FNCLEX—Clear Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-324
-FCMOVcc—Floating-Point Conditional Move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-326
-FCOM/FCOMP/FCOMPP—Compare Floating Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-328
-FCOMI/FCOMIP/ FUCOMI/FUCOMIP—Compare Floating Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-331
-FCOS— Cosine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-334
-FDECSTP—Decrement Stack-Top Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-336
-FDIV/FDIVP/FIDIV—Divide. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-337
-FDIVR/FDIVRP/FIDIVR—Reverse Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-340
-FFREE—Free Floating-Point Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-343
-FICOM/FICOMP—Compare Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-344
-FILD—Load Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-346
-FINCSTP—Increment Stack-Top Pointer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-348
-FINIT/FNINIT—Initialize Floating-Point Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-349
-FIST/FISTP—Store Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-351
-FISTTP—Store Integer with Truncation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-354
-FLD—Load Floating Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-356
-FLD1/FLDL2T/FLDL2E/FLDPI/FLDLG2/FLDLN2/FLDZ—Load Constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-358
-FLDCW—Load x87 FPU Control Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-360
-FLDENV—Load x87 FPU Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-362
-FMUL/FMULP/FIMUL—Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-364
-FNOP—No Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-367
-FPATAN—Partial Arctangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-368
-FPREM—Partial Remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-370
-FPREM1—Partial Remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-372
-FPTAN—Partial Tangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-374
-FRNDINT—Round to Integer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-376
-FRSTOR—Restore x87 FPU State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-377
-FSAVE/FNSAVE—Store x87 FPU State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-379
-FSCALE—Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-382
-FSIN—Sine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-384
-FSINCOS—Sine and Cosine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-386
-FSQRT—Square Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-388
-FST/FSTP—Store Floating Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-390
-FSTCW/FNSTCW—Store x87 FPU Control Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-392
-FSTENV/FNSTENV—Store x87 FPU Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-394
-FSTSW/FNSTSW—Store x87 FPU Status Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-396
-FSUB/FSUBP/FISUB—Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-398
-FSUBR/FSUBRP/FISUBR—Reverse Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-401
-FTST—TEST. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-404
-FUCOM/FUCOMP/FUCOMPP—Unordered Compare Floating Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-406
-FXAM—Examine Floating-Point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-409
-FXCH—Exchange Register Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-411
-FXRSTOR—Restore x87 FPU, MMX, XMM, and MXCSR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-413
-FXSAVE—Save x87 FPU, MMX Technology, and SSE State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-416
+DIVPS—Divide Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-295
+DIVSD—Divide Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-298
+DIVSS—Divide Scalar Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-300
+DPPD — Dot Product of Packed Double Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-302
+DPPS — Dot Product of Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-304
+EMMS—Empty MMX Technology State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-307
+ENTER—Make Stack Frame for Procedure Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-308
+EXTRACTPS—Extract Packed Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-311
+F2XM1—Compute 2x–1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-313
+FABS—Absolute Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-315
+FADD/FADDP/FIADD—Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-316
+FBLD—Load Binary Coded Decimal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-319
+FBSTP—Store BCD Integer and Pop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-321
+FCHS—Change Sign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-323
+FCLEX/FNCLEX—Clear Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-325
+FCMOVcc—Floating-Point Conditional Move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-327
+FCOM/FCOMP/FCOMPP—Compare Floating Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-329
+FCOMI/FCOMIP/ FUCOMI/FUCOMIP—Compare Floating Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-332
+FCOS— Cosine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-335
+FDECSTP—Decrement Stack-Top Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-337
+FDIV/FDIVP/FIDIV—Divide. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-338
+FDIVR/FDIVRP/FIDIVR—Reverse Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-341
+FFREE—Free Floating-Point Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-344
+FICOM/FICOMP—Compare Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-345
+FILD—Load Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-347
+FINCSTP—Increment Stack-Top Pointer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-349
+FINIT/FNINIT—Initialize Floating-Point Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-350
+FIST/FISTP—Store Integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-352
+FISTTP—Store Integer with Truncation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-355
+FLD—Load Floating Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-357
+FLD1/FLDL2T/FLDL2E/FLDPI/FLDLG2/FLDLN2/FLDZ—Load Constant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-359
+FLDCW—Load x87 FPU Control Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-361
+FLDENV—Load x87 FPU Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-363
+FMUL/FMULP/FIMUL—Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-365
+FNOP—No Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-368
+FPATAN—Partial Arctangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-369
+FPREM—Partial Remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-371
+FPREM1—Partial Remainder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-373
+FPTAN—Partial Tangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-375
+FRNDINT—Round to Integer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-377
+FRSTOR—Restore x87 FPU State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-378
+FSAVE/FNSAVE—Store x87 FPU State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-380
+FSCALE—Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-383
+FSIN—Sine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-385
+FSINCOS—Sine and Cosine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-387
+FSQRT—Square Root . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-389
+FST/FSTP—Store Floating Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-391
+FSTCW/FNSTCW—Store x87 FPU Control Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-393
+FSTENV/FNSTENV—Store x87 FPU Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-395
+FSTSW/FNSTSW—Store x87 FPU Status Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-397
+FSUB/FSUBP/FISUB—Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-399
+FSUBR/FSUBRP/FISUBR—Reverse Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-402
+FTST—TEST. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-405
+FUCOM/FUCOMP/FUCOMPP—Unordered Compare Floating Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-407
+FXAM—Examine Floating-Point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-410
+FXCH—Exchange Register Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-412
+FXRSTOR—Restore x87 FPU, MMX, XMM, and MXCSR State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-414
+FXSAVE—Save x87 FPU, MMX Technology, and SSE State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-417
+FXTRACT—Extract Exponent and Significand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-425
Vol. 2A vii
CONTENTS
PAGE
-FXTRACT—Extract Exponent and Significand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-424
-FYL2X—Compute y * log2x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-426
-FYL2XP1—Compute y * log2(x +1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-428
-HADDPD—Packed Double-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-430
-HADDPS—Packed Single-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-433
-HLT—Halt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-436
-HSUBPD—Packed Double-FP Horizontal Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-437
-HSUBPS—Packed Single-FP Horizontal Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-440
-IDIV—Signed Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-443
-IMUL—Signed Multiply. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-446
-IN—Input from Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-450
-INC—Increment by 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-452
-INS/INSB/INSW/INSD—Input from Port to String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-454
-INSERTPS—Insert Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-457
-INT n/INTO/INT3/INT1—Call to Interrupt Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-460
-INVD—Invalidate Internal Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-473
-INVLPG—Invalidate TLB Entries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-475
-INVPCID—Invalidate Process-Context Identifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-477
-IRET/IRETD—Interrupt Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-480
-Jcc—Jump if Condition Is Met. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-487
-JMP—Jump. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-492
-KADDW/KADDB/KADDQ/KADDD—ADD Two Masks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-500
-KANDW/KANDB/KANDQ/KANDD—Bitwise Logical AND Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-501
-KANDNW/KANDNB/KANDNQ/KANDND—Bitwise Logical AND NOT Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-502
-KMOVW/KMOVB/KMOVQ/KMOVD—Move from and to Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-503
-KNOTW/KNOTB/KNOTQ/KNOTD—NOT Mask Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-505
-KORW/KORB/KORQ/KORD—Bitwise Logical OR Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-506
-KORTESTW/KORTESTB/KORTESTQ/KORTESTD—OR Masks And Set Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-507
-KSHIFTLW/KSHIFTLB/KSHIFTLQ/KSHIFTLD—Shift Left Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-509
-KSHIFTRW/KSHIFTRB/KSHIFTRQ/KSHIFTRD—Shift Right Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-511
-KTESTW/KTESTB/KTESTQ/KTESTD—Packed Bit Test Masks and Set Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-513
-KUNPCKBW/KUNPCKWD/KUNPCKDQ—Unpack for Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-515
-KXNORW/KXNORB/KXNORQ/KXNORD—Bitwise Logical XNOR Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-516
-KXORW/KXORB/KXORQ/KXORD—Bitwise Logical XOR Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-517
-LAHF—Load Status Flags into AH Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-518
-LAR—Load Access Rights Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-519
-LDDQU—Load Unaligned Integer 128 Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-522
-LDMXCSR—Load MXCSR Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-524
-LDS/LES/LFS/LGS/LSS—Load Far Pointer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-525
-LEA—Load Effective Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-529
-LEAVE—High Level Procedure Exit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-531
-LFENCE—Load Fence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-533
-LGDT/LIDT—Load Global/Interrupt Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-534
-LLDT—Load Local Descriptor Table Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-537
-LMSW—Load Machine Status Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-539
-LOCK—Assert LOCK# Signal Prefix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-541
-LODS/LODSB/LODSW/LODSD/LODSQ—Load String. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-543
-LOOP/LOOPcc—Loop According to ECX Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-546
-LSL—Load Segment Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-548
-LTR—Load Task Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-551
-LZCNT— Count the Number of Leading Zero Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-553
+FYL2X—Compute y * log2x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-427
+FYL2XP1—Compute y * log2(x +1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-429
+HADDPD—Packed Double-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-431
+HADDPS—Packed Single-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-434
+HLT—Halt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-437
+HSUBPD—Packed Double-FP Horizontal Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-438
+HSUBPS—Packed Single-FP Horizontal Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-441
+IDIV—Signed Divide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-444
+IMUL—Signed Multiply. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-447
+IN—Input from Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-451
+INC—Increment by 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-453
+INS/INSB/INSW/INSD—Input from Port to String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-455
+INSERTPS—Insert Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-458
+INT n/INTO/INT3/INT1—Call to Interrupt Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-461
+INVD—Invalidate Internal Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-474
+INVLPG—Invalidate TLB Entries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-476
+INVPCID—Invalidate Process-Context Identifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-478
+IRET/IRETD—Interrupt Return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-481
+Jcc—Jump if Condition Is Met. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-488
+JMP—Jump. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-493
+KADDW/KADDB/KADDQ/KADDD—ADD Two Masks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-501
+KANDW/KANDB/KANDQ/KANDD—Bitwise Logical AND Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-502
+KANDNW/KANDNB/KANDNQ/KANDND—Bitwise Logical AND NOT Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-503
+KMOVW/KMOVB/KMOVQ/KMOVD—Move from and to Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-504
+KNOTW/KNOTB/KNOTQ/KNOTD—NOT Mask Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-506
+KORW/KORB/KORQ/KORD—Bitwise Logical OR Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-507
+KORTESTW/KORTESTB/KORTESTQ/KORTESTD—OR Masks And Set Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-508
+KSHIFTLW/KSHIFTLB/KSHIFTLQ/KSHIFTLD—Shift Left Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-510
+KSHIFTRW/KSHIFTRB/KSHIFTRQ/KSHIFTRD—Shift Right Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-512
+KTESTW/KTESTB/KTESTQ/KTESTD—Packed Bit Test Masks and Set Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-514
+KUNPCKBW/KUNPCKWD/KUNPCKDQ—Unpack for Mask Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-516
+KXNORW/KXNORB/KXNORQ/KXNORD—Bitwise Logical XNOR Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-517
+KXORW/KXORB/KXORQ/KXORD—Bitwise Logical XOR Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-518
+LAHF—Load Status Flags into AH Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-519
+LAR—Load Access Rights Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-520
+LDDQU—Load Unaligned Integer 128 Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-523
+LDMXCSR—Load MXCSR Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-525
+LDS/LES/LFS/LGS/LSS—Load Far Pointer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-526
+LEA—Load Effective Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-530
+LEAVE—High Level Procedure Exit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-532
+LFENCE—Load Fence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-534
+LGDT/LIDT—Load Global/Interrupt Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-535
+LLDT—Load Local Descriptor Table Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-538
+LMSW—Load Machine Status Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-540
+LOCK—Assert LOCK# Signal Prefix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-542
+LODS/LODSB/LODSW/LODSD/LODSQ—Load String. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-544
+LOOP/LOOPcc—Loop According to ECX Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-547
+LSL—Load Segment Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-549
+LTR—Load Task Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-552
+LZCNT— Count the Number of Leading Zero Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-554
CHAPTER 4
INSTRUCTION SET REFERENCE, M-U
4.1
@@ -45615,19 +45625,19 @@ Source Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Aggregation Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2
4.1.4
Polarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3
+4.1.5
+Output Selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4
viii Vol. 2A
CONTENTS
PAGE
-4.1.5
4.1.6
4.1.7
4.1.8
4.2
4.3
-Output Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Valid/Invalid Override of Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Summary of Im8 Control byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Diagram Comparison and Aggregation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
@@ -45686,12 +45696,12 @@ MUL—Unsigned Multiply . . . . . . . . . . . . . . . . . . . . . . . . . . . .
MULPD—Multiply Packed Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-146
MULPS—Multiply Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-149
MULSD—Multiply Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-152
+MULSS—Multiply Scalar Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-154
Vol. 2A ix
CONTENTS
PAGE
-MULSS—Multiply Scalar Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-154
MULX — Unsigned Multiply Without Affecting Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-156
MWAIT—Monitor Wait . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-158
NEG—Two's Complement Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-161
@@ -45750,12 +45760,12 @@ PMOVSX—Packed Move with Sign Extend . . . . . . . . . . . . . . . . . . . . .
PMOVZX—Packed Move with Zero Extend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-349
PMULDQ—Multiply Packed Doubleword Integers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-358
PMULHRSW — Packed Multiply High with Round and Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-361
+PMULHUW—Multiply Packed Unsigned Integers and Store High Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-365
x Vol. 2A
CONTENTS
PAGE
-PMULHUW—Multiply Packed Unsigned Integers and Store High Result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-365
PMULHW—Multiply Packed Signed Integers and Store High Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-369
PMULLD/PMULLQ—Multiply Packed Integers and Store Low Result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-373
PMULLW—Multiply Packed Signed Integers and Store Low Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-377
@@ -45799,77 +45809,77 @@ RDMSR—Read from Model Specific Register. . . . . . . . . . . . . . . . . . . .
RDPID—Read Processor ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-530
RDPKRU—Read Protection Key Rights for User Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-531
RDPMC—Read Performance-Monitoring Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-533
-RDRAND—Read Random Number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-537
-RDSEED—Read Random SEED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-539
-RDTSC—Read Time-Stamp Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-541
-RDTSCP—Read Time-Stamp Counter and Processor ID. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-543
-REP/REPE/REPZ/REPNE/REPNZ—Repeat String Operation Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-545
-RET—Return from Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-549
-RORX — Rotate Right Logical Without Affecting Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-559
-ROUNDPD — Round Packed Double Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-560
-ROUNDPS — Round Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-563
-ROUNDSD — Round Scalar Double Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-566
-ROUNDSS — Round Scalar Single Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-568
-RSM—Resume from System Management Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-570
-RSQRTPS—Compute Reciprocals of Square Roots of Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . 4-572
-RSQRTSS—Compute Reciprocal of Square Root of Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . 4-574
-SAHF—Store AH into Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-576
+RDRAND—Read Random Number. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-535
+RDSEED—Read Random SEED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-537
+RDTSC—Read Time-Stamp Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-539
+RDTSCP—Read Time-Stamp Counter and Processor ID. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-541
+REP/REPE/REPZ/REPNE/REPNZ—Repeat String Operation Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-543
+RET—Return from Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-547
+RORX — Rotate Right Logical Without Affecting Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-557
+ROUNDPD — Round Packed Double Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-558
+ROUNDPS — Round Packed Single Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-561
+ROUNDSD — Round Scalar Double Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-564
+ROUNDSS — Round Scalar Single Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-566
+RSM—Resume from System Management Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-568
+RSQRTPS—Compute Reciprocals of Square Roots of Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . 4-570
+RSQRTSS—Compute Reciprocal of Square Root of Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . 4-572
+SAHF—Store AH into Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-574
+SAL/SAR/SHL/SHR—Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-576
Vol. 2A xi
CONTENTS
PAGE
-SAL/SAR/SHL/SHR—Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-578
-SARX/SHLX/SHRX — Shift Without Affecting Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-583
-SBB—Integer Subtraction with Borrow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-585
-SCAS/SCASB/SCASW/SCASD—Scan String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-588
-SETcc—Set Byte on Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-592
-SFENCE—Store Fence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-595
-SGDT—Store Global Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-596
-SHA1RNDS4—Perform Four Rounds of SHA1 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-598
-SHA1NEXTE—Calculate SHA1 State Variable E after Four Rounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-600
-SHA1MSG1—Perform an Intermediate Calculation for the Next Four SHA1 Message Dwords . . . . . . . . . . . . . . . . . . . . .4-601
-SHA1MSG2—Perform a Final Calculation for the Next Four SHA1 Message Dwords. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-602
-SHA256RNDS2—Perform Two Rounds of SHA256 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-603
-SHA256MSG1—Perform an Intermediate Calculation for the Next Four SHA256 Message Dwords . . . . . . . . . . . . . . . .4-605
-SHA256MSG2—Perform a Final Calculation for the Next Four SHA256 Message Dwords . . . . . . . . . . . . . . . . . . . . . . . . .4-606
-SHLD—Double Precision Shift Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-607
-SHRD—Double Precision Shift Right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-610
-SHUFPD—Packed Interleave Shuffle of Pairs of Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . .4-613
-SHUFPS—Packed Interleave Shuffle of Quadruplets of Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . .4-618
-SIDT—Store Interrupt Descriptor Table Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-622
-SLDT—Store Local Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-624
-SMSW—Store Machine Status Word. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-626
-SQRTPD—Square Root of Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-628
-SQRTPS—Square Root of Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-631
-SQRTSD—Compute Square Root of Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-634
-SQRTSS—Compute Square Root of Scalar Single-Precision Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-636
-STAC—Set AC Flag in EFLAGS Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-638
-STC—Set Carry Flag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-639
-STD—Set Direction Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-640
-STI—Set Interrupt Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-641
-STMXCSR—Store MXCSR Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-643
-STOS/STOSB/STOSW/STOSD/STOSQ—Store String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-644
-STR—Store Task Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-648
-SUB—Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-650
-SUBPD—Subtract Packed Double-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-652
-SUBPS—Subtract Packed Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-655
-SUBSD—Subtract Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-658
-SUBSS—Subtract Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-660
-SWAPGS—Swap GS Base Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-662
-SYSCALL—Fast System Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-664
-SYSENTER—Fast System Call. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-666
-SYSEXIT—Fast Return from Fast System Call. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-669
-SYSRET—Return From Fast System Call. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-672
-TEST—Logical Compare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-675
-TZCNT — Count the Number of Trailing Zero Bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-677
-UCOMISD—Unordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . .4-679
-UCOMISS—Unordered Compare Scalar Single-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . .4-681
-UD—Undefined Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-683
-UNPCKHPD—Unpack and Interleave High Packed Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . .4-684
-UNPCKHPS—Unpack and Interleave High Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . .4-688
-UNPCKLPD—Unpack and Interleave Low Packed Double-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . .4-692
-UNPCKLPS—Unpack and Interleave Low Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . .4-696
+SARX/SHLX/SHRX — Shift Without Affecting Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-581
+SBB—Integer Subtraction with Borrow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-583
+SCAS/SCASB/SCASW/SCASD—Scan String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-586
+SETcc—Set Byte on Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-590
+SFENCE—Store Fence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-593
+SGDT—Store Global Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-594
+SHA1RNDS4—Perform Four Rounds of SHA1 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-596
+SHA1NEXTE—Calculate SHA1 State Variable E after Four Rounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-598
+SHA1MSG1—Perform an Intermediate Calculation for the Next Four SHA1 Message Dwords . . . . . . . . . . . . . . . . . . . . .4-599
+SHA1MSG2—Perform a Final Calculation for the Next Four SHA1 Message Dwords. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-600
+SHA256RNDS2—Perform Two Rounds of SHA256 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-601
+SHA256MSG1—Perform an Intermediate Calculation for the Next Four SHA256 Message Dwords . . . . . . . . . . . . . . . .4-603
+SHA256MSG2—Perform a Final Calculation for the Next Four SHA256 Message Dwords . . . . . . . . . . . . . . . . . . . . . . . . .4-604
+SHLD—Double Precision Shift Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-605
+SHRD—Double Precision Shift Right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-608
+SHUFPD—Packed Interleave Shuffle of Pairs of Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . .4-611
+SHUFPS—Packed Interleave Shuffle of Quadruplets of Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . .4-616
+SIDT—Store Interrupt Descriptor Table Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-620
+SLDT—Store Local Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-622
+SMSW—Store Machine Status Word. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-624
+SQRTPD—Square Root of Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-626
+SQRTPS—Square Root of Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-629
+SQRTSD—Compute Square Root of Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-632
+SQRTSS—Compute Square Root of Scalar Single-Precision Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-634
+STAC—Set AC Flag in EFLAGS Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-636
+STC—Set Carry Flag. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-637
+STD—Set Direction Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-638
+STI—Set Interrupt Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-639
+STMXCSR—Store MXCSR Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-641
+STOS/STOSB/STOSW/STOSD/STOSQ—Store String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-642
+STR—Store Task Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-646
+SUB—Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-648
+SUBPD—Subtract Packed Double-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-650
+SUBPS—Subtract Packed Single-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-653
+SUBSD—Subtract Scalar Double-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-656
+SUBSS—Subtract Scalar Single-Precision Floating-Point Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-658
+SWAPGS—Swap GS Base Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-660
+SYSCALL—Fast System Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-662
+SYSENTER—Fast System Call. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-664
+SYSEXIT—Fast Return from Fast System Call. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-667
+SYSRET—Return From Fast System Call. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-670
+TEST—Logical Compare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-673
+TZCNT — Count the Number of Trailing Zero Bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-675
+UCOMISD—Unordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . .4-677
+UCOMISS—Unordered Compare Scalar Single-Precision Floating-Point Values and Set EFLAGS . . . . . . . . . . . . . . . . . . . .4-679
+UD—Undefined Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-681
+UNPCKHPD—Unpack and Interleave High Packed Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . .4-682
+UNPCKHPS—Unpack and Interleave High Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . .4-686
+UNPCKLPD—Unpack and Interleave Low Packed Double-Precision Floating-Point Values. . . . . . . . . . . . . . . . . . . . . . . . . .4-690
+UNPCKLPS—Unpack and Interleave Low Packed Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . .4-694
CHAPTER 5
INSTRUCTION SET REFERENCE, V-Z
5.1
@@ -45879,12 +45889,12 @@ INSTRUCTIONS (V-Z) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VALIGND/VALIGNQ—Align Doubleword/Quadword Vectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
VBLENDMPD/VBLENDMPS—Blend Float64/Float32 Vectors Using an OpMask Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
VBROADCAST—Load with Broadcast Floating-Point Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
+VCOMPRESSPD—Store Sparse Packed Double-Precision Floating-Point Values into Dense Memory . . . . . . . . . . . . . . . . . 5-20
xii Vol. 2A
CONTENTS
PAGE
-VCOMPRESSPD—Store Sparse Packed Double-Precision Floating-Point Values into Dense Memory. . . . . . . . . . . . . . . . . 5-20
VCOMPRESSPS—Store Sparse Packed Single-Precision Floating-Point Values into Dense Memory . . . . . . . . . . . . . . . . . . 5-22
VCVTPD2QQ—Convert Packed Double-Precision Floating-Point Values to Packed Quadword Integers . . . . . . . . . . . . . . 5-24
VCVTPD2UDQ—Convert Packed Double-Precision Floating-Point Values to Packed Unsigned Doubleword Integers . 5-27
@@ -45908,8 +45918,8 @@ VCVTTPD2UQQ—Convert with Truncation Packed Double-Precision Floating-Point Val
Quadword Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-61
VCVTTPS2UDQ—Convert with Truncation Packed Single-Precision Floating-Point Values to Packed Unsigned
Doubleword Integer Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-63
-VCVTTPS2QQ—Convert with Truncation Packed Single Precision Floating-Point Values to Packed Singed Quadword
-Integer Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-65
+VCVTTPS2QQ—Convert with Truncation Packed Single Precision Floating-Point Values to Packed Singed
+Quadword Integer Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-65
VCVTTPS2UQQ—Convert with Truncation Packed Single Precision Floating-Point Values to Packed Unsigned
Quadword Integer Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-67
VCVTTSD2USI—Convert with Truncation Scalar Double-Precision Floating-Point Value to Unsigned Integer . . . . . . . . 5-69
@@ -45943,18 +45953,18 @@ Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VFMADDSUB132PD/VFMADDSUB213PD/VFMADDSUB231PD—Fused Multiply-Alternating Add/Subtract of Packed
Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-140
VFMADDSUB132PS/VFMADDSUB213PS/VFMADDSUB231PS—Fused Multiply-Alternating Add/Subtract of Packed
+Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-150
Vol. 2A xiii
CONTENTS
PAGE
-Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-150
VFMSUBADD132PD/VFMSUBADD213PD/VFMSUBADD231PD—Fused Multiply-Alternating Subtract/Add of Packed
Double-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-159
VFMSUBADD132PS/VFMSUBADD213PS/VFMSUBADD231PS—Fused Multiply-Alternating Subtract/Add of Packed
Single-Precision Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-169
-VFMSUB132PD/VFMSUB213PD/VFMSUB231PD—Fused Multiply-Subtract of Packed Double-Precision
-Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-179
+VFMSUB132PD/VFMSUB213PD/VFMSUB231PD—Fused Multiply-Subtract of Packed Double-Precision Floating-Point
+Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-179
VFMSUB132PS/VFMSUB213PS/VFMSUB231PS—Fused Multiply-Subtract of Packed Single-Precision Floating-Point
Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-186
VFMSUB132SD/VFMSUB213SD/VFMSUB231SD—Fused Multiply-Subtract of Scalar Double-Precision Floating-Point
@@ -46007,12 +46017,12 @@ VPCMPB/VPCMPUB—Compare Packed Byte Values Into Mask . . . . . . . . . . . . .
VPCMPD/VPCMPUD—Compare Packed Integer Values into Mask. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-318
VPCMPQ/VPCMPUQ—Compare Packed Integer Values into Mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-321
VPCMPW/VPCMPUW—Compare Packed Word Values Into Mask. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-324
+VPCOMPRESSD—Store Sparse Packed Doubleword Integer Values into Dense Memory/Register . . . . . . . . . . . . . . . . . .5-327
xiv Vol. 2A
CONTENTS
PAGE
-VPCOMPRESSD—Store Sparse Packed Doubleword Integer Values into Dense Memory/Register . . . . . . . . . . . . . . . . . . 5-327
VPCOMPRESSQ—Store Sparse Packed Quadword Integer Values into Dense Memory/Register . . . . . . . . . . . . . . . . . . . 5-329
VPCONFLICTD/Q—Detect Conflicts Within a Vector of Packed Dword/Qword Values into Dense Memory/ Register. 5-331
VPERM2F128 — Permute Floating-Point Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-334
@@ -46051,8 +46061,8 @@ VPMOVWB/VPMOVSWB/VPMOVUSWB—Down Convert Word to Byte . . . . . . . . . . . . .
VPMULTISHIFTQB – Select Packed Unaligned Bytes from Quadword Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-435
VPROLD/VPROLVD/VPROLQ/VPROLVQ—Bit Rotate Left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-437
VPRORD/VPRORVD/VPRORQ/VPRORVQ—Bit Rotate Right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-442
-VPSCATTERDD/VPSCATTERDQ/VPSCATTERQD/VPSCATTERQQ—Scatter Packed Dword, Packed Qword with
-Signed Dword, Signed Qword Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-447
+VPSCATTERDD/VPSCATTERDQ/VPSCATTERQD/VPSCATTERQQ—Scatter Packed Dword, Packed Qword with Signed
+Dword, Signed Qword Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-447
VPSLLVW/VPSLLVD/VPSLLVQ—Variable Bit Shift Left Logical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-451
VPSRAVW/VPSRAVD/VPSRAVQ—Variable Bit Shift Right Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-456
VPSRLVW/VPSRLVD/VPSRLVQ—Variable Bit Shift Right Logical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-461
@@ -46071,12 +46081,12 @@ VREDUCEPD—Perform Reduction Transformation on Packed Float64 Values . . . . .
VREDUCESD—Perform a Reduction Transformation on a Scalar Float64 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-502
VREDUCEPS—Perform Reduction Transformation on Packed Float32 Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-504
VREDUCESS—Perform a Reduction Transformation on a Scalar Float32 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-506
+VRNDSCALEPD—Round Packed Float64 Values To Include A Given Number Of Fraction Bits . . . . . . . . . . . . . . . . . . . . . . 5-508
Vol. 2A xv
CONTENTS
PAGE
-VRNDSCALEPD—Round Packed Float64 Values To Include A Given Number Of Fraction Bits . . . . . . . . . . . . . . . . . . . . . .5-508
VRNDSCALESD—Round Scalar Float64 Value To Include A Given Number Of Fraction Bits . . . . . . . . . . . . . . . . . . . . . . . .5-512
VRNDSCALEPS—Round Packed Float32 Values To Include A Given Number Of Fraction Bits . . . . . . . . . . . . . . . . . . . . . .5-514
VRNDSCALESS—Round Scalar Float32 Value To Include A Given Number Of Fraction Bits. . . . . . . . . . . . . . . . . . . . . . . . .5-517
@@ -46088,8 +46098,8 @@ VSCALEFPD—Scale Packed Float64 Values With Float64 Values. . . . . . . . . . .
VSCALEFSD—Scale Scalar Float64 Values With Float64 Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-530
VSCALEFPS—Scale Packed Float32 Values With Float32 Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-532
VSCALEFSS—Scale Scalar Float32 Value With Float32 Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-535
-VSCATTERDPS/VSCATTERDPD/VSCATTERQPS/VSCATTERQPD—Scatter Packed Single, Packed Double with
-Signed Dword and Qword Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-537
+VSCATTERDPS/VSCATTERDPD/VSCATTERQPS/VSCATTERQPD—Scatter Packed Single, Packed Double with Signed
+Dword and Qword Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-537
VSHUFF32x4/VSHUFF64x2/VSHUFI32x4/VSHUFI64x2—Shuffle Packed Values at 128-bit Granularity . . . . . . . . . . .5-541
VTESTPD/VTESTPS—Packed Bit Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-546
VZEROALL—Zero All YMM Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-549
@@ -46129,13 +46139,13 @@ Detecting and Enabling SMX. . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.2
SMX Instruction Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2
6.2.2.1
-GETSEC[CAPABILITIES]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2
+GETSEC[CAPABILITIES]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-3
6.2.2.2
GETSEC[ENTERACCS] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-3
6.2.2.3
GETSEC[EXITAC] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-3
6.2.2.4
-GETSEC[SENTER] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-3
+GETSEC[SENTER] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4
6.2.2.5
GETSEC[SEXIT] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4
6.2.2.6
@@ -46145,45 +46155,54 @@ GETSEC[SMCTRL] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.2.8
GETSEC[WAKEUP] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4
6.2.3
-Measured Environment and SMX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4
+Measured Environment and SMX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-5
6.3
GETSEC LEAF FUNCTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
+GETSEC[CAPABILITIES] - Report the SMX Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
xvi Vol. 2A
CONTENTS
PAGE
-GETSEC[CAPABILITIES] - Report the SMX Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-7
-GETSEC[ENTERACCS] - Execute Authenticated Chipset Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
-GETSEC[EXITAC]—Exit Authenticated Code Execution Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
-GETSEC[SENTER]—Enter a Measured Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-21
-GETSEC[SEXIT]—Exit Measured Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-30
-GETSEC[PARAMETERS]—Report the SMX Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-33
-GETSEC[SMCTRL]—SMX Mode Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-37
-GETSEC[WAKEUP]—Wake up sleeping processors in measured environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-40
+GETSEC[ENTERACCS] - Execute Authenticated Chipset Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+GETSEC[EXITAC]—Exit Authenticated Code Execution Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+GETSEC[SENTER]—Enter a Measured Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+GETSEC[SEXIT]—Exit Measured Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+GETSEC[PARAMETERS]—Report the SMX Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+GETSEC[SMCTRL]—SMX Mode Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+GETSEC[WAKEUP]—Wake up sleeping processors in measured environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
+
+6-10
+6-18
+6-21
+6-30
+6-33
+6-37
+6-40
+
CHAPTER 7
INSTRUCTION SET REFERENCE UNIQUE TO INTEL® XEON PHI™ PROCESSORS
PREFETCHWT1—Prefetch Vector Data Into Caches with Intent to Write and T1 Hint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2
V4FMADDPS/V4FNMADDPS — Packed Single-Precision Floating-Point Fused Multiply-Add (4-iterations) . . . . . . . . . . . . .6-4
V4FMADDSS/V4FNMADDSS —Scalar Single-Precision Floating-Point Fused Multiply-Add (4-iterations) . . . . . . . . . . . . . . .6-6
-VEXP2PD—Approximation to the Exponential 2^x of Packed Double-Precision Floating-Point Values with Less Than
-2^-23 Relative Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-8
-VEXP2PS—Approximation to the Exponential 2^x of Packed Single-Precision Floating-Point Values with Less Than
-2^-23 Relative Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
+VEXP2PD—Approximation to the Exponential 2^x of Packed Double-Precision Floating-Point Values with Less
+Than 2^-23 Relative Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-8
+VEXP2PS—Approximation to the Exponential 2^x of Packed Single-Precision Floating-Point Values with Less
+Than 2^-23 Relative Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
VGATHERPF0DPS/VGATHERPF0QPS/VGATHERPF0DPD/VGATHERPF0QPD—Sparse Prefetch Packed SP/DP Data
Values with Signed Dword, Signed Qword Indices Using T0 Hint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
VGATHERPF1DPS/VGATHERPF1QPS/VGATHERPF1DPD/VGATHERPF1QPD—Sparse Prefetch Packed SP/DP Data
Values with Signed Dword, Signed Qword Indices Using T1 Hint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14
VP4DPWSSDS — Dot Product of Signed Words with Dword Accumulation and Saturation (4-iterations) . . . . . . . . . . . . 6-16
VP4DPWSSD — Dot Product of Signed Words with Dword Accumulation (4-iterations) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
-VRCP28PD—Approximation to the Reciprocal of Packed Double-Precision Floating-Point Values with Less Than
-2^-28 Relative Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-20
-VRCP28SD—Approximation to the Reciprocal of Scalar Double-Precision Floating-Point Value with Less Than
-2^-28 Relative Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22
-VRCP28PS—Approximation to the Reciprocal of Packed Single-Precision Floating-Point Values with Less Than
-2^-28 Relative Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
-VRCP28SS—Approximation to the Reciprocal of Scalar Single-Precision Floating-Point Value with Less Than
-2^-28 Relative Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-26
+VRCP28PD—Approximation to the Reciprocal of Packed Double-Precision Floating-Point Values with Less
+Than 2^-28 Relative Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-20
+VRCP28SD—Approximation to the Reciprocal of Scalar Double-Precision Floating-Point Value with Less
+Than 2^-28 Relative Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22
+VRCP28PS—Approximation to the Reciprocal of Packed Single-Precision Floating-Point Values with Less
+Than 2^-28 Relative Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
+VRCP28SS—Approximation to the Reciprocal of Scalar Single-Precision Floating-Point Value with Less
+Than 2^-28 Relative Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-26
VRSQRT28PD—Approximation to the Reciprocal Square Root of Packed Double-Precision Floating-Point Values
with Less Than 2^-28 Relative Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-28
VRSQRT28SD—Approximation to the Reciprocal Square Root of Scalar Double-Precision Floating-Point Value
@@ -46222,12 +46241,13 @@ A.2.5
Superscripts Utilized in Opcode Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A-6
A.3
ONE, TWO, AND THREE-BYTE OPCODE MAPS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6
+A.4
+OPCODE EXTENSIONS FOR ONE-BYTE AND TWO-BYTE OPCODES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17
Vol. 2A xvii
CONTENTS
PAGE
-A.4
A.4.1
A.4.2
A.5
@@ -46242,7 +46262,6 @@ A.5.2.6
A.5.2.7
A.5.2.8
-OPCODE EXTENSIONS FOR ONE-BYTE AND TWO-BYTE OPCODES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17
Opcode Look-up Examples Using Opcode Extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17
Opcode Extension Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17
ESCAPE OPCODE INSTRUCTIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-20
@@ -46339,6 +46358,7 @@ C.1
SIMPLE INTRINSICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2
C.2
COMPOSITE INTRINSICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-14
+
xviii Vol. 2A
CONTENTS
@@ -46416,30 +46436,30 @@ Instruction Encoding Format with VEX Prefix . . . . . . . . . . . . . . . . . .
VEX bit fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
AVX-512 Instruction Format and the EVEX Prefix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-36
Bit Field Layout of the EVEX Prefix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-36
-Bit Offset for BIT[RAX, 21] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
-Memory Bit Indexing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
-ADDSUBPD—Packed Double-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-45
-ADDSUBPS—Packed Single-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-47
-Memory Layout of BNDMOV to/from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-101
-Version Information Returned by CPUID in EAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-207
-Feature Information Returned in the ECX Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-209
-Feature Information Returned in the EDX Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-211
-Determination of Support for the Processor Brand String. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-220
-Algorithm for Extracting Processor Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-221
-CVTDQ2PD (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-232
-VCVTPD2DQ (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-239
-VCVTPD2PS (VEX.256 encoded version). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-244
-CVTPS2PD (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-253
-VCVTTPD2DQ (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-269
-HADDPD—Packed Double-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-430
-VHADDPD operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-431
-HADDPS—Packed Single-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-434
-VHADDPS operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-434
-HSUBPD—Packed Double-FP Horizontal Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-437
-VHSUBPD operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-438
-HSUBPS—Packed Single-FP Horizontal Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-441
-VHSUBPS operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-441
-INVPCID Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-477
+Bit Offset for BIT[RAX, 21] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-11
+Memory Bit Indexing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
+ADDSUBPD—Packed Double-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-44
+ADDSUBPS—Packed Single-FP Add/Subtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-46
+Memory Layout of BNDMOV to/from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-100
+Version Information Returned by CPUID in EAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-208
+Feature Information Returned in the ECX Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-210
+Feature Information Returned in the EDX Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-212
+Determination of Support for the Processor Brand String. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-221
+Algorithm for Extracting Processor Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-222
+CVTDQ2PD (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-233
+VCVTPD2DQ (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-240
+VCVTPD2PS (VEX.256 encoded version). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-245
+CVTPS2PD (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-254
+VCVTTPD2DQ (VEX.256 encoded version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-270
+HADDPD—Packed Double-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-431
+VHADDPD operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-432
+HADDPS—Packed Single-FP Horizontal Add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-435
+VHADDPS operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-435
+HSUBPD—Packed Double-FP Horizontal Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-438
+VHSUBPD operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-439
+HSUBPS—Packed Single-FP Horizontal Subtract. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-442
+VHSUBPS operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-442
+INVPCID Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-478
Operation of PCMPSTRx and PCMPESTRx. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
VMOVDDUP Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-60
MOVSHDUP Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-114
@@ -46512,11 +46532,11 @@ xx Vol. 2A
256-bit VPUNPCKHDQ Instruction Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-489
PUNPCKLBW Instruction Operation Using 64-bit Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-499
256-bit VPUNPCKLDQ Instruction Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-499
-Bit Control Fields of Immediate Byte for ROUNDxx Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-561
-256-bit VSHUFPD Operation of Four Pairs of DP FP Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-614
-256-bit VSHUFPS Operation of Selection from Input Quadruplet and Pair-wise Interleaved Result . . . . . . . . . . . . . 4-619
-VUNPCKHPS Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-689
-VUNPCKLPS Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-697
+Bit Control Fields of Immediate Byte for ROUNDxx Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-559
+256-bit VSHUFPD Operation of Four Pairs of DP FP Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-612
+256-bit VSHUFPS Operation of Selection from Input Quadruplet and Pair-wise Interleaved Result . . . . . . . . . . . . . 4-617
+VUNPCKHPS Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-687
+VUNPCKLPS Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-695
VBROADCASTSS Operation (VEX.256 encoded version). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
VBROADCASTSS Operation (VEX.128-bit version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
VBROADCASTSD Operation (VEX.256-bit version) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
@@ -46744,57 +46764,57 @@ Type E12NP Class Exception Conditions . . . . . . . . . . . . . . . . . . . . .
TYPE K20 Exception Definition (VEX-Encoded OpMask Instructions w/o Memory Arg) . . . . . . . . . . . . . . . . . . . . . . . . . . 2-67
TYPE K21 Exception Definition (VEX-Encoded OpMask Instructions Addressing Memory) . . . . . . . . . . . . . . . . . . . . . . . 2-68
Register Codes Associated With +rb, +rw, +rd, +ro. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
-Range of Bit Positions Specified by Bit Offset Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
-Standard and Non-standard Data Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
-Intel 64 and IA-32 General Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
-x87 FPU Floating-Point Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
-SIMD Floating-Point Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
-Decision Table for CLI Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-144
-Comparison Predicate for CMPPD and CMPPS Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-157
-Pseudo-Op and CMPPD Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-158
-Pseudo-Op and VCMPPD Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-159
-Pseudo-Op and CMPPS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-164
-Pseudo-Op and VCMPPS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-165
-Pseudo-Op and CMPSD Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-175
-Pseudo-Op and VCMPSD Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-175
-Pseudo-Op and CMPSS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-179
-Pseudo-Op and VCMPSS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-179
-Information Returned by CPUID Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-192
-Processor Type Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-207
-Feature Information Returned in the ECX Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-209
-More on Feature Information Returned in the EDX Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-212
-Encoding of CPUID Leaf 2 Descriptors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-214
-Processor Brand String Returned with Pentium 4 Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-221
-Mapping of Brand Indices; and Intel 64 and IA-32 Processor Brand Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-222
-DIV Action. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-288
-Results Obtained from F2XM1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-312
-Results Obtained from FABS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-314
-FADD/FADDP/FIADD Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-316
-FBSTP Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-320
-FCHS Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-322
-FCOM/FCOMP/FCOMPP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-328
-FCOMI/FCOMIP/ FUCOMI/FUCOMIP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-331
-FCOS Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-334
-FDIV/FDIVP/FIDIV Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-338
-FDIVR/FDIVRP/FIDIVR Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-341
-FICOM/FICOMP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-344
-FIST/FISTP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-351
-FISTTP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-354
-FMUL/FMULP/FIMUL Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-365
-FPATAN Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-368
-FPREM Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-370
-FPREM1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-372
-FPTAN Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-374
-FSCALE Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-382
-FSIN Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-384
-FSINCOS Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-386
-FSQRT Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-388
-FSUB/FSUBP/FISUB Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-399
-FSUBR/FSUBRP/FISUBR Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-402
-FTST Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-404
-FUCOM/FUCOMP/FUCOMPP Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-406
-FXAM Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-409
-Non-64-bit-Mode Layout of FXSAVE and FXRSTOR Memory Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-416
+Range of Bit Positions Specified by Bit Offset Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
+Standard and Non-standard Data Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-14
+Intel 64 and IA-32 General Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
+x87 FPU Floating-Point Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
+SIMD Floating-Point Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
+Decision Table for CLI Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-143
+Comparison Predicate for CMPPD and CMPPS Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-156
+Pseudo-Op and CMPPD Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-157
+Pseudo-Op and VCMPPD Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-158
+Pseudo-Op and CMPPS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-163
+Pseudo-Op and VCMPPS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-164
+Pseudo-Op and CMPSD Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-174
+Pseudo-Op and VCMPSD Implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-174
+Pseudo-Op and CMPSS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-178
+Pseudo-Op and VCMPSS Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-178
+Information Returned by CPUID Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-191
+Processor Type Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-208
+Feature Information Returned in the ECX Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-210
+More on Feature Information Returned in the EDX Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-213
+Encoding of CPUID Leaf 2 Descriptors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-215
+Processor Brand String Returned with Pentium 4 Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-222
+Mapping of Brand Indices; and Intel 64 and IA-32 Processor Brand Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-223
+DIV Action. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-289
+Results Obtained from F2XM1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-313
+Results Obtained from FABS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-315
+FADD/FADDP/FIADD Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-317
+FBSTP Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-321
+FCHS Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-323
+FCOM/FCOMP/FCOMPP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-329
+FCOMI/FCOMIP/ FUCOMI/FUCOMIP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-332
+FCOS Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-335
+FDIV/FDIVP/FIDIV Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-339
+FDIVR/FDIVRP/FIDIVR Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-342
+FICOM/FICOMP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-345
+FIST/FISTP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-352
+FISTTP Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-355
+FMUL/FMULP/FIMUL Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-366
+FPATAN Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-369
+FPREM Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-371
+FPREM1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-373
+FPTAN Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-375
+FSCALE Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-383
+FSIN Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-385
+FSINCOS Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-387
+FSQRT Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-389
+FSUB/FSUBP/FISUB Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-400
+FSUBR/FSUBRP/FISUBR Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-403
+FTST Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-405
+FUCOM/FUCOMP/FUCOMPP Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-407
+FXAM Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-410
+Non-64-bit-Mode Layout of FXSAVE and FXRSTOR Memory Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-417
CONTENTS
PAGE
@@ -46829,7 +46849,6 @@ Table 4-15.
Table 4-16.
Table 4-17.
Table 4-18.
-Table 4-19.
Table 5-1.
Table 5-2.
Table 5-3.
@@ -46858,19 +46877,20 @@ Table 6-2.
Table 6-3.
Table 6-4.
Table 6-5.
+Table 6-6.
-Field Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-417
-Recreating FSAVE Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-419
-Layout of the 64-bit-mode FXSAVE64 Map (requires REX.W = 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-420
-Layout of the 64-bit-mode FXSAVE Map (REX.W = 0). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-421
-FYL2X Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-426
-FYL2XP1 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-428
-IDIV Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-443
-Decision Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-461
-Segment and Gate Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-520
-Non-64-bit Mode LEA Operation with Address and Operand Size Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-529
-64-bit Mode LEA Operation with Address and Operand Size Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-529
-Segment and Gate Descriptor Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-549
+Field Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-418
+Recreating FSAVE Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-420
+Layout of the 64-bit-mode FXSAVE64 Map (requires REX.W = 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-421
+Layout of the 64-bit-mode FXSAVE Map (REX.W = 0). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-422
+FYL2X Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-427
+FYL2XP1 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-429
+IDIV Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-444
+Decision Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-462
+Segment and Gate Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-521
+Non-64-bit Mode LEA Operation with Address and Operand Size Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-530
+64-bit Mode LEA Operation with Address and Operand Size Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-530
+Segment and Gate Descriptor Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-550
Source Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2
Aggregation Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2
Aggregation Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3
@@ -46886,10 +46906,9 @@ Recommended Multi-Byte Sequence of NOP Instruction . . . . . . . . . . . . . . .
PCLMULQDQ Quadword Selection of Immediate Byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-241
Pseudo-Op and PCLMULQDQ Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-241
Effect of POPF/POPFD on the EFLAGS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-394
-Valid General and Special Purpose Performance Counter Index Range for RDPMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-533
-Repeat Prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-546
-Rounding Modes and Encoding of Rounding Control (RC) Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-561
-Decision Table for STI Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-641
+Repeat Prefixes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-544
+Rounding Modes and Encoding of Rounding Control (RC) Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-559
+Decision Table for STI Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-639
Low 8 columns of the 16x16 Map of VPTERNLOG Boolean Logic Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2
Low 8 columns of the 16x16 Map of VPTERNLOG Boolean Logic Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-3
Immediate Byte Encoding for 16-bit Floating-Point Conversion Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-37
@@ -46918,12 +46937,12 @@ GETSEC Leaf Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Getsec Capability Result Encoding (EBX = 0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-7
Register State Initialization after GETSEC[ENTERACCS]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
IA32_MISC_ENABLE MSR Initialization by ENTERACCS and SENTER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
+Register State Initialization after GETSEC[SENTER] and GETSEC[WAKEUP] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
Vol. 2A xxiii
CONTENTS
PAGE
-Table 6-6.
Table 6-7.
Table 6-8.
Table 6-9.
@@ -46982,9 +47001,9 @@ Table B-17.
Table B-18.
Table B-19.
Table B-20.
+Table B-21.
xxiv Vol. 2A
-Register State Initialization after GETSEC[SENTER] and GETSEC[WAKEUP]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
SMX Reporting Parameters Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-33
TXT Feature Extensions Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-34
External Memory Types Using Parameter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-35
@@ -47043,11 +47062,11 @@ Pentium Processor Family Instruction Formats and Encodings, 64-Bit Mode . . . .
Encoding of Granularity of Data Field (gg) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-39
MMX Instruction Formats and Encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-39
Formats and Encodings of XSAVE/XRSTOR/XGETBV/XSETBV Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-42
+Formats and Encodings of P6 Family Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-42
CONTENTS
PAGE
-Table B-21.
Table B-22.
Table B-23.
Table B-25.
@@ -47071,7 +47090,6 @@ Table B-41.
Table C-1.
Table C-2.
-Formats and Encodings of P6 Family Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-42
Formats and Encodings of SSE Floating-Point Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-43
Formats and Encodings of SSE Integer Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-48
Encoding of Granularity of Data Field (gg) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-49
@@ -50281,43 +50299,43 @@ Opcode
Instruction mnemonic
-VEX.NDD.128.66.0F 73 /7 ib
+VEX.128.66.0F 73 /7 ib
VPSLLDQ xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 73 /3 ib
+VEX.128.66.0F 73 /3 ib
VPSRLDQ xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 71 /2 ib
+VEX.128.66.0F 71 /2 ib
VPSRLW xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 72 /2 ib
+VEX.128.66.0F 72 /2 ib
VPSRLD xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 73 /2 ib
+VEX.128.66.0F 73 /2 ib
VPSRLQ xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 71 /4 ib
+VEX.128.66.0F 71 /4 ib
VPSRAW xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 72 /4 ib
+VEX.128.66.0F 72 /4 ib
VPSRAD xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 71 /6 ib
+VEX.128.66.0F 71 /6 ib
VPSLLW xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 72 /6 ib
+VEX.128.66.0F 72 /6 ib
VPSLLD xmm1, xmm2, imm8
-VEX.NDD.128.66.0F 73 /6 ib
+VEX.128.66.0F 73 /6 ib
VPSLLQ xmm1, xmm2, imm8
@@ -52635,19 +52653,13 @@ ANDN, BLSI, BLSMSK, BLSR, BZHI, MULX, PDEP, PEXT, RORX, SARX, SHLX, SHRX
(*) - Additional exception restrictions are present - see the Instruction description for details.
-X
-
-X
-
-X
-
-64-bit
-
-X
+Stack, SS(0)
Protected and
Compatibility
+64-bit
+
Invalid Opcode, #UD
Virtual-8086
@@ -52662,6 +52674,8 @@ X
X
+Cause of Exception
+
X
X
@@ -52669,30 +52683,58 @@ X
X
X
+
X
-If a memory address referencing the SS segment is in a non-canonical form.
-For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.
-If the DS, ES, FS, or GS register is used to access memory and it contains a null
-segment selector.
+X
X
+
+X
+
+If VEX.L = 1.
+
+X
+
+X
+
+X
+
+X
+
+If preceded by a LOCK prefix (F0H).
+
+X
+
X
If any REX, F2, F3, or 66 prefixes precede a VEX prefix.
-For an illegal address in the SS segment.
X
+
+X
+
+If a VEX prefix is present.
+
+X
+
+For an illegal address in the SS segment.
+X
+
General Protection,
#GP(0)
-If BMI1/BMI2 CPUID feature flag is ‘0’.
-If a VEX prefix is present.
+X
+
+If a memory address referencing the SS segment is in a non-canonical form.
+For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.
+If the DS, ES, FS, or GS register is used to access memory and it contains a null
+segment selector.
X
-Stack, SS(0)
+X
-Cause of Exception
+If BMI1/BMI2 CPUID feature flag is ‘0’.
X
@@ -52730,12 +52772,12 @@ The majority of the Intel AVX-512 family of instructions (operating on 512/256/1
are encoded using a new prefix (called EVEX). Opmask instructions (operating on opmask register operands) are
encoded using the VEX prefix. The EVEX prefix has some parts resembling the instruction encoding scheme using
the VEX prefix, and many other capabilities not available with the VEX prefix.
-The significant feature differences between EVEX and VEX are summarized below.
-
Vol. 2A 2-35
INSTRUCTION FORMAT
+The significant feature differences between EVEX and VEX are summarized below.
+
EVEX is a 4-Byte prefix (the first byte must be 62H); VEX is either a 2-Byte (C5H is the first byte) or 3-Byte
@@ -52959,7 +53001,7 @@ Combine with EVEX.B and ModR/M.rm, when SIB/VSIB absent.
EVEX.vvvv
-NDS register specifier
+VVVV register specifier
P[14 : 11]
@@ -52967,7 +53009,7 @@ Same as VEX.vvvv.
EVEXV’
-High-16 NDS/VIDX register specifier
+High-16 VVVV/VIDX register specifier
P[19]
@@ -53106,7 +53148,7 @@ GPR, Vector
Destination or Source
-NDS/NDD
+VVVV
EVEX.V’
@@ -53184,7 +53226,7 @@ GPR, Vector
Destination or Source
-NDS/NDD
+VVVV
EVEX.vvv
@@ -53266,7 +53308,7 @@ k0-k7
Source
-NDS
+VVVV
VEX.vvvv
@@ -57300,6 +57342,12 @@ encoding for a different instruction.
+NFx — Indicates the use of F2/F3 prefixes (beyond those already part of the instructions opcode) are not
+allowed with the instruction. Such use will either cause an invalid-opcode exception (#UD) or result in the
+encoding for a different instruction.
+
+•
+
REX.W — Indicates the use of a REX prefix that affects operand size or instruction semantics. The ordering of
the REX prefix and other optional/mandatory instruction prefixes are discussed Chapter 2. Note that REX
prefixes that promote legacy instructions to 64-bit behavior are not listed explicitly in the opcode column.
@@ -57339,12 +57387,6 @@ to form a single opcode byte.
Table 3-1. Register Codes Associated With +rb, +rw, +rd, +ro
-0
-
-RAX
-
-None
-
Reg Field
REX.B
@@ -57395,6 +57437,10 @@ None
0
+RAX
+
+None
+
CL
None
@@ -57613,17 +57659,15 @@ None
5
-SIL
-
-Yes
+3-2 Vol. 2A
-6
+0
-SI
+ INSTRUCTION SET REFERENCE, A-L
-None
+Table 3-1. Register Codes Associated With +rb, +rw, +rd, +ro (Contd.)
-6
+Reg Field
ESI
@@ -57661,33 +57705,40 @@ None
7
-Registers R8 - R15 (see below): Available in 64-Bit Mode Only
-
-3-2 Vol. 2A
+REX.B
- INSTRUCTION SET REFERENCE, A-L
+6
-Table 3-1. Register Codes Associated With +rb, +rw, +rd, +ro (Contd.)
+Register
-Reg Field
+None
REX.B
+SI
+
Register
-quadword register
-(64-Bit Mode only)
+6
-Reg Field
+REX.B
+
+Yes
REX.B
+Reg Field
+
+quadword register
+(64-Bit Mode only)
+
+SIL
+
Register
-dword register
Reg Field
-REX.B
+dword register
Register
@@ -57695,12 +57746,9 @@ word register
Reg Field
-REX.B
-
-Register
-
byte register
+Registers R8 - R15 (see below): Available in 64-Bit Mode Only
R8L
Yes
@@ -57899,7 +57947,7 @@ Opcode Column in the Instruction Summary Table (Instructions with VEX prefix)
In the Instruction Summary Table, the Opcode column presents each instruction encoded using the VEX prefix in
following form (including the modR/M byte if applicable, the immediate byte if applicable):
-VEX.[NDS].[128,256].[66,F2,F3].0F/0F3A/0F38.[W0,W1] opcode [/r] [/ib,/is4]
+VEX.[128,256].[66,F2,F3].0F/0F3A/0F38.[W0,W1] opcode [/r] [/ib,/is4]
@@ -57907,25 +57955,6 @@ VEX — Indicates the presence of the VEX prefix is required. The VEX prefix can
only applies to those instructions that do not require the following fields to be encoded: VEX.mmmmm, VEX.W,
VEX.X, VEX.B. Refer to Section 2.3 for more detail on the VEX prefix.
The encoding of various sub-fields of the VEX prefix is described using the following notations:
-— NDS, NDD, DDS: Specifies that VEX.vvvv field is valid for the encoding of a register operand:
-
-•
-
-VEX.NDS: VEX.vvvv encodes the first source register in an instruction syntax where the content of
-source registers will be preserved.
-
-•
-•
-
-VEX.NDD: VEX.vvvv encodes the destination register that cannot be encoded by ModR/M:reg field.
-
-•
-
-VEX.DDS: VEX.vvvv encodes the second source register in a three-operand instruction syntax where
-the content of first source register will be overwritten by the result.
-If none of NDS, NDD, and DDS is present, VEX.vvvv must be 1111b (i.e. VEX.vvvv does not encode an
-operand). The VEX.vvvv field can be encoded using either the 2-byte or 3-byte form of the VEX prefix.
-
— 128,256: VEX.L field can be 0 (denoted by VEX.128 or VEX.LZ) or 1 (denoted by VEX.256). The VEX.L field
can be encoded using either the 2-byte or 3-byte form of the VEX prefix. The presence of the notation
VEX.256 or VEX.128 in the opcode column should be interpreted as follows:
@@ -57947,13 +57976,7 @@ encoded with VEX.L= 1 by causing an #UD exception (e.g. VMOVLPS).
-If VEX.LIG is present in the opcode column: The VEX.L value is ignored. This generally applies to VEXencoded scalar SIMD floating-point instructions. Scalar SIMD floating-point instruction can be distin-
-
-Vol. 2A 3-3
-
- INSTRUCTION SET REFERENCE, A-L
-
-guished from the mnemonic of the instruction. Generally, the last two letters of the instruction
+If VEX.LIG is present in the opcode column: The VEX.L value is ignored. This generally applies to VEXencoded scalar SIMD floating-point instructions. Scalar SIMD floating-point instruction can be distinguished from the mnemonic of the instruction. Generally, the last two letters of the instruction
mnemonic would be either “SS“, “SD“, or “SI“ for SIMD floating-point conversion instructions.
@@ -57963,6 +57986,11 @@ VEX.L is not zero.
— 66,F2,F3: The presence or absence of these values map to the VEX.pp field encodings. If absent, this
corresponds to VEX.pp=00B. If present, the corresponding VEX.pp value affects the “opcode” byte in the
+
+Vol. 2A 3-3
+
+ INSTRUCTION SET REFERENCE, A-L
+
same way as if a SIMD prefix (66H, F2H or F3H) does to the ensuing opcode byte. Thus a non-zero encoding
of VEX.pp may be considered as an implied 66H/F2H/F3H prefix. The VEX.pp field may be encoded using
either the 2-byte or 3-byte form of the VEX prefix.
@@ -58002,26 +58030,16 @@ encoding scheme of VEX.R, VEX.X, VEX.B fields must follow the rules defined in S
/is4 — An 8-bit immediate byte is present containing a source register specifier in either imm8[7:4] (for 64-bit
mode) or imm8[6:4] (for 32-bit mode), and instruction-specific payload in imm8[3:0].
-EVEX.[NDS/NDD/DDS].[128,256,512,LIG].[66,F2,F3].0F/0F3A/0F38.[W0,W1,WIG] opcode [/r] [ib]
+EVEX.[128,256,512,LIG].[66,F2,F3].0F/0F3A/0F38.[W0,W1,WIG] opcode [/r] [ib]
EVEX — The EVEX prefix is encoded using the four-byte form (the first byte is 62H). Refer to Section 2.6.1 for
more detail on the EVEX prefix.
The encoding of various sub-fields of the EVEX prefix is described using the following notations:
-— NDS, NDD, DDS: implies that EVEX.vvvv (and EVEX.v’) field is valid for the encoding of an operand. It may
-specify either the source register (NDS) or the destination register (NDD). DDS expresses a syntax where
-vvvv encodes the second source register in a three-operand instruction syntax where the content of first
-source register will be overwritten by the result. If both NDS and NDD absent (i.e. EVEX.vvvv does not
-encode an operand), EVEX.vvvv must be 1111b (and EVEX.v’ must be 1b).
— 128, 256, 512, LIG: This corresponds to the vector length; three values are allowed by EVEX: 512-bit,
256-bit and 128-bit. Alternatively, vector length is ignored (LIG) for certain instructions; this typically
applies to scalar instructions operating on one data element of a vector register.
-
-3-4 Vol. 2A
-
- INSTRUCTION SET REFERENCE, A-L
-
— 66,F2,F3: The presence of these value maps to the EVEX.pp field encodings. The corresponding VEX.pp
value affects the “opcode” byte in the same way as if a SIMD prefix (66H, F2H or F3H) does to the ensuing
opcode byte. Thus a non-zero encoding of VEX.pp may be considered as an implied 66H/F2H/F3H prefix.
@@ -58033,6 +58051,11 @@ encoding of EVEX.mmm may be considered as an implied escape byte sequence of eit
0F38H.
— W0: EVEX.W=0.
— W1: EVEX.W=1.
+
+3-4 Vol. 2A
+
+ INSTRUCTION SET REFERENCE, A-L
+
— WIG: EVEX.W bit ignored
@@ -58042,6 +58065,15 @@ opcode — Instruction opcode.
In general, the encoding of EVEX.R and R’, EVEX.X and X’, and EVEX.B and B’ fields are not shown explicitly in
the opcode column.
+NOTE
+Previously, the terms NDS, NDD and DDS were used in instructions with an EVEX (or VEX) prefix.
+These terms indicated that the vvvv field was valid for encoding, and specified register usage.
+These terms are no longer necessary and are redundant with the instruction operand encoding
+tables provided with each instruction. The instruction operand encoding tables give explicit details
+on all operands, indicating where every operand is stored and if they are read or written. If vvvv is
+not listed as an operand in the instruction operand encoding table, then EVEX (or VEX) vvvv must
+be 0b1111.
+
3.1.1.3
Instruction Column in the Opcode Summary Table
@@ -58109,10 +58141,6 @@ imm64 — An immediate quadword value used for instructions whose operand-size a
The value allows the use of a number between +9,223,372,036,854,775,807 and –
9,223,372,036,854,775,808 inclusive.
-Vol. 2A 3-5
-
- INSTRUCTION SET REFERENCE, A-L
-
r/m8 — A byte operand that is either the contents of a byte general-purpose register (AL, CL, DL, BL, AH, CH,
@@ -58123,6 +58151,11 @@ DH, BH, BPL, SPL, DIL and SIL) or a byte from memory. Byte registers R8L - R15L
r/m16 — A word general-purpose register or memory operand used for instructions whose operand-size
attribute is 16 bits. The word general-purpose registers are: AX, CX, DX, BX, SP, BP, SI, DI. The contents of
+
+Vol. 2A 3-5
+
+ INSTRUCTION SET REFERENCE, A-L
+
memory are found at the address provided by the effective address computation. Word registers R8W - R15W
are available using REX.R in 64-bit mode.
@@ -58192,12 +58225,18 @@ x87 FPU floating-point instructions.
m16int, m32int, m64int — A word, doubleword, and quadword integer (respectively) operand in memory.
These symbols designate integers that are used as operands for x87 FPU integer instructions.
+•
ST or ST(0) — The top element of the FPU register stack.
+•
+
+mm/m64 — An MMX register or a 64-bit memory operand. The 64-bit MMX registers are: MM0 through MM7.
+The contents of memory are found at the address provided by the effective address computation.
+
m8 — A byte operand in memory, usually expressed as a variable or array name, but pointed to by the
DS:(E)SI or ES:(E)DI registers. In 64-bit mode, it is pointed to by the RSI or RDI registers.
@@ -58213,21 +58252,13 @@ indicates its size, which is determined by the address-size attribute of the ins
ST(i) — The ith element from the top of the FPU register stack (i ← 0 through 7).
mm — An MMX register. The 64-bit MMX registers are: MM0 through MM7.
-
-3-6 Vol. 2A
-
- INSTRUCTION SET REFERENCE, A-L
-
-•
-
mm/m32 — The low order 32 bits of an MMX register or a 32-bit memory operand. The 64-bit MMX registers
are: MM0 through MM7. The contents of memory are found at the address provided by the effective address
computation.
-•
+3-6 Vol. 2A
-mm/m64 — An MMX register or a 64-bit memory operand. The 64-bit MMX registers are: MM0 through MM7.
-The contents of memory are found at the address provided by the effective address computation.
+ INSTRUCTION SET REFERENCE, A-L
@@ -58301,6 +58332,20 @@ memory addresses are specified using a common base register, a constant scale fa
register with individual elements of 64-bit index value in an XMM register (vm64x), a YMM register (vm64y) or
a ZMM register (vm64z).
+•
+
+zmm/m512/m32bcst — An operand that can be a ZMM register, a 512-bit memory location or a 512-bit
+vector loaded from a 32-bit memory location.
+
+•
+
+zmm/m512/m64bcst — An operand that can be a ZMM register, a 512-bit memory location or a 512-bit
+vector loaded from a 64-bit memory location.
+
+•
+
+<ZMM0> — Indicates use of the ZMM0 register as an implicit argument.
+
ymm/m256 — A YMM register or 256-bit memory operand.
<YMM0>— Indicates use of the YMM0 register as an implicit argument.
bnd — A 128-bit bounds register. BND0 through BND3.
@@ -58321,18 +58366,6 @@ Vol. 2A 3-7
-zmm/m512/m32bcst — An operand that can be a ZMM register, a 512-bit memory location or a 512-bit
-vector loaded from a 32-bit memory location.
-
-•
-
-zmm/m512/m64bcst — An operand that can be a ZMM register, a 512-bit memory location or a 512-bit
-vector loaded from a 64-bit memory location.
-
-•
-•
-
-<ZMM0> — Indicates use of the ZMM0 register as an implicit argument.
{er} — Indicates support for embedded rounding control, which is only applicable to the register-register form
of the instruction. This also implies support for SAE (Suppress All Exceptions).
@@ -58410,14 +58443,15 @@ N.I. — Indicates the opcode is treated as a new instruction in 64-bit mode.
N.S. — Indicates an instruction syntax that requires an address override prefix in 64-bit mode and is not
supported. Using an address override prefix in 64-bit mode may result in model-specific execution behavior.
-3-8 Vol. 2A
-
- INSTRUCTION SET REFERENCE, A-L
-
The Compatibility/Legacy Mode support is to the right of the ‘slash’ and has the following notation:
• V — Supported.
• I — Not supported.
• N.E. — Indicates an Intel 64 instruction mnemonics/syntax that is not encodable; the opcode sequence is not
+
+3-8 Vol. 2A
+
+ INSTRUCTION SET REFERENCE, A-L
+
applicable as an individual instruction in compatibility mode or IA-32 mode. The opcode may represent a valid
sequence of legacy IA-32 instructions.
@@ -58493,20 +58527,17 @@ segment-relative offset. For example, [SRC] indicates that the content of the so
A ← B indicates that the value of B is assigned to A.
+•
+
+The expression “« COUNT” and “» COUNT” indicates that the destination operand should be shifted left or right
+by the number of bits indicated by the count operand.
+
Compound statements are enclosed in keywords, such as: IF, THEN, ELSE and FI for an if statement; DO and
OD for a do statement; or CASE... OF for a case statement.
The symbols =, ≠, >, <, ≥, and ≤ are relational operators used to compare two values: meaning equal, not
equal, greater or equal, less or equal, respectively. A relational expression such as A = B is TRUE if the value of
A is equal to B; otherwise it is FALSE.
-Vol. 2A 3-9
-
- INSTRUCTION SET REFERENCE, A-L
-
-•
-
-The expression “« COUNT” and “» COUNT” indicates that the destination operand should be shifted left or right
-by the number of bits indicated by the count operand.
The following identifiers are used in the algorithmic descriptions:
@@ -58514,6 +58545,10 @@ The following identifiers are used in the algorithmic descriptions:
OperandSize and AddressSize — The OperandSize identifier represents the operand-size attribute of the
instruction, which is 16, 32 or 64-bits. The AddressSize identifier represents the address-size attribute, which
+Vol. 2A 3-9
+
+ INSTRUCTION SET REFERENCE, A-L
+
is 16, 32 or 64-bits. For example, the following pseudo-code indicates that the operand-size attribute depends
on the form of the MOV instruction used.
IF Instruction = MOVW
@@ -58585,15 +58620,11 @@ SaturateSignedWordToSignedByte — Converts a signed 16-bit value to a signed 8-
16-bit value is less than –128, it is represented by the saturated value -128 (80H); if it is greater than 127, it
is represented by the saturated value 127 (7FH).
-3-10 Vol. 2A
-
- INSTRUCTION SET REFERENCE, A-L
-
SaturateSignedDwordToSignedWord — Converts a signed 32-bit value to a signed 16-bit value. If the
-signed 32-bit value is less than –32768, it is represented by the saturated value –32768 (8000H); if it is
-greater than 32767, it is represented by the saturated value 32767 (7FFFH).
+signed 32-bit value is less than –32768, it is represented by the saturated value –32768 (8000H); if it is greater
+than 32767, it is represented by the saturated value 32767 (7FFFH).
@@ -58601,6 +58632,10 @@ SaturateSignedWordToUnsignedByte — Converts a signed 16-bit value to an unsign
signed 16-bit value is less than zero, it is represented by the saturated value zero (00H); if it is greater than
255, it is represented by the saturated value 255 (FFH).
+3-10 Vol. 2A
+
+ INSTRUCTION SET REFERENCE, A-L
+
SaturateToSignedByte — Represents the result of an operation as a signed 8-bit value. If the result is less
@@ -60965,19 +61000,19 @@ SSE2
66 0F 58 /r
ADDPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 58 /r
+VEX.128.66.0F.WIG 58 /r
VADDPD xmm1,xmm2,
xmm3/m128
-VEX.NDS.256.66.0F.WIG 58 /r
+VEX.256.66.0F.WIG 58 /r
VADDPD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F.W1 58 /r
+EVEX.128.66.0F.W1 58 /r
VADDPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 58 /r
+EVEX.256.66.0F.W1 58 /r
VADDPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 58 /r
+EVEX.512.66.0F.W1 58 /r
VADDPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
@@ -61221,17 +61256,17 @@ SSE
NP 0F 58 /r
ADDPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 58 /r
+VEX.128.0F.WIG 58 /r
VADDPS xmm1,xmm2, xmm3/m128
-VEX.NDS.256.0F.WIG 58 /r
+VEX.256.0F.WIG 58 /r
VADDPS ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.0F.W0 58 /r
+EVEX.128.0F.W0 58 /r
VADDPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 58 /r
+EVEX.256.0F.W0 58 /r
VADDPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 58 /r
+EVEX.512.0F.W0 58 /r
VADDPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst {er}
@@ -61479,10 +61514,10 @@ SSE2
F2 0F 58 /r
ADDSD xmm1, xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 58 /r
+VEX.LIG.F2.0F.WIG 58 /r
VADDSD xmm1, xmm2,
xmm3/m64
-EVEX.NDS.LIG.F2.0F.W1 58 /r
+EVEX.LIG.F2.0F.W1 58 /r
VADDSD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
@@ -61648,10 +61683,10 @@ SSE
F3 0F 58 /r
ADDSS xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 58 /r
+VEX.LIG.F3.0F.WIG 58 /r
VADDSS xmm1,xmm2,
xmm3/m32
-EVEX.NDS.LIG.F3.0F.W0 58 /r
+EVEX.LIG.F3.0F.W0 58 /r
VADDSS xmm1{k1}{z}, xmm2,
xmm3/m32{er}
@@ -61838,9 +61873,9 @@ floating-point values from ymm3/mem to
ymm2 and stores result in ymm1.
ADDSUBPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG D0 /r
+VEX.128.66.0F.WIG D0 /r
VADDSUBPD xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG D0 /r
+VEX.256.66.0F.WIG D0 /r
VADDSUBPD ymm1, ymm2, ymm3/m256
Instruction Operand Encoding
@@ -61998,9 +62033,9 @@ values from ymm3/mem to ymm2 and stores
result in ymm1.
ADDSUBPS xmm1, xmm2/m128
-VEX.NDS.128.F2.0F.WIG D0 /r
+VEX.128.F2.0F.WIG D0 /r
VADDSUBPS xmm1, xmm2, xmm3/m128
-VEX.NDS.256.F2.0F.WIG D0 /r
+VEX.256.F2.0F.WIG D0 /r
VADDSUBPS ymm1, ymm2, ymm3/m256
Instruction Operand Encoding
@@ -62360,7 +62395,7 @@ using the Equivalent Inverse Cipher, operating
on a 128-bit data (state) from xmm1 with a
128-bit round key from xmm2/m128.
-VEX.NDS.128.66.0F38.WIG DE /r
+VEX.128.66.0F38.WIG DE /r
VAESDEC xmm1, xmm2, xmm3/m128
RVM V/V
@@ -62487,7 +62522,7 @@ operating on a 128-bit data (state) from
xmm1 with a 128-bit round key from
xmm2/m128.
-VEX.NDS.128.66.0F38.WIG DF /r
+VEX.128.66.0F38.WIG DF /r
VAESDECLAST xmm1, xmm2, xmm3/m128
RVM V/V
@@ -62610,7 +62645,7 @@ operating on a 128-bit data (state) from
xmm1 with a 128-bit round key from
xmm2/m128.
-VEX.NDS.128.66.0F38.WIG DC /r
+VEX.128.66.0F38.WIG DC /r
VAESENC xmm1, xmm2, xmm3/m128
RVM V/V
@@ -62735,7 +62770,7 @@ flow, operating on a 128-bit data (state) from
xmm1 with a 128-bit round key from
xmm2/m128.
-VEX.NDS.128.66.0F38.WIG DD /r
+VEX.128.66.0F38.WIG DD /r
VAESENCLAST xmm1, xmm2, xmm3/m128
RVM V/V
@@ -63521,9 +63556,9 @@ RVM
Mode
V/V
-VEX.NDS.LZ.0F38.W0 F2 /r
+VEX.LZ.0F38.W0 F2 /r
ANDN r32a, r32b, r/m32
-VEX.NDS.LZ. 0F38.W1 F2 /r
+VEX.LZ. 0F38.W1 F2 /r
ANDN r64a, r64b, r/m64
RVM
@@ -63579,13 +63614,10 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
ANDN — Logical AND NOT
-If VEX.W = 1.
-
Vol. 2A 3-63
INSTRUCTION SET REFERENCE, A-L
@@ -63610,19 +63642,19 @@ SSE2
66 0F 54 /r
ANDPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F 54 /r
+VEX.128.66.0F 54 /r
VANDPD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F 54 /r
+VEX.256.66.0F 54 /r
VANDPD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F.W1 54 /r
+EVEX.128.66.0F.W1 54 /r
VANDPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 54 /r
+EVEX.256.66.0F.W1 54 /r
VANDPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 54 /r
+EVEX.512.66.0F.W1 54 /r
VANDPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -63832,19 +63864,19 @@ SSE
NP 0F 54 /r
ANDPS xmm1, xmm2/m128
-VEX.NDS.128.0F 54 /r
+VEX.128.0F 54 /r
VANDPS xmm1,xmm2,
xmm3/m128
-VEX.NDS.256.0F 54 /r
+VEX.256.0F 54 /r
VANDPS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.0F.W0 54 /r
+EVEX.128.0F.W0 54 /r
VANDPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 54 /r
+EVEX.256.0F.W0 54 /r
VANDPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 54 /r
+EVEX.512.0F.W0 54 /r
VANDPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -64067,19 +64099,19 @@ SSE2
66 0F 55 /r
ANDNPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F 55 /r
+VEX.128.66.0F 55 /r
VANDNPD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F 55/r
+VEX.256.66.0F 55/r
VANDNPD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F.W1 55 /r
+EVEX.128.66.0F.W1 55 /r
VANDNPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 55 /r
+EVEX.256.66.0F.W1 55 /r
VANDNPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 55 /r
+EVEX.512.66.0F.W1 55 /r
VANDNPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -64288,19 +64320,19 @@ SSE
NP 0F 55 /r
ANDNPS xmm1, xmm2/m128
-VEX.NDS.128.0F 55 /r
+VEX.128.0F 55 /r
VANDNPS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.0F 55 /r
+VEX.256.0F 55 /r
VANDNPS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.0F.W0 55 /r
+EVEX.128.0F.W0 55 /r
VANDNPS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 55 /r
+EVEX.256.0F.W0 55 /r
VANDNPS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 55 /r
+EVEX.512.0F.W0 55 /r
VANDNPS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst
@@ -64642,6 +64674,109 @@ Vol. 2A 3-77
INSTRUCTION SET REFERENCE, A-L
+BEXTR — Bit Field Extract
+Opcode/Instruction
+
+Op/
+En
+RMV
+
+64/32
+-bit
+Mode
+V/V
+
+CPUID
+Feature
+Flag
+BMI1
+
+VEX.LZ.0F38.W0 F7 /r
+BEXTR r32a, r/m32, r32b
+VEX.LZ.0F38.W1 F7 /r
+BEXTR r64a, r/m64, r64b
+
+RMV
+
+V/N.E.
+
+BMI1
+
+Description
+
+Contiguous bitwise extract from r/m32 using r32b as control; store
+result in r32a.
+Contiguous bitwise extract from r/m64 using r64b as control; store
+result in r64a
+
+Instruction Operand Encoding
+Op/En
+
+Operand 1
+
+Operand 2
+
+Operand 3
+
+Operand 4
+
+RMV
+
+ModRM:reg (w)
+
+ModRM:r/m (r)
+
+VEX.vvvv (r)
+
+NA
+
+Description
+Extracts contiguous bits from the first source operand (the second operand) using an index value and length value
+specified in the second source operand (the third operand). Bit 7:0 of the second source operand specifies the
+starting bit position of bit extraction. A START value exceeding the operand size will not extract any bits from the
+second source operand. Bit 15:8 of the second source operand specifies the maximum number of bits (LENGTH)
+beginning at the START position to extract. Only bit positions up to (OperandSize -1) of the first source operand are
+extracted. The extracted bits are written to the destination register, starting from the least significant bit. All higher
+order bits in the destination operand (starting at bit position LENGTH) are zeroed. The destination register is
+cleared if no bits are extracted.
+This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in
+64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt
+to execute this instruction with VEX.L not equal to 0 will cause #UD.
+
+Operation
+START ← SRC2[7:0];
+LEN ← SRC2[15:8];
+TEMP ← ZERO_EXTEND_TO_512 (SRC1 );
+DEST ← ZERO_EXTEND(TEMP[START+LEN -1: START]);
+ZF ← (DEST = 0);
+
+Flags Affected
+ZF is updated based on the result. AF, SF, and PF are undefined. All other flags are cleared.
+
+Intel C/C++ Compiler Intrinsic Equivalent
+BEXTR:
+
+unsigned __int32 _bextr_u32(unsigned __int32 src, unsigned __int32 start. unsigned __int32 len);
+
+BEXTR:
+
+unsigned __int64 _bextr_u64(unsigned __int64 src, unsigned __int32 start. unsigned __int32 len);
+
+SIMD Floating-Point Exceptions
+None
+
+Other Exceptions
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
+#UD
+
+3-78 Vol. 2A
+
+If VEX.W = 1.
+
+BEXTR — Bit Field Extract
+
+ INSTRUCTION SET REFERENCE, A-L
+
BLENDPD — Blend Packed Double Precision Floating-Point Values
Opcode/
Instruction
@@ -64685,9 +64820,9 @@ Values from ymm2 and ymm3/m256 from
mask in imm8 and store the values in ymm1.
BLENDPD xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F3A.WIG 0D /r ib
+VEX.128.66.0F3A.WIG 0D /r ib
VBLENDPD xmm1, xmm2, xmm3/m128, imm8
-VEX.NDS.256.66.0F3A.WIG 0D /r ib
+VEX.256.66.0F3A.WIG 0D /r ib
VBLENDPD ymm1, ymm2, ymm3/m256, imm8
Instruction Operand Encoding
@@ -64749,10 +64884,10 @@ IF (IMM8[1] = 0) THEN DEST[127:64]  SRC1[127:64]
ELSE DEST [127:64]  SRC2[127:64] FI
DEST[MAXVL-1:128]  0
-3-78 Vol. 2A
-
BLENDPD — Blend Packed Double Precision Floating-Point Values
+Vol. 2A 3-79
+
INSTRUCTION SET REFERENCE, A-L
VBLENDPD (VEX.256 encoded version)
@@ -64780,112 +64915,9 @@ None
Other Exceptions
See Exceptions Type 4.
-BLENDPD — Blend Packed Double Precision Floating-Point Values
-
-Vol. 2A 3-79
-
- INSTRUCTION SET REFERENCE, A-L
-
-BEXTR — Bit Field Extract
-Opcode/Instruction
-
-Op/
-En
-RMV
-
-64/32
--bit
-Mode
-V/V
-
-CPUID
-Feature
-Flag
-BMI1
-
-VEX.NDS.LZ.0F38.W0 F7 /r
-BEXTR r32a, r/m32, r32b
-VEX.NDS.LZ.0F38.W1 F7 /r
-BEXTR r64a, r/m64, r64b
-
-RMV
-
-V/N.E.
-
-BMI1
-
-Description
-
-Contiguous bitwise extract from r/m32 using r32b as control; store
-result in r32a.
-Contiguous bitwise extract from r/m64 using r64b as control; store
-result in r64a
-
-Instruction Operand Encoding
-Op/En
-
-Operand 1
-
-Operand 2
-
-Operand 3
-
-Operand 4
-
-RMV
-
-ModRM:reg (w)
-
-ModRM:r/m (r)
-
-VEX.vvvv (r)
-
-NA
-
-Description
-Extracts contiguous bits from the first source operand (the second operand) using an index value and length value
-specified in the second source operand (the third operand). Bit 7:0 of the second source operand specifies the
-starting bit position of bit extraction. A START value exceeding the operand size will not extract any bits from the
-second source operand. Bit 15:8 of the second source operand specifies the maximum number of bits (LENGTH)
-beginning at the START position to extract. Only bit positions up to (OperandSize -1) of the first source operand are
-extracted. The extracted bits are written to the destination register, starting from the least significant bit. All higher
-order bits in the destination operand (starting at bit position LENGTH) are zeroed. The destination register is
-cleared if no bits are extracted.
-This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in
-64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt
-to execute this instruction with VEX.L not equal to 0 will cause #UD.
-
-Operation
-START ← SRC2[7:0];
-LEN ← SRC2[15:8];
-TEMP ← ZERO_EXTEND_TO_512 (SRC1 );
-DEST ← ZERO_EXTEND(TEMP[START+LEN -1: START]);
-ZF ← (DEST = 0);
-
-Flags Affected
-ZF is updated based on the result. AF, SF, and PF are undefined. All other flags are cleared.
-
-Intel C/C++ Compiler Intrinsic Equivalent
-BEXTR:
-
-unsigned __int32 _bextr_u32(unsigned __int32 src, unsigned __int32 start. unsigned __int32 len);
-
-BEXTR:
-
-unsigned __int64 _bextr_u64(unsigned __int64 src, unsigned __int32 start. unsigned __int32 len);
-
-SIMD Floating-Point Exceptions
-None
-
-Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
-
3-80 Vol. 2A
-If VEX.W = 1.
-
-BEXTR — Bit Field Extract
+BLENDPD — Blend Packed Double Precision Floating-Point Values
INSTRUCTION SET REFERENCE, A-L
@@ -64934,9 +64966,9 @@ mask in imm8 and store the values in ymm1.
BLENDPS xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F3A.WIG 0C /r ib
+VEX.128.66.0F3A.WIG 0C /r ib
VBLENDPS xmm1, xmm2, xmm3/m128, imm8
-VEX.NDS.256.66.0F3A.WIG 0C /r ib
+VEX.256.66.0F3A.WIG 0C /r ib
VBLENDPS ymm1, ymm2, ymm3/m256, imm8
Instruction Operand Encoding
@@ -65095,10 +65127,10 @@ ymm1, based on mask bits in the mask
operand, ymm4.
BLENDVPD xmm1, xmm2/m128 , <XMM0>
-VEX.NDS.128.66.0F3A.W0 4B /r /is4
+VEX.128.66.0F3A.W0 4B /r /is4
VBLENDVPD xmm1, xmm2, xmm3/m128, xmm4
-VEX.NDS.256.66.0F3A.W0 4B /r /is4
+VEX.256.66.0F3A.W0 4B /r /is4
VBLENDVPD ymm1, ymm2, ymm3/m256, ymm4
Instruction Operand Encoding
@@ -65266,10 +65298,10 @@ mask register, ymm4.
BLENDVPS xmm1, xmm2/m128, <XMM0>
-VEX.NDS.128.66.0F3A.W0 4A /r /is4
+VEX.128.66.0F3A.W0 4A /r /is4
VBLENDVPS xmm1, xmm2, xmm3/m128, xmm4
-VEX.NDS.256.66.0F3A.W0 4A /r /is4
+VEX.256.66.0F3A.W0 4A /r /is4
VBLENDVPS ymm1, ymm2, ymm3/m256, ymm4
Instruction Operand Encoding
@@ -65431,9 +65463,9 @@ Feature
Flag
BMI1
-VEX.NDD.LZ.0F38.W0 F3 /3
+VEX.LZ.0F38.W0 F3 /3
BLSI r32, r/m32
-VEX.NDD.LZ.0F38.W1 F3 /3
+VEX.LZ.0F38.W1 F3 /3
BLSI r64, r/m64
Description
@@ -65505,13 +65537,10 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
3-88 Vol. 2A
-If VEX.W = 1.
-
BLSI — Extract Lowest Set Isolated Bit
INSTRUCTION SET REFERENCE, A-L
@@ -65533,9 +65562,9 @@ Feature
Flag
BMI1
-VEX.NDD.LZ.0F38.W0 F3 /2
+VEX.LZ.0F38.W0 F3 /2
BLSMSK r32, r/m32
-VEX.NDD.LZ.0F38.W1 F3 /2
+VEX.LZ.0F38.W1 F3 /2
BLSMSK r64, r/m64
VM
@@ -65607,10 +65636,7 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
-
-If VEX.W = 1.
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
BLSMSK — Get Mask Up to Lowest Set Bit
@@ -65635,9 +65661,9 @@ Feature
Flag
BMI1
-VEX.NDD.LZ.0F38.W0 F3 /1
+VEX.LZ.0F38.W0 F3 /1
BLSR r32, r/m32
-VEX.NDD.LZ.0F38.W1 F3 /1
+VEX.LZ.0F38.W1 F3 /1
BLSR r64, r/m64
VM
@@ -65709,13 +65735,10 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
3-90 Vol. 2A
-If VEX.W = 1.
-
BLSR — Reset Lowest Set Bit
INSTRUCTION SET REFERENCE, A-L
@@ -68453,9 +68476,9 @@ Feature
Flag
BMI2
-VEX.NDS.LZ.0F38.W0 F5 /r
+VEX.LZ.0F38.W0 F5 /r
BZHI r32a, r/m32, r32b
-VEX.NDS.LZ.0F38.W1 F5 /r
+VEX.LZ.0F38.W1 F5 /r
BZHI r64a, r/m64, r64b
RMV
@@ -68530,10 +68553,7 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
-
-If VEX.W = 1.
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
BZHI — Zero High Bits Starting with Specified Bit Position
@@ -68941,6 +68961,11 @@ Vol. 2A 3-125
INSTRUCTION SET REFERENCE, A-L
+Instruction ordering. Instructions following a far call may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the far call have
+completed execution (the later instructions may execute before data stored by the earlier instructions have become
+globally visible).
+
Operation
IF near call
THEN IF near relative call
@@ -68990,24 +69015,21 @@ THEN #SS(0); FI;
Push(EIP);
EIP ← tempEIP;
FI;
-IF OperandSize = 16
-THEN
-tempEIP ← DEST AND 0000FFFFH; (* DEST is r/m16 *)
-IF tempEIP is not within code segment limit THEN #GP(0); FI;
-
3-126 Vol. 2A
CALL—Call Procedure
INSTRUCTION SET REFERENCE, A-L
-FI;
-
+IF OperandSize = 16
+THEN
+tempEIP ← DEST AND 0000FFFFH; (* DEST is r/m16 *)
+IF tempEIP is not within code segment limit THEN #GP(0); FI;
IF stack not large enough for a 2-byte return address
THEN #SS(0); FI;
Push(IP);
EIP ← tempEIP;
-
+FI;
FI;rel/abs
FI; near
IF far call and (PE = 0 or (PE = 1 and VM = 1)) (* Real-address or virtual-8086 mode *)
@@ -69051,16 +69073,15 @@ Depending on type and access rights:
GO TO CONFORMING-CODE-SEGMENT;
GO TO NONCONFORMING-CODE-SEGMENT;
GO TO CALL-GATE;
-GO TO TASK-GATE;
-GO TO TASK-STATE-SEGMENT;
-FI;
-
CALL—Call Procedure
Vol. 2A 3-127
INSTRUCTION SET REFERENCE, A-L
+GO TO TASK-GATE;
+GO TO TASK-STATE-SEGMENT;
+FI;
CONFORMING-CODE-SEGMENT:
IF L bit = 1 and D bit = 1 and IA32_EFER.LMA = 1
THEN GP(new code segment selector); FI;
@@ -69110,16 +69131,16 @@ FI;
END;
NONCONFORMING-CODE-SEGMENT:
IF L-Bit = 1 and D-BIT = 1 and IA32_EFER.LMA = 1
-THEN GP(new code segment selector); FI;
-IF (RPL > CPL) or (DPL ≠ CPL)
-THEN #GP(new code segment selector); FI;
-IF segment not present
3-128 Vol. 2A
CALL—Call Procedure
INSTRUCTION SET REFERENCE, A-L
+THEN GP(new code segment selector); FI;
+IF (RPL > CPL) or (DPL ≠ CPL)
+THEN #GP(new code segment selector); FI;
+IF segment not present
THEN #NP(new code segment selector); FI;
IF stack not large enough for return address
THEN #SS(0); FI;
@@ -69169,16 +69190,16 @@ IF call-gate code-segment selector is NULL
THEN #GP(0); FI;
IF call-gate code-segment selector index is outside descriptor table limits
THEN #GP(call-gate code-segment selector); FI;
-Read call-gate code-segment descriptor;
-IF call-gate code-segment descriptor does not indicate a code segment
-or call-gate code-segment descriptor DPL > CPL
-THEN #GP(call-gate code-segment selector); FI;
CALL—Call Procedure
Vol. 2A 3-129
INSTRUCTION SET REFERENCE, A-L
+Read call-gate code-segment descriptor;
+IF call-gate code-segment descriptor does not indicate a code segment
+or call-gate code-segment descriptor DPL > CPL
+THEN #GP(call-gate code-segment selector); FI;
IF IA32_EFER.LMA = 1 AND (call-gate code-segment descriptor is
not a 64-bit code segment or call-gate code-segment descriptor has both L-bit and D-bit set)
THEN #GP(call-gate code-segment selector); FI;
@@ -69228,10 +69249,6 @@ IF new stack does not have room for parameters plus 16 bytes
THEN #SS(NewSS); FI;
IF CallGate(InstructionPointer) not within new code-segment limit
THEN #GP(0); FI;
-SS ← newSS; (* Segment descriptor information also loaded *)
-ESP ← newESP;
-CS:EIP ← CallGate(CS:InstructionPointer);
-(* Segment descriptor information also loaded *)
3-130 Vol. 2A
@@ -69239,6 +69256,10 @@ CALL—Call Procedure
INSTRUCTION SET REFERENCE, A-L
+SS ← newSS; (* Segment descriptor information also loaded *)
+ESP ← newESP;
+CS:EIP ← CallGate(CS:InstructionPointer);
+(* Segment descriptor information also loaded *)
Push(oldSS:oldESP); (* From calling procedure *)
temp ← parameter count from call gate, masked to 5 bits;
Push(parameters from calling procedure’s stack, temp)
@@ -69288,16 +69309,16 @@ If CallGateSize = 16
THEN
IF stack does not have room for 4 bytes
THEN #SS(0); FI;
-IF CallGate(InstructionPointer) not within code segment limit
-THEN #GP(0); FI;
-CS:IP ← CallGate(CS:instruction pointer);
-(* Segment descriptor information also loaded *)
CALL—Call Procedure
Vol. 2A 3-131
INSTRUCTION SET REFERENCE, A-L
+IF CallGate(InstructionPointer) not within code segment limit
+THEN #GP(0); FI;
+CS:IP ← CallGate(CS:instruction pointer);
+(* Segment descriptor information also loaded *)
Push(oldCS:oldIP); (* Return address to calling procedure *)
ELSE (* CallGateSize = 64)
IF pushing 16 bytes on the stack touches non-canonical addresses
@@ -70037,7 +70058,7 @@ Mode
Compat/ Description
Leg Mode
-66 0F AE /7
+NFx 66 0F AE /7
CLFLUSHOPT m8
@@ -72422,13 +72443,13 @@ SSE2
66 0F C2 /r ib
CMPPD xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F.WIG C2 /r ib
+VEX.128.66.0F.WIG C2 /r ib
VCMPPD xmm1, xmm2, xmm3/m128,
imm8
-VEX.NDS.256.66.0F.WIG C2 /r ib
+VEX.256.66.0F.WIG C2 /r ib
VCMPPD ymm1, ymm2, ymm3/m256,
imm8
-EVEX.NDS.128.66.0F.W1 C2 /r ib
+EVEX.128.66.0F.W1 C2 /r ib
VCMPPD k1 {k2}, xmm2,
xmm3/m128/m64bcst, imm8
@@ -72451,7 +72472,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F.W1 C2 /r ib
+EVEX.256.66.0F.W1 C2 /r ib
VCMPPD k1 {k2}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -72462,7 +72483,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F.W1 C2 /r ib
+EVEX.512.66.0F.W1 C2 /r ib
VCMPPD k1 {k2}, zmm2,
zmm3/m512/m64bcst{sae}, imm8
@@ -73506,13 +73527,13 @@ SSE
NP 0F C2 /r ib
CMPPS xmm1, xmm2/m128,
imm8
-VEX.NDS.128.0F.WIG C2 /r ib
+VEX.128.0F.WIG C2 /r ib
VCMPPS xmm1, xmm2,
xmm3/m128, imm8
-VEX.NDS.256.0F.WIG C2 /r ib
+VEX.256.0F.WIG C2 /r ib
VCMPPS ymm1, ymm2,
ymm3/m256, imm8
-EVEX.NDS.128.0F.W0 C2 /r ib
+EVEX.128.0F.W0 C2 /r ib
VCMPPS k1 {k2}, xmm2,
xmm3/m128/m32bcst, imm8
@@ -73535,7 +73556,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.0F.W0 C2 /r ib
+EVEX.256.0F.W0 C2 /r ib
VCMPPS k1 {k2}, ymm2,
ymm3/m256/m32bcst, imm8
@@ -73546,7 +73567,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.0F.W0 C2 /r ib
+EVEX.512.0F.W0 C2 /r ib
VCMPPS k1 {k2}, zmm2,
zmm3/m512/m32bcst{sae}, imm8
@@ -74435,10 +74456,10 @@ SSE2
F2 0F C2 /r ib
CMPSD xmm1, xmm2/m64, imm8
-VEX.NDS.LIG.F2.0F.WIG C2 /r ib
+VEX.LIG.F2.0F.WIG C2 /r ib
VCMPSD xmm1, xmm2,
xmm3/m64, imm8
-EVEX.NDS.LIG.F2.0F.W1 C2 /r ib
+EVEX.LIG.F2.0F.W1 C2 /r ib
VCMPSD k1 {k2}, xmm2,
xmm3/m64{sae}, imm8
@@ -74870,10 +74891,10 @@ SSE
F3 0F C2 /r ib
CMPSS xmm1, xmm2/m32, imm8
-VEX.NDS.LIG.F3.0F.WIG C2 /r ib
+VEX.LIG.F3.0F.WIG C2 /r ib
VCMPSS xmm1, xmm2, xmm3/m32,
imm8
-EVEX.NDS.LIG.F3.0F.W0 C2 /r ib
+EVEX.LIG.F3.0F.W0 C2 /r ib
VCMPSS k1 {k2}, xmm2,
xmm3/m32{sae}, imm8
@@ -75583,7 +75604,7 @@ written into the destination. (The processor never produces a locked read withou
In 64-bit mode, default operation size is 64 bits. Use of the REX.W prefix promotes operation to 128 bits. Note that
CMPXCHG16B requires that the destination (memory) operand be 16-byte aligned. See the summary chart at the
beginning of this section for encoding data and limits. For information on the CPUID flag that indicates
-CMPXCHG16B, see page 3-208.
+CMPXCHG16B, see page 3-210.
IA-32 Architecture Compatibility
This instruction encoding is not supported on Intel processors earlier than the Pentium processors.
@@ -76064,11 +76085,12 @@ CPUID
Table 3-8 shows information returned, depending on the initial value loaded into the EAX register.
Two types of information are returned: basic and extended function information. If a value entered for CPUID.EAX
is higher than the maximum input value for basic or extended function for that processor then the data for the
-highest basic information leaf is returned. For example, using the Intel Core i7 processor, the following is true:
+highest basic information leaf is returned. For example, using some Intel processors, the following is true:
CPUID.EAX = 05H (* Returns MONITOR/MWAIT leaf. *)
CPUID.EAX = 0AH (* Returns Architectural Performance Monitoring leaf. *)
-CPUID.EAX = 0BH (* Returns Extended Topology Enumeration leaf. *)
+CPUID.EAX = 0BH (* Returns Extended Topology Enumeration leaf. *)2
CPUID.EAX = 0CH (* INVALID: Returns the same information as CPUID.EAX = 0BH. *)
+CPUID.EAX =1FH (* Returns V2 Extended Topology Enumeration leaf. *)2
CPUID.EAX = 80000008H (* Returns linear/physical address size data. *)
CPUID.EAX = 8000000AH (* INVALID: Returns same information as CPUID.EAX = 0BH. *)
If a value entered for CPUID.EAX is less than or equal to the maximum input value and the leaf is not supported on
@@ -76082,8 +76104,9 @@ See also:
“Serializing Instructions” in Chapter 8, “Multiple-Processor Management,” in the Intel® 64 and IA-32 Architectures
Software Developer’s Manual, Volume 3A.
“Caching Translation Information” in Chapter 4, “Paging,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.
-
1. On Intel 64 processors, CPUID clears the high 32 bits of the RAX/RBX/RCX/RDX registers in all modes.
+2. CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence of CPUID leaf 1FH before using
+leaf 0BH.
3-190 Vol. 2A
CPUID—CPU Identification
@@ -76126,7 +76149,7 @@ EBX
Bits 07 - 00: Brand Index.
Bits 15 - 08: CLFLUSH line size (Value ∗ 8 = cache line size in bytes; used also by CLFLUSHOPT).
Bits 23 - 16: Maximum number of addressable IDs for logical processors in this physical package*.
-Bits 31 - 24: Initial APIC ID.
+Bits 31 - 24: Initial APIC ID**.
ECX
@@ -76139,6 +76162,8 @@ NOTES:
* The nearest power-of-2 integer that is not smaller than EBX[23:16] is the number of unique initial APIC
IDs reserved for addressing different logical processors in a physical package. This field is only valid if
CPUID.1.EDX.HTT[bit 28]= 1.
+** The 8-bit initial APIC ID in EBX[31:24] is replaced by the 32-bit x2APIC ID, available in Leaf 0BH and
+Leaf 1FH.
02H
@@ -76187,7 +76212,7 @@ Deterministic Cache Parameters Leaf
NOTES:
Leaf 04H output depends on the initial value in ECX.*
-See also: “INPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level” on page 216.
+See also: “INPUT EAX = 04H: Returns Deterministic Cache Parameters for Each Level” on page 218.
EAX
CPUID—CPU Identification
@@ -76405,7 +76430,9 @@ Bit 01: AVX512_VBMI.
Bit 02: UMIP. Supports user-mode instruction prevention if 1.
Bit 03: PKU. Supports protection keys for user-mode pages if 1.
Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable protection keys (and the RDPKRU/WRPKRU instructions).
-Bits 16 - 5: Reserved.
+Bits 13 - 05: Reserved.
+Bit 14: AVX512_VPOPCNTDQ. (Intel® Xeon Phi™ only.)
+Bits 16 - 15: Reserved.
Bits 21 - 17: The value of MAWAU used by the BNDLDX and BNDSTX instructions in 64-bit mode.
Bit 22: RDPID and IA32_TSC_AUX are available if 1.
Bits 29 - 23: Reserved.
@@ -76423,7 +76450,20 @@ Value
Information Provided about the Processor
EDX
-Reserved.
+Bit 01: Reserved.
+Bit 02: AVX512_4VNNIW. (Intel® Xeon Phi™ only.)
+Bit 03: AVX512_4FMAPS. (Intel® Xeon Phi™ only.)
+Bits 25 - 04: Reserved.
+Bit 26: Enumerates support for indirect branch restricted speculation (IBRS) and the indirect branch predictor barrier (IBPB). Processors that set this bit support the IA32_SPEC_CTRL MSR and the
+IA32_PRED_CMD MSR. They allow software to set IA32_SPEC_CTRL[0] (IBRS) and IA32_PRED_CMD[0]
+(IBPB).
+Bit 27: Enumerates support for single thread indirect branch predictors (STIBP). Processors that set this
+bit support the IA32_SPEC_CTRL MSR. They allow software to set IA32_SPEC_CTRL[1] (STIBP).
+Bit 28: Enumerates support for L1D_FLUSH. Processors that set this bit support the IA32_FLUSH_CMD
+MSR. They allow software to set IA32_FLUSH_CMD[0] (L1D_FLUSH).
+Bit 29: Enumerates support for the IA32_ARCH_CAPABILITIES MSR.
+Bit 30: Reserved.
+Bit 31: Enumerates support for Speculative Store Bypass Disable (SSBD). Processors that set this bit support the IA32_SPEC_CTRL MSR. They allow software to set IA32_SPEC_CTRL[2] (SSBD).
NOTE:
* If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n is invalid if n
exceeds the value that sub-leaf 0 returns in EAX.
@@ -76480,10 +76520,24 @@ Bits 14 - 13: Reserved = 0.
Bit 15: AnyThread deprecation.
Bits 31 - 16: Reserved = 0.
+CPUID—CPU Identification
+
+Vol. 2A 3-195
+
+ INSTRUCTION SET REFERENCE, A-L
+
+Table 3-8. Information Returned by CPUID Instruction (Contd.)
+Initial EAX
+Value
+
+Information Provided about the Processor
Extended Topology Enumeration Leaf
+
0BH
NOTES:
+CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence
+of Leaf 1FH before using leaf 0BH.
Most of Leaf 0BH output depends on the initial value in ECX.
The EDX output of leaf 0BH is always valid and does not vary with input value in ECX.
Output value in ECX[7:0] always equals input value in ECX[7:0].
@@ -76503,17 +76557,6 @@ Bits 15 - 00: Number of logical processors at this level type. The number reflec
by Intel**.
Bits 31- 16: Reserved.
-CPUID—CPU Identification
-
-Vol. 2A 3-195
-
- INSTRUCTION SET REFERENCE, A-L
-
-Table 3-8. Information Returned by CPUID Instruction (Contd.)
-Initial EAX
-Value
-
-Information Provided about the Processor
ECX
Bits 07 - 00: Level number. Same value in ECX input.
@@ -76535,7 +76578,19 @@ and platform hardware configurations.
2: Core.
3-255: Reserved.
+3-196 Vol. 2A
+
+CPUID—CPU Identification
+
+ INSTRUCTION SET REFERENCE, A-L
+
+Table 3-8. Information Returned by CPUID Instruction (Contd.)
+Initial EAX
+Value
+
+Information Provided about the Processor
Processor Extended State Enumeration Main Leaf (EAX = 0DH, ECX = 0)
+
0DH
NOTES:
@@ -76576,8 +76631,6 @@ Bits 31 - 00: Reserved.
Processor Extended State Enumeration Sub-leaf (EAX = 0DH, ECX = 1)
0DH
-3-196 Vol. 2A
-
EAX
Bit 00: XSAVEOPT is available.
@@ -76590,15 +76643,6 @@ EBX
Bits 31 - 00: The size in bytes of the XSAVE area containing all states enabled by XCRO | IA32_XSS.
-CPUID—CPU Identification
-
- INSTRUCTION SET REFERENCE, A-L
-
-Table 3-8. Information Returned by CPUID Instruction (Contd.)
-Initial EAX
-Value
-
-Information Provided about the Processor
ECX
Bits 31 - 00: Reports the supported bits of the lower 32 bits of the IA32_XSS MSR. IA32_XSS[n] can be
@@ -76616,7 +76660,19 @@ Bits 31 - 00: Reports the supported bits of the upper 32 bits of the IA32_XSS MS
be set to 1 only if EDX[n] is 1.
Bits 31 - 00: Reserved.
+CPUID—CPU Identification
+
+Vol. 2A 3-197
+
+ INSTRUCTION SET REFERENCE, A-L
+
+Table 3-8. Information Returned by CPUID Instruction (Contd.)
+Initial EAX
+Value
+
+Information Provided about the Processor
Processor Extended State Enumeration Sub-leaves (EAX = 0DH, ECX = n, n > 1)
+
0DH
NOTES:
@@ -76682,19 +76738,8 @@ NOTES:
Leaf 0FH output depends on the initial value in ECX.
EAX
-CPUID—CPU Identification
-
Reserved.
-Vol. 2A 3-197
-
- INSTRUCTION SET REFERENCE, A-L
-
-Table 3-8. Information Returned by CPUID Instruction (Contd.)
-Initial EAX
-Value
-
-Information Provided about the Processor
EBX
Bits 31 - 00: Conversion factor from reported IA32_QM_CTR value to occupancy metric (bytes).
@@ -76713,9 +76758,20 @@ Bits 31 - 03: Reserved.
Intel Resource Director Technology (Intel RDT) Allocation Enumeration Sub-leaf (EAX = 10H, ECX = 0)
10H
+3-198 Vol. 2A
+
NOTES:
Leaf 10H output depends on the initial value in ECX.
Sub-leaf index 0 reports valid resource identification (ResID) starting at bit position 1 of EBX.
+CPUID—CPU Identification
+
+ INSTRUCTION SET REFERENCE, A-L
+
+Table 3-8. Information Returned by CPUID Instruction (Contd.)
+Initial EAX
+Value
+
+Information Provided about the Processor
EAX
Reserved.
@@ -76787,20 +76843,8 @@ Bits 31 - 16: Reserved.
Memory Bandwidth Allocation Enumeration Sub-leaf (EAX = 10H, ECX = ResID =3)
10H
-3-198 Vol. 2A
-
NOTES:
Leaf 10H output depends on the initial value in ECX.
-
-CPUID—CPU Identification
-
- INSTRUCTION SET REFERENCE, A-L
-
-Table 3-8. Information Returned by CPUID Instruction (Contd.)
-Initial EAX
-Value
-
-Information Provided about the Processor
EAX
Bits 11 - 00: Reports the maximum MBA throttling value supported for the corresponding ResID using
@@ -76822,6 +76866,17 @@ EDX
Bits 15 - 00: Highest COS number supported for this ResID.
Bits 31 - 16: Reserved.
+CPUID—CPU Identification
+
+Vol. 2A 3-199
+
+ INSTRUCTION SET REFERENCE, A-L
+
+Table 3-8. Information Returned by CPUID Instruction (Contd.)
+Initial EAX
+Value
+
+Information Provided about the Processor
Intel SGX Capability Enumeration Leaf, sub-leaf 0 (EAX = 12H, ECX = 0)
NOTES:
Leaf 12H sub-leaf 0 (ECX = 0) is supported if CPUID.(EAX=07H, ECX=0H):EBX[SGX] = 1.
@@ -76835,7 +76890,7 @@ Bits 04 - 02: Reserved.
Bit 05: If 1, indicates Intel SGX supports ENCLV instruction leaves EINCVIRTCHILD, EDECVIRTCHILD, and
ESETCONTEXT.
Bit 06: If 1, indicates Intel SGX supports ENCLS instruction leaves ETRACKC, ERDINFO, ELDBC, and ELDUC.
-Bits 31 - 02: Reserved.
+Bits 31 - 07: Reserved.
EBX
@@ -76894,9 +76949,9 @@ Type
0000b. This sub-leaf is invalid.
EDX:ECX:EBX:EAX return 0.
-CPUID—CPU Identification
+3-200 Vol. 2A
-Vol. 2A 3-199
+CPUID—CPU Identification
INSTRUCTION SET REFERENCE, A-L
@@ -76963,8 +77018,6 @@ Bits 31 - 00: Reserved.
Intel Processor Trace Enumeration Sub-leaf (EAX = 14H, ECX = 1)
14H
-3-200 Vol. 2A
-
EAX
Bits 02 - 00: Number of configurable Address Ranges for filtering.
@@ -76979,8 +77032,11 @@ Bit 31 - 16: Bitmap of supported Configurable PSB frequency encodings.
ECX
Bits 31 - 00: Reserved.
+
CPUID—CPU Identification
+Vol. 2A 3-201
+
INSTRUCTION SET REFERENCE, A-L
Table 3-8. Information Returned by CPUID Instruction (Contd.)
@@ -77051,6 +77107,8 @@ zero are not supported.
System-On-Chip Vendor Attribute Enumeration Main Leaf (EAX = 17H, ECX = 0)
17H
+3-202 Vol. 2A
+
NOTES:
Leaf 17H main leaf (ECX = 0).
Leaf 17H output depends on the initial value in ECX.
@@ -77075,11 +77133,8 @@ Bits 31 - 00: Project ID. A unique number an SOC vendor assigns to its SOC proje
EDX
Bits 31 - 00: Stepping ID. A unique number within an SOC project that an SOC vendor assigns.
-
CPUID—CPU Identification
-Vol. 2A 3-201
-
INSTRUCTION SET REFERENCE, A-L
Table 3-8. Information Returned by CPUID Instruction (Contd.)
@@ -77136,8 +77191,6 @@ Bits 31 - 00: Reserved = 0.
Deterministic Address Translation Parameters Main Leaf (EAX = 18H, ECX = 0)
18H
-3-202 Vol. 2A
-
NOTES:
Each sub-leaf enumerates a different address translation structure.
If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n is invalid if n
@@ -77170,6 +77223,8 @@ Bits 31 - 00: S = Number of Sets.
CPUID—CPU Identification
+Vol. 2A 3-203
+
INSTRUCTION SET REFERENCE, A-L
Table 3-8. Information Returned by CPUID Instruction (Contd.)
@@ -77194,6 +77249,8 @@ Bits 31 - 26: Reserved.
Deterministic Address Translation Parameters Sub-leaf (EAX = 18H, ECX ≥ 1)
18H
+3-204 Vol. 2A
+
NOTES:
Each sub-leaf enumerates a different address translation structure.
If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Sub-leaf index n is invalid if n
@@ -77238,17 +77295,8 @@ Bits 13 - 09: Reserved.
Bits 25- 14: Maximum number of addressable IDs for logical processors sharing this translation cache**
Bits 31 - 26: Reserved.
-Unimplemented CPUID Leaf Functions
-40000000H
-4FFFFFFFH
-
CPUID—CPU Identification
-Invalid. No existing or future CPU will return processor identification or feature information if the initial
-EAX value is in the range 40000000H to 4FFFFFFFH.
-
-Vol. 2A 3-203
-
INSTRUCTION SET REFERENCE, A-L
Table 3-8. Information Returned by CPUID Instruction (Contd.)
@@ -77256,6 +77304,62 @@ Initial EAX
Value
Information Provided about the Processor
+V2 Extended Topology Enumeration Leaf
+
+1FH
+
+NOTES:
+CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence
+of Leaf 1FH and using this if available.
+Most of Leaf 1FH output depends on the initial value in ECX.
+The EDX output of leaf 1FH is always valid and does not vary with input value in ECX.
+Output value in ECX[7:0] always equals input value in ECX[7:0].
+Sub-leaf index 0 enumerates SMT level. Each subsequent higher sub-leaf index enumerates a higherlevel topological entity in hierarchical order.
+For sub-leaves that return an invalid level-type of 0 in ECX[15:8]; EAX and EBX will return 0.
+If an input value n in ECX returns the invalid level-type of 0 in ECX[15:8], other input values with ECX >
+n also return 0 in ECX[15:8].
+EAX
+
+Bits 04 - 00: Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type*.
+All logical processors with the same next level ID share current level.
+Bits 31 - 05: Reserved.
+
+EBX
+
+Bits 15 - 00: Number of logical processors at this level type. The number reflects configuration as shipped
+by Intel**.
+Bits 31- 16: Reserved.
+
+ECX
+
+Bits 07 - 00: Level number. Same value in ECX input.
+Bits 15 - 08: Level type***.
+Bits 31 - 16: Reserved.
+
+EDX
+
+Bits 31- 00: x2APIC ID the current logical processor.
+NOTES:
+* Software should use this field (EAX[4:0]) to enumerate processor topology of the system.
+** Software must not use EBX[15:0] to enumerate processor topology of the system. This value in this
+field (EBX[15:0]) is only intended for display/diagnostic purposes. The actual number of logical processors
+available to BIOS/OS/Applications may be different from the value of EBX[15:0], depending on software
+and platform hardware configurations.
+*** The value of the “level type” field is not related to level numbers in any way, higher “level type” values do not mean higher levels. Level type field has the following encoding:
+0: Invalid.
+1: SMT.
+2: Core.
+3: Module.
+4: Tile.
+5: Die.
+6-255: Reserved.
+
+Unimplemented CPUID Leaf Functions
+40000000H
+4FFFFFFFH
+
+Invalid. No existing or future CPU will return processor identification or feature information if the initial
+EAX value is in the range 40000000H to 4FFFFFFFH.
Extended Function CPUID Information
80000000H EAX
@@ -77270,12 +77374,22 @@ ECX
Reserved.
-EDX
+CPUID—CPU Identification
-Reserved.
+Vol. 2A 3-205
+
+ INSTRUCTION SET REFERENCE, A-L
+
+Table 3-8. Information Returned by CPUID Instruction (Contd.)
+Initial EAX
+Value
+
+Information Provided about the Processor
+EDX
80000001H EAX
+Reserved.
Extended Processor Signature and Feature Bits.
EBX
@@ -77354,7 +77468,7 @@ ECX
EDX
-3-204 Vol. 2A
+3-206 Vol. 2A
Bits 07 - 00: Cache Line size in bytes.
Bits 11 - 08: Reserved.
@@ -77373,14 +77487,25 @@ Value
Information Provided about the Processor
NOTES:
* L2 associativity field encodings:
-00H - Disabled.
-01H - Direct mapped.
-02H - 2-way.
-04H - 4-way.
-06H - 8-way.
-08H - 16-way.
-0FH - Fully associative.
-
+00H - Disabled
+01H - 1 way (direct mapped)
+02H - 2 ways
+03H - Reserved
+04H - 4 ways
+05H - Reserved
+06H - 8 ways
+07H - See CPUID leaf 04H, sub-leaf 2**
+
+08H - 16 ways
+09H - Reserved
+0AH - 32 ways
+0BH - 48 ways
+0CH - 64 ways
+0DH - 96 ways
+0EH - 128 ways
+0FH - Fully associative
+
+** CPUID leaf 04H provides details of deterministic cache parameters, including the L2 cache in sub-leaf 2
80000007H EAX
EBX
ECX
@@ -77430,7 +77555,7 @@ the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.
CPUID—CPU Identification
-Vol. 2A 3-205
+Vol. 2A 3-207
INSTRUCTION SET REFERENCE, A-L
@@ -77516,13 +77641,6 @@ Intel reserved
NOTE
See Chapter 19 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1,
for information on identifying earlier IA-32 processors.
-
-3-206 Vol. 2A
-
-CPUID—CPU Identification
-
- INSTRUCTION SET REFERENCE, A-L
-
The Extended Family ID needs to be examined only when the Family ID is 0FH. Integrate the fields into a display
using the following rule:
IF Family_ID ≠ 0FH
@@ -77531,6 +77649,13 @@ ELSE DisplayFamily = Extended_Family_ID + Family_ID;
(* Right justify and zero-extend 4-bit field. *)
FI;
(* Show DisplayFamily as HEX field. *)
+
+3-208 Vol. 2A
+
+CPUID—CPU Identification
+
+ INSTRUCTION SET REFERENCE, A-L
+
The Extended Model ID needs to be examined only when the Family ID is 06H or 0FH. Integrate the field into a
display using the following rule:
IF (Family_ID = 06H or Family_ID = 0FH)
@@ -77576,7 +77701,7 @@ prior to using the feature. Software should not depend on future offerings retai
CPUID—CPU Identification
-Vol. 2A 3-207
+Vol. 2A 3-209
INSTRUCTION SET REFERENCE, A-L
@@ -77697,7 +77822,7 @@ SSSE3
A value of 1 indicates the presence of the Supplemental Streaming SIMD Extensions 3 (SSSE3). A
value of 0 indicates the instruction extensions are not present in the processor.
-3-208 Vol. 2A
+3-210 Vol. 2A
CPUID—CPU Identification
@@ -77857,7 +77982,7 @@ Always returns 0.
CPUID—CPU Identification
-Vol. 2A 3-209
+Vol. 2A 3-211
INSTRUCTION SET REFERENCE, A-L
@@ -77900,7 +78025,7 @@ OM16523
Figure 3-8. Feature Information Returned in the EDX Register
-3-210 Vol. 2A
+3-212 Vol. 2A
CPUID—CPU Identification
@@ -78067,7 +78192,7 @@ Reserved
CPUID—CPU Identification
-Vol. 2A 3-211
+Vol. 2A 3-213
INSTRUCTION SET REFERENCE, A-L
@@ -78182,7 +78307,7 @@ registers is not defined; that is, specific bytes are not designated to contain
prefetch, or TLB types. The descriptors may appear in any order. Note also a processor may report a general
descriptor type (FFH) and not report any byte descriptor of “cache type” via CPUID leaf 2.
-3-212 Vol. 2A
+3-214 Vol. 2A
CPUID—CPU Identification
@@ -78433,7 +78558,7 @@ Instruction TLB: 4 KByte pages, 32 entries
CPUID—CPU Identification
-Vol. 2A 3-213
+Vol. 2A 3-215
INSTRUCTION SET REFERENCE, A-L
@@ -78685,7 +78810,7 @@ Cache
2nd-level cache: 1 MByte, 8-way set associative, 64 byte line size
-3-214 Vol. 2A
+3-216 Vol. 2A
CPUID—CPU Identification
@@ -78906,7 +79031,7 @@ CPUID leaf 2 does not report cache descriptor information, use CPUID leaf 4 to q
CPUID—CPU Identification
-Vol. 2A 3-215
+Vol. 2A 3-217
INSTRUCTION SET REFERENCE, A-L
@@ -78973,7 +79098,7 @@ MWAIT instruction. The MWAIT instruction optionally provides additional extensio
INPUT EAX = 06H: Returns Thermal and Power Management Features
When CPUID executes with EAX set to 06H, the processor returns information about thermal and power management features. See Table 3-8.
-3-216 Vol. 2A
+3-218 Vol. 2A
CPUID—CPU Identification
@@ -78999,6 +79124,8 @@ described in Chapter 23, “Introduction to Virtual-Machine Extensions,” in th
Software Developer’s Manual, Volume 3C.
INPUT EAX = 0BH: Returns Extended Topology Information
+CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence of Leaf 1FH
+before using leaf 0BH.
When CPUID executes with EAX set to 0BH, the processor returns information about extended topology enumeration data. Software must detect the presence of CPUID leaf 0BH by verifying (a) the highest leaf index supported
by CPUID is >= 0BH, and (b) CPUID.0BH:EBX[15:0] reports a non-zero value. See Table 3-8.
@@ -79023,16 +79150,15 @@ to a specific resource type if the bit is set. The bit position corresponds to t
When CPUID executes with EAX set to 0FH and ECX = n (n >= 1, and is a valid ResID), the processor returns information software can use to program IA32_PQR_ASSOC, IA32_QM_EVTSEL MSRs before reading QoS data from the
IA32_QM_CTR MSR.
-INPUT EAX = 10H: Returns Intel Resource Director Technology (Intel RDT) Allocation Enumeration Information
-When CPUID executes with EAX set to 10H and ECX = 0, the processor returns information about the bit-vector
-representation of QoS Enforcement resource types that are supported in the processor. Each bit, starting from bit
-
CPUID—CPU Identification
-Vol. 2A 3-217
+Vol. 2A 3-219
INSTRUCTION SET REFERENCE, A-L
+INPUT EAX = 10H: Returns Intel Resource Director Technology (Intel RDT) Allocation Enumeration Information
+When CPUID executes with EAX set to 10H and ECX = 0, the processor returns information about the bit-vector
+representation of QoS Enforcement resource types that are supported in the processor. Each bit, starting from bit
1, corresponds to a specific resource type if the bit is set. The bit position corresponds to the sub-leaf index (or
ResID) that software must use to query QoS enforcement capability available for that type. See Table 3-8.
When CPUID executes with EAX set to 10H and ECX = n (n >= 1, and is a valid ResID), the processor returns information about available classes of service and range of QoS mask MSRs that software can use to configure each
@@ -79066,10 +79192,21 @@ INPUT EAX = 18H: Returns Deterministic Address Translation Parameters Informatio
When CPUID executes with EAX set to 18H, the processor returns information about the Deterministic Address
Translation Parameters. See Table 3-8.
+INPUT EAX = 1FH: Returns V2 Extended Topology Information
+When CPUID executes with EAX set to 1FH, the processor returns information about extended topology enumeration data. Software must detect the presence of CPUID leaf 1FH by verifying (a) the highest leaf index supported
+by CPUID is >= 1FH, and (b) CPUID.1FH:EBX[15:0] reports a non-zero value. See Table 3-8.
+
METHODS FOR RETURNING BRANDING INFORMATION
Use the following techniques to access branding information:
1. Processor brand string method.
2. Processor brand index; this method uses a software supplied brand string table.
+
+3-220 Vol. 2A
+
+CPUID—CPU Identification
+
+ INSTRUCTION SET REFERENCE, A-L
+
These two methods are discussed in the following sections. For methods that are available in early processors, see
Section: “Identification of Earlier IA-32 Processors” in Chapter 19 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.
@@ -79079,12 +79216,6 @@ should execute this algorithm on all Intel 64 and IA-32 processors.
This method (introduced with Pentium 4 processors) returns an ASCII brand identification string and the Processor
Base frequency of the processor to the EAX, EBX, ECX, and EDX registers.
-3-218 Vol. 2A
-
-CPUID—CPU Identification
-
- INSTRUCTION SET REFERENCE, A-L
-
Input: EAX=
0x80000000
CPUID
@@ -79125,7 +79256,7 @@ value, CPUID returns 16 ASCII characters using EAX, EBX, ECX, and EDX. The retur
CPUID—CPU Identification
-Vol. 2A 3-219
+Vol. 2A 3-221
INSTRUCTION SET REFERENCE, A-L
@@ -79247,7 +79378,7 @@ OM15195
Figure 3-10. Algorithm for Extracting Processor Frequency
-3-220 Vol. 2A
+3-222 Vol. 2A
CPUID—CPU Identification
@@ -79368,7 +79499,7 @@ Intel486 processor.
CPUID—CPU Identification
-Vol. 2A 3-221
+Vol. 2A 3-223
INSTRUCTION SET REFERENCE, A-L
@@ -79424,7 +79555,7 @@ ECX ← MONITOR/MWAIT Leaf;
EDX ← MONITOR/MWAIT Leaf;
BREAK;
-3-222 Vol. 2A
+3-224 Vol. 2A
CPUID—CPU Identification
@@ -79486,7 +79617,7 @@ EDX ← Reserved = 0;
BREAK;
CPUID—CPU Identification
-Vol. 2A 3-223
+Vol. 2A 3-225
INSTRUCTION SET REFERENCE, A-L
@@ -79538,18 +79669,24 @@ EBX ← Deterministic Address Translation Parameters Enumeration Leaf;
ECX ←Deterministic Address Translation Parameters Enumeration Leaf;
EDX ← Deterministic Address Translation Parameters Enumeration Leaf;
BREAK;
-EAX = 80000000H:
-EAX ← Highest extended function input value understood by CPUID;
-EBX ← Reserved;
-ECX ← Reserved;
-EDX ← Reserved;
+EAX = 1FH:
+EAX ← V2 Extended Topology Enumeration Leaf; (* See Table 3-8. *)
+EBX ← V2 Extended Topology Enumeration Leaf;
+ECX ← V2 Extended Topology Enumeration Leaf;
+EDX ← V2 Extended Topology Enumeration Leaf;
BREAK;
-3-224 Vol. 2A
+3-226 Vol. 2A
CPUID—CPU Identification
INSTRUCTION SET REFERENCE, A-L
+EAX = 80000000H:
+EAX ← Highest extended function input value understood by CPUID;
+EBX ← Reserved;
+ECX ← Reserved;
+EDX ← Reserved;
+BREAK;
EAX = 80000001H:
EAX ← Reserved;
EBX ← Reserved;
@@ -79598,18 +79735,18 @@ EBX ← Reserved = Virtual Address Size Information;
ECX ← Reserved = 0;
EDX ← Reserved = 0;
BREAK;
+CPUID—CPU Identification
+
+Vol. 2A 3-227
+
+ INSTRUCTION SET REFERENCE, A-L
+
EAX >= 40000000H and EAX <= 4FFFFFFFH:
DEFAULT: (* EAX = Value outside of recognized range for CPUID. *)
(* If the highest basic information leaf data depend on ECX input value, ECX is honored.*)
EAX ← Reserved; (* Information returned for highest basic information leaf. *)
EBX ← Reserved; (* Information returned for highest basic information leaf. *)
ECX ← Reserved; (* Information returned for highest basic information leaf. *)
-CPUID—CPU Identification
-
-Vol. 2A 3-225
-
- INSTRUCTION SET REFERENCE, A-L
-
EDX ← Reserved; (* Information returned for highest basic information leaf. *)
BREAK;
ESAC;
@@ -79623,7 +79760,7 @@ Exceptions (All Operating Modes)
If the LOCK prefix is used.
In earlier IA-32 processors that do not support the CPUID instruction, execution of the instruction results in an invalid opcode (#UD) exception being generated.
-3-226 Vol. 2A
+3-228 Vol. 2A
CPUID—CPU Identification
@@ -79752,7 +79889,7 @@ MOD2: Remainder from Polynomial division modulus 2
CRC32 — Accumulate CRC32 Value
-Vol. 2A 3-227
+Vol. 2A 3-229
INSTRUCTION SET REFERENCE, A-L
@@ -79802,7 +79939,7 @@ DEST[31-0]  BIT_REFLECT (TEMP6[31-0])
Flags Affected
None
-3-228 Vol. 2A
+3-230 Vol. 2A
CRC32 — Accumulate CRC32 Value
@@ -79904,7 +80041,7 @@ If LOCK prefix is used.
CRC32 — Accumulate CRC32 Value
-Vol. 2A 3-229
+Vol. 2A 3-231
INSTRUCTION SET REFERENCE, A-L
@@ -80048,7 +80185,7 @@ operand is an XMM register. The upper Bits (MAXVL-1:128) of the corresponding ZM
unmodified.
VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.
-3-230 Vol. 2A
+3-232 Vol. 2A
CVTDQ2PD—Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point Values
@@ -80098,7 +80235,7 @@ DEST[MAXVL-1:VL]  0
CVTDQ2PD—Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point Values
-Vol. 2A 3-231
+Vol. 2A 3-233
INSTRUCTION SET REFERENCE, A-L
@@ -80154,7 +80291,7 @@ VCVTDQ2PD __m128d _mm_mask_cvtepi32_pd( __m128d s, __mmask8 k, __m128i a);
VCVTDQ2PD __m128d _mm_maskz_cvtepi32_pd( __mmask8 k, __m128i a);
CVTDQ2PD __m128d _mm_cvtepi32_pd (__m128i src)
-3-232 Vol. 2A
+3-234 Vol. 2A
CVTDQ2PD—Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point Values
@@ -80169,7 +80306,7 @@ If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
CVTDQ2PD—Convert Packed Doubleword Integers to Packed Double-Precision Floating-Point Values
-Vol. 2A 3-233
+Vol. 2A 3-235
INSTRUCTION SET REFERENCE, A-L
@@ -80310,7 +80447,7 @@ operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding reg
operand is an XMM register. The upper Bits (MAXVL-1:128) of the corresponding register destination are unmodified.
VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.
-3-234 Vol. 2A
+3-236 Vol. 2A
CVTDQ2PS—Convert Packed Doubleword Integers to Packed Single-Precision Floating-Point Values
@@ -80369,7 +80506,7 @@ DEST[MAXVL-1:VL]  0
CVTDQ2PS—Convert Packed Doubleword Integers to Packed Single-Precision Floating-Point Values
-Vol. 2A 3-235
+Vol. 2A 3-237
INSTRUCTION SET REFERENCE, A-L
@@ -80418,7 +80555,7 @@ VEX-encoded instructions, see Exceptions Type 2;
EVEX-encoded instructions, see Exceptions Type E2.
#UD
-3-236 Vol. 2A
+3-238 Vol. 2A
If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
@@ -80574,7 +80711,7 @@ VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions wi
CVTPD2DQ—Convert Packed Double-Precision Floating-Point Values to Packed Doubleword Integers
-Vol. 2A 3-237
+Vol. 2A 3-239
INSTRUCTION SET REFERENCE, A-L
@@ -80628,7 +80765,7 @@ FI;
ENDFOR
DEST[MAXVL-1:VL/2]  0
-3-238 Vol. 2A
+3-240 Vol. 2A
CVTPD2DQ—Convert Packed Double-Precision Floating-Point Values to Packed Doubleword Integers
@@ -80678,7 +80815,7 @@ DEST[MAXVL-1:128] (unmodified)
CVTPD2DQ—Convert Packed Double-Precision Floating-Point Values to Packed Doubleword Integers
-Vol. 2A 3-239
+Vol. 2A 3-241
INSTRUCTION SET REFERENCE, A-L
@@ -80704,7 +80841,7 @@ See Exceptions Type 2; additionally
EVEX-encoded instructions, see Exceptions Type E2.
#UD
-3-240 Vol. 2A
+3-242 Vol. 2A
If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
@@ -80788,7 +80925,7 @@ See Table 22-4, “Exception Conditions for Legacy SIMD/MMX Instructions with FP
CVTPD2PI—Convert Packed Double-Precision FP Values to Packed Dword Integers
-Vol. 2A 3-241
+Vol. 2A 3-243
INSTRUCTION SET REFERENCE, A-L
@@ -80938,7 +81075,7 @@ operand is an XMM register. Bits[127:64] of the destination XMM register are zer
(MAXVL-1:128) of the corresponding ZMM register destination are unmodified.
VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.
-3-242 Vol. 2A
+3-244 Vol. 2A
CVTPD2PS—Convert Packed Double-Precision Floating-Point Values to Packed Single-Precision Floating-Point Values
@@ -80996,7 +81133,7 @@ DEST[MAXVL-1:VL/2]  0
CVTPD2PS—Convert Packed Double-Precision Floating-Point Values to Packed Single-Precision Floating-Point Values
-Vol. 2A 3-243
+Vol. 2A 3-245
INSTRUCTION SET REFERENCE, A-L
@@ -81040,7 +81177,7 @@ DEST[63:32]  Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[
DEST[127:64]  0
DEST[MAXVL-1:128] (unmodified)
-3-244 Vol. 2A
+3-246 Vol. 2A
CVTPD2PS—Convert Packed Double-Precision Floating-Point Values to Packed Single-Precision Floating-Point Values
@@ -81072,7 +81209,7 @@ If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
CVTPD2PS—Convert Packed Double-Precision Floating-Point Values to Packed Single-Precision Floating-Point Values
-Vol. 2A 3-245
+Vol. 2A 3-247
INSTRUCTION SET REFERENCE, A-L
@@ -81161,7 +81298,7 @@ Other Exceptions
See Table 22-6, “Exception Conditions for Legacy SIMD/MMX Instructions with XMM and without FP Exception,” in
the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B.
-3-246 Vol. 2A
+3-248 Vol. 2A
CVTPI2PD—Convert Packed Dword Integers to Packed Double-Precision FP Values
@@ -81245,7 +81382,7 @@ Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B.
CVTPI2PS—Convert Packed Dword Integers to Packed Single-Precision FP Values
-Vol. 2A 3-247
+Vol. 2A 3-249
INSTRUCTION SET REFERENCE, A-L
@@ -81397,7 +81534,7 @@ operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZM
unmodified.
VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.
-3-248 Vol. 2A
+3-250 Vol. 2A
CVTPS2DQ—Convert Packed Single-Precision Floating-Point Values to Packed Signed Doubleword Integer Values
@@ -81456,7 +81593,7 @@ DEST[MAXVL-1:VL]  0
CVTPS2DQ—Convert Packed Single-Precision Floating-Point Values to Packed Signed Doubleword Integer Values
-Vol. 2A 3-249
+Vol. 2A 3-251
INSTRUCTION SET REFERENCE, A-L
@@ -81504,7 +81641,7 @@ VEX-encoded instructions, see Exceptions Type 2;
EVEX-encoded instructions, see Exceptions Type E2.
#UD
-3-250 Vol. 2A
+3-252 Vol. 2A
If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
@@ -81653,7 +81790,7 @@ Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructio
CVTPS2PD—Convert Packed Single-Precision Floating-Point Values to Packed Double-Precision Floating-Point Values
-Vol. 2A 3-251
+Vol. 2A 3-253
INSTRUCTION SET REFERENCE, A-L
@@ -81714,7 +81851,7 @@ DEST[i+63:i] 
Convert_Single_Precision_To_Double_Precision_Floating_Point(SRC[k+31:k])
FI;
ELSE
-3-252 Vol. 2A
+3-254 Vol. 2A
CVTPS2PD—Convert Packed Single-Precision Floating-Point Values to Packed Double-Precision Floating-Point Values
@@ -81771,7 +81908,7 @@ If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
CVTPS2PD—Convert Packed Single-Precision Floating-Point Values to Packed Double-Precision Floating-Point Values
-Vol. 2A 3-253
+Vol. 2A 3-255
INSTRUCTION SET REFERENCE, A-L
@@ -81849,7 +81986,7 @@ Other Exceptions
See Table 22-5, “Exception Conditions for Legacy SIMD/MMX Instructions with XMM and FP Exception,” in the
Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B.
-3-254 Vol. 2A
+3-256 Vol. 2A
CVTPS2PI—Convert Packed Single-Precision FP Values to Packed Dword Integers
@@ -81993,7 +82130,7 @@ unpredictable behavior across different processor generations.
CVTSD2SI—Convert Scalar Double-Precision Floating-Point Value to Doubleword Integer
-Vol. 2A 3-255
+Vol. 2A 3-257
INSTRUCTION SET REFERENCE, A-L
@@ -82035,7 +82172,7 @@ VEX-encoded instructions, see Exceptions Type 3;
EVEX-encoded instructions, see Exceptions Type E3NF.
#UD
-3-256 Vol. 2A
+3-258 Vol. 2A
If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
@@ -82064,10 +82201,10 @@ SSE2
F2 0F 5A /r
CVTSD2SS xmm1, xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 5A /r
+VEX.LIG.F2.0F.WIG 5A /r
VCVTSD2SS xmm1,xmm2,
xmm3/m64
-EVEX.NDS.LIG.F2.0F.W1 5A /r
+EVEX.LIG.F2.0F.W1 5A /r
VCVTSD2SS xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
@@ -82162,7 +82299,7 @@ unpredictable behavior across different processor generations.
CVTSD2SS—Convert Scalar Double-Precision Floating-Point Value to Scalar Single-Precision Floating-Point Value
-Vol. 2A 3-257
+Vol. 2A 3-259
INSTRUCTION SET REFERENCE, A-L
@@ -82211,7 +82348,7 @@ Other Exceptions
VEX-encoded instructions, see Exceptions Type 3.
EVEX-encoded instructions, see Exceptions Type E3.
-3-258 Vol. 2A
+3-260 Vol. 2A
CVTSD2SS—Convert Scalar Double-Precision Floating-Point Value to Scalar Single-Precision Floating-Point Value
@@ -82246,7 +82383,7 @@ V/N.E.
SSE2
-VEX.NDS.LIG.F2.0F.W0 2A /r
+VEX.LIG.F2.0F.W0 2A /r
VCVTSI2SD xmm1, xmm2, r/m32
B
@@ -82255,7 +82392,7 @@ V/V
AVX
-VEX.NDS.LIG.F2.0F.W1 2A /r
+VEX.LIG.F2.0F.W1 2A /r
VCVTSI2SD xmm1, xmm2, r/m64
B
@@ -82264,7 +82401,7 @@ V/N.E.1
AVX
-EVEX.NDS.LIG.F2.0F.W0 2A /r
+EVEX.LIG.F2.0F.W0 2A /r
VCVTSI2SD xmm1, xmm2, r/m32
C
@@ -82273,7 +82410,7 @@ V/V
AVX512F
-EVEX.NDS.LIG.F2.0F.W1 2A /r
+EVEX.LIG.F2.0F.W1 2A /r
VCVTSI2SD xmm1, xmm2, r/m64{er}
C
@@ -82372,7 +82509,7 @@ unpredictable behavior across different processor generations.
CVTSI2SD—Convert Doubleword Integer to Scalar Double-Precision Floating-Point Value
-Vol. 2A 3-259
+Vol. 2A 3-261
INSTRUCTION SET REFERENCE, A-L
@@ -82424,7 +82561,7 @@ Other Exceptions
VEX-encoded instructions, see Exceptions Type 3 if W1, else Type 5.
EVEX-encoded instructions, see Exceptions Type E3NF if W1, else Type E10NF.
-3-260 Vol. 2A
+3-262 Vol. 2A
CVTSI2SD—Convert Doubleword Integer to Scalar Double-Precision Floating-Point Value
@@ -82452,13 +82589,13 @@ F3 0F 2A /r
CVTSI2SS xmm1, r/m32
F3 REX.W 0F 2A /r
CVTSI2SS xmm1, r/m64
-VEX.NDS.LIG.F3.0F.W0 2A /r
+VEX.LIG.F3.0F.W0 2A /r
VCVTSI2SS xmm1, xmm2, r/m32
-VEX.NDS.LIG.F3.0F.W1 2A /r
+VEX.LIG.F3.0F.W1 2A /r
VCVTSI2SS xmm1, xmm2, r/m64
-EVEX.NDS.LIG.F3.0F.W0 2A /r
+EVEX.LIG.F3.0F.W0 2A /r
VCVTSI2SS xmm1, xmm2, r/m32{er}
-EVEX.NDS.LIG.F3.0F.W1 2A /r
+EVEX.LIG.F3.0F.W1 2A /r
VCVTSI2SS xmm1, xmm2, r/m64{er}
A
@@ -82576,7 +82713,7 @@ unpredictable behavior across different processor generations.
CVTSI2SS—Convert Doubleword Integer to Scalar Single-Precision Floating-Point Value
-Vol. 2A 3-261
+Vol. 2A 3-263
INSTRUCTION SET REFERENCE, A-L
@@ -82629,7 +82766,7 @@ Other Exceptions
VEX-encoded instructions, see Exceptions Type 3.
EVEX-encoded instructions, see Exceptions Type E3NF.
-3-262 Vol. 2A
+3-264 Vol. 2A
CVTSI2SS—Convert Doubleword Integer to Scalar Single-Precision Floating-Point Value
@@ -82656,10 +82793,10 @@ SSE2
F3 0F 5A /r
CVTSS2SD xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 5A /r
+VEX.LIG.F3.0F.WIG 5A /r
VCVTSS2SD xmm1, xmm2,
xmm3/m32
-EVEX.NDS.LIG.F3.0F.W0 5A /r
+EVEX.LIG.F3.0F.W0 5A /r
VCVTSS2SD xmm1 {k1}{z}, xmm2,
xmm3/m32{sae}
@@ -82768,7 +82905,7 @@ DEST[MAXVL-1:128]  0
CVTSS2SD—Convert Scalar Single-Precision Floating-Point Value to Scalar Double-Precision Floating-Point Value
-Vol. 2A 3-263
+Vol. 2A 3-265
INSTRUCTION SET REFERENCE, A-L
@@ -82795,7 +82932,7 @@ Other Exceptions
VEX-encoded instructions, see Exceptions Type 3.
EVEX-encoded instructions, see Exceptions Type E3.
-3-264 Vol. 2A
+3-266 Vol. 2A
CVTSS2SD—Convert Scalar Single-Precision Floating-Point Value to Scalar Double-Precision Floating-Point Value
@@ -82936,7 +83073,7 @@ unpredictable behavior across different processor generations.
CVTSS2SI—Convert Scalar Single-Precision Floating-Point Value to Doubleword Integer
-Vol. 2A 3-265
+Vol. 2A 3-267
INSTRUCTION SET REFERENCE, A-L
@@ -82979,7 +83116,7 @@ If VEX.vvvv != 1111B.
EVEX-encoded instructions, see Exceptions Type E3NF.
-3-266 Vol. 2A
+3-268 Vol. 2A
CVTSS2SI—Convert Scalar Single-Precision Floating-Point Value to Doubleword Integer
@@ -83138,7 +83275,7 @@ Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructi
CVTTPD2DQ—Convert with Truncation Packed Double-Precision Floating-Point Values to Packed Doubleword Integers
-Vol. 2A 3-267
+Vol. 2A 3-269
INSTRUCTION SET REFERENCE, A-L
@@ -83186,7 +83323,7 @@ FI;
ENDFOR
DEST[MAXVL-1:VL/2]  0
-3-268 Vol. 2A
+3-270 Vol. 2A
CVTTPD2DQ—Convert with Truncation Packed Double-Precision Floating-Point Values to Packed Doubleword Integers
@@ -83236,7 +83373,7 @@ DEST[MAXVL-1:128] (unmodified)
CVTTPD2DQ—Convert with Truncation Packed Double-Precision Floating-Point Values to Packed Doubleword Integers
-Vol. 2A 3-269
+Vol. 2A 3-271
INSTRUCTION SET REFERENCE, A-L
@@ -83262,7 +83399,7 @@ VEX-encoded instructions, see Exceptions Type 2;
EVEX-encoded instructions, see Exceptions Type E2.
#UD
-3-270 Vol. 2A
+3-272 Vol. 2A
If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
@@ -83349,7 +83486,7 @@ See Table 22-4, “Exception Conditions for Legacy SIMD/MMX Instructions with FP
CVTTPD2PI—Convert with Truncation Packed Double-Precision FP Values to Packed Dword Integers
-Vol. 2A 3-271
+Vol. 2A 3-273
INSTRUCTION SET REFERENCE, A-L
@@ -83504,7 +83641,7 @@ operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZM
unmodified.
Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.
-3-272 Vol. 2A
+3-274 Vol. 2A
CVTTPS2DQ—Convert with Truncation Packed Single-Precision Floating-Point Values to Packed Signed Doubleword Integer Values
@@ -83566,7 +83703,7 @@ DEST[255:224] Convert_Single_Precision_Floating_Point_To_Integer_Truncate(SRC
CVTTPS2DQ—Convert with Truncation Packed Single-Precision Floating-Point Values to Packed Signed Doubleword Integer Values
-Vol. 2A 3-273
+Vol. 2A 3-275
INSTRUCTION SET REFERENCE, A-L
@@ -83605,7 +83742,7 @@ VEX-encoded instructions, see Exceptions Type 2; additionally
EVEX-encoded instructions, see Exceptions Type E2.
#UD
-3-274 Vol. 2A
+3-276 Vol. 2A
If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
@@ -83694,7 +83831,7 @@ Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B.
CVTTPS2PI—Convert with Truncation Packed Single-Precision FP Values to Packed Dword Integers
-Vol. 2A 3-275
+Vol. 2A 3-277
INSTRUCTION SET REFERENCE, A-L
@@ -83842,7 +83979,7 @@ indefinite integer value (80000000_00000000H) is returned.
Legacy SSE instructions: In 64-bit mode, Use of the REX.W prefix promotes the instruction to 64-bit operation. See
the summary chart at the beginning of this section for encoding data and limits.
VEX.W1 and EVEX.W1 versions: promotes the instruction to produce 64-bit data in 64-bit mode.
-3-276 Vol. 2A
+3-278 Vol. 2A
CVTTSD2SI—Convert with Truncation Scalar Double-Precision Floating-Point Value to Signed Integer
@@ -83882,7 +84019,7 @@ EVEX-encoded instructions, see Exceptions Type E3NF.
CVTTSD2SI—Convert with Truncation Scalar Double-Precision Floating-Point Value to Signed Integer
-Vol. 2A 3-277
+Vol. 2A 3-279
INSTRUCTION SET REFERENCE, A-L
@@ -84029,7 +84166,7 @@ Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructi
Software should ensure VCVTTSS2SI is encoded with VEX.L=0. Encoding VCVTTSS2SI with VEX.L=1 may
encounter unpredictable behavior across different processor generations.
-3-278 Vol. 2A
+3-280 Vol. 2A
CVTTSS2SI—Convert with Truncation Scalar Single-Precision Floating-Point Value to Integer
@@ -84065,7 +84202,7 @@ EVEX-encoded instructions, see Exceptions Type E3NF.
CVTTSS2SI—Convert with Truncation Scalar Single-Precision Floating-Point Value to Integer
-Vol. 2A 3-279
+Vol. 2A 3-281
INSTRUCTION SET REFERENCE, A-L
@@ -84174,7 +84311,7 @@ None
Exceptions (All Operating Modes)
#UD
-3-280 Vol. 2A
+3-282 Vol. 2A
If the LOCK prefix is used.
@@ -84277,7 +84414,7 @@ After: AL=34H BL=35H EFLAGS(0SZAPC)=X00101
DAA—Decimal Adjust AL after Addition
-Vol. 2A 3-281
+Vol. 2A 3-283
INSTRUCTION SET REFERENCE, A-L
@@ -84308,7 +84445,7 @@ If the LOCK prefix is used.
64-Bit Mode Exceptions
#UD
-3-282 Vol. 2A
+3-284 Vol. 2A
If in 64-bit mode.
@@ -84412,7 +84549,7 @@ The CF and AF flags are set if the adjustment of the value results in a decimal
DAS—Decimal Adjust AL after Subtraction
-Vol. 2A 3-283
+Vol. 2A 3-285
INSTRUCTION SET REFERENCE, A-L
@@ -84439,7 +84576,7 @@ If the LOCK prefix is used.
64-Bit Mode Exceptions
#UD
-3-284 Vol. 2A
+3-286 Vol. 2A
If in 64-bit mode.
@@ -84624,7 +84761,7 @@ If the LOCK prefix is used but the destination is not a memory operand.
DEC—Decrement by 1
-Vol. 2A 3-285
+Vol. 2A 3-287
INSTRUCTION SET REFERENCE, A-L
@@ -84687,7 +84824,7 @@ current privilege level is 3.
If the LOCK prefix is used but the destination is not a memory operand.
-3-286 Vol. 2A
+3-288 Vol. 2A
DEC—Decrement by 1
@@ -84875,7 +85012,7 @@ quadword
DIV—Unsigned Divide
-Vol. 2A 3-287
+Vol. 2A 3-289
INSTRUCTION SET REFERENCE, A-L
@@ -84926,7 +85063,7 @@ FI;
Flags Affected
The CF, OF, SF, ZF, AF, and PF flags are undefined.
-3-288 Vol. 2A
+3-290 Vol. 2A
DIV—Unsigned Divide
@@ -85037,7 +85174,7 @@ If the LOCK prefix is used.
DIV—Unsigned Divide
-Vol. 2A 3-289
+Vol. 2A 3-291
INSTRUCTION SET REFERENCE, A-L
@@ -85061,7 +85198,7 @@ SSE2
66 0F 5E /r
DIVPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 5E /r
+VEX.128.66.0F.WIG 5E /r
VDIVPD xmm1, xmm2, xmm3/m128
B
@@ -85070,7 +85207,7 @@ V/V
AVX
-VEX.NDS.256.66.0F.WIG 5E /r
+VEX.256.66.0F.WIG 5E /r
VDIVPD ymm1, ymm2, ymm3/m256
B
@@ -85079,7 +85216,7 @@ V/V
AVX
-EVEX.NDS.128.66.0F.W1 5E /r
+EVEX.128.66.0F.W1 5E /r
VDIVPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -85090,7 +85227,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F.W1 5E /r
+EVEX.256.66.0F.W1 5E /r
VDIVPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -85101,7 +85238,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F.W1 5E /r
+EVEX.512.66.0F.W1 5E /r
VDIVPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
@@ -85199,7 +85336,7 @@ bits (MAXVL-1:128) of the corresponding destination are zeroed.
128-bit Legacy SSE version: The second source operand (the second operand) can be an XMM register or an 128bit memory location. The destination is the same as the first source operand. The upper bits (MAXVL-1:128) of the
corresponding destination are unmodified.
-3-290 Vol. 2A
+3-292 Vol. 2A
DIVPD—Divide Packed Double-Precision Floating-Point Values
@@ -85252,7 +85389,7 @@ DEST[MAXVL-1:128] (Unmodified)
DIVPD—Divide Packed Double-Precision Floating-Point Values
-Vol. 2A 3-291
+Vol. 2A 3-293
INSTRUCTION SET REFERENCE, A-L
@@ -85277,7 +85414,7 @@ Other Exceptions
VEX-encoded instructions, see Exceptions Type 2.
EVEX-encoded instructions, see Exceptions Type E2.
-3-292 Vol. 2A
+3-294 Vol. 2A
DIVPD—Divide Packed Double-Precision Floating-Point Values
@@ -85303,7 +85440,7 @@ SSE
NP 0F 5E /r
DIVPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 5E /r
+VEX.128.0F.WIG 5E /r
VDIVPS xmm1, xmm2, xmm3/m128
B
@@ -85312,7 +85449,7 @@ V/V
AVX
-VEX.NDS.256.0F.WIG 5E /r
+VEX.256.0F.WIG 5E /r
VDIVPS ymm1, ymm2, ymm3/m256
B
@@ -85321,7 +85458,7 @@ V/V
AVX
-EVEX.NDS.128.0F.W0 5E /r
+EVEX.128.0F.W0 5E /r
VDIVPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -85332,7 +85469,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.0F.W0 5E /r
+EVEX.256.0F.W0 5E /r
VDIVPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -85343,7 +85480,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.0F.W0 5E /r
+EVEX.512.0F.W0 5E /r
VDIVPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{er}
@@ -85443,7 +85580,7 @@ ZMM register destination are unmodified.
DIVPS—Divide Packed Single-Precision Floating-Point Values
-Vol. 2A 3-293
+Vol. 2A 3-295
INSTRUCTION SET REFERENCE, A-L
@@ -85494,7 +85631,7 @@ DEST[95:64] SRC1[95:64] / SRC2[95:64]
DEST[127:96] SRC1[127:96] / SRC2[127:96]
DEST[MAXVL-1:128] 0
-3-294 Vol. 2A
+3-296 Vol. 2A
DIVPS—Divide Packed Single-Precision Floating-Point Values
@@ -85530,7 +85667,7 @@ EVEX-encoded instructions, see Exceptions Type E2.
DIVPS—Divide Packed Single-Precision Floating-Point Values
-Vol. 2A 3-295
+Vol. 2A 3-297
INSTRUCTION SET REFERENCE, A-L
@@ -85554,7 +85691,7 @@ SSE2
F2 0F 5E /r
DIVSD xmm1, xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 5E /r
+VEX.LIG.F2.0F.WIG 5E /r
VDIVSD xmm1, xmm2, xmm3/m64
B
@@ -85563,7 +85700,7 @@ V/V
AVX
-EVEX.NDS.LIG.F2.0F.W1 5E /r
+EVEX.LIG.F2.0F.W1 5E /r
VDIVSD xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
@@ -85647,7 +85784,7 @@ the destination register are zeroed.
EVEX version: The low quadword element of the destination is updated according to the writemask.
Software should ensure VDIVSD is encoded with VEX.L=0. Encoding VDIVSD with VEX.L=1 may encounter unpredictable behavior across different processor generations.
-3-296 Vol. 2A
+3-298 Vol. 2A
DIVSD—Divide Scalar Double-Precision Floating-Point Value
@@ -85700,7 +85837,7 @@ EVEX-encoded instructions, see Exceptions Type E3.
DIVSD—Divide Scalar Double-Precision Floating-Point Value
-Vol. 2A 3-297
+Vol. 2A 3-299
INSTRUCTION SET REFERENCE, A-L
@@ -85724,7 +85861,7 @@ SSE
F3 0F 5E /r
DIVSS xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 5E /r
+VEX.LIG.F3.0F.WIG 5E /r
VDIVSS xmm1, xmm2, xmm3/m32
B
@@ -85733,7 +85870,7 @@ V/V
AVX
-EVEX.NDS.LIG.F3.0F.W0 5E /r
+EVEX.LIG.F3.0F.W0 5E /r
VDIVSS xmm1 {k1}{z}, xmm2,
xmm3/m32{er}
@@ -85815,7 +85952,7 @@ of the destination register are zeroed.
EVEX version: The low doubleword element of the destination is updated according to the writemask.
Software should ensure VDIVSS is encoded with VEX.L=0. Encoding VDIVSS with VEX.L=1 may encounter unpredictable behavior across different processor generations.
-3-298 Vol. 2A
+3-300 Vol. 2A
DIVSS—Divide Scalar Single-Precision Floating-Point Values
@@ -85868,7 +86005,7 @@ EVEX-encoded instructions, see Exceptions Type E3.
DIVSS—Divide Scalar Single-Precision Floating-Point Values
-Vol. 2A 3-299
+Vol. 2A 3-301
INSTRUCTION SET REFERENCE, A-L
@@ -85908,7 +86045,7 @@ xmm1.
DPPD xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F3A.WIG 41 /r ib
+VEX.128.66.0F3A.WIG 41 /r ib
RVMI V/V
@@ -85964,7 +86101,7 @@ zeroed.
If VDPPD is encoded with VEX.L= 1, an attempt to execute the instruction encoded with VEX.L= 1 will cause an
#UD exception.
-3-300 Vol. 2A
+3-302 Vol. 2A
DPPD — Dot Product of Packed Double Precision Floating-Point Values
@@ -86015,7 +86152,7 @@ If VEX.L= 1.
DPPD — Dot Product of Packed Double Precision Floating-Point Values
-Vol. 2A 3-301
+Vol. 2A 3-303
INSTRUCTION SET REFERENCE, A-L
@@ -86066,10 +86203,10 @@ pairs of elements and store to ymm1.
DPPS xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F3A.WIG 40 /r ib
+VEX.128.66.0F3A.WIG 40 /r ib
VDPPS xmm1,xmm2, xmm3/m128, imm8
-VEX.NDS.256.66.0F3A.WIG 40 /r ib
+VEX.256.66.0F3A.WIG 40 /r ib
VDPPS ymm1, ymm2, ymm3/m256, imm8
Instruction Operand Encoding
@@ -86121,7 +86258,7 @@ zeroed.
VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM
register or a 256-bit memory location. The destination operand is a YMM register.
-3-302 Vol. 2A
+3-304 Vol. 2A
DPPS — Dot Product of Packed Single Precision Floating-Point Values
@@ -86174,7 +86311,7 @@ None
DPPS — Dot Product of Packed Single Precision Floating-Point Values
-Vol. 2A 3-303
+Vol. 2A 3-305
INSTRUCTION SET REFERENCE, A-L
@@ -86195,7 +86332,7 @@ Unmasked exceptions will leave the destination operands unchanged.
Other Exceptions
See Exceptions Type 2.
-3-304 Vol. 2A
+3-306 Vol. 2A
DPPS — Dot Product of Packed Single Precision Floating-Point Values
@@ -86300,7 +86437,7 @@ Same exceptions as in protected mode.
EMMS—Empty MMX Technology State
-Vol. 2A 3-305
+Vol. 2A 3-307
INSTRUCTION SET REFERENCE, A-L
@@ -86414,7 +86551,7 @@ The value in the RBP/EBP register prior to executing “66H ENTER” must be wit
the current stack pointer (RSP/ESP), such that the value of RBP/EBP after “66H ENTER” remains a valid address
in the stack. This ensures “66H LEAVE” can restore 16-bits of data from the stack.
-3-306 Vol. 2A
+3-308 Vol. 2A
ENTER—Make Stack Frame for Procedure Parameters
@@ -86474,7 +86611,7 @@ ELSE IF OperandSize = 32
ENTER—Make Stack Frame for Procedure Parameters
-Vol. 2A 3-307
+Vol. 2A 3-309
INSTRUCTION SET REFERENCE, A-L
@@ -86555,7 +86692,7 @@ stack segment) would cause a page fault.
If the LOCK prefix is used.
-3-308 Vol. 2A
+3-310 Vol. 2A
ENTER—Make Stack Frame for Procedure Parameters
@@ -86678,7 +86815,7 @@ FI
EXTRACTPS—Extract Packed Floating-Point Values
-Vol. 2A 3-309
+Vol. 2A 3-311
INSTRUCTION SET REFERENCE, A-L
@@ -86708,7 +86845,7 @@ IF VEX.L = 0.
If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.
-3-310 Vol. 2A
+3-312 Vol. 2A
EXTRACTPS—Extract Packed Floating-Point Values
@@ -86819,7 +86956,7 @@ Virtual-8086 Mode Exceptions
Same exceptions as in protected mode.
F2XM1—Compute 2x–1
-Vol. 2A 3-311
+Vol. 2A 3-313
INSTRUCTION SET REFERENCE, A-L
@@ -86829,7 +86966,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-312 Vol. 2A
+3-314 Vol. 2A
F2XM1—Compute 2x–1
@@ -86938,7 +87075,7 @@ Same exceptions as in protected mode.
Same exceptions as in protected mode.
FABS—Absolute Value
-Vol. 2A 3-313
+Vol. 2A 3-315
INSTRUCTION SET REFERENCE, A-L
@@ -87057,7 +87194,7 @@ When the sum of two operands with opposite signs is 0, the result is +0, except
which case the result is −0. When the source operand is an integer 0, it is treated as a +0.
When both operand are infinities of the same sign, the result is ∞ of the expected sign. If both operands are infinities of opposite signs, an invalid-operation exception is generated. See Table 3-18.
-3-314 Vol. 2A
+3-316 Vol. 2A
FADD/FADDP/FIADD—Add
@@ -87250,7 +87387,7 @@ Value cannot be represented exactly in destination format.
FADD/FADDP/FIADD—Add
-Vol. 2A 3-315
+Vol. 2A 3-317
INSTRUCTION SET REFERENCE, A-L
@@ -87356,7 +87493,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-316 Vol. 2A
+3-318 Vol. 2A
FADD/FADDP/FIADD—Add
@@ -87483,7 +87620,7 @@ If the LOCK prefix is used.
FBLD—Load Binary Coded Decimal
-Vol. 2A 3-317
+Vol. 2A 3-319
INSTRUCTION SET REFERENCE, A-L
@@ -87520,7 +87657,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-318 Vol. 2A
+3-320 Vol. 2A
FBLD—Load Binary Coded Decimal
@@ -87626,7 +87763,7 @@ Undefined.
FBSTP—Store BCD Integer and Pop
-Vol. 2A 3-319
+Vol. 2A 3-321
INSTRUCTION SET REFERENCE, A-L
@@ -87746,7 +87883,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-320 Vol. 2A
+3-322 Vol. 2A
FBSTP—Store BCD Integer and Pop
@@ -87853,14 +87990,14 @@ Same exceptions as in protected mode.
FCHS—Change Sign
-Vol. 2A 3-321
+Vol. 2A 3-323
INSTRUCTION SET REFERENCE, A-L
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-322 Vol. 2A
+3-324 Vol. 2A
FCHS—Change Sign
@@ -87948,7 +88085,7 @@ Same exceptions as in protected mode.
FCLEX/FNCLEX—Clear Exceptions
-Vol. 2A 3-323
+Vol. 2A 3-325
INSTRUCTION SET REFERENCE, A-L
@@ -87961,7 +88098,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-324 Vol. 2A
+3-326 Vol. 2A
FCLEX/FNCLEX—Clear Exceptions
@@ -88100,7 +88237,7 @@ None.
FCMOVcc—Floating-Point Conditional Move
-Vol. 2A 3-325
+Vol. 2A 3-327
INSTRUCTION SET REFERENCE, A-L
@@ -88125,7 +88262,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-326 Vol. 2A
+3-328 Vol. 2A
FCMOVcc—Floating-Point Conditional Move
@@ -88299,7 +88436,7 @@ This instruction’s operation is the same in non-64-bit modes and 64-bit mode.
FCOM/FCOMP/FCOMPP—Compare Floating Point Values
-Vol. 2A 3-327
+Vol. 2A 3-329
INSTRUCTION SET REFERENCE, A-L
@@ -88395,7 +88532,7 @@ CR0.EM[bit 2] or CR0.TS[bit 3] = 1.
If the LOCK prefix is used.
-3-328 Vol. 2A
+3-330 Vol. 2A
FCOM/FCOMP/FCOMPP—Compare Floating Point Values
@@ -88461,7 +88598,7 @@ If the LOCK prefix is used.
FCOM/FCOMP/FCOMPP—Compare Floating Point Values
-Vol. 2A 3-329
+Vol. 2A 3-331
INSTRUCTION SET REFERENCE, A-L
@@ -88587,7 +88724,7 @@ IA-32 Architecture Compatibility
The FCOMI/FCOMIP/FUCOMI/FUCOMIP instructions were introduced to the IA-32 Architecture in the P6 family
processors and are not available in earlier IA-32 processors.
-3-330 Vol. 2A
+3-332 Vol. 2A
FCOMI/FCOMIP/ FUCOMI/FUCOMIP—Compare Floating Point Values and Set EFLAGS
@@ -88654,7 +88791,7 @@ have undefined formats. Detection of a QNaN value does not raise an invalid-oper
FCOMI/FCOMIP/ FUCOMI/FUCOMIP—Compare Floating Point Values and Set EFLAGS
-Vol. 2A 3-331
+Vol. 2A 3-333
INSTRUCTION SET REFERENCE, A-L
@@ -88683,7 +88820,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-332 Vol. 2A
+3-334 Vol. 2A
FCOMI/FCOMIP/ FUCOMI/FUCOMIP—Compare Floating Point Values and Set EFLAGS
@@ -88776,7 +88913,7 @@ FI;
FCOS— Cosine
-Vol. 2A 3-333
+Vol. 2A 3-335
INSTRUCTION SET REFERENCE, A-L
@@ -88837,7 +88974,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-334 Vol. 2A
+3-336 Vol. 2A
FCOS— Cosine
@@ -88911,7 +89048,7 @@ Same exceptions as in protected mode.
FDECSTP—Decrement Stack-Top Pointer
-Vol. 2A 3-335
+Vol. 2A 3-337
INSTRUCTION SET REFERENCE, A-L
@@ -89029,7 +89166,7 @@ the appropriate sign is stored in the destination operand.
The following table shows the results obtained when dividing various classes of numbers, assuming that neither
overflow nor underflow occurs.
-3-336 Vol. 2A
+3-338 Vol. 2A
FDIV/FDIVP/FIDIV—Divide
@@ -89234,7 +89371,7 @@ FDIV/FDIVP/FIDIV—Divide
Undefined.
-Vol. 2A 3-337
+Vol. 2A 3-339
INSTRUCTION SET REFERENCE, A-L
@@ -89368,7 +89505,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-338 Vol. 2A
+3-340 Vol. 2A
FDIV/FDIVP/FIDIV—Divide
@@ -89492,7 +89629,7 @@ overflow nor underflow occurs.
FDIVR/FDIVRP/FIDIVR—Reverse Divide
-Vol. 2A 3-339
+Vol. 2A 3-341
INSTRUCTION SET REFERENCE, A-L
@@ -89692,7 +89829,7 @@ Set if result was rounded up; cleared otherwise.
C0, C2, C3
-3-340 Vol. 2A
+3-342 Vol. 2A
Undefined.
@@ -89832,7 +89969,7 @@ If the LOCK prefix is used.
FDIVR/FDIVRP/FIDIVR—Reverse Divide
-Vol. 2A 3-341
+Vol. 2A 3-343
INSTRUCTION SET REFERENCE, A-L
@@ -89898,7 +90035,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-342 Vol. 2A
+3-344 Vol. 2A
FFREE—Free Floating-Point Register
@@ -90050,7 +90187,7 @@ One or both operands are denormal values.
FICOM/FICOMP—Compare Integer
-Vol. 2A 3-343
+Vol. 2A 3-345
INSTRUCTION SET REFERENCE, A-L
@@ -90156,7 +90293,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-344 Vol. 2A
+3-346 Vol. 2A
FICOM/FICOMP—Compare Integer
@@ -90300,7 +90437,7 @@ If the LOCK prefix is used.
FILD—Load Integer
-Vol. 2A 3-345
+Vol. 2A 3-347
INSTRUCTION SET REFERENCE, A-L
@@ -90337,7 +90474,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-346 Vol. 2A
+3-348 Vol. 2A
FILD—Load Integer
@@ -90412,7 +90549,7 @@ Same exceptions as in protected mode.
FINCSTP—Increment Stack-Top Pointer
-Vol. 2A 3-347
+Vol. 2A 3-349
INSTRUCTION SET REFERENCE, A-L
@@ -90489,7 +90626,7 @@ C0, C1, C2, C3 set to 0.
Floating-Point Exceptions
None
-3-348 Vol. 2A
+3-350 Vol. 2A
FINIT/FNINIT—Initialize Floating-Point Unit
@@ -90522,7 +90659,7 @@ Same exceptions as in protected mode.
FINIT/FNINIT—Initialize Floating-Point Unit
-Vol. 2A 3-349
+Vol. 2A 3-351
INSTRUCTION SET REFERENCE, A-L
@@ -90651,7 +90788,7 @@ not masked, an invalid-arithmetic-operand exception (#IA) is generated and no va
operand. If the invalid-operation exception is masked, the integer indefinite value is stored in memory.
This instruction’s operation is the same in non-64-bit modes and 64-bit mode.
-3-350 Vol. 2A
+3-352 Vol. 2A
FIST/FISTP—Store Integer
@@ -90766,7 +90903,7 @@ Same exceptions as in protected mode.
FIST/FISTP—Store Integer
-Vol. 2A 3-351
+Vol. 2A 3-353
INSTRUCTION SET REFERENCE, A-L
@@ -90800,7 +90937,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-352 Vol. 2A
+3-354 Vol. 2A
FIST/FISTP—Store Integer
@@ -90924,7 +91061,7 @@ If the LOCK prefix is used.
FISTTP—Store Integer with Truncation
-Vol. 2A 3-353
+Vol. 2A 3-355
INSTRUCTION SET REFERENCE, A-L
@@ -90995,7 +91132,7 @@ If alignment checking is enabled and an unaligned memory reference is made while
current privilege level is 3.
If the LOCK prefix is used.
-3-354 Vol. 2A
+3-356 Vol. 2A
FISTTP—Store Integer with Truncation
@@ -91102,7 +91239,7 @@ extended-precision floating-point format.
FLD—Load Floating Point Value
-Vol. 2A 3-355
+Vol. 2A 3-357
INSTRUCTION SET REFERENCE, A-L
@@ -91210,7 +91347,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-356 Vol. 2A
+3-358 Vol. 2A
FLD—Load Floating Point Value
@@ -91353,7 +91490,7 @@ Virtual-8086 Mode Exceptions
Same exceptions as in protected mode.
FLD1/FLDL2T/FLDL2E/FLDPI/FLDLG2/FLDLN2/FLDZ—Load Constant
-Vol. 2A 3-357
+Vol. 2A 3-359
INSTRUCTION SET REFERENCE, A-L
@@ -91366,7 +91503,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-358 Vol. 2A
+3-360 Vol. 2A
FLD1/FLDL2T/FLDL2E/FLDPI/FLDLG2/FLDLN2/FLDZ—Load Constant
@@ -91461,7 +91598,7 @@ If the LOCK prefix is used.
FLDCW—Load x87 FPU Control Word
-Vol. 2A 3-359
+Vol. 2A 3-361
INSTRUCTION SET REFERENCE, A-L
@@ -91523,7 +91660,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-360 Vol. 2A
+3-362 Vol. 2A
FLDCW—Load x87 FPU Control Word
@@ -91589,7 +91726,7 @@ None; however, if an unmasked exception is loaded in the status word, it is gene
FLDENV—Load x87 FPU Environment
-Vol. 2A 3-361
+Vol. 2A 3-363
INSTRUCTION SET REFERENCE, A-L
@@ -91696,7 +91833,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-362 Vol. 2A
+3-364 Vol. 2A
FLDENV—Load x87 FPU Environment
@@ -91818,7 +91955,7 @@ overflow nor underflow occurs.
FMUL/FMULP/FIMUL—Multiply
-Vol. 2A 3-363
+Vol. 2A 3-365
INSTRUCTION SET REFERENCE, A-L
@@ -92039,7 +92176,7 @@ Result is too large for destination format.
Value cannot be represented exactly in destination format.
-3-364 Vol. 2A
+3-366 Vol. 2A
FMUL/FMULP/FIMUL—Multiply
@@ -92150,7 +92287,7 @@ If the LOCK prefix is used.
FMUL/FMULP/FIMUL—Multiply
-Vol. 2A 3-365
+Vol. 2A 3-367
INSTRUCTION SET REFERENCE, A-L
@@ -92213,7 +92350,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-366 Vol. 2A
+3-368 Vol. 2A
FNOP—No Operation
@@ -92404,7 +92541,7 @@ The source operands for this instruction are restricted for the 80287 math copro
FPATAN—Partial Arctangent
-Vol. 2A 3-367
+Vol. 2A 3-369
INSTRUCTION SET REFERENCE, A-L
@@ -92468,7 +92605,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-368 Vol. 2A
+3-370 Vol. 2A
FPATAN—Partial Arctangent
@@ -92662,7 +92799,7 @@ An important use of the FPREM instruction is to reduce the arguments of periodic
complete, the instruction stores the three least-significant bits of the quotient in the C3, C1, and C0 flags of the FPU
FPREM—Partial Remainder
-Vol. 2A 3-369
+Vol. 2A 3-371
INSTRUCTION SET REFERENCE, A-L
@@ -92744,7 +92881,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-370 Vol. 2A
+3-372 Vol. 2A
FPREM—Partial Remainder
@@ -92938,7 +93075,7 @@ complete, the instruction stores the three least-significant bits of the quotien
FPREM1—Partial Remainder
-Vol. 2A 3-371
+Vol. 2A 3-373
INSTRUCTION SET REFERENCE, A-L
@@ -93021,7 +93158,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-372 Vol. 2A
+3-374 Vol. 2A
FPREM1—Partial Remainder
@@ -93111,7 +93248,7 @@ This instruction’s operation is the same in non-64-bit modes and 64-bit mode.
FPTAN—Partial Tangent
-Vol. 2A 3-373
+Vol. 2A 3-375
INSTRUCTION SET REFERENCE, A-L
@@ -93186,7 +93323,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-374 Vol. 2A
+3-376 Vol. 2A
FPTAN—Partial Tangent
@@ -93279,7 +93416,7 @@ Same exceptions as in protected mode.
FRNDINT—Round to Integer
-Vol. 2A 3-375
+Vol. 2A 3-377
INSTRUCTION SET REFERENCE, A-L
@@ -93345,7 +93482,7 @@ Floating-Point Exceptions
None; however, this operation might unmask an existing exception that has been detected but not generated,
because it was masked. Here, the exception is generated at the completion of the instruction.
-3-376 Vol. 2A
+3-378 Vol. 2A
FRSTOR—Restore x87 FPU State
@@ -93452,7 +93589,7 @@ If the LOCK prefix is used.
FRSTOR—Restore x87 FPU State
-Vol. 2A 3-377
+Vol. 2A 3-379
INSTRUCTION SET REFERENCE, A-L
@@ -93526,7 +93663,7 @@ circumstances) for an FNSAVE instruction to be interrupted prior to being execut
FNSAVE instruction cannot be interrupted in this way on later Intel processors, except for the Intel QuarkTM X1000
processor.
-3-378 Vol. 2A
+3-380 Vol. 2A
FSAVE/FNSAVE—Store x87 FPU State
@@ -93610,7 +93747,7 @@ If the LOCK prefix is used.
FSAVE/FNSAVE—Store x87 FPU State
-Vol. 2A 3-379
+Vol. 2A 3-381
INSTRUCTION SET REFERENCE, A-L
@@ -93668,7 +93805,7 @@ If a page fault occurs.
If alignment checking is enabled and an unaligned memory reference is made while the
current privilege level is 3.
-3-380 Vol. 2A
+3-382 Vol. 2A
FSAVE/FNSAVE—Store x87 FPU State
@@ -93864,7 +94001,7 @@ C0, C2, C3
FSCALE—Scale
Undefined.
-Vol. 2A 3-381
+Vol. 2A 3-383
INSTRUCTION SET REFERENCE, A-L
@@ -93918,7 +94055,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-382 Vol. 2A
+3-384 Vol. 2A
FSCALE—Scale
@@ -94025,7 +94162,7 @@ Undefined.
FSIN—Sine
-Vol. 2A 3-383
+Vol. 2A 3-385
INSTRUCTION SET REFERENCE, A-L
@@ -94071,7 +94208,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-384 Vol. 2A
+3-386 Vol. 2A
FSIN—Sine
@@ -94180,7 +94317,7 @@ This instruction’s operation is the same in non-64-bit modes and 64-bit mode.
FSINCOS—Sine and Cosine
-Vol. 2A 3-385
+Vol. 2A 3-387
INSTRUCTION SET REFERENCE, A-L
@@ -94256,7 +94393,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-386 Vol. 2A
+3-388 Vol. 2A
FSINCOS—Sine and Cosine
@@ -94375,7 +94512,7 @@ Same exceptions as in protected mode.
FSQRT—Square Root
-Vol. 2A 3-387
+Vol. 2A 3-389
INSTRUCTION SET REFERENCE, A-L
@@ -94388,7 +94525,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-388 Vol. 2A
+3-390 Vol. 2A
FSQRT—Square Root
@@ -94515,7 +94652,7 @@ Undefined.
FST/FSTP—Store Floating Point Value
-Vol. 2A 3-389
+Vol. 2A 3-391
INSTRUCTION SET REFERENCE, A-L
@@ -94645,7 +94782,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-390 Vol. 2A
+3-392 Vol. 2A
FST/FSTP—Store Floating Point Value
@@ -94745,7 +94882,7 @@ If the LOCK prefix is used.
FSTCW/FNSTCW—Store x87 FPU Control Word
-Vol. 2A 3-391
+Vol. 2A 3-393
INSTRUCTION SET REFERENCE, A-L
@@ -94824,7 +94961,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-392 Vol. 2A
+3-394 Vol. 2A
FSTCW/FNSTCW—Store x87 FPU Control Word
@@ -94907,7 +95044,7 @@ The C0, C1, C2, and C3 are undefined.
FSTENV/FNSTENV—Store x87 FPU Environment
-Vol. 2A 3-393
+Vol. 2A 3-395
INSTRUCTION SET REFERENCE, A-L
@@ -95018,7 +95155,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-394 Vol. 2A
+3-396 Vol. 2A
FSTENV/FNSTENV—Store x87 FPU Environment
@@ -95121,7 +95258,7 @@ None
FSTSW/FNSTSW—Store x87 FPU Status Word
-Vol. 2A 3-395
+Vol. 2A 3-397
INSTRUCTION SET REFERENCE, A-L
@@ -95225,7 +95362,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-396 Vol. 2A
+3-398 Vol. 2A
FSTSW/FNSTSW—Store x87 FPU Status Word
@@ -95349,7 +95486,7 @@ When one operand is ∞, the result is ∞ of the expected sign. If both operand
FSUB/FSUBP/FISUB—Subtract
-Vol. 2A 3-397
+Vol. 2A 3-399
INSTRUCTION SET REFERENCE, A-L
@@ -95537,7 +95674,7 @@ Result is too large for destination format.
Value cannot be represented exactly in destination format.
-3-398 Vol. 2A
+3-400 Vol. 2A
FSUB/FSUBP/FISUB—Subtract
@@ -95648,7 +95785,7 @@ If the LOCK prefix is used.
FSUB/FSUBP/FISUB—Subtract
-Vol. 2A 3-399
+Vol. 2A 3-401
INSTRUCTION SET REFERENCE, A-L
@@ -95771,7 +95908,7 @@ in which case the result is −0. This instruction also guarantees that +0 − (
source operand is an integer 0, it is treated as a +0.
When one operand is ∞, the result is ∞ of the expected sign. If both operands are ∞ of the same sign, an invalidoperation exception is generated.
-3-400 Vol. 2A
+3-402 Vol. 2A
FSUBR/FSUBRP/FISUBR—Reverse Subtract
@@ -95961,7 +96098,7 @@ Value cannot be represented exactly in destination format.
FSUBR/FSUBRP/FISUBR—Reverse Subtract
-Vol. 2A 3-401
+Vol. 2A 3-403
INSTRUCTION SET REFERENCE, A-L
@@ -96068,7 +96205,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-402 Vol. 2A
+3-404 Vol. 2A
FSUBR/FSUBRP/FISUBR—Reverse Subtract
@@ -96200,7 +96337,7 @@ Same exceptions as in protected mode.
FTST—TEST
-Vol. 2A 3-403
+Vol. 2A 3-405
INSTRUCTION SET REFERENCE, A-L
@@ -96213,7 +96350,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-404 Vol. 2A
+3-406 Vol. 2A
FTST—TEST
@@ -96345,7 +96482,7 @@ This instruction’s operation is the same in non-64-bit modes and 64-bit mode.
FUCOM/FUCOMP/FUCOMPP—Unordered Compare Floating Point Values
-Vol. 2A 3-405
+Vol. 2A 3-407
INSTRUCTION SET REFERENCE, A-L
@@ -96422,7 +96559,7 @@ Same exceptions as in protected mode.
Compatibility Mode Exceptions
Same exceptions as in protected mode.
-3-406 Vol. 2A
+3-408 Vol. 2A
FUCOM/FUCOMP/FUCOMPP—Unordered Compare Floating Point Values
@@ -96433,7 +96570,7 @@ Same exceptions as in protected mode.
FUCOM/FUCOMP/FUCOMPP—Unordered Compare Floating Point Values
-Vol. 2A 3-407
+Vol. 2A 3-409
INSTRUCTION SET REFERENCE, A-L
@@ -96580,7 +96717,7 @@ If the LOCK prefix is used.
Real-Address Mode Exceptions
Same exceptions as in protected mode.
-3-408 Vol. 2A
+3-410 Vol. 2A
FXAM—Examine Floating-Point
@@ -96597,7 +96734,7 @@ Same exceptions as in protected mode.
FXAM—Examine Floating-Point
-Vol. 2A 3-409
+Vol. 2A 3-411
INSTRUCTION SET REFERENCE, A-L
@@ -96691,7 +96828,7 @@ Same exceptions as in protected mode.
Virtual-8086 Mode Exceptions
Same exceptions as in protected mode.
-3-410 Vol. 2A
+3-412 Vol. 2A
FXCH—Exchange Register Contents
@@ -96705,7 +96842,7 @@ Same exceptions as in protected mode.
FXCH—Exchange Register Contents
-Vol. 2A 3-411
+Vol. 2A 3-413
INSTRUCTION SET REFERENCE, A-L
@@ -96803,7 +96940,7 @@ FI;
x87 FPU and SIMD Floating-Point Exceptions
None.
-3-412 Vol. 2A
+3-414 Vol. 2A
FXRSTOR—Restore x87 FPU, MMX, XMM, and MXCSR State
@@ -96884,7 +97021,7 @@ Same exceptions as in protected mode.
FXRSTOR—Restore x87 FPU, MMX, XMM, and MXCSR State
-Vol. 2A 3-413
+Vol. 2A 3-415
INSTRUCTION SET REFERENCE, A-L
@@ -96911,7 +97048,7 @@ If instruction is preceded by a LOCK prefix.
#AC
-3-414 Vol. 2A
+3-416 Vol. 2A
If this exception is disabled a general protection exception (#GP) is signaled if the memory
operand is not aligned on a 16-byte boundary, as described above. If the alignment check
@@ -97138,7 +97275,7 @@ XMM7
Reserved
288
-Vol. 2A 3-415
+Vol. 2A 3-417
INSTRUCTION SET REFERENCE, A-L
@@ -97273,7 +97410,7 @@ See “x87 FPU Instruction and Operand (Data) Pointers” in Chapter 8 of the In
Architectures Software Developer’s Manual, Volume 1, for a description of the x87 FPU instruction
pointer.
-3-416 Vol. 2A
+3-418 Vol. 2A
FXSAVE—Save x87 FPU, MMX Technology, and SSE State
@@ -97363,7 +97500,7 @@ the processor retains the contents of the registers. Because of this behavior, t
FXSAVE—Save x87 FPU, MMX Technology, and SSE State
-Vol. 2A 3-417
+Vol. 2A 3-419
INSTRUCTION SET REFERENCE, A-L
@@ -97589,7 +97726,7 @@ are two different layouts of the FXSAVE map in 64-bit mode, corresponding to FXS
REX.W=1) and FXSAVE (REX.W=0). In the FXSAVE64 map (Table 3-46), the FPU IP and FPU DP pointers are 64-bit
wide. In the FXSAVE map for 64-bit mode (Table 3-47), the FPU IP and FPU DP pointers are 32-bits.
-3-418 Vol. 2A
+3-420 Vol. 2A
FXSAVE—Save x87 FPU, MMX Technology, and SSE State
@@ -97785,7 +97922,7 @@ Available
496
-Vol. 2A 3-419
+Vol. 2A 3-421
INSTRUCTION SET REFERENCE, A-L
@@ -97804,7 +97941,7 @@ FCS
MXCSR_MASK
-3-420 Vol. 2A
+3-422 Vol. 2A
11
@@ -98080,7 +98217,7 @@ Same exceptions as in protected mode.
FXSAVE—Save x87 FPU, MMX Technology, and SSE State
-Vol. 2A 3-421
+Vol. 2A 3-423
INSTRUCTION SET REFERENCE, A-L
@@ -98120,7 +98257,7 @@ Implementation Note
The order in which the processor signals general-protection (#GP) and page-fault (#PF) exceptions when they both
occur on an instruction boundary is given in Table 5-2 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B. This order vary for FXSAVE for different processor implementations.
-3-422 Vol. 2A
+3-424 Vol. 2A
FXSAVE—Save x87 FPU, MMX Technology, and SSE State
@@ -98217,7 +98354,7 @@ Same exceptions as in protected mode.
FXTRACT—Extract Exponent and Significand
-Vol. 2A 3-423
+Vol. 2A 3-425
INSTRUCTION SET REFERENCE, A-L
@@ -98227,7 +98364,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-424 Vol. 2A
+3-426 Vol. 2A
FXTRACT—Extract Exponent and Significand
@@ -98440,7 +98577,7 @@ FYL2X—Compute y * log2x
Undefined.
-Vol. 2A 3-425
+Vol. 2A 3-427
INSTRUCTION SET REFERENCE, A-L
@@ -98498,7 +98635,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-426 Vol. 2A
+3-428 Vol. 2A
FYL2X—Compute y * log2x
@@ -98656,7 +98793,7 @@ PopRegisterStack;
FYL2XP1—Compute y * log2(x +1)
-Vol. 2A 3-427
+Vol. 2A 3-429
INSTRUCTION SET REFERENCE, A-L
@@ -98720,7 +98857,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-3-428 Vol. 2A
+3-430 Vol. 2A
FYL2XP1—Compute y * log2(x +1)
@@ -98769,9 +98906,9 @@ floating-point values from ymm2 and
ymm3/mem.
HADDPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 7C /r
+VEX.128.66.0F.WIG 7C /r
VHADDPD xmm1,xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 7C /r
+VEX.256.66.0F.WIG 7C /r
VHADDPD ymm1, ymm2, ymm3/m256
Instruction Operand Encoding
@@ -98845,7 +98982,7 @@ Figure 3-16. HADDPD—Packed Double-FP Horizontal Add
HADDPD—Packed Double-FP Horizontal Add
-Vol. 2A 3-429
+Vol. 2A 3-431
INSTRUCTION SET REFERENCE, A-L
@@ -98912,7 +99049,7 @@ __m128d _mm_hadd_pd (__m128d a, __m128d b);
Exceptions
When the source operand is a memory operand, the operand must be aligned on a 16-byte boundary or a generalprotection exception (#GP) will be generated.
-3-430 Vol. 2A
+3-432 Vol. 2A
HADDPD—Packed Double-FP Horizontal Add
@@ -98926,7 +99063,7 @@ See Exceptions Type 2.
HADDPD—Packed Double-FP Horizontal Add
-Vol. 2A 3-431
+Vol. 2A 3-433
INSTRUCTION SET REFERENCE, A-L
@@ -98973,9 +99110,9 @@ floating-point values from ymm2 and
ymm3/mem.
HADDPS xmm1, xmm2/m128
-VEX.NDS.128.F2.0F.WIG 7C /r
+VEX.128.F2.0F.WIG 7C /r
VHADDPS xmm1, xmm2, xmm3/m128
-VEX.NDS.256.F2.0F.WIG 7C /r
+VEX.256.F2.0F.WIG 7C /r
VHADDPS ymm1, ymm2, ymm3/m256
Instruction Operand Encoding
@@ -99020,7 +99157,7 @@ Adds single-precision floating-point values in the third and fourth dword of the
in the fourth dword of the destination operand.
In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).
-3-432 Vol. 2A
+3-434 Vol. 2A
HADDPS—Packed Single-FP Horizontal Add
@@ -99142,7 +99279,7 @@ register or a 256-bit memory location. The destination operand is a YMM register
HADDPS—Packed Single-FP Horizontal Add
-Vol. 2A 3-433
+Vol. 2A 3-435
INSTRUCTION SET REFERENCE, A-L
@@ -99187,7 +99324,7 @@ Overflow, Underflow, Invalid, Precision, Denormal
Other Exceptions
See Exceptions Type 2.
-3-434 Vol. 2A
+3-436 Vol. 2A
HADDPS—Packed Single-FP Horizontal Add
@@ -99281,7 +99418,7 @@ Same exceptions as in protected mode.
HLT—Halt
-Vol. 2A 3-435
+Vol. 2A 3-437
INSTRUCTION SET REFERENCE, A-L
@@ -99311,7 +99448,7 @@ Horizontal subtract packed double-precision
floating-point values from xmm2/m128 to
xmm1.
-VEX.NDS.128.66.0F.WIG 7D /r
+VEX.128.66.0F.WIG 7D /r
VHSUBPD xmm1,xmm2, xmm3/m128
RVM V/V
@@ -99322,7 +99459,7 @@ Horizontal subtract packed double-precision
floating-point values from xmm2 and
xmm3/mem.
-VEX.NDS.256.66.0F.WIG 7D /r
+VEX.256.66.0F.WIG 7D /r
VHSUBPD ymm1, ymm2, ymm3/m256
RVM V/V
@@ -99403,7 +99540,7 @@ OM15995
Figure 3-20. HSUBPD—Packed Double-FP Horizontal Subtract
-3-436 Vol. 2A
+3-438 Vol. 2A
HSUBPD—Packed Double-FP Horizontal Subtract
@@ -99479,14 +99616,14 @@ Numeric Exceptions
Overflow, Underflow, Invalid, Precision, Denormal
HSUBPD—Packed Double-FP Horizontal Subtract
-Vol. 2A 3-437
+Vol. 2A 3-439
INSTRUCTION SET REFERENCE, A-L
Other Exceptions
See Exceptions Type 2.
-3-438 Vol. 2A
+3-440 Vol. 2A
HSUBPD—Packed Double-FP Horizontal Subtract
@@ -99535,9 +99672,9 @@ floating-point values from ymm2 and
ymm3/mem.
HSUBPS xmm1, xmm2/m128
-VEX.NDS.128.F2.0F.WIG 7D /r
+VEX.128.F2.0F.WIG 7D /r
VHSUBPS xmm1, xmm2, xmm3/m128
-VEX.NDS.256.F2.0F.WIG 7D /r
+VEX.256.F2.0F.WIG 7D /r
VHSUBPS ymm1, ymm2, ymm3/m256
Instruction Operand Encoding
@@ -99585,7 +99722,7 @@ See Figure 3-22 for HSUBPS; see Figure 3-23 for VHSUBPS.
HSUBPS—Packed Single-FP Horizontal Subtract
-Vol. 2A 3-439
+Vol. 2A 3-441
INSTRUCTION SET REFERENCE, A-L
@@ -99698,7 +99835,7 @@ zeroed.
VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM
register or a 256-bit memory location. The destination operand is a YMM register.
-3-440 Vol. 2A
+3-442 Vol. 2A
HSUBPS—Packed Single-FP Horizontal Subtract
@@ -99745,7 +99882,7 @@ See Exceptions Type 2.
HSUBPS—Packed Single-FP Horizontal Subtract
-Vol. 2A 3-441
+Vol. 2A 3-443
INSTRUCTION SET REFERENCE, A-L
@@ -99923,7 +100060,7 @@ RDX
−263 to 263 − 1
-3-442 Vol. 2A
+3-444 Vol. 2A
IDIV—Signed Divide
@@ -99984,7 +100121,7 @@ FI;
IDIV—Signed Divide
-Vol. 2A 3-443
+Vol. 2A 3-445
INSTRUCTION SET REFERENCE, A-L
@@ -100094,7 +100231,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-444 Vol. 2A
+3-446 Vol. 2A
IDIV—Signed Divide
@@ -100348,7 +100485,7 @@ general-purpose register).
IMUL—Signed Multiply
-Vol. 2A 3-445
+Vol. 2A 3-447
INSTRUCTION SET REFERENCE, A-L
@@ -100413,7 +100550,7 @@ THEN CF ← 0; OF ← 0;
ELSE CF ← 1; OF ← 1; FI;
FI;
-3-446 Vol. 2A
+3-448 Vol. 2A
IMUL—Signed Multiply
@@ -100506,7 +100643,7 @@ Same exceptions as in protected mode.
IMUL—Signed Multiply
-Vol. 2A 3-447
+Vol. 2A 3-449
INSTRUCTION SET REFERENCE, A-L
@@ -100532,7 +100669,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-448 Vol. 2A
+3-450 Vol. 2A
IMUL—Signed Multiply
@@ -100689,7 +100826,7 @@ None
IN—Input from Port
-Vol. 2A 3-449
+Vol. 2A 3-451
INSTRUCTION SET REFERENCE, A-L
@@ -100734,7 +100871,7 @@ corresponding I/O permission bits in TSS for the I/O port being accessed is 1.
If the LOCK prefix is used.
-3-450 Vol. 2A
+3-452 Vol. 2A
IN—Input from Port
@@ -100917,7 +101054,7 @@ If the LOCK prefix is used but the destination is not a memory operand.
INC—Increment by 1
-Vol. 2A 3-451
+Vol. 2A 3-453
INSTRUCTION SET REFERENCE, A-L
@@ -100980,7 +101117,7 @@ current privilege level is 3.
If the LOCK prefix is used but the destination is not a memory operand.
-3-452 Vol. 2A
+3-454 Vol. 2A
INC—Increment by 1
@@ -101131,7 +101268,7 @@ register is incremented or decremented by 1 for byte operations, by 2 for word o
operations.
INS/INSB/INSW/INSD—Input from Port to String
-Vol. 2A 3-453
+Vol. 2A 3-455
INSTRUCTION SET REFERENCE, A-L
@@ -101185,7 +101322,7 @@ THEN (E)DI ← (E)DI + 2;
ELSE (E)DI ← (E)DI – 2; FI;
ELSE (* Doubleword transfer *)
-3-454 Vol. 2A
+3-456 Vol. 2A
INS/INSB/INSW/INSD—Input from Port to String
@@ -101282,7 +101419,7 @@ If the LOCK prefix is used.
INS/INSB/INSW/INSD—Input from Port to String
-Vol. 2A 3-455
+Vol. 2A 3-457
INSTRUCTION SET REFERENCE, A-L
@@ -101307,7 +101444,7 @@ SSE4_1
66 0F 3A 21 /r ib
INSERTPS xmm1, xmm2/m32, imm8
-VEX.NDS.128.66.0F3A.WIG 21 /r ib
+VEX.128.66.0F3A.WIG 21 /r ib
VINSERTPS xmm1, xmm2,
xmm3/m32, imm8
@@ -101317,7 +101454,7 @@ V/V
AVX
-EVEX.NDS.128.66.0F3A.W0 21 /r ib
+EVEX.128.66.0F3A.W0 21 /r ib
VINSERTPS xmm1, xmm2,
xmm3/m32, imm8
@@ -101411,7 +101548,7 @@ source operand is either an XMM register or a 32-bit memory location. The upper
If VINSERTPS is encoded with VEX.L= 1, an attempt to execute the instruction encoded with VEX.L= 1 will cause
an #UD exception.
-3-456 Vol. 2A
+3-458 Vol. 2A
INSERTPS—Insert Scalar Single-Precision Floating-Point Value
@@ -101470,7 +101607,7 @@ TMP2[127:64] DEST[127:64]
2: TMP2[95:64] TMP
INSERTPS—Insert Scalar Single-Precision Floating-Point Value
-Vol. 2A 3-457
+Vol. 2A 3-459
INSTRUCTION SET REFERENCE, A-L
@@ -101504,7 +101641,7 @@ If VEX.L = 0.
EVEX-encoded instruction, see Exceptions Type E9NF.
-3-458 Vol. 2A
+3-460 Vol. 2A
INSERTPS—Insert Scalar Single-Precision Floating-Point Value
@@ -101645,7 +101782,7 @@ than the DPL value in the selected gate descriptor in the IDT. In contrast, the
1. The mnemonic ICEBP has also been used for the instruction with opcode F1.
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
-Vol. 2A 3-459
+Vol. 2A 3-461
INSTRUCTION SET REFERENCE, A-L
@@ -101844,17 +101981,22 @@ a protected mode interrupt to privilege level 0. The interrupt gate's DPL must b
interrupt handler procedure must be 0 to execute the protected mode interrupt to privilege level 0.
The interrupt descriptor table register (IDTR) specifies the base linear address and limit of the IDT. The initial base
address value of the IDTR after the processor is powered up or reset is 0.
+Instruction ordering. Instructions following an INT n may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the INT n have
+completed execution (the later instructions may execute before data stored by the earlier instructions have become
+globally visible). This applies also to the INTO, INT3, and INT1 instructions, but not to executions of INTO when
+EFLAGS.OF = 0.
-Operation
-The following operational description applies not only to the INT n, INTO, INT3, or INT1 instructions, but also to
-external interrupts, nonmaskable interrupts (NMIs), and exceptions. Some of these events push onto the stack an
-error code.
-3-460 Vol. 2A
+3-462 Vol. 2A
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
INSTRUCTION SET REFERENCE, A-L
+Operation
+The following operational description applies not only to the INT n, INTO, INT3, or INT1 instructions, but also to
+external interrupts, nonmaskable interrupts (NMIs), and exceptions. Some of these events push onto the stack an
+error code.
The operational description specifies numerous checks whose failure may result in delivery of a nested exception.
In these cases, the original event is not delivered.
The operational description specifies the error code delivered by any nested exception. In some cases, the error
@@ -101904,17 +102046,16 @@ FI;
FI;
REAL-ADDRESS-MODE:
IF ((vector_number « 2) + 3) is not within IDT limit
-THEN #GP; FI;
-IF stack not large enough for a 6-byte return information
-THEN #SS; FI;
-Push (EFLAGS[15:0]);
-
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
-Vol. 2A 3-461
+Vol. 2A 3-463
INSTRUCTION SET REFERENCE, A-L
+THEN #GP; FI;
+IF stack not large enough for a 6-byte return information
+THEN #SS; FI;
+Push (EFLAGS[15:0]);
IF ← 0; (* Clear interrupt flag *)
TF ← 0; (* Clear trap flag *)
AC ← 0; (* Clear AC flag *)
@@ -101965,16 +102106,16 @@ IF gate not present
THEN #NP(error_code(vector_number,1,EXT));
(* idt operand to error_code set because vector is used *)
FI;
-GOTO TRAP-OR-INTERRUPT-GATE; (* Trap/interrupt gate *)
-END;
-TASK-GATE: (* PE = 1, task gate *)
-Read TSS selector in task gate (IDT descriptor);
-3-462 Vol. 2A
+3-464 Vol. 2A
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
INSTRUCTION SET REFERENCE, A-L
+GOTO TRAP-OR-INTERRUPT-GATE; (* Trap/interrupt gate *)
+END;
+TASK-GATE: (* PE = 1, task gate *)
+Read TSS selector in task gate (IDT descriptor);
IF local/global bit is set to local or index not within GDT limits
THEN #GP(error_code(TSS selector,0,EXT)); FI;
(* idt operand to error_code is 0 because selector is used *)
@@ -102025,16 +102166,16 @@ GOTO INTERRUPT-FROM-VIRTUAL-8086-MODE; FI;
FI;
ELSE (* PE = 1, interrupt or trap gate, DPL ≥ CPL *)
IF VM = 1
-THEN #GP(error_code(new code-segment selector,0,EXT));
-(* idt operand to error_code is 0 because selector is used *)
-IF new code segment is conforming or new code-segment DPL = CPL
-THEN
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
-Vol. 2A 3-463
+Vol. 2A 3-465
INSTRUCTION SET REFERENCE, A-L
+THEN #GP(error_code(new code-segment selector,0,EXT));
+(* idt operand to error_code is 0 because selector is used *)
+IF new code segment is conforming or new code-segment DPL = CPL
+THEN
GOTO INTRA-PRIVILEGE-LEVEL-INTERRUPT;
ELSE (* PE = 1, interrupt or trap gate, nonconforming code segment, DPL > CPL *)
#GP(error_code(new code-segment selector,0,EXT));
@@ -102086,16 +102227,17 @@ FI;
IF (TSSstackAddress + 7) > current TSS limit
THEN #TS(error_code(current TSS selector,0,EXT); FI;
(* idt operand to error_code is 0 because selector is used *)
-NewRSP ← 8 bytes loaded from (current TSS base + TSSstackAddress);
-NewSS ← new code-segment DPL; (* NULL selector with RPL = new CPL *)
-FI;
-IF IDT gate is 32-bit
-3-464 Vol. 2A
+3-466 Vol. 2A
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
INSTRUCTION SET REFERENCE, A-L
+NewRSP ← 8 bytes loaded from (current TSS base + TSSstackAddress);
+NewSS ← new code-segment DPL; (* NULL selector with RPL = new CPL *)
+
+FI;
+IF IDT gate is 32-bit
THEN
IF new stack does not have room for 24 bytes (error code pushed)
or 20 bytes (no error code pushed)
@@ -102146,16 +102288,16 @@ Push(far pointer to old stack);
Push(EFLAGS);
Push(far pointer to return instruction);
(* Old CS and EIP, 3 words padded to 4 *)
-Push(ErrorCode); (* If needed, 4 bytes *)
-ELSE
-IF IDT gate 16-bit
-THEN
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
-Vol. 2A 3-465
+Vol. 2A 3-467
INSTRUCTION SET REFERENCE, A-L
+Push(ErrorCode); (* If needed, 4 bytes *)
+ELSE
+IF IDT gate 16-bit
+THEN
Push(far pointer to old stack);
(* Old SS and SP, 2 words *)
Push(EFLAGS(15:0]);
@@ -102206,16 +102348,16 @@ Read new stack-segment descriptor for NewSS in GDT or LDT;
IF new stack-segment DPL ≠ 0 or stack segment does not indicate writable data segment
THEN #TS(error_code(NewSS,0,EXT)); FI;
(* idt operand to error_code is 0 because selector is used *)
-IF new stack segment not present
-THEN #SS(error_code(NewSS,0,EXT)); FI;
-(* idt operand to error_code is 0 because selector is used *)
-IF IDT gate is 32-bit
-3-466 Vol. 2A
+3-468 Vol. 2A
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
INSTRUCTION SET REFERENCE, A-L
+IF new stack segment not present
+THEN #SS(error_code(NewSS,0,EXT)); FI;
+(* idt operand to error_code is 0 because selector is used *)
+IF IDT gate is 32-bit
THEN
IF new stack does not have room for 40 bytes (error code pushed)
or 36 bytes (no error code pushed)
@@ -102266,16 +102408,16 @@ EIP ← Gate(instruction pointer) AND 0000FFFFH;
FI;
(* Start execution of new routine in Protected Mode *)
END;
-INTRA-PRIVILEGE-LEVEL-INTERRUPT:
-(* PE = 1, DPL = CPL or conforming segment *)
-IF IA32_EFER.LMA = 1 (* IA-32e mode *)
-IF IDT-descriptor IST ≠ 0
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
-Vol. 2A 3-467
+Vol. 2A 3-469
INSTRUCTION SET REFERENCE, A-L
+INTRA-PRIVILEGE-LEVEL-INTERRUPT:
+(* PE = 1, DPL = CPL or conforming segment *)
+IF IA32_EFER.LMA = 1 (* IA-32e mode *)
+IF IDT-descriptor IST ≠ 0
THEN
TSSstackAddress ← (IDT-descriptor IST « 3) + 28;
IF (TSSstackAddress + 7) > TSS limit
@@ -102283,7 +102425,6 @@ THEN #TS(error_code(current TSS selector,0,EXT)); FI;
(* idt operand to error_code is 0 because selector is used *)
NewRSP ← 8 bytes loaded from (current TSS base + TSSstackAddress);
ELSE NewRSP ← RSP;
-
FI;
FI;
IF 32-bit gate (* implies IA32_EFER.LMA = 0 *)
@@ -102327,16 +102468,16 @@ ELSE (* IA32_EFER.LMA = 1, 64-bit gate*)
Push(far pointer to old stack);
(* Old SS and SP, each an 8-byte push *)
Push(RFLAGS); (* 8-byte push *)
-Push(far pointer to return instruction);
-(* Old CS and RIP, each an 8-byte push *)
-Push(ErrorCode); (* If needed, 8 bytes *)
-CS:RIP ← GATE(CS:RIP);
-3-468 Vol. 2A
+3-470 Vol. 2A
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
INSTRUCTION SET REFERENCE, A-L
+Push(far pointer to return instruction);
+(* Old CS and RIP, each an 8-byte push *)
+Push(ErrorCode); (* If needed, 8 bytes *)
+CS:RIP ← GATE(CS:RIP);
(* Segment descriptor information also loaded *)
FI;
FI;
@@ -102407,7 +102548,7 @@ If alignment checking is enabled, the gate DPL is 3, and a stack push is unalign
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
-Vol. 2A 3-469
+Vol. 2A 3-471
INSTRUCTION SET REFERENCE, A-L
@@ -102484,7 +102625,7 @@ If alignment checking is enabled, the gate DPL is 3, and a stack push is unalign
Compatibility Mode Exceptions
Same exceptions as in protected mode.
-3-470 Vol. 2A
+3-472 Vol. 2A
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
@@ -102534,7 +102675,7 @@ If alignment checking is enabled, the gate DPL is 3, and a stack push is unalign
INT n/INTO/INT3/INT1—Call to Interrupt Procedure
-Vol. 2A 3-471
+Vol. 2A 3-473
INSTRUCTION SET REFERENCE, A-L
@@ -102623,19 +102764,22 @@ Protected Mode Exceptions
#GP(0)
If the current privilege level is not 0.
+If the processor reserved memory protections are activated.
#UD
+3-474 Vol. 2A
+
If the LOCK prefix is used.
+INVD—Invalidate Internal Caches
+
+ INSTRUCTION SET REFERENCE, A-L
+
Real-Address Mode Exceptions
#UD
-3-472 Vol. 2A
If the LOCK prefix is used.
-INVD—Invalidate Internal Caches
-
- INSTRUCTION SET REFERENCE, A-L
Virtual-8086 Mode Exceptions
#GP(0)
@@ -102650,7 +102794,7 @@ Same exceptions as in protected mode.
INVD—Invalidate Internal Caches
-Vol. 2A 3-473
+Vol. 2A 3-475
INSTRUCTION SET REFERENCE, A-L
@@ -102746,7 +102890,7 @@ If the LOCK prefix is used.
1. If the paging structures map the linear address using a page larger than 4 KBytes and there are multiple TLB entries for that page
(see Section 4.10.2.3, “Details of TLB Use,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A), the
instruction invalidates all of them.
-3-474 Vol. 2A
+3-476 Vol. 2A
INVLPG—Invalidate TLB Entries
@@ -102775,7 +102919,7 @@ If the LOCK prefix is used.
INVLPG—Invalidate TLB Entries
-Vol. 2A 3-475
+Vol. 2A 3-477
INSTRUCTION SET REFERENCE, A-L
@@ -102889,7 +103033,7 @@ Figure 3-24. INVPCID Descriptor
1. If the paging structures map the linear address using a page larger than 4 KBytes and there are multiple TLB entries for that page
(see Section 4.10.2.3, “Details of TLB Use,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A), the
instruction invalidates all of them.
-3-476 Vol. 2A
+3-478 Vol. 2A
INVPCID—Invalidate Process-Context Identifier
@@ -102962,7 +103106,7 @@ If the LOCK prefix is used.
INVPCID—Invalidate Process-Context Identifier
-Vol. 2A 3-477
+Vol. 2A 3-479
INSTRUCTION SET REFERENCE, A-L
@@ -103011,7 +103155,7 @@ If the memory destination operand is in the SS segment and the memory address is
If the LOCK prefix is used.
If CPUID.(EAX=07H, ECX=0H):EBX.INVPCID (bit 10) = 0.
-3-478 Vol. 2A
+3-480 Vol. 2A
INVPCID—Invalidate Process-Context Identifier
@@ -103135,13 +103279,14 @@ IA-32 Architectures Software Developer’s Manual, Volume 3A), execution of the
IRET/IRETD—Interrupt Return
-Vol. 2A 3-479
+Vol. 2A 3-481
INSTRUCTION SET REFERENCE, A-L
This unblocking occurs even if the instruction causes a fault. In such a case, NMIs are unmasked before the exception handler is invoked.
In 64-bit mode, the instruction’s default operation size is 32 bits. Use of the REX.W prefix promotes operation to 64
bits (IRETQ). See the summary chart at the beginning of this section for encoding data and limits.
+Instruction ordering. IRET is a serializing instruction. See Section 8.3 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.
See “Changes to Instruction Behavior in VMX Non-Root Operation” in Chapter 25 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C, for more information about the behavior of this instruction in
VMX non-root operation.
@@ -103186,14 +103331,14 @@ CS ← Pop(); (* 16-bit pop *)
EFLAGS[15:0] ← Pop(); (* IOPL in EFLAGS not modified by pop *)
IF EIP not within CS limit
THEN #GP(0); FI;
-FI;
-ELSE
-3-480 Vol. 2A
+3-482 Vol. 2A
IRET/IRETD—Interrupt Return
INSTRUCTION SET REFERENCE, A-L
+FI;
+ELSE
#GP(0); (* Trap to virtual-8086 monitor: PE
FI;
END;
@@ -103247,14 +103392,14 @@ THEN GOTO RETURN-TO-OUTER-PRIVILEGE-LEVEL;
ELSE GOTO RETURN-TO-SAME-PRIVILEGE-LEVEL; FI;
END;
RETURN-TO-OUTER-PRIVILEGE-LEVEL:
-IF OperandSize = 32
-THEN
IRET/IRETD—Interrupt Return
-Vol. 2A 3-481
+Vol. 2A 3-483
INSTRUCTION SET REFERENCE, A-L
+IF OperandSize = 32
+THEN
ESP ← Pop();
SS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *)
ELSE IF OperandSize = 16
@@ -103306,14 +103451,14 @@ FI;
EFLAGS (CF, PF, AF, ZF, SF, TF, DF, OF, NT) ← tempEFLAGS;
IF OperandSize = 32 or OperandSize = 64
THEN EFLAGS(RF, AC, ID) ← tempEFLAGS; FI;
-IF CPL ≤ IOPL
-THEN EFLAGS(IF) ← tempEFLAGS; FI;
-3-482 Vol. 2A
+3-484 Vol. 2A
IRET/IRETD—Interrupt Return
INSTRUCTION SET REFERENCE, A-L
+IF CPL ≤ IOPL
+THEN EFLAGS(IF) ← tempEFLAGS; FI;
IF CPL = 0
THEN
EFLAGS(IOPL) ← tempEFLAGS;
@@ -103362,17 +103507,17 @@ FI;
GOTO RETURN-TO-SAME-PRIVILEGE-LEVEL; FI;
END;
-Flags Affected
-All the flags and fields in the EFLAGS register are potentially modified, depending on the mode of operation of the
-processor. If performing a return from a nested task to a previous task, the EFLAGS register will be modified
-according to the EFLAGS image stored in the previous task’s TSS.
-
IRET/IRETD—Interrupt Return
-Vol. 2A 3-483
+Vol. 2A 3-485
INSTRUCTION SET REFERENCE, A-L
+Flags Affected
+All the flags and fields in the EFLAGS register are potentially modified, depending on the mode of operation of the
+processor. If performing a return from a nested task to a previous task, the EFLAGS register will be modified
+according to the EFLAGS image stored in the previous task’s TSS.
+
Protected Mode Exceptions
#GP(0)
@@ -103404,6 +103549,7 @@ If the top bytes of stack are not within stack limits.
If the return code segment is not present.
+If the return stack segment is not present.
#PF(fault-code)
If a page fault occurs.
@@ -103417,8 +103563,6 @@ enabled.
If the LOCK prefix is used.
-If the return stack segment is not present.
-
Real-Address Mode Exceptions
#GP
@@ -103457,7 +103601,7 @@ If EFLAGS.NT[bit 14] = 1.
Other exceptions same as in Protected Mode.
-3-484 Vol. 2A
+3-486 Vol. 2A
IRET/IRETD—Interrupt Return
@@ -103515,7 +103659,7 @@ If the LOCK prefix is used.
IRET/IRETD—Interrupt Return
-Vol. 2A 3-485
+Vol. 2A 3-487
INSTRUCTION SET REFERENCE, A-L
@@ -103969,7 +104113,7 @@ Valid
Jump near if above or equal (CF=0). Not
supported in 64-bit mode.
-3-486 Vol. 2A
+3-488 Vol. 2A
Jcc—Jump if Condition Is Met
@@ -104353,7 +104497,7 @@ Jump near if not carry (CF=0).
Jcc—Jump if Condition Is Met
-Vol. 2A 3-487
+Vol. 2A 3-489
INSTRUCTION SET REFERENCE, A-L
@@ -104708,7 +104852,7 @@ Valid
Jump near if sign (SF=1). Not supported in 64bit mode.
-3-488 Vol. 2A
+3-490 Vol. 2A
Jcc—Jump if Condition Is Met
@@ -104821,7 +104965,7 @@ Near is RIP = RIP + 32-bit offset sign extended to 64 bits.
Jcc—Jump if Condition Is Met
-Vol. 2A 3-489
+Vol. 2A 3-491
INSTRUCTION SET REFERENCE, A-L
@@ -104875,7 +105019,7 @@ If the memory address is in a non-canonical form.
If the LOCK prefix is used.
-3-490 Vol. 2A
+3-492 Vol. 2A
Jcc—Jump if Condition Is Met
@@ -105099,7 +105243,7 @@ an offset from the base of the code segment) or a relative offset (a signed disp
JMP—Jump
-Vol. 2A 3-491
+Vol. 2A 3-493
INSTRUCTION SET REFERENCE, A-L
@@ -105158,7 +105302,7 @@ The JMP instruction can also specify the segment selector of the TSS directly, w
task gate. See Chapter 7 in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for
detailed information on the mechanics of a task switch.
-3-492 Vol. 2A
+3-494 Vol. 2A
JMP—Jump
@@ -105167,10 +105311,14 @@ JMP—Jump
Note that when you execute at task switch with a JMP instruction, the nested task flag (NT) is not set in the EFLAGS
register and the new TSS’s previous task link field is not loaded with the old task’s TSS selector. A return to the
previous task can thus not be carried out by executing the IRET instruction. Switching tasks with the JMP instruction differs in this regard from the CALL instruction which does set the NT flag and save the previous task link information, allowing a return to the calling task with an IRET instruction.
-In 64-Bit Mode — The instruction’s operation size is fixed at 64 bits. If a selector points to a gate, then RIP equals
-the 64-bit displacement taken from gate; else RIP equals the zero-extended offset from the far pointer referenced
-in the instruction.
+In 64-Bit Mode. The instruction’s operation size is fixed at 64 bits. If a selector points to a gate, then RIP equals the
+64-bit displacement taken from gate; else RIP equals the zero-extended offset from the far pointer referenced in
+the instruction.
See the summary chart at the beginning of this section for encoding data and limits.
+Instruction ordering. Instructions following a far jump may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the far jump have
+completed execution (the later instructions may execute before data stored by the earlier instructions have
+become globally visible).
Operation
IF near jump
@@ -105211,22 +105359,21 @@ FI;
IF far jump and (PE = 0 or (PE = 1 AND VM = 1)) (* Real-address or virtual-8086 mode *)
THEN
tempEIP ← DEST(Offset); (* DEST is ptr16:32 or [m16:32] *)
-IF tempEIP is beyond code segment limit
-THEN #GP(0); FI;
-CS ← DEST(segment selector); (* DEST is ptr16:32 or [m16:32] *)
-IF OperandSize = 32
JMP—Jump
-Vol. 2A 3-493
+Vol. 2A 3-495
INSTRUCTION SET REFERENCE, A-L
+IF tempEIP is beyond code segment limit
+THEN #GP(0); FI;
+CS ← DEST(segment selector); (* DEST is ptr16:32 or [m16:32] *)
+IF OperandSize = 32
THEN
EIP ← tempEIP; (* DEST is ptr16:32 or [m16:32] *)
ELSE (* OperandSize = 16 *)
EIP ← tempEIP AND 0000FFFFH; (* Clear upper 16 bits *)
-
FI;
FI;
IF far jump and (PE = 1 and VM = 0)
@@ -105273,17 +105420,17 @@ tempEIP outside code segment limit
THEN #GP(0); FI
IF tempEIP is non-canonical
THEN #GP(0); FI;
-CS ← DEST[segment selector]; (* Segment descriptor information also loaded *)
-CS(RPL) ← CPL
-EIP ← tempEIP;
-END;
-3-494 Vol. 2A
+3-496 Vol. 2A
JMP—Jump
INSTRUCTION SET REFERENCE, A-L
+CS ← DEST[segment selector]; (* Segment descriptor information also loaded *)
+CS(RPL) ← CPL
+EIP ← tempEIP;
+END;
NONCONFORMING-CODE-SEGMENT:
IF L-Bit = 1 and D-BIT = 1 and IA32_EFER.LMA = 1
THEN GP(new code segment selector); FI;
@@ -105333,16 +105480,16 @@ CS(RPL) ← CPL;
EIP ← tempEIP;
END;
TASK-GATE:
-IF task gate DPL < CPL
-or task gate DPL < task gate segment-selector RPL
-THEN #GP(task gate selector); FI;
-IF task gate not present
JMP—Jump
-Vol. 2A 3-495
+Vol. 2A 3-497
INSTRUCTION SET REFERENCE, A-L
+IF task gate DPL < CPL
+or task gate DPL < task gate segment-selector RPL
+THEN #GP(task gate selector); FI;
+IF task gate not present
THEN #NP(gate selector); FI;
Read the TSS segment selector in the task-gate descriptor;
IF TSS segment selector local/global bit is set to local
@@ -105396,18 +105543,18 @@ If the segment descriptor for selector in a call gate does not indicate it is a
If the segment descriptor for the segment selector in a task gate does not indicate an available
TSS.
If the segment selector for a TSS has its local/global bit set for local.
-If a TSS segment descriptor specifies that the TSS is busy or not available.
-#SS(0)
-
-3-496 Vol. 2A
-
-If a memory operand effective address is outside the SS segment limit.
+3-498 Vol. 2A
JMP—Jump
INSTRUCTION SET REFERENCE, A-L
+If a TSS segment descriptor specifies that the TSS is busy or not available.
+#SS(0)
+
+If a memory operand effective address is outside the SS segment limit.
+
#NP (selector)
If the code segment being accessed is not present.
@@ -105492,16 +105639,16 @@ If the upper type field of a 64-bit call gate is not 0x0.
If the segment selector from a 64-bit call gate is beyond the descriptor table limits.
If the code segment descriptor pointed to by the selector in the 64-bit gate doesn't have the
L-bit set and the D-bit clear.
-If the segment descriptor for a segment selector from the 64-bit call gate does not indicate it
-is a code segment.
-If the code segment is non-conforming and CPL ≠ DPL.
JMP—Jump
-Vol. 2A 3-497
+Vol. 2A 3-499
INSTRUCTION SET REFERENCE, A-L
+If the segment descriptor for a segment selector from the 64-bit call gate does not indicate it
+is a code segment.
+If the code segment is non-conforming and CPL ≠ DPL.
If the code segment is confirming and CPL < DPL.
#NP(selector)
@@ -105521,7 +105668,7 @@ If a page fault occurs.
If alignment checking is enabled and an unaligned memory reference is made while the
current privilege level is 3.
-3-498 Vol. 2A
+3-500 Vol. 2A
JMP—Jump
@@ -105625,7 +105772,7 @@ See Exceptions Type K20.
KADDW/KADDB/KADDQ/KADDD—ADD Two Masks
-Vol. 2A 3-499
+Vol. 2A 3-501
INSTRUCTION SET REFERENCE, A-L
@@ -105647,7 +105794,7 @@ Feature
Flag
AVX512F
-VEX.NDS.L1.0F.W0 41 /r
+VEX.L1.0F.W0 41 /r
KANDW k1, k2, k3
VEX.L1.66.0F.W0 41 /r
KANDB k1, k2, k3
@@ -105731,7 +105878,7 @@ None
Other Exceptions
See Exceptions Type K20.
-3-500 Vol. 2A
+3-502 Vol. 2A
KANDW/KANDB/KANDQ/KANDD—Bitwise Logical AND Masks
@@ -105755,7 +105902,7 @@ Feature
Flag
AVX512F
-VEX.NDS.L1.0F.W0 42 /r
+VEX.L1.0F.W0 42 /r
KANDNW k1, k2, k3
VEX.L1.66.0F.W0 42 /r
KANDNB k1, k2, k3
@@ -105841,7 +105988,7 @@ See Exceptions Type K20.
KANDNW/KANDNB/KANDNQ/KANDND—Bitwise Logical AND NOT Masks
-Vol. 2A 3-501
+Vol. 2A 3-503
INSTRUCTION SET REFERENCE, A-L
@@ -106054,7 +106201,7 @@ to a general-purpose register (GPR), the result is zero-extended to the size of
default GPR destination’s size is 32 bits. In 64-bit mode, the default GPR destination’s size is 64 bits. Note that
VEX.W can only be used to modify the size of the GPR operand in 64b mode.
-3-502 Vol. 2A
+3-504 Vol. 2A
KMOVW/KMOVB/KMOVQ/KMOVD—Move from and to Mask Registers
@@ -106097,7 +106244,7 @@ Instructions with RM or MR operand encoding See Exceptions Type K21.
KMOVW/KMOVB/KMOVQ/KMOVD—Move from and to Mask Registers
-Vol. 2A 3-503
+Vol. 2A 3-505
INSTRUCTION SET REFERENCE, A-L
@@ -106198,7 +106345,7 @@ None
Other Exceptions
See Exceptions Type K20.
-3-504 Vol. 2A
+3-506 Vol. 2A
KNOTW/KNOTB/KNOTQ/KNOTD—NOT Mask Register
@@ -106222,7 +106369,7 @@ Feature
Flag
AVX512F
-VEX.NDS.L1.0F.W0 45 /r
+VEX.L1.0F.W0 45 /r
KORW k1, k2, k3
VEX.L1.66.0F.W0 45 /r
KORB k1, k2, k3
@@ -106308,7 +106455,7 @@ See Exceptions Type K20.
KORW/KORB/KORQ/KORD—Bitwise Logical OR Masks
-Vol. 2A 3-505
+Vol. 2A 3-507
INSTRUCTION SET REFERENCE, A-L
@@ -106408,7 +106555,7 @@ THEN CF  1
ELSE CF  0
FI;
-3-506 Vol. 2A
+3-508 Vol. 2A
KORTESTW/KORTESTB/KORTESTQ/KORTESTD—OR Masks And Set Flags
@@ -106448,7 +106595,7 @@ See Exceptions Type K20.
KORTESTW/KORTESTB/KORTESTQ/KORTESTD—OR Masks And Set Flags
-Vol. 2A 3-507
+Vol. 2A 3-509
INSTRUCTION SET REFERENCE, A-L
@@ -106552,7 +106699,7 @@ THEN
DEST[63:0]  SRC1[63:0] << COUNT;
FI;
-3-508 Vol. 2A
+3-510 Vol. 2A
KSHIFTLW/KSHIFTLB/KSHIFTLQ/KSHIFTLD—Shift Left Mask Registers
@@ -106580,7 +106727,7 @@ See Exceptions Type K20.
KSHIFTLW/KSHIFTLB/KSHIFTLQ/KSHIFTLD—Shift Left Mask Registers
-Vol. 2A 3-509
+Vol. 2A 3-511
INSTRUCTION SET REFERENCE, A-L
@@ -106684,7 +106831,7 @@ THEN
DEST[63:0]  SRC1[63:0] >> COUNT;
FI;
-3-510 Vol. 2A
+3-512 Vol. 2A
KSHIFTRW/KSHIFTRB/KSHIFTRQ/KSHIFTRD—Shift Right Mask Registers
@@ -106712,7 +106859,7 @@ See Exceptions Type K20.
KSHIFTRW/KSHIFTRB/KSHIFTRQ/KSHIFTRD—Shift Right Mask Registers
-Vol. 2A 3-511
+Vol. 2A 3-513
INSTRUCTION SET REFERENCE, A-L
@@ -106818,7 +106965,7 @@ ELSE CF  0;
FI;
AF  OF  PF  SF  0;
-3-512 Vol. 2A
+3-514 Vol. 2A
KTESTW/KTESTB/KTESTQ/KTESTD—Packed Bit Test Masks and Set Flags
@@ -106859,7 +107006,7 @@ See Exceptions Type K20.
KTESTW/KTESTB/KTESTQ/KTESTD—Packed Bit Test Masks and Set Flags
-Vol. 2A 3-513
+Vol. 2A 3-515
INSTRUCTION SET REFERENCE, A-L
@@ -106869,6 +107016,13 @@ Instruction
Op/En
+CPUID
+Feature
+Flag
+AVX512F
+
+Description
+
RVR
64/32
@@ -106876,20 +107030,13 @@ bit Mode
Support
V/V
-CPUID
-Feature
-Flag
-AVX512F
-
-VEX.NDS.L1.66.0F.W0 4B /r
+VEX.L1.66.0F.W0 4B /r
KUNPCKBW k1, k2, k3
-VEX.NDS.L1.0F.W0 4B /r
+VEX.L1.0F.W0 4B /r
KUNPCKWD k1, k2, k3
-VEX.NDS.L1.0F.W1 4B /r
+VEX.L1.0F.W1 4B /r
KUNPCKDQ k1, k2, k3
-Description
-
RVR
V/V
@@ -106902,11 +107049,12 @@ V/V
AVX512BW
-Unpack and interleave 8 bits masks in k2 and k3 and write
-word result in k1.
-Unpack and interleave 16 bits in k2 and k3 and write doubleword result in k1.
-Unpack and interleave 32 bits masks in k2 and k3 and write
-quadword result in k1.
+Unpack 16-bit masks in k2 and k3 and write doubleword result
+in k1.
+Unpack 32-bit masks in k2 and k3 and write quadword result
+in k1.
+
+Unpack 8-bit masks in k2 and k3 and write word result in k1.
Instruction Operand Encoding
Op/En
@@ -106957,7 +107105,7 @@ None
Other Exceptions
See Exceptions Type K20.
-3-514 Vol. 2A
+3-516 Vol. 2A
KUNPCKBW/KUNPCKWD/KUNPCKDQ—Unpack for Mask Registers
@@ -106969,6 +107117,13 @@ Instruction
Op/En
+CPUID
+Feature
+Flag
+AVX512F
+
+Description
+
RVR
64/32
@@ -106976,12 +107131,7 @@ bit Mode
Support
V/V
-CPUID
-Feature
-Flag
-AVX512F
-
-VEX.NDS.L1.0F.W0 46 /r
+VEX.L1.0F.W0 46 /r
KXNORW k1, k2, k3
VEX.L1.66.0F.W0 46 /r
KXNORB k1, k2, k3
@@ -106990,17 +107140,13 @@ KXNORQ k1, k2, k3
VEX.L1.66.0F.W1 46 /r
KXNORD k1, k2, k3
-Description
-
-Bitwise XNOR 16 bits masks k2 and k3 and place result in k1.
-
RVR
V/V
AVX512DQ
-Bitwise XNOR 8 bits masks k2 and k3 and place result in k1.
+Bitwise XNOR 8-bit masks k2 and k3 and place result in k1.
RVR
@@ -107008,7 +107154,7 @@ V/V
AVX512BW
-Bitwise XNOR 64 bits masks k2 and k3 and place result in k1.
+Bitwise XNOR 64-bit masks k2 and k3 and place result in k1.
RVR
@@ -107016,7 +107162,9 @@ V/V
AVX512BW
-Bitwise XNOR 32 bits masks k2 and k3 and place result in k1.
+Bitwise XNOR 32-bit masks k2 and k3 and place result in k1.
+
+Bitwise XNOR 16-bit masks k2 and k3 and place result in k1.
Instruction Operand Encoding
Op/En
@@ -107067,7 +107215,7 @@ See Exceptions Type K20.
KXNORW/KXNORB/KXNORQ/KXNORD—Bitwise Logical XNOR Masks
-Vol. 2A 3-515
+Vol. 2A 3-517
INSTRUCTION SET REFERENCE, A-L
@@ -107091,7 +107239,7 @@ bit Mode
Support
V/V
-VEX.NDS.L1.0F.W0 47 /r
+VEX.L1.0F.W0 47 /r
KXORW k1, k2, k3
VEX.L1.66.0F.W0 47 /r
KXORB k1, k2, k3
@@ -107106,7 +107254,7 @@ V/V
AVX512DQ
-Bitwise XOR 8 bits masks k2 and k3 and place result in k1.
+Bitwise XOR 8-bit masks k2 and k3 and place result in k1.
RVR
@@ -107114,7 +107262,7 @@ V/V
AVX512BW
-Bitwise XOR 64 bits masks k2 and k3 and place result in k1.
+Bitwise XOR 64-bit masks k2 and k3 and place result in k1.
RVR
@@ -107122,9 +107270,9 @@ V/V
AVX512BW
-Bitwise XOR 32 bits masks k2 and k3 and place result in k1.
+Bitwise XOR 32-bit masks k2 and k3 and place result in k1.
-Bitwise XOR 16 bits masks k2 and k3 and place result in k1.
+Bitwise XOR 16-bit masks k2 and k3 and place result in k1.
Instruction Operand Encoding
Op/En
@@ -107173,7 +107321,7 @@ None
Other Exceptions
See Exceptions Type K20.
-3-516 Vol. 2A
+3-518 Vol. 2A
KXORW/KXORB/KXORQ/KXORD—Bitwise Logical XOR Masks
@@ -107269,7 +107417,7 @@ If the LOCK prefix is used.
LAHF—Load Status Flags into AH Register
-Vol. 2A 3-517
+Vol. 2A 3-519
INSTRUCTION SET REFERENCE, A-L
@@ -107389,7 +107537,7 @@ accessed
If the segment descriptor cannot be accessed or is an invalid type for the instruction, the ZF flag is cleared and no
access rights are loaded in the destination operand.
-3-518 Vol. 2A
+3-520 Vol. 2A
LAR—Load Access Rights Byte
@@ -107592,7 +107740,7 @@ The ZF flag is set to 1 if the access rights are loaded successfully; otherwise,
LAR—Load Access Rights Byte
-Vol. 2A 3-519
+Vol. 2A 3-521
INSTRUCTION SET REFERENCE, A-L
@@ -107656,7 +107804,7 @@ the current privilege level is 3.
If the LOCK prefix is used.
-3-520 Vol. 2A
+3-522 Vol. 2A
LAR—Load Access Rights Byte
@@ -107780,7 +107928,7 @@ DEST[MAXVL-1:128] (Unmodified)
LDDQU—Load Unaligned Integer 128 Bits
-Vol. 2A 3-521
+Vol. 2A 3-523
INSTRUCTION SET REFERENCE, A-L
@@ -107804,7 +107952,7 @@ Other Exceptions
See Exceptions Type 4;
Note treatment of #AC varies.
-3-522 Vol. 2A
+3-524 Vol. 2A
LDDQU—Load Unaligned Integer 128 Bits
@@ -107909,7 +108057,7 @@ If VEX.vvvv ≠ 1111B.
LDMXCSR—Load MXCSR Register
-Vol. 2A 3-523
+Vol. 2A 3-525
INSTRUCTION SET REFERENCE, A-L
@@ -108130,7 +108278,7 @@ IF SegmentSelector = NULL and ( (RPL = 3) or
(RPL ≠ 3 and RPL ≠ CPL) )
THEN #GP(0);
ELSE IF descriptor is in non-canonical space
-3-524 Vol. 2A
+3-526 Vol. 2A
LDS/LES/LFS/LGS/LSS—Load Far Pointer
@@ -108192,7 +108340,7 @@ THEN #GP(selector); FI;
LDS/LES/LFS/LGS/LSS—Load Far Pointer
-Vol. 2A 3-525
+Vol. 2A 3-527
INSTRUCTION SET REFERENCE, A-L
@@ -108275,7 +108423,7 @@ If a memory operand effective address is outside the SS segment limit.
If source operand is not a memory location.
If the LOCK prefix is used.
-3-526 Vol. 2A
+3-528 Vol. 2A
LDS/LES/LFS/LGS/LSS—Load Far Pointer
@@ -108355,7 +108503,7 @@ If the LOCK prefix is used.
LDS/LES/LFS/LGS/LSS—Load Far Pointer
-Vol. 2A 3-527
+Vol. 2A 3-529
INSTRUCTION SET REFERENCE, A-L
@@ -108522,7 +108670,7 @@ in the requested 64-bit register destination (using REX.W).
64-bit effective address is calculated (default address size) and all 64-bits of the address are
stored in the requested 64-bit register destination (using REX.W).
-3-528 Vol. 2A
+3-530 Vol. 2A
Action Performed
@@ -108586,7 +108734,7 @@ Same exceptions as in protected mode.
Same exceptions as in protected mode.
LEA—Load Effective Address
-Vol. 2A 3-529
+Vol. 2A 3-531
INSTRUCTION SET REFERENCE, A-L
@@ -108692,7 +108840,7 @@ FI;
Flags Affected
None
-3-530 Vol. 2A
+3-532 Vol. 2A
LEAVE—High Level Procedure Exit
@@ -108762,7 +108910,7 @@ If the LOCK prefix is used.
LEAVE—High Level Procedure Exit
-Vol. 2A 3-531
+Vol. 2A 3-533
INSTRUCTION SET REFERENCE, A-L
@@ -108817,8 +108965,8 @@ Description
Performs a serializing operation on all load-from-memory instructions that were issued prior the LFENCE instruction. Specifically, LFENCE does not execute until all prior instructions have completed locally, and no later instruction begins execution until LFENCE completes. In particular, an instruction that loads from memory and that
precedes an LFENCE receives data from memory prior to completion of the LFENCE. (An LFENCE that follows an
instruction that stores to memory might complete before the data being stored have become globally visible.)
-Instructions following an LFENCE may be fetched from memory before the LFENCE, but they will not execute until
-the LFENCE completes.
+Instructions following an LFENCE may be fetched from memory before the LFENCE, but they will not execute (even
+speculatively) until the LFENCE completes.
Weakly ordered memory types can be used to achieve higher processor performance through such techniques as
out-of-order issue and speculative reads. The degree to which a consumer of data recognizes or knows that the
data is weakly ordered varies among applications and may be unknown to the producer of this data. The LFENCE
@@ -108843,7 +108991,7 @@ Exceptions (All Modes of Operation)
If CPUID.01H:EDX.SSE2[bit 26] = 0.
If the LOCK prefix is used.
-3-532 Vol. 2A
+3-534 Vol. 2A
LFENCE—Load Fence
@@ -108951,7 +109099,7 @@ Developer’s Manual, Volume 2B, for information on storing the contents of the
LGDT/LIDT—Load Global/Interrupt Descriptor Table Register
-Vol. 2A 3-533
+Vol. 2A 3-535
INSTRUCTION SET REFERENCE, A-L
@@ -109014,7 +109162,7 @@ If a memory operand effective address is outside the SS segment limit.
If a page fault occurs.
-3-534 Vol. 2A
+3-536 Vol. 2A
LGDT/LIDT—Load Global/Interrupt Descriptor Table Register
@@ -109063,7 +109211,7 @@ If a page fault occurs.
LGDT/LIDT—Load Global/Interrupt Descriptor Table Register
-Vol. 2A 3-535
+Vol. 2A 3-537
INSTRUCTION SET REFERENCE, A-L
@@ -109147,7 +109295,7 @@ FI;
Flags Affected
None
-3-536 Vol. 2A
+3-538 Vol. 2A
LLDT—Load Local Descriptor Table Register
@@ -109223,7 +109371,7 @@ If the LOCK prefix is used.
LLDT—Load Local Descriptor Table Register
-Vol. 2A 3-537
+Vol. 2A 3-539
INSTRUCTION SET REFERENCE, A-L
@@ -109329,7 +109477,7 @@ If a memory operand effective address is outside the CS, DS, ES, FS, or GS segme
If the LOCK prefix is used.
-3-538 Vol. 2A
+3-540 Vol. 2A
LMSW—Load Machine Status Word
@@ -109367,7 +109515,7 @@ If the LOCK prefix is used.
LMSW—Load Machine Status Word
-Vol. 2A 3-539
+Vol. 2A 3-541
INSTRUCTION SET REFERENCE, A-L
@@ -109461,7 +109609,7 @@ CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD,
XCHG.
Other exceptions can be generated by the instruction when the LOCK prefix is applied.
-3-540 Vol. 2A
+3-542 Vol. 2A
LOCK—Assert LOCK# Signal Prefix
@@ -109481,7 +109629,7 @@ Same exceptions as in protected mode.
LOCK—Assert LOCK# Signal Prefix
-Vol. 2A 3-541
+Vol. 2A 3-543
INSTRUCTION SET REFERENCE, A-L
@@ -109648,7 +109796,7 @@ After the byte, word, or doubleword is transferred from the memory location into
register. (If the DF flag is 0, the (E)SI register is incremented; if the DF flag is 1, the ESI register is decremented.)
The (E)SI register is incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for
doubleword operations.
-3-542 Vol. 2A
+3-544 Vol. 2A
LODS/LODSB/LODSW/LODSD/LODSQ—Load String
@@ -109731,7 +109879,7 @@ If the LOCK prefix is used.
LODS/LODSB/LODSW/LODSD/LODSQ—Load String
-Vol. 2A 3-543
+Vol. 2A 3-545
INSTRUCTION SET REFERENCE, A-L
@@ -109781,7 +109929,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-3-544 Vol. 2A
+3-546 Vol. 2A
LODS/LODSB/LODSW/LODSD/LODSQ—Load String
@@ -109897,7 +110045,7 @@ ELSE BranchCond ← 0;
LOOP/LOOPcc—Loop According to ECX Counter
-Vol. 2A 3-545
+Vol. 2A 3-547
INSTRUCTION SET REFERENCE, A-L
@@ -109966,7 +110114,7 @@ If the offset being jumped to is in a non-canonical form.
If the LOCK prefix is used.
-3-546 Vol. 2A
+3-548 Vol. 2A
LOOP/LOOPcc—Loop According to ECX Counter
@@ -110087,7 +110235,7 @@ value is loaded in the destination operand.
LSL—Load Segment Limit
-Vol. 2A 3-547
+Vol. 2A 3-549
INSTRUCTION SET REFERENCE, A-L
@@ -110291,7 +110439,7 @@ DEST ← temp AND FFFFH;
FI;
FI;
-3-548 Vol. 2A
+3-550 Vol. 2A
LSL—Load Segment Limit
@@ -110362,7 +110510,7 @@ If the LOCK prefix is used.
LSL—Load Segment Limit
-Vol. 2A 3-549
+Vol. 2A 3-551
INSTRUCTION SET REFERENCE, A-L
@@ -110444,7 +110592,7 @@ TaskRegister(SegmentDescriptor) ← TSSSegmentDescriptor;
Flags Affected
None
-3-550 Vol. 2A
+3-552 Vol. 2A
LTR—Load Task Register
@@ -110524,7 +110672,7 @@ If the LOCK prefix is used.
LTR—Load Task Register
-Vol. 2A 3-551
+Vol. 2A 3-553
INSTRUCTION SET REFERENCE, A-L
@@ -110625,7 +110773,7 @@ Flags Affected
ZF flag is set to 1 in case of zero output (most significant bit of the source is set), and to 0 otherwise, CF flag is set
to 1 if input was zero and cleared otherwise. OF, SF, PF and AF flags are undefined.
-3-552 Vol. 2A
+3-554 Vol. 2A
LZCNT— Count the Number of Leading Zero Bits
@@ -110710,11 +110858,11 @@ current privilege level is 3.
LZCNT— Count the Number of Leading Zero Bits
-Vol. 2A 3-553
+Vol. 2A 3-555
INSTRUCTION SET REFERENCE, A-L
-3-554 Vol. 2A
+3-556 Vol. 2A
LZCNT— Count the Number of Leading Zero Bits
@@ -111523,9 +111671,9 @@ SSE2
66 0F 5F /r
MAXPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 5F /r
+VEX.128.66.0F.WIG 5F /r
VMAXPD xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 5F /r
+VEX.256.66.0F.WIG 5F /r
VMAXPD ymm1, ymm2, ymm3/m256
Description
@@ -111542,7 +111690,7 @@ V/V
AVX
-EVEX.NDS.128.66.0F.W1 5F /r
+EVEX.128.66.0F.W1 5F /r
VMAXPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -111553,7 +111701,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F.W1 5F /r
+EVEX.256.66.0F.W1 5F /r
VMAXPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -111564,7 +111712,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F.W1 5F /r
+EVEX.512.66.0F.W1 5F /r
VMAXPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{sae}
@@ -111770,19 +111918,19 @@ SSE
NP 0F 5F /r
MAXPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 5F /r
+VEX.128.0F.WIG 5F /r
VMAXPS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.0F.WIG 5F /r
+VEX.256.0F.WIG 5F /r
VMAXPS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.0F.W0 5F /r
+EVEX.128.0F.W0 5F /r
VMAXPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 5F /r
+EVEX.256.0F.W0 5F /r
VMAXPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 5F /r
+EVEX.512.0F.W0 5F /r
VMAXPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{sae}
@@ -112022,10 +112170,10 @@ SSE2
F2 0F 5F /r
MAXSD xmm1, xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 5F /r
+VEX.LIG.F2.0F.WIG 5F /r
VMAXSD xmm1, xmm2,
xmm3/m64
-EVEX.NDS.LIG.F2.0F.W1 5F /r
+EVEX.LIG.F2.0F.W1 5F /r
VMAXSD xmm1 {k1}{z}, xmm2,
xmm3/m64{sae}
@@ -112193,10 +112341,10 @@ SSE
F3 0F 5F /r
MAXSS xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 5F /r
+VEX.LIG.F3.0F.WIG 5F /r
VMAXSS xmm1, xmm2,
xmm3/m32
-EVEX.NDS.LIG.F3.0F.W0 5F /r
+EVEX.LIG.F3.0F.W0 5F /r
VMAXSS xmm1 {k1}{z}, xmm2,
xmm3/m32{sae}
@@ -112450,19 +112598,19 @@ SSE2
66 0F 5D /r
MINPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 5D /r
+VEX.128.66.0F.WIG 5D /r
VMINPD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F.WIG 5D /r
+VEX.256.66.0F.WIG 5D /r
VMINPD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F.W1 5D /r
+EVEX.128.66.0F.W1 5D /r
VMINPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 5D /r
+EVEX.256.66.0F.W1 5D /r
VMINPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 5D /r
+EVEX.512.66.0F.W1 5D /r
VMINPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{sae}
@@ -112693,19 +112841,19 @@ SSE
NP 0F 5D /r
MINPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 5D /r
+VEX.128.0F.WIG 5D /r
VMINPS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.0F.WIG 5D /r
+VEX.256.0F.WIG 5D /r
VMINPS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.0F.W0 5D /r
+EVEX.128.0F.W0 5D /r
VMINPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 5D /r
+EVEX.256.0F.W0 5D /r
VMINPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 5D /r
+EVEX.512.0F.W0 5D /r
VMINPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{sae}
@@ -112944,9 +113092,9 @@ SSE2
F2 0F 5D /r
MINSD xmm1, xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 5D /r
+VEX.LIG.F2.0F.WIG 5D /r
VMINSD xmm1, xmm2, xmm3/m64
-EVEX.NDS.LIG.F2.0F.W1 5D /r
+EVEX.LIG.F2.0F.W1 5D /r
VMINSD xmm1 {k1}{z}, xmm2,
xmm3/m64{sae}
@@ -113112,9 +113260,9 @@ SSE
F3 0F 5D /r
MINSS xmm1,xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 5D /r
+VEX.LIG.F3.0F.WIG 5D /r
VMINSS xmm1,xmm2, xmm3/m32
-EVEX.NDS.LIG.F3.0F.W0 5D /r
+EVEX.LIG.F3.0F.W0 5D /r
VMINSS xmm1 {k1}{z}, xmm2,
xmm3/m32{sae}
@@ -117531,9 +117679,9 @@ SSE
NP 0F 12 /r
MOVHLPS xmm1, xmm2
-VEX.NDS.128.0F.WIG 12 /r
+VEX.128.0F.WIG 12 /r
VMOVHLPS xmm1, xmm2, xmm3
-EVEX.NDS.128.0F.W0 12 /r
+EVEX.128.0F.W0 12 /r
VMOVHLPS xmm1, xmm2, xmm3
RVM
@@ -117654,9 +117802,9 @@ SSE2
66 0F 16 /r
MOVHPD xmm1, m64
-VEX.NDS.128.66.0F.WIG 16 /r
+VEX.128.66.0F.WIG 16 /r
VMOVHPD xmm2, xmm1, m64
-EVEX.NDS.128.66.0F.W1 16 /r
+EVEX.128.66.0F.W1 16 /r
VMOVHPD xmm2, xmm1, m64
66 0F 17 /r
MOVHPD m64, xmm1
@@ -117854,9 +118002,9 @@ SSE
NP 0F 16 /r
MOVHPS xmm1, m64
-VEX.NDS.128.0F.WIG 16 /r
+VEX.128.0F.WIG 16 /r
VMOVHPS xmm2, xmm1, m64
-EVEX.NDS.128.0F.W0 16 /r
+EVEX.128.0F.W0 16 /r
VMOVHPS xmm2, xmm1, m64
NP 0F 17 /r
MOVHPS m64, xmm1
@@ -117995,8 +118143,8 @@ operand (the second operand) are copied to the lower 64-bits of the destination.
128-bit store:
Stores two packed single-precision floating-point values from the high 64-bits of the XMM register source (second
operand) to the 64-bit memory location (first operand).
-Note: VMOVHPS (store) (VEX.NDS.128.0F 17 /r) is legal and has the same behavior as the existing 0F 17 store.
-For VMOVHPS (store) VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instruction will #UD.
+Note: VMOVHPS (store) (VEX.128.0F 17 /r) is legal and has the same behavior as the existing 0F 17 store. For
+VMOVHPS (store) VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instruction will #UD.
If VMOVHPS is encoded with VEX.L or EVEX.L’L= 1, an attempt to execute the instruction encoded with VEX.L or
EVEX.L’L= 1 will cause an #UD exception.
@@ -118056,9 +118204,9 @@ SSE
NP 0F 16 /r
MOVLHPS xmm1, xmm2
-VEX.NDS.128.0F.WIG 16 /r
+VEX.128.0F.WIG 16 /r
VMOVLHPS xmm1, xmm2, xmm3
-EVEX.NDS.128.0F.W0 16 /r
+EVEX.128.0F.W0 16 /r
VMOVLHPS xmm1, xmm2, xmm3
RVM
@@ -118178,9 +118326,9 @@ SSE2
66 0F 12 /r
MOVLPD xmm1, m64
-VEX.NDS.128.66.0F.WIG 12 /r
+VEX.128.66.0F.WIG 12 /r
VMOVLPD xmm2, xmm1, m64
-EVEX.NDS.128.66.0F.W1 12 /r
+EVEX.128.66.0F.W1 12 /r
VMOVLPD xmm2, xmm1, m64
66 0F 13/r
MOVLPD m64, xmm1
@@ -118377,9 +118525,9 @@ SSE
NP 0F 12 /r
MOVLPS xmm1, m64
-VEX.NDS.128.0F.WIG 12 /r
+VEX.128.0F.WIG 12 /r
VMOVLPS xmm2, xmm1, m64
-EVEX.NDS.128.0F.W0 12 /r
+EVEX.128.0F.W0 12 /r
VMOVLPS xmm2, xmm1, m64
0F 13/r
MOVLPS m64, xmm1
@@ -118968,7 +119116,7 @@ Because the WC protocol uses a weakly-ordered memory consistency model, a fencin
a MFENCE instruction should be used in conjunction with MOVNTDQA instructions if multiple processors might use
different memory types for the referenced memory locations or to synchronize reads of a processor with writes by
other agents in the system. A processor’s implementation of the streaming load hint does not override the effective
-memory type, but the implementation of the hint is processor dependent. For example, a processor implementa1. ModRM.MOD = 011B required
+memory type, but the implementation of the hint is processor dependent. For example, a processor implementa1. ModRM.MOD != 011B
4-92 Vol. 2B
MOVNTDQA—Load Double Quadword Non-Temporal Aligned Hint
@@ -119143,7 +119291,7 @@ VL = 128, 256, 512
DEST[VL-1:0]  SRC[VL-1:0]
DEST[MAXVL-1:VL]  0
-1. ModRM.MOD = 011B required
+1. ModRM.MOD != 011B
4-94 Vol. 2B
MOVNTDQ—Store Packed Integers Using Non-Temporal Hint
@@ -119469,7 +119617,7 @@ VL = 128, 256, 512
DEST[VL-1:0]  SRC[VL-1:0]
DEST[MAXVL-1:VL]  0
-1. ModRM.MOD = 011B required
+1. ModRM.MOD != 011B
4-98 Vol. 2B
MOVNTPD—Store Packed Double-Precision Floating-Point Values Using Non-Temporal Hint
@@ -119632,7 +119780,7 @@ VL = 128, 256, 512
DEST[VL-1:0]  SRC[VL-1:0]
DEST[MAXVL-1:VL]  0
-1. ModRM.MOD = 011B required
+1. ModRM.MOD != 011B
4-100 Vol. 2B
MOVNTPS—Store Packed Single-Precision Floating-Point Values Using Non-Temporal Hint
@@ -119934,7 +120082,7 @@ MOVQ instruction when source operand is memory location and destination
operand is XMM register:
DEST[63:0] ← SRC;
DEST[127:64] ← 0000000000000000H;
-VMOVQ (VEX.NDS.128.F3.0F 7E) with XMM register source and destination
+VMOVQ (VEX.128.F3.0F 7E) with XMM register source and destination
DEST[63:0] ← SRC[63:0]
DEST[MAXVL-1:64] ← 0
VMOVQ (VEX.128.66.0F D6) with XMM register source and destination
@@ -120457,19 +120605,19 @@ F2 0F 10 /r
MOVSD xmm1, m64
F2 0F 11 /r
MOVSD xmm1/m64, xmm2
-VEX.NDS.LIG.F2.0F.WIG 10 /r
+VEX.LIG.F2.0F.WIG 10 /r
VMOVSD xmm1, xmm2, xmm3
VEX.LIG.F2.0F.WIG 10 /r
VMOVSD xmm1, m64
-VEX.NDS.LIG.F2.0F.WIG 11 /r
+VEX.LIG.F2.0F.WIG 11 /r
VMOVSD xmm1, xmm2, xmm3
VEX.LIG.F2.0F.WIG 11 /r
VMOVSD m64, xmm1
-EVEX.NDS.LIG.F2.0F.W1 10 /r
+EVEX.LIG.F2.0F.W1 10 /r
VMOVSD xmm1 {k1}{z}, xmm2, xmm3
EVEX.LIG.F2.0F.W1 10 /r
VMOVSD xmm1 {k1}{z}, m64
-EVEX.NDS.LIG.F2.0F.W1 11 /r
+EVEX.LIG.F2.0F.W1 11 /r
VMOVSD xmm1 {k1}{z}, xmm2, xmm3
EVEX.LIG.F2.0F.W1 11 /r
VMOVSD m64 {k1}, xmm1
@@ -120694,7 +120842,7 @@ EVEX encoded versions: The low quadword of the destination is updated according
Note: For VMOVSD (memory store and load forms), VEX.vvvv and EVEX.vvvv are reserved and must be 1111b,
otherwise instruction will #UD.
Operation
-VMOVSD (EVEX.NDS.LIG.F2.0F 10 /r: VMOVSD xmm1, m64 with support for 32 registers)
+VMOVSD (EVEX.LIG.F2.0F 10 /r: VMOVSD xmm1, m64 with support for 32 registers)
IF k1[0] or *no writemask*
THEN
DEST[63:0]  SRC[63:0]
@@ -120708,7 +120856,7 @@ THEN DEST[63:0]  0
FI;
FI;
DEST[MAXVL-1:64]  0
-VMOVSD (EVEX.NDS.LIG.F2.0F 11 /r: VMOVSD m64, xmm1 with support for 32 registers)
+VMOVSD (EVEX.LIG.F2.0F 11 /r: VMOVSD m64, xmm1 with support for 32 registers)
IF k1[0] or *no writemask*
THEN
DEST[63:0]  SRC[63:0]
@@ -120716,7 +120864,7 @@ ELSE
*DEST[63:0] remains unchanged*
; merging-masking
FI;
-VMOVSD (EVEX.NDS.LIG.F2.0F 11 /r: VMOVSD xmm1, xmm2, xmm3)
+VMOVSD (EVEX.LIG.F2.0F 11 /r: VMOVSD xmm1, xmm2, xmm3)
IF k1[0] or *no writemask*
THEN
DEST[63:0]  SRC2[63:0]
@@ -120741,15 +120889,15 @@ MOVSD—Move or Merge Scalar Double-Precision Floating-Point Value
MOVSD (128-bit Legacy SSE version: MOVSD XMM1, XMM2)
DEST[63:0] SRC[63:0]
DEST[MAXVL-1:64] (Unmodified)
-VMOVSD (VEX.NDS.128.F2.0F 11 /r: VMOVSD xmm1, xmm2, xmm3)
+VMOVSD (VEX.128.F2.0F 11 /r: VMOVSD xmm1, xmm2, xmm3)
DEST[63:0] SRC2[63:0]
DEST[127:64] SRC1[127:64]
DEST[MAXVL-1:128] 0
-VMOVSD (VEX.NDS.128.F2.0F 10 /r: VMOVSD xmm1, xmm2, xmm3)
+VMOVSD (VEX.128.F2.0F 10 /r: VMOVSD xmm1, xmm2, xmm3)
DEST[63:0] SRC2[63:0]
DEST[127:64] SRC1[127:64]
DEST[MAXVL-1:128] 0
-VMOVSD (VEX.NDS.128.F2.0F 10 /r: VMOVSD xmm1, m64)
+VMOVSD (VEX.128.F2.0F 10 /r: VMOVSD xmm1, m64)
DEST[63:0] SRC[63:0]
DEST[MAXVL-1:64] 0
MOVSD/VMOVSD (128-bit versions: MOVSD m64, xmm1 or VMOVSD m64, xmm1)
@@ -121335,17 +121483,17 @@ F3 0F 10 /r
MOVSS xmm1, xmm2
F3 0F 10 /r
MOVSS xmm1, m32
-VEX.NDS.LIG.F3.0F.WIG 10 /r
+VEX.LIG.F3.0F.WIG 10 /r
VMOVSS xmm1, xmm2, xmm3
VEX.LIG.F3.0F.WIG 10 /r
VMOVSS xmm1, m32
F3 0F 11 /r
MOVSS xmm2/m32, xmm1
-VEX.NDS.LIG.F3.0F.WIG 11 /r
+VEX.LIG.F3.0F.WIG 11 /r
VMOVSS xmm1, xmm2, xmm3
VEX.LIG.F3.0F.WIG 11 /r
VMOVSS m32, xmm1
-EVEX.NDS.LIG.F3.0F.W0 10 /r
+EVEX.LIG.F3.0F.W0 10 /r
VMOVSS xmm1 {k1}{z}, xmm2, xmm3
A
@@ -121392,7 +121540,7 @@ AVX512F
EVEX.LIG.F3.0F.W0 10 /r
VMOVSS xmm1 {k1}{z}, m32
-EVEX.NDS.LIG.F3.0F.W0 11 /r
+EVEX.LIG.F3.0F.W0 11 /r
VMOVSS xmm1 {k1}{z}, xmm2, xmm3
F
@@ -121566,7 +121714,7 @@ and must be 1111b otherwise instruction will #UD.
Software should ensure VMOVSS is encoded with VEX.L=0. Encoding VMOVSS with VEX.L=1 may encounter
unpredictable behavior across different processor generations.
Operation
-VMOVSS (EVEX.NDS.LIG.F3.0F.W0 11 /r when the source operand is memory and the destination is an XMM register)
+VMOVSS (EVEX.LIG.F3.0F.W0 11 /r when the source operand is memory and the destination is an XMM register)
IF k1[0] or *no writemask*
THEN
DEST[31:0]  SRC[31:0]
@@ -121580,7 +121728,7 @@ THEN DEST[31:0]  0
FI;
FI;
DEST[MAXVL-1:32]  0
-VMOVSS (EVEX.NDS.LIG.F3.0F.W0 10 /r when the source operand is an XMM register and the destination is memory)
+VMOVSS (EVEX.LIG.F3.0F.W0 10 /r when the source operand is an XMM register and the destination is memory)
IF k1[0] or *no writemask*
THEN
DEST[31:0]  SRC[31:0]
@@ -121595,7 +121743,7 @@ Vol. 2B 4-121
INSTRUCTION SET REFERENCE, M-U
-VMOVSS (EVEX.NDS.LIG.F3.0F.W0 10/11 /r where the source and destination are XMM registers)
+VMOVSS (EVEX.LIG.F3.0F.W0 10/11 /r where the source and destination are XMM registers)
IF k1[0] or *no writemask*
THEN
DEST[31:0]  SRC2[31:0]
@@ -121613,15 +121761,15 @@ DEST[MAXVL-1:128]  0
MOVSS (Legacy SSE version when the source and destination operands are both XMM registers)
DEST[31:0] SRC[31:0]
DEST[MAXVL-1:32] (Unmodified)
-VMOVSS (VEX.NDS.128.F3.0F 11 /r where the destination is an XMM register)
+VMOVSS (VEX.128.F3.0F 11 /r where the destination is an XMM register)
DEST[31:0] SRC2[31:0]
DEST[127:32] SRC1[127:32]
DEST[MAXVL-1:128] 0
-VMOVSS (VEX.NDS.128.F3.0F 10 /r where the source and destination are XMM registers)
+VMOVSS (VEX.128.F3.0F 10 /r where the source and destination are XMM registers)
DEST[31:0] SRC2[31:0]
DEST[127:32] SRC1[127:32]
DEST[MAXVL-1:128] 0
-VMOVSS (VEX.NDS.128.F3.0F 10 /r when the source operand is memory and the destination is an XMM register)
+VMOVSS (VEX.128.F3.0F 10 /r when the source operand is memory and the destination is an XMM register)
DEST[31:0] SRC[31:0]
DEST[MAXVL-1:32] 0
MOVSS/VMOVSS (when the source operand is an XMM register and the destination is memory)
@@ -122825,10 +122973,10 @@ xmm3/m128 are determined by imm8.
MPSADBW xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F3A.WIG 42 /r ib
+VEX.128.66.0F3A.WIG 42 /r ib
VMPSADBW xmm1, xmm2, xmm3/m128, imm8
-VEX.NDS.256.66.0F3A.WIG 42 /r ib
+VEX.256.66.0F3A.WIG 42 /r ib
VMPSADBW ymm1, ymm2, ymm3/m256, imm8
Instruction Operand Encoding
@@ -123530,17 +123678,17 @@ SSE2
66 0F 59 /r
MULPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 59 /r
+VEX.128.66.0F.WIG 59 /r
VMULPD xmm1,xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 59 /r
+VEX.256.66.0F.WIG 59 /r
VMULPD ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.66.0F.W1 59 /r
+EVEX.128.66.0F.W1 59 /r
VMULPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 59 /r
+EVEX.256.66.0F.W1 59 /r
VMULPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 59 /r
+EVEX.512.66.0F.W1 59 /r
VMULPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
@@ -123757,17 +123905,17 @@ SSE
NP 0F 59 /r
MULPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 59 /r
+VEX.128.0F.WIG 59 /r
VMULPS xmm1,xmm2, xmm3/m128
-VEX.NDS.256.0F.WIG 59 /r
+VEX.256.0F.WIG 59 /r
VMULPS ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.0F.W0 59 /r
+EVEX.128.0F.W0 59 /r
VMULPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 59 /r
+EVEX.256.0F.W0 59 /r
VMULPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 59 /r
+EVEX.512.0F.W0 59 /r
VMULPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst {er}
@@ -123994,7 +124142,7 @@ SSE2
F2 0F 59 /r
MULSD xmm1,xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 59 /r
+VEX.LIG.F2.0F.WIG 59 /r
VMULSD xmm1,xmm2, xmm3/m64
B
@@ -124003,7 +124151,7 @@ V/V
AVX
-EVEX.NDS.LIG.F2.0F.W1 59 /r
+EVEX.LIG.F2.0F.W1 59 /r
VMULSD xmm1 {k1}{z}, xmm2,
xmm3/m64 {er}
@@ -124159,7 +124307,7 @@ SSE
F3 0F 59 /r
MULSS xmm1,xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 59 /r
+VEX.LIG.F3.0F.WIG 59 /r
VMULSS xmm1,xmm2, xmm3/m32
B
@@ -124168,7 +124316,7 @@ V/V
AVX
-EVEX.NDS.LIG.F3.0F.W0 59 /r
+EVEX.LIG.F3.0F.W0 59 /r
VMULSS xmm1 {k1}{z}, xmm2,
xmm3/m32 {er}
@@ -124323,9 +124471,9 @@ Feature
Flag
BMI2
-VEX.NDD.LZ.F2.0F38.W0 F6 /r
+VEX.LZ.F2.0F38.W0 F6 /r
MULX r32a, r32b, r/m32
-VEX.NDD.LZ.F2.0F38.W1 F6 /r
+VEX.LZ.F2.0F38.W1 F6 /r
MULX r64a, r64b, r/m64
RVM
@@ -124404,10 +124552,7 @@ MULX — Unsigned Multiply Without Affecting Flags
INSTRUCTION SET REFERENCE, M-U
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
-
-If VEX.W = 1.
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
MULX — Unsigned Multiply Without Affecting Flags
@@ -125700,17 +125845,17 @@ SSE2
66 0F 56/r
ORPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F 56 /r
+VEX.128.66.0F 56 /r
VORPD xmm1,xmm2, xmm3/m128
-VEX.NDS.256.66.0F 56 /r
+VEX.256.66.0F 56 /r
VORPD ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.66.0F.W1 56 /r
+EVEX.128.66.0F.W1 56 /r
VORPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 56 /r
+EVEX.256.66.0F.W1 56 /r
VORPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 56 /r
+EVEX.512.66.0F.W1 56 /r
VORPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -125921,17 +126066,17 @@ SSE
NP 0F 56 /r
ORPS xmm1, xmm2/m128
-VEX.NDS.128.0F 56 /r
+VEX.128.0F 56 /r
VORPS xmm1,xmm2, xmm3/m128
-VEX.NDS.256.0F 56 /r
+VEX.256.0F 56 /r
VORPS ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.0F.W0 56 /r
+EVEX.128.0F.W0 56 /r
VORPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 56 /r
+EVEX.256.0F.W0 56 /r
VORPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 56 /r
+EVEX.512.0F.W0 56 /r
VORPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -126948,6 +127093,7 @@ AVX512F
in ymm2/m256/m32bcst and store UNSIGNED
result in ymm1 using writemask k1.
+EVEX.512.66.0F38.W0 1E /r
VPABSD zmm1 {k1}{z}, zmm2/m512/m32bcst
C
@@ -127366,7 +127512,7 @@ integers from ymm2 and from ymm3/m256
into 16 packed signed word integers in
ymm1using signed saturation.
-EVEX.NDS.128.66.0F.WIG 63 /r
+EVEX.128.66.0F.WIG 63 /r
VPACKSSWB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -127381,7 +127527,7 @@ xmm2 and from xmm3/m128 into packed
signed byte integers in xmm1 using signed
saturation under writemask k1.
-EVEX.NDS.256.66.0F.WIG 63 /r
+EVEX.256.66.0F.WIG 63 /r
VPACKSSWB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -127396,7 +127542,7 @@ ymm2 and from ymm3/m256 into packed
signed byte integers in ymm1 using signed
saturation under writemask k1.
-EVEX.NDS.512.66.0F.WIG 63 /r
+EVEX.512.66.0F.WIG 63 /r
VPACKSSWB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -127410,7 +127556,7 @@ zmm2 and from zmm3/m512 into packed
signed byte integers in zmm1 using signed
saturation under writemask k1.
-EVEX.NDS.128.66.0F.W0 6B /r
+EVEX.128.66.0F.W0 6B /r
VPACKSSDW xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -127436,16 +127582,16 @@ PACKSSDW mm1, mm2/m64
66 0F 6B /r
PACKSSDW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 63 /r
+VEX.128.66.0F.WIG 63 /r
VPACKSSWB xmm1,xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 6B /r
+VEX.128.66.0F.WIG 6B /r
VPACKSSDW xmm1,xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 63 /r
+VEX.256.66.0F.WIG 63 /r
VPACKSSWB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG 6B /r
+VEX.256.66.0F.WIG 6B /r
VPACKSSDW ymm1, ymm2, ymm3/m256
4-186 Vol. 2B
@@ -127454,7 +127600,7 @@ PACKSSWB/PACKSSDW—Pack with Signed Saturation
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.256.66.0F.W0 6B /r
+EVEX.256.66.0F.W0 6B /r
VPACKSSDW ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -127470,7 +127616,7 @@ from ymm2 and from ymm3/m256/m32bcst
into packed signed word integers in ymm1
using signed saturation under writemask k1.
-EVEX.NDS.512.66.0F.W0 6B /r
+EVEX.512.66.0F.W0 6B /r
VPACKSSDW zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -127577,8 +127723,7 @@ A’
Figure 4-6. Operation of the PACKSSDW Instruction Using 64-bit Operands
PACKSSWB converts packed signed word integers in the first and second source operands into packed signed byte
integers using signed saturation to handle overflow conditions beyond the range of signed byte integers. If the
-signed doubleword value is beyond the range of an unsigned word (i.e. greater than 7FH or less than 80H), the
-saturated signed byte integer value of 7FH or 80H, respectively, is stored in the destination. PACKSSDW converts
+signed word value is beyond the range of a signed byte value (i.e., greater than 7FH or less than 80H), the saturated signed byte integer value of 7FH or 80H, respectively, is stored in the destination. PACKSSDW converts
packed signed doubleword integers in the first and second source operands into packed signed word integers using
signed saturation to handle overflow conditions beyond 7FFFH and 8000H.
EVEX encoded PACKSSWB: The first source operand is a ZMM/YMM/XMM register. The second source operand is a
@@ -127944,7 +128089,7 @@ SSE4_1
66 0F 38 2B /r
PACKUSDW xmm1, xmm2/m128
-VEX.NDS.128.66.0F38 2B /r
+VEX.128.66.0F38 2B /r
VPACKUSDW xmm1,xmm2,
xmm3/m128
@@ -127954,7 +128099,7 @@ V/V
AVX
-VEX.NDS.256.66.0F38 2B /r
+VEX.256.66.0F38 2B /r
VPACKUSDW ymm1, ymm2,
ymm3/m256
@@ -127964,7 +128109,7 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.W0 2B /r
+EVEX.128.66.0F38.W0 2B /r
VPACKUSDW xmm1{k1}{z},
xmm2, xmm3/m128/m32bcst
@@ -127975,7 +128120,7 @@ V/V
AVX512VL
AVX512BW
-EVEX.NDS.256.66.0F38.W0 2B /r
+EVEX.256.66.0F38.W0 2B /r
VPACKUSDW ymm1{k1}{z},
ymm2, ymm3/m256/m32bcst
@@ -127986,7 +128131,7 @@ V/V
AVX512VL
AVX512BW
-EVEX.NDS.512.66.0F38.W0 2B /r
+EVEX.512.66.0F38.W0 2B /r
VPACKUSDW zmm1{k1}{z},
zmm2, zmm3/m512/m32bcst
@@ -128361,7 +128506,7 @@ and 16signed word integers from
ymm3/m256 into 32 unsigned byte integers
in ymm1 using unsigned saturation.
-EVEX.NDS.128.66.0F.WIG 67 /r
+EVEX.128.66.0F.WIG 67 /r
VPACKUSWB xmm1{k1}{z}, xmm2, xmm3/m128
C
@@ -128376,7 +128521,7 @@ and signed word integers from xmm3/m128
into unsigned byte integers in xmm1 using
unsigned saturation under writemask k1.
-EVEX.NDS.256.66.0F.WIG 67 /r
+EVEX.256.66.0F.WIG 67 /r
VPACKUSWB ymm1{k1}{z}, ymm2, ymm3/m256
C
@@ -128391,7 +128536,7 @@ and signed word integers from ymm3/m256
into unsigned byte integers in ymm1 using
unsigned saturation under writemask k1.
-EVEX.NDS.512.66.0F.WIG 67 /r
+EVEX.512.66.0F.WIG 67 /r
VPACKUSWB zmm1{k1}{z}, zmm2, zmm3/m512
C
@@ -128410,10 +128555,10 @@ PACKUSWB mm, mm/m64
66 0F 67 /r
PACKUSWB xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 67 /r
+VEX.128.66.0F.WIG 67 /r
VPACKUSWB xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 67 /r
+VEX.256.66.0F.WIG 67 /r
VPACKUSWB ymm1, ymm2, ymm3/m256
NOTES:
@@ -128729,38 +128874,38 @@ PADDW xmm1, xmm2/m128
PADDD xmm1, xmm2/m128
66 0F D4 /r
PADDQ xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG FC /r
+VEX.128.66.0F.WIG FC /r
VPADDB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG FD /r
+VEX.128.66.0F.WIG FD /r
VPADDW xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG FE /r
+VEX.128.66.0F.WIG FE /r
VPADDD xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG D4 /r
+VEX.128.66.0F.WIG D4 /r
VPADDQ xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG FC /r
+VEX.256.66.0F.WIG FC /r
VPADDB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG FD /r
+VEX.256.66.0F.WIG FD /r
VPADDW ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG FE /r
+VEX.256.66.0F.WIG FE /r
VPADDD ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG D4 /r
+VEX.256.66.0F.WIG D4 /r
VPADDQ ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.66.0F.WIG FC /r
+EVEX.128.66.0F.WIG FC /r
VPADDB xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.128.66.0F.WIG FD /r
+EVEX.128.66.0F.WIG FD /r
VPADDW xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.128.66.0F.W0 FE /r
+EVEX.128.66.0F.W0 FE /r
VPADDD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F.W1 D4 /r
+EVEX.128.66.0F.W1 D4 /r
VPADDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.WIG FC /r
+EVEX.256.66.0F.WIG FC /r
VPADDB ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.256.66.0F.WIG FD /r
+EVEX.256.66.0F.WIG FD /r
VPADDW ymm1 {k1}{z}, ymm2,
ymm3/m256
4-204 Vol. 2B
@@ -128983,22 +129128,22 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F.W0 FE /r
+EVEX.256.66.0F.W0 FE /r
VPADDD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F.W1 D4 /r
+EVEX.256.66.0F.W1 D4 /r
VPADDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.WIG FC /r
+EVEX.512.66.0F.WIG FC /r
VPADDB zmm1 {k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.512.66.0F.WIG FD /r
+EVEX.512.66.0F.WIG FD /r
VPADDW zmm1 {k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.512.66.0F.W0 FE /r
+EVEX.512.66.0F.W0 FE /r
VPADDD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.512.66.0F.W1 D4 /r
+EVEX.512.66.0F.W1 D4 /r
VPADDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
NOTES:
@@ -129446,7 +129591,7 @@ Add packed signed word integers from
xmm2/m128 and xmm1 and saturate the
results.
-VEX.NDS.128.66.0F.WIG EC /r
+VEX.128.66.0F.WIG EC /r
VPADDSB xmm1, xmm2, xmm3/m128
B
@@ -129458,7 +129603,7 @@ AVX
Add packed signed byte integers from
xmm3/m128 and xmm2 saturate the results.
-VEX.NDS.128.66.0F.WIG ED /r
+VEX.128.66.0F.WIG ED /r
B
@@ -129490,7 +129635,7 @@ Add packed signed word integers from ymm2,
and ymm3/m256 and store the saturated
results in ymm1.
-EVEX.NDS.128.66.0F.WIG EC /r
+EVEX.128.66.0F.WIG EC /r
VPADDSB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -129504,7 +129649,7 @@ Add packed signed byte integers from xmm2,
and xmm3/m128 and store the saturated
results in xmm1 under writemask k1.
-EVEX.NDS.256.66.0F.WIG EC /r
+EVEX.256.66.0F.WIG EC /r
VPADDSB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -129518,7 +129663,7 @@ Add packed signed byte integers from ymm2,
and ymm3/m256 and store the saturated
results in ymm1 under writemask k1.
-EVEX.NDS.512.66.0F.WIG EC /r
+EVEX.512.66.0F.WIG EC /r
VPADDSB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -129531,7 +129676,7 @@ Add packed signed byte integers from zmm2,
and zmm3/m512 and store the saturated
results in zmm1 under writemask k1.
-EVEX.NDS.128.66.0F.WIG ED /r
+EVEX.128.66.0F.WIG ED /r
VPADDSW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -129545,7 +129690,7 @@ Add packed signed word integers from xmm2,
and xmm3/m128 and store the saturated
results in xmm1 under writemask k1.
-EVEX.NDS.256.66.0F.WIG ED /r
+EVEX.256.66.0F.WIG ED /r
VPADDSW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -129559,7 +129704,7 @@ Add packed signed word integers from ymm2,
and ymm3/m256 and store the saturated
results in ymm1 under writemask k1.
-EVEX.NDS.512.66.0F.WIG ED /r
+EVEX.512.66.0F.WIG ED /r
VPADDSW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -129581,9 +129726,9 @@ PADDSW mm, mm/m64
PADDSW xmm1, xmm2/m128
VPADDSW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG EC /r
+VEX.256.66.0F.WIG EC /r
VPADDSB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG ED /r
+VEX.256.66.0F.WIG ED /r
VPADDSW ymm1, ymm2, ymm3/m256
NOTES:
@@ -129880,7 +130025,7 @@ Add packed unsigned word integers from
xmm3/m128 to xmm2 and saturate the
results.
-VEX.NDS.256.66.0F.WIG DC /r
+VEX.256.66.0F.WIG DC /r
VPADDUSB ymm1, ymm2, ymm3/m256
B
@@ -129893,7 +130038,7 @@ Add packed unsigned byte integers from
ymm2, and ymm3/m256 and store the
saturated results in ymm1.
-VEX.NDS.256.66.0F.WIG DD /r
+VEX.256.66.0F.WIG DD /r
VPADDUSW ymm1, ymm2, ymm3/m256
B
@@ -129906,7 +130051,7 @@ Add packed unsigned word integers from
ymm2, and ymm3/m256 and store the
saturated results in ymm1.
-EVEX.NDS.128.66.0F.WIG DC /r
+EVEX.128.66.0F.WIG DC /r
VPADDUSB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -129921,7 +130066,7 @@ xmm2, and xmm3/m128 and store the
saturated results in xmm1 under writemask
k1.
-EVEX.NDS.256.66.0F.WIG DC /r
+EVEX.256.66.0F.WIG DC /r
VPADDUSB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -129936,7 +130081,7 @@ ymm2, and ymm3/m256 and store the
saturated results in ymm1 under writemask
k1.
-EVEX.NDS.512.66.0F.WIG DC /r
+EVEX.512.66.0F.WIG DC /r
VPADDUSB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -129950,7 +130095,7 @@ zmm2, and zmm3/m512 and store the
saturated results in zmm1 under writemask
k1.
-EVEX.NDS.128.66.0F.WIG DD /r
+EVEX.128.66.0F.WIG DD /r
VPADDUSW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -129965,7 +130110,7 @@ xmm2, and xmm3/m128 and store the
saturated results in xmm1 under writemask
k1.
-EVEX.NDS.256.66.0F.WIG DD /r
+EVEX.256.66.0F.WIG DD /r
VPADDUSW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -129987,9 +130132,9 @@ NP 0F DD /r1
PADDUSW mm, mm/m64
66 0F DD /r
PADDUSW xmm1, xmm2/m128
-VEX.NDS.128.660F.WIG DC /r
+VEX.128.660F.WIG DC /r
VPADDUSB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG DD /r
+VEX.128.66.0F.WIG DD /r
VPADDUSW xmm1, xmm2, xmm3/m128
PADDUSB/PADDUSW—Add Packed Unsigned Integers with Unsigned Saturation
@@ -129998,7 +130143,7 @@ Vol. 2B 4-215
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.512.66.0F.WIG DD /r
+EVEX.512.66.0F.WIG DD /r
VPADDUSW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -130290,7 +130435,7 @@ the right by constant values in imm8 from each
intermediate result, and two 16-byte results are
stored in ymm1.
-EVEX.NDS.128.66.0F3A.WIG 0F /r ib
+EVEX.128.66.0F3A.WIG 0F /r ib
VPALIGNR xmm1 {k1}{z}, xmm2, xmm3/m128,
imm8
@@ -130302,7 +130447,7 @@ AVX512VL Concatenate xmm2 and xmm3/m128 into a 32AVX512BW byte intermediate resu
result shifted to the right by constant value in
imm8 and result is stored in xmm1.
-EVEX.NDS.256.66.0F3A.WIG 0F /r ib
+EVEX.256.66.0F3A.WIG 0F /r ib
VPALIGNR ymm1 {k1}{z}, ymm2, ymm3/m256,
imm8
@@ -130317,7 +130462,7 @@ the right by constant values in imm8 from each
intermediate result, and two 16-byte results are
stored in ymm1.
-EVEX.NDS.512.66.0F3A.WIG 0F /r ib
+EVEX.512.66.0F3A.WIG 0F /r ib
VPALIGNR zmm1 {k1}{z}, zmm2, zmm3/m512,
imm8
@@ -130335,10 +130480,10 @@ stored in zmm1.
PALIGNR mm1, mm2/m64, imm8
66 0F 3A 0F /r ib
PALIGNR xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F3A.WIG 0F /r ib
+VEX.128.66.0F3A.WIG 0F /r ib
VPALIGNR xmm1, xmm2, xmm3/m128, imm8
-VEX.NDS.256.66.0F3A.WIG 0F /r ib
+VEX.256.66.0F3A.WIG 0F /r ib
VPALIGNR ymm1, ymm2, ymm3/m256, imm8
NOTES:
@@ -130594,7 +130739,7 @@ AVX2
Bitwise AND of ymm2, and ymm3/m256 and
store result in ymm1.
-EVEX.NDS.128.66.0F.W0 DB /r
+EVEX.128.66.0F.W0 DB /r
VPANDD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -130609,7 +130754,7 @@ Bitwise AND of packed doubleword integers in
xmm2 and xmm3/m128/m32bcst and store
result in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.W0 DB /r
+EVEX.256.66.0F.W0 DB /r
VPANDD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -130624,7 +130769,7 @@ Bitwise AND of packed doubleword integers in
ymm2 and ymm3/m256/m32bcst and store
result in ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.W0 DB /r
+EVEX.512.66.0F.W0 DB /r
VPANDD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -130638,7 +130783,7 @@ Bitwise AND of packed doubleword integers in
zmm2 and zmm3/m512/m32bcst and store
result in zmm1 using writemask k1.
-EVEX.NDS.128.66.0F.W1 DB /r
+EVEX.128.66.0F.W1 DB /r
VPANDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -130653,7 +130798,7 @@ Bitwise AND of packed quadword integers in
xmm2 and xmm3/m128/m64bcst and store
result in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.W1 DB /r
+EVEX.256.66.0F.W1 DB /r
VPANDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -130668,7 +130813,7 @@ Bitwise AND of packed quadword integers in
ymm2 and ymm3/m256/m64bcst and store
result in ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.W1 DB /r
+EVEX.512.66.0F.W1 DB /r
VPANDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -130685,9 +130830,9 @@ result in zmm1 using writemask k1.
PAND mm, mm/m64
66 0F DB /r
PAND xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG DB /r
+VEX.128.66.0F.WIG DB /r
VPAND xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG DB /r
+VEX.256.66.0F.WIG DB /r
VPAND ymm1, ymm2, ymm3/.m256
NOTES:
@@ -130922,7 +131067,7 @@ AVX2
Bitwise AND NOT of ymm2, and ymm3/m256
and store result in ymm1.
-EVEX.NDS.128.66.0F.W0 DF /r
+EVEX.128.66.0F.W0 DF /r
VPANDND xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -130937,7 +131082,7 @@ Bitwise AND NOT of packed doubleword
integers in xmm2 and xmm3/m128/m32bcst
and store result in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.W0 DF /r
+EVEX.256.66.0F.W0 DF /r
VPANDND ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -130952,7 +131097,7 @@ Bitwise AND NOT of packed doubleword
integers in ymm2 and ymm3/m256/m32bcst
and store result in ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.W0 DF /r
+EVEX.512.66.0F.W0 DF /r
VPANDND zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -130966,7 +131111,7 @@ Bitwise AND NOT of packed doubleword
integers in zmm2 and zmm3/m512/m32bcst
and store result in zmm1 using writemask k1.
-EVEX.NDS.128.66.0F.W1 DF /r
+EVEX.128.66.0F.W1 DF /r
VPANDNQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -130981,7 +131126,7 @@ Bitwise AND NOT of packed quadword
integers in xmm2 and xmm3/m128/m64bcst
and store result in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.W1 DF /r
+EVEX.256.66.0F.W1 DF /r
VPANDNQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -130996,7 +131141,7 @@ Bitwise AND NOT of packed quadword
integers in ymm2 and ymm3/m256/m64bcst
and store result in ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.W1 DF /r
+EVEX.512.66.0F.W1 DF /r
VPANDNQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -131013,9 +131158,9 @@ and store result in zmm1 using writemask k1.
PANDN mm, mm/m64
66 0F DF /r
PANDN xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG DF /r
+VEX.128.66.0F.WIG DF /r
VPANDN xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG DF /r
+VEX.256.66.0F.WIG DF /r
VPANDN ymm1, ymm2, ymm3/m256
NOTES:
@@ -131370,7 +131515,7 @@ AVX2
Average packed unsigned word integers from
ymm2, ymm3/m256 with rounding to ymm1.
-EVEX.NDS.128.66.0F.WIG E0 /r
+EVEX.128.66.0F.WIG E0 /r
VPAVGB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -131381,7 +131526,7 @@ AVX512VL Average packed unsigned byte integers from
AVX512BW xmm2, and xmm3/m128 with rounding and
store to xmm1 under writemask k1.
-EVEX.NDS.256.66.0F.WIG E0 /r
+EVEX.256.66.0F.WIG E0 /r
VPAVGB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -131392,7 +131537,7 @@ AVX512VL Average packed unsigned byte integers from
AVX512BW ymm2, and ymm3/m256 with rounding and
store to ymm1 under writemask k1.
-EVEX.NDS.512.66.0F.WIG E0 /r
+EVEX.512.66.0F.WIG E0 /r
VPAVGB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -131403,7 +131548,7 @@ AVX512BW Average packed unsigned byte integers from
zmm2, and zmm3/m512 with rounding and
store to zmm1 under writemask k1.
-EVEX.NDS.128.66.0F.WIG E3 /r
+EVEX.128.66.0F.WIG E3 /r
VPAVGW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -131414,7 +131559,7 @@ AVX512VL Average packed unsigned word integers from
AVX512BW xmm2, xmm3/m128 with rounding to xmm1
under writemask k1.
-EVEX.NDS.256.66.0F.WIG E3 /r
+EVEX.256.66.0F.WIG E3 /r
VPAVGW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -131425,7 +131570,7 @@ AVX512VL Average packed unsigned word integers from
AVX512BW ymm2, ymm3/m256 with rounding to ymm1
under writemask k1.
-EVEX.NDS.512.66.0F.WIG E3 /r
+EVEX.512.66.0F.WIG E3 /r
VPAVGW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -131443,13 +131588,13 @@ NP 0F E3 /r1
PAVGW mm1, mm2/m64
66 0F E3 /r
PAVGW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG E0 /r
+VEX.128.66.0F.WIG E0 /r
VPAVGB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG E3 /r
+VEX.128.66.0F.WIG E3 /r
VPAVGW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG E0 /r
+VEX.256.66.0F.WIG E0 /r
VPAVGB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG E3 /r
+VEX.256.66.0F.WIG E3 /r
VPAVGW ymm1, ymm2, ymm3/m256
NOTES:
@@ -131688,7 +131833,7 @@ xmm2/m128 from mask specified in the high
bit of each byte in XMM0 and store the
values into xmm1.
-VEX.NDS.128.66.0F3A.W0 4C /r /is4
+VEX.128.66.0F3A.W0 4C /r /is4
VPBLENDVB xmm1, xmm2, xmm3/m128, xmm4
RVMR V/V
@@ -131700,7 +131845,7 @@ xmm3/m128 using mask bits in the specified
mask register, xmm4, and store the values
into xmm1.
-VEX.NDS.256.66.0F3A.W0 4C /r /is4
+VEX.256.66.0F3A.W0 4C /r /is4
VPBLENDVB ymm1, ymm2, ymm3/m256, ymm4
RVMR V/V
@@ -131980,7 +132125,7 @@ Select words from xmm1 and xmm2/m128
from mask specified in imm8 and store the
values into xmm1.
-VEX.NDS.128.66.0F3A.WIG 0E /r ib
+VEX.128.66.0F3A.WIG 0E /r ib
VPBLENDW xmm1, xmm2, xmm3/m128, imm8
RVMI V/V
@@ -131991,7 +132136,7 @@ Select words from xmm2 and xmm3/m128
from mask specified in imm8 and store the
values into xmm1.
-VEX.NDS.256.66.0F3A.WIG 0E /r ib
+VEX.256.66.0F3A.WIG 0E /r ib
VPBLENDW ymm1, ymm2, ymm3/m256, imm8
RVMI V/V
@@ -132188,7 +132333,7 @@ xmm1 by one quadword of xmm2/m128,
stores the 128-bit result in xmm1. The immediate is used to determine which quadwords
of xmm1 and xmm2/m128 should be used.
-VEX.NDS.128.66.0F3A.WIG 44 /r ib
+VEX.128.66.0F3A.WIG 44 /r ib
VPCLMULQDQ xmm1, xmm2, xmm3/m128, imm8
RVMI V/V
@@ -132489,7 +132634,7 @@ AVX
Compare packed doublewords in xmm3/m128
and xmm2 for equality.
-VEX.NDS.256.66.0F.WIG 74 /r
+VEX.256.66.0F.WIG 74 /r
VPCMPEQB ymm1, ymm2, ymm3 /m256
B
@@ -132501,7 +132646,7 @@ AVX2
Compare packed bytes in ymm3/m256 and
ymm2 for equality.
-VEX.NDS.256.66.0F.WIG 75 /r
+VEX.256.66.0F.WIG 75 /r
B
@@ -132521,7 +132666,7 @@ AVX2
Compare packed doublewords in ymm3/m256
and ymm2 for equality.
-EVEX.NDS.128.66.0F.W0 76 /r
+EVEX.128.66.0F.W0 76 /r
C
VPCMPEQD k1 {k2}, xmm2, xmm3/m128/m32bcst
@@ -132536,7 +132681,7 @@ set vector mask k1 to reflect the
zero/nonzero status of each element of the
result, under writemask.
-EVEX.NDS.256.66.0F.W0 76 /r
+EVEX.256.66.0F.W0 76 /r
C
VPCMPEQD k1 {k2}, ymm2, ymm3/m256/m32bcst
@@ -132551,7 +132696,7 @@ set vector mask k1 to reflect the
zero/nonzero status of each element of the
result, under writemask.
-EVEX.NDS.512.66.0F.W0 76 /r
+EVEX.512.66.0F.W0 76 /r
C
VPCMPEQD k1 {k2}, zmm2, zmm3/m512/m32bcst
@@ -132564,7 +132709,7 @@ zmm2 and zmm3/m512/m32bcst, and set
destination k1 according to the comparison
results under writemask k2.
-EVEX.NDS.128.66.0F.WIG 74 /r
+EVEX.128.66.0F.WIG 74 /r
VPCMPEQB k1 {k2}, xmm2, xmm3 /m128
V/V
@@ -132585,15 +132730,15 @@ NP 0F 76 /r1
PCMPEQD mm, mm/m64
66 0F 76 /r
PCMPEQD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 74 /r
+VEX.128.66.0F.WIG 74 /r
VPCMPEQB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 75 /r
+VEX.128.66.0F.WIG 75 /r
VPCMPEQW xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 76 /r
+VEX.128.66.0F.WIG 76 /r
VPCMPEQD xmm1, xmm2, xmm3/m128
VPCMPEQW ymm1, ymm2, ymm3 /m256
-VEX.NDS.256.66.0F.WIG 76 /r
+VEX.256.66.0F.WIG 76 /r
VPCMPEQD ymm1, ymm2, ymm3 /m256
4-244 Vol. 2B
@@ -132604,7 +132749,7 @@ PCMPEQB/PCMPEQW/PCMPEQD— Compare Packed Data for Equal
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.256.66.0F.WIG 74 /r
+EVEX.256.66.0F.WIG 74 /r
VPCMPEQB k1 {k2}, ymm2, ymm3 /m256
D
@@ -132616,7 +132761,7 @@ AVX512BW ymm2 for equality and set vector mask k1 to
reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.512.66.0F.WIG 74 /r
+EVEX.512.66.0F.WIG 74 /r
VPCMPEQB k1 {k2}, zmm2, zmm3 /m512
D
@@ -132628,7 +132773,7 @@ zmm2 for equality and set vector mask k1 to
reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.128.66.0F.WIG 75 /r
+EVEX.128.66.0F.WIG 75 /r
VPCMPEQW k1 {k2}, xmm2, xmm3 /m128
D
@@ -132640,7 +132785,7 @@ AVX512BW xmm2 for equality and set vector mask k1 to
reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.256.66.0F.WIG 75 /r
+EVEX.256.66.0F.WIG 75 /r
VPCMPEQW k1 {k2}, ymm2, ymm3 /m256
D
@@ -132652,7 +132797,7 @@ AVX512BW ymm2 for equality and set vector mask k1 to
reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.512.66.0F.WIG 75 /r
+EVEX.512.66.0F.WIG 75 /r
VPCMPEQW k1 {k2}, zmm2, zmm3 /m512
D
@@ -132994,7 +133139,7 @@ SSE4_1
Compare packed qwords in xmm2/m128 and
xmm1 for equality.
-VEX.NDS.128.66.0F38.WIG 29 /r
+VEX.128.66.0F38.WIG 29 /r
VPCMPEQQ xmm1, xmm2, xmm3/m128
B
@@ -133006,7 +133151,7 @@ AVX
Compare packed quadwords in xmm3/m128
and xmm2 for equality.
-VEX.NDS.256.66.0F38.WIG 29 /r
+VEX.256.66.0F38.WIG 29 /r
VPCMPEQQ ymm1, ymm2, ymm3 /m256
B
@@ -133018,7 +133163,7 @@ AVX2
Compare packed quadwords in ymm3/m256
and ymm2 for equality.
-EVEX.NDS.128.66.0F38.W1 29 /r
+EVEX.128.66.0F38.W1 29 /r
VPCMPEQQ k1 {k2}, xmm2, xmm3/m128/m64bcst
C
@@ -133031,7 +133176,7 @@ set vector mask k1 to reflect the zero/nonzero
status of each element of the result, under
writemask.
-EVEX.NDS.256.66.0F38.W1 29 /r
+EVEX.256.66.0F38.W1 29 /r
VPCMPEQQ k1 {k2}, ymm2, ymm3/m256/m64bcst
C
@@ -133044,7 +133189,7 @@ set vector mask k1 to reflect the zero/nonzero
status of each element of the result, under
writemask.
-EVEX.NDS.512.66.0F38.W1 29 /r
+EVEX.512.66.0F38.W1 29 /r
VPCMPEQQ k1 {k2}, zmm2, zmm3/m512/m64bcst
C
@@ -133720,7 +133865,7 @@ AVX2
Compare packed signed doubleword integers in
ymm2 and ymm3/m256 for greater than.
-EVEX.NDS.128.66.0F.W0 66 /r
+EVEX.128.66.0F.W0 66 /r
VPCMPGTD k1 {k2}, xmm2,
xmm3/m128/m32bcst
@@ -133736,7 +133881,7 @@ int32 vector xmm3/m128/m32bcst, and set
vector mask k1 to reflect the zero/nonzero status
of each element of the result, under writemask.
-EVEX.NDS.256.66.0F.W0 66 /r
+EVEX.256.66.0F.W0 66 /r
VPCMPGTD k1 {k2}, ymm2,
ymm3/m256/m32bcst
@@ -133752,7 +133897,7 @@ int32 vector ymm3/m256/m32bcst, and set
vector mask k1 to reflect the zero/nonzero status
of each element of the result, under writemask.
-EVEX.NDS.512.66.0F.W0 66 /r
+EVEX.512.66.0F.W0 66 /r
VPCMPGTD k1 {k2}, zmm2,
zmm3/m512/m32bcst
@@ -133767,7 +133912,7 @@ zmm2 and zmm3/m512/m32bcst, and set
destination k1 according to the comparison results
under writemask. k2.
-EVEX.NDS.128.66.0F.WIG 64 /r
+EVEX.128.66.0F.WIG 64 /r
VPCMPGTB k1 {k2}, xmm2, xmm3/m128
D
@@ -133782,7 +133927,7 @@ and xmm3/m128 for greater than, and set vector
mask k1 to reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.256.66.0F.WIG 64 /r
+EVEX.256.66.0F.WIG 64 /r
VPCMPGTB k1 {k2}, ymm2, ymm3/m256
D
@@ -133808,17 +133953,17 @@ NP 0F 66 /r1
PCMPGTD mm, mm/m64
66 0F 66 /r
PCMPGTD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 64 /r
+VEX.128.66.0F.WIG 64 /r
VPCMPGTB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 65 /r
+VEX.128.66.0F.WIG 65 /r
VPCMPGTW xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 66 /r
+VEX.128.66.0F.WIG 66 /r
VPCMPGTD xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 64 /r
+VEX.256.66.0F.WIG 64 /r
VPCMPGTB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG 65 /r
+VEX.256.66.0F.WIG 65 /r
VPCMPGTW ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG 66 /r
+VEX.256.66.0F.WIG 66 /r
VPCMPGTD ymm1, ymm2, ymm3/m256
PCMPGTB/PCMPGTW/PCMPGTD—Compare Packed Signed Integers for Greater Than
@@ -133827,7 +133972,7 @@ Vol. 2B 4-257
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.512.66.0F.WIG 64 /r
+EVEX.512.66.0F.WIG 64 /r
VPCMPGTB k1 {k2}, zmm2, zmm3/m512
D
@@ -133841,7 +133986,7 @@ zmm3/m512 for greater than, and set vector
mask k1 to reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.128.66.0F.WIG 65 /r
+EVEX.128.66.0F.WIG 65 /r
VPCMPGTW k1 {k2}, xmm2, xmm3/m128
D
@@ -133856,7 +134001,7 @@ and xmm3/m128 for greater than, and set vector
mask k1 to reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.256.66.0F.WIG 65 /r
+EVEX.256.66.0F.WIG 65 /r
VPCMPGTW k1 {k2}, ymm2, ymm3/m256
D
@@ -133871,7 +134016,7 @@ and ymm3/m256 for greater than, and set vector
mask k1 to reflect the zero/nonzero status of each
element of the result, under writemask.
-EVEX.NDS.512.66.0F.WIG 65 /r
+EVEX.512.66.0F.WIG 65 /r
VPCMPGTW k1 {k2}, zmm2, zmm3/m512
D
@@ -134218,7 +134363,7 @@ SSE4_2
Compare packed signed qwords in xmm2/m128
and xmm1 for greater than.
-VEX.NDS.128.66.0F38.WIG 37 /r
+VEX.128.66.0F38.WIG 37 /r
VPCMPGTQ xmm1, xmm2, xmm3/m128
B
@@ -134230,7 +134375,7 @@ AVX
Compare packed signed qwords in xmm2 and
xmm3/m128 for greater than.
-VEX.NDS.256.66.0F38.WIG 37 /r
+VEX.256.66.0F38.WIG 37 /r
VPCMPGTQ ymm1, ymm2, ymm3/m256
B
@@ -134242,7 +134387,7 @@ AVX2
Compare packed signed qwords in ymm2 and
ymm3/m256 for greater than.
-EVEX.NDS.128.66.0F38.W1 37 /r
+EVEX.128.66.0F38.W1 37 /r
VPCMPGTQ k1 {k2}, xmm2,
xmm3/m128/m64bcst
@@ -134255,7 +134400,7 @@ AVX512F int64 vector xmm3/m128/m64bcst, and set
vector mask k1 to reflect the zero/nonzero status
of each element of the result, under writemask.
-EVEX.NDS.256.66.0F38.W1 37 /r
+EVEX.256.66.0F38.W1 37 /r
VPCMPGTQ k1 {k2}, ymm2,
ymm3/m256/m64bcst
@@ -134268,7 +134413,7 @@ AVX512F int64 vector ymm3/m256/m64bcst, and set
vector mask k1 to reflect the zero/nonzero status
of each element of the result, under writemask.
-EVEX.NDS.512.66.0F38.W1 37 /r
+EVEX.512.66.0F38.W1 37 /r
C
VPCMPGTQ k1 {k2}, zmm2, zmm3/m512/m64bcst
@@ -134774,9 +134919,9 @@ Feature
Flag
BMI2
-VEX.NDS.LZ.F2.0F38.W0 F5 /r
+VEX.LZ.F2.0F38.W0 F5 /r
PDEP r32a, r32b, r/m32
-VEX.NDS.LZ.F2.0F38.W1 F5 /r
+VEX.LZ.F2.0F38.W1 F5 /r
PDEP r64a, r64b, r/m64
RVM
@@ -134928,10 +135073,7 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
-
-If VEX.W = 1.
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
PDEP — Parallel Bits Deposit
@@ -134957,9 +135099,9 @@ Feature
Flag
BMI2
-VEX.NDS.LZ.F3.0F38.W0 F5 /r
+VEX.LZ.F3.0F38.W0 F5 /r
PEXT r32a, r32b, r/m32
-VEX.NDS.LZ.F3.0F38.W1 F5 /r
+VEX.LZ.F3.0F38.W1 F5 /r
PEXT r64a, r64b, r/m64
RVM
@@ -135107,10 +135249,7 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
-
-If VEX.W = 1.
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
PEXT — Parallel Bits Extract
@@ -135752,13 +135891,13 @@ NP 0F 38 02 /r
PHADDD mm1, mm2/m64
66 0F 38 02 /r
PHADDD xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 01 /r
+VEX.128.66.0F38.WIG 01 /r
VPHADDW xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F38.WIG 02 /r
+VEX.128.66.0F38.WIG 02 /r
VPHADDD xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 01 /r
+VEX.256.66.0F38.WIG 01 /r
VPHADDW ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F38.WIG 02 /r
+VEX.256.66.0F38.WIG 02 /r
VPHADDD ymm1, ymm2, ymm3/m256
NOTES:
1. See note in Section 2.4, “AVX and SSE Instruction Exception Specification” in the Intel® 64 and IA-32 Architectures Software
@@ -136039,9 +136178,9 @@ saturated integers to ymm1.
PHADDSW mm1, mm2/m64
66 0F 38 03 /r
PHADDSW xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 03 /r
+VEX.128.66.0F38.WIG 03 /r
VPHADDSW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 03 /r
+VEX.256.66.0F38.WIG 03 /r
VPHADDSW ymm1, ymm2, ymm3/m256
NOTES:
@@ -136395,13 +136534,13 @@ NP 0F 38 06 /r
PHSUBD mm1, mm2/m64
66 0F 38 06 /r
PHSUBD xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 05 /r
+VEX.128.66.0F38.WIG 05 /r
VPHSUBW xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F38.WIG 06 /r
+VEX.128.66.0F38.WIG 06 /r
VPHSUBD xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 05 /r
+VEX.256.66.0F38.WIG 05 /r
VPHSUBW ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F38.WIG 06 /r
+VEX.256.66.0F38.WIG 06 /r
VPHSUBD ymm1, ymm2, ymm3/m256
NOTES:
@@ -136633,9 +136772,9 @@ pack saturated integers to ymm1.
PHSUBSW mm1, mm2/m64
66 0F 38 07 /r
PHSUBSW xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 07 /r
+VEX.128.66.0F38.WIG 07 /r
VPHSUBSW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 07 /r
+VEX.256.66.0F38.WIG 07 /r
VPHSUBSW ymm1, ymm2, ymm3/m256
NOTES:
@@ -136823,7 +136962,7 @@ Insert a qword integer value from r/m64 into
the xmm1 at the destination element
specified by imm8.
-VEX.NDS.128.66.0F3A.W0 20 /r ib
+VEX.128.66.0F3A.W0 20 /r ib
VPINSRB xmm1, xmm2, r32/m8, imm8
B
@@ -136836,7 +136975,7 @@ Merge a byte integer value from r32/m8 and
rest from xmm2 into xmm1 at the byte offset
in imm8.
-VEX.NDS.128.66.0F3A.W0 22 /r ib
+VEX.128.66.0F3A.W0 22 /r ib
VPINSRD xmm1, xmm2, r/m32, imm8
B
@@ -136849,7 +136988,7 @@ Insert a dword integer value from r32/m32
and rest from xmm2 into xmm1 at the dword
offset in imm8.
-VEX.NDS.128.66.0F3A.W1 22 /r ib
+VEX.128.66.0F3A.W1 22 /r ib
VPINSRQ xmm1, xmm2, r/m64, imm8
B
@@ -136862,7 +137001,7 @@ Insert a qword integer value from r64/m64
and rest from xmm2 into xmm1 at the qword
offset in imm8.
-EVEX.NDS.128.66.0F3A.WIG 20 /r ib
+EVEX.128.66.0F3A.WIG 20 /r ib
VPINSRB xmm1, xmm2, r32/m8, imm8
C
@@ -136873,7 +137012,7 @@ AVX512BW Merge a byte integer value from r32/m8 and
rest from xmm2 into xmm1 at the byte offset
in imm8.
-EVEX.NDS.128.66.0F3A.W0 22 /r ib
+EVEX.128.66.0F3A.W0 22 /r ib
VPINSRD xmm1, xmm2, r32/m32, imm8
C
@@ -136886,7 +137025,7 @@ Insert a dword integer value from r32/m32
and rest from xmm2 into xmm1 at the dword
offset in imm8.
-EVEX.NDS.128.66.0F3A.W1 22 /r ib
+EVEX.128.66.0F3A.W1 22 /r ib
VPINSRQ xmm1, xmm2, r64/m64, imm8
C
@@ -137091,9 +137230,9 @@ offset in imm8.
PINSRW mm, r32/m16, imm8
66 0F C4 /r ib
PINSRW xmm, r32/m16, imm8
-VEX.NDS.128.66.0F.W0 C4 /r ib
+VEX.128.66.0F.W0 C4 /r ib
VPINSRW xmm1, xmm2, r32/m16, imm8
-EVEX.NDS.128.66.0F.WIG C4 /r ib
+EVEX.128.66.0F.WIG C4 /r ib
VPINSRW xmm1, xmm2, r32/m16, imm8
NOTES:
@@ -137296,7 +137435,7 @@ Multiply signed and unsigned bytes, add
horizontal pair of signed words, pack
saturated signed-words to ymm1.
-EVEX.NDS.128.66.0F38.WIG 04 /r
+EVEX.128.66.0F38.WIG 04 /r
VPMADDUBSW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -137308,7 +137447,7 @@ AVX512BW horizontal pair of signed words, pack
saturated signed-words to xmm1 under
writemask k1.
-EVEX.NDS.256.66.0F38.WIG 04 /r
+EVEX.256.66.0F38.WIG 04 /r
VPMADDUBSW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -137320,7 +137459,7 @@ AVX512BW horizontal pair of signed words, pack
saturated signed-words to ymm1 under
writemask k1.
-EVEX.NDS.512.66.0F38.WIG 04 /r
+EVEX.512.66.0F38.WIG 04 /r
VPMADDUBSW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -137335,9 +137474,9 @@ writemask k1.
PMADDUBSW mm1, mm2/m64
66 0F 38 04 /r
PMADDUBSW xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 04 /r
+VEX.128.66.0F38.WIG 04 /r
VPMADDUBSW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 04 /r
+VEX.256.66.0F38.WIG 04 /r
VPMADDUBSW ymm1, ymm2, ymm3/m256
NOTES:
@@ -137554,7 +137693,7 @@ the packed word integers in ymm3/m256, add
adjacent doubleword results, and store in
ymm1.
-EVEX.NDS.128.66.0F.WIG F5 /r
+EVEX.128.66.0F.WIG F5 /r
VPMADDWD xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -137566,7 +137705,7 @@ AVX512BW the packed word integers in xmm3/m128, add
adjacent doubleword results, and store in
xmm1 under writemask k1.
-EVEX.NDS.256.66.0F.WIG F5 /r
+EVEX.256.66.0F.WIG F5 /r
VPMADDWD ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -137578,7 +137717,7 @@ AVX512BW the packed word integers in ymm3/m256, add
adjacent doubleword results, and store in
ymm1 under writemask k1.
-EVEX.NDS.512.66.0F.WIG F5 /r
+EVEX.512.66.0F.WIG F5 /r
VPMADDWD zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -137594,10 +137733,10 @@ PMADDWD mm, mm/m64
66 0F F5 /r
PMADDWD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG F5 /r
+VEX.128.66.0F.WIG F5 /r
VPMADDWD xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG F5 /r
+VEX.256.66.0F.WIG F5 /r
VPMADDWD ymm1, ymm2, ymm3/m256
NOTES:
@@ -137840,7 +137979,7 @@ V/V
SSE4_1
-VEX.NDS.128.66.0F38.WIG 3C /r
+VEX.128.66.0F38.WIG 3C /r
VPMAXSB xmm1, xmm2, xmm3/m128
B
@@ -137849,7 +137988,7 @@ V/V
AVX
-VEX.NDS.128.66.0F.WIG EE /r
+VEX.128.66.0F.WIG EE /r
VPMAXSW xmm1, xmm2, xmm3/m128
B
@@ -137858,7 +137997,7 @@ V/V
AVX
-VEX.NDS.128.66.0F38.WIG 3D /r
+VEX.128.66.0F38.WIG 3D /r
VPMAXSD xmm1, xmm2, xmm3/m128
B
@@ -137867,7 +138006,7 @@ V/V
AVX
-VEX.NDS.256.66.0F38.WIG 3C /r
+VEX.256.66.0F38.WIG 3C /r
VPMAXSB ymm1, ymm2, ymm3/m256
B
@@ -137876,7 +138015,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F.WIG EE /r
+VEX.256.66.0F.WIG EE /r
VPMAXSW ymm1, ymm2, ymm3/m256
B
@@ -137885,7 +138024,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F38.WIG 3D /r
+VEX.256.66.0F38.WIG 3D /r
VPMAXSD ymm1, ymm2, ymm3/m256
B
@@ -137894,25 +138033,25 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.WIG 3C /r
+EVEX.128.66.0F38.WIG 3C /r
VPMAXSB xmm1{k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.WIG 3C /r
+EVEX.256.66.0F38.WIG 3C /r
VPMAXSB ymm1{k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.WIG 3C /r
+EVEX.512.66.0F38.WIG 3C /r
VPMAXSB zmm1{k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F.WIG EE /r
+EVEX.128.66.0F.WIG EE /r
VPMAXSW xmm1{k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F.WIG EE /r
+EVEX.256.66.0F.WIG EE /r
VPMAXSW ymm1{k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F.WIG EE /r
+EVEX.512.66.0F.WIG EE /r
VPMAXSW zmm1{k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38.W0 3D /r
+EVEX.128.66.0F38.W0 3D /r
VPMAXSD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -138039,19 +138178,19 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W0 3D /r
+EVEX.256.66.0F38.W0 3D /r
VPMAXSD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 3D /r
+EVEX.512.66.0F38.W0 3D /r
VPMAXSD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 3D /r
+EVEX.128.66.0F38.W1 3D /r
VPMAXSQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 3D /r
+EVEX.256.66.0F38.W1 3D /r
VPMAXSQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 3D /r
+EVEX.512.66.0F38.W1 3D /r
VPMAXSQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -138503,7 +138642,7 @@ V/V
SSE4_1
-VEX.NDS.128.66.0F DE /r
+VEX.128.66.0F DE /r
VPMAXUB xmm1, xmm2, xmm3/m128
B
@@ -138512,7 +138651,7 @@ V/V
AVX
-VEX.NDS.128.66.0F38 3E/r
+VEX.128.66.0F38 3E/r
VPMAXUW xmm1, xmm2, xmm3/m128
B
@@ -138521,7 +138660,7 @@ V/V
AVX
-VEX.NDS.256.66.0F DE /r
+VEX.256.66.0F DE /r
VPMAXUB ymm1, ymm2, ymm3/m256
B
@@ -138530,7 +138669,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F38 3E/r
+VEX.256.66.0F38 3E/r
VPMAXUW ymm1, ymm2, ymm3/m256
B
@@ -138539,22 +138678,22 @@ V/V
AVX2
-EVEX.NDS.128.66.0F.WIG DE /r
+EVEX.128.66.0F.WIG DE /r
VPMAXUB xmm1{k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F.WIG DE /r
+EVEX.256.66.0F.WIG DE /r
VPMAXUB ymm1{k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F.WIG DE /r
+EVEX.512.66.0F.WIG DE /r
VPMAXUB zmm1{k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38.WIG 3E /r
+EVEX.128.66.0F38.WIG 3E /r
VPMAXUW xmm1{k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.WIG 3E /r
+EVEX.256.66.0F38.WIG 3E /r
VPMAXUW ymm1{k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.WIG 3E /r
+EVEX.512.66.0F38.WIG 3E /r
VPMAXUW zmm1{k1}{z}, zmm2,
zmm3/m512
NOTES:
@@ -138903,7 +139042,7 @@ SSE4_1
66 0F 38 3F /r
PMAXUD xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 3F /r
+VEX.128.66.0F38.WIG 3F /r
VPMAXUD xmm1, xmm2, xmm3/m128
B
@@ -138912,7 +139051,7 @@ V/V
AVX
-VEX.NDS.256.66.0F38.WIG 3F /r
+VEX.256.66.0F38.WIG 3F /r
VPMAXUD ymm1, ymm2, ymm3/m256
B
@@ -138921,22 +139060,22 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.W0 3F /r
+EVEX.128.66.0F38.W0 3F /r
VPMAXUD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 3F /r
+EVEX.256.66.0F38.W0 3F /r
VPMAXUD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 3F /r
+EVEX.512.66.0F38.W0 3F /r
VPMAXUD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 3F /r
+EVEX.128.66.0F38.W1 3F /r
VPMAXUQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 3F /r
+EVEX.256.66.0F38.W1 3F /r
VPMAXUQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 3F /r
+EVEX.512.66.0F38.W1 3F /r
VPMAXUQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -139242,7 +139381,7 @@ SSE4_1
66 0F EA /r
PMINSW xmm1, xmm2/m128
-VEX.NDS.128.66.0F38 38 /r
+VEX.128.66.0F38 38 /r
VPMINSB xmm1, xmm2, xmm3/m128
A
@@ -139257,7 +139396,7 @@ V/V
AVX
-VEX.NDS.128.66.0F EA /r
+VEX.128.66.0F EA /r
VPMINSW xmm1, xmm2, xmm3/m128
B
@@ -139266,7 +139405,7 @@ V/V
AVX
-VEX.NDS.256.66.0F38 38 /r
+VEX.256.66.0F38 38 /r
VPMINSB ymm1, ymm2, ymm3/m256
B
@@ -139275,7 +139414,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F EA /r
+VEX.256.66.0F EA /r
VPMINSW ymm1, ymm2, ymm3/m256
B
@@ -139284,22 +139423,22 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.WIG 38 /r
+EVEX.128.66.0F38.WIG 38 /r
VPMINSB xmm1{k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.WIG 38 /r
+EVEX.256.66.0F38.WIG 38 /r
VPMINSB ymm1{k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.WIG 38 /r
+EVEX.512.66.0F38.WIG 38 /r
VPMINSB zmm1{k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F.WIG EA /r
+EVEX.128.66.0F.WIG EA /r
VPMINSW xmm1{k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F.WIG EA /r
+EVEX.256.66.0F.WIG EA /r
VPMINSW ymm1{k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F.WIG EA /r
+EVEX.512.66.0F.WIG EA /r
VPMINSW zmm1{k1}{z}, zmm2,
zmm3/m512
NOTES:
@@ -139650,7 +139789,7 @@ SSE4_1
66 0F 38 39 /r
PMINSD xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 39 /r
+VEX.128.66.0F38.WIG 39 /r
VPMINSD xmm1, xmm2, xmm3/m128
B
@@ -139659,7 +139798,7 @@ V/V
AVX
-VEX.NDS.256.66.0F38.WIG 39 /r
+VEX.256.66.0F38.WIG 39 /r
VPMINSD ymm1, ymm2, ymm3/m256
B
@@ -139668,22 +139807,22 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.W0 39 /r
+EVEX.128.66.0F38.W0 39 /r
VPMINSD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 39 /r
+EVEX.256.66.0F38.W0 39 /r
VPMINSD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 39 /r
+EVEX.512.66.0F38.W0 39 /r
VPMINSD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 39 /r
+EVEX.128.66.0F38.W1 39 /r
VPMINSQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 39 /r
+EVEX.256.66.0F38.W1 39 /r
VPMINSQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 39 /r
+EVEX.512.66.0F38.W1 39 /r
VPMINSQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -140001,7 +140140,7 @@ V/V
SSE4_1
-VEX.NDS.128.66.0F DA /r
+VEX.128.66.0F DA /r
VPMINUB xmm1, xmm2, xmm3/m128
B
@@ -140010,7 +140149,7 @@ V/V
AVX
-VEX.NDS.128.66.0F38 3A/r
+VEX.128.66.0F38 3A/r
VPMINUW xmm1, xmm2, xmm3/m128
B
@@ -140019,7 +140158,7 @@ V/V
AVX
-VEX.NDS.256.66.0F DA /r
+VEX.256.66.0F DA /r
VPMINUB ymm1, ymm2, ymm3/m256
B
@@ -140028,7 +140167,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F38 3A/r
+VEX.256.66.0F38 3A/r
VPMINUW ymm1, ymm2, ymm3/m256
B
@@ -140037,22 +140176,22 @@ V/V
AVX2
-EVEX.NDS.128.66.0F DA /r
+EVEX.128.66.0F DA /r
VPMINUB xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F DA /r
+EVEX.256.66.0F DA /r
VPMINUB ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F DA /r
+EVEX.512.66.0F DA /r
VPMINUB zmm1 {k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38 3A/r
+EVEX.128.66.0F38 3A/r
VPMINUW xmm1{k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38 3A/r
+EVEX.256.66.0F38 3A/r
VPMINUW ymm1{k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38 3A/r
+EVEX.512.66.0F38 3A/r
VPMINUW zmm1{k1}{z}, zmm2,
zmm3/m512
NOTES:
@@ -140401,28 +140540,28 @@ SSE4_1
66 0F 38 3B /r
PMINUD xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 3B /r
+VEX.128.66.0F38.WIG 3B /r
VPMINUD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.WIG 3B /r
+VEX.256.66.0F38.WIG 3B /r
VPMINUD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W0 3B /r
+EVEX.128.66.0F38.W0 3B /r
VPMINUD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 3B /r
+EVEX.256.66.0F38.W0 3B /r
VPMINUD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 3B /r
+EVEX.512.66.0F38.W0 3B /r
VPMINUD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 3B /r
+EVEX.128.66.0F38.W1 3B /r
VPMINUQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 3B /r
+EVEX.256.66.0F38.W1 3B /r
VPMINUQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 3B /r
+EVEX.512.66.0F38.W1 3B /r
VPMINUQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -142697,13 +142836,13 @@ SSE4_1
66 0F 38 28 /r
PMULDQ xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 28 /r
+VEX.128.66.0F38.WIG 28 /r
VPMULDQ xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.WIG 28 /r
+VEX.256.66.0F38.WIG 28 /r
VPMULDQ ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W1 28 /r
+EVEX.128.66.0F38.W1 28 /r
VPMULDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -142726,7 +142865,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W1 28 /r
+EVEX.256.66.0F38.W1 28 /r
VPMULDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -142737,7 +142876,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F38.W1 28 /r
+EVEX.512.66.0F38.W1 28 /r
VPMULDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -142969,7 +143108,7 @@ Multiply 16-bit signed words, scale and round
signed doublewords, pack high 16 bits to
ymm1.
-EVEX.NDS.128.66.0F38.WIG 0B /r
+EVEX.128.66.0F38.WIG 0B /r
VPMULHRSW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -142980,7 +143119,7 @@ AVX512VL Multiply 16-bit signed words, scale and round
AVX512BW signed doublewords, pack high 16 bits to
xmm1 under writemask k1.
-EVEX.NDS.256.66.0F38.WIG 0B /r
+EVEX.256.66.0F38.WIG 0B /r
VPMULHRSW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -142991,7 +143130,7 @@ AVX512VL Multiply 16-bit signed words, scale and round
AVX512BW signed doublewords, pack high 16 bits to
ymm1 under writemask k1.
-EVEX.NDS.512.66.0F38.WIG 0B /r
+EVEX.512.66.0F38.WIG 0B /r
VPMULHRSW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -143005,9 +143144,9 @@ zmm1 under writemask k1.
PMULHRSW mm1, mm2/m64
66 0F 38 0B /r
PMULHRSW xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 0B /r
+VEX.128.66.0F38.WIG 0B /r
VPMULHRSW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 0B /r
+VEX.256.66.0F38.WIG 0B /r
VPMULHRSW ymm1, ymm2, ymm3/m256
NOTES:
@@ -143292,7 +143431,7 @@ Multiply the packed unsigned word integers in
ymm2 and ymm3/m256, and store the high
16 bits of the results in ymm1.
-EVEX.NDS.128.66.0F.WIG E4 /r
+EVEX.128.66.0F.WIG E4 /r
VPMULHUW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -143304,7 +143443,7 @@ AVX512BW xmm2 and xmm3/m128, and store the high
16 bits of the results in xmm1 under
writemask k1.
-EVEX.NDS.256.66.0F.WIG E4 /r
+EVEX.256.66.0F.WIG E4 /r
VPMULHUW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -143316,7 +143455,7 @@ AVX512BW ymm2 and ymm3/m256, and store the high
16 bits of the results in ymm1 under
writemask k1.
-EVEX.NDS.512.66.0F.WIG E4 /r
+EVEX.512.66.0F.WIG E4 /r
VPMULHUW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -143331,9 +143470,9 @@ k1.
PMULHUW mm1, mm2/m64
66 0F E4 /r
PMULHUW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG E4 /r
+VEX.128.66.0F.WIG E4 /r
VPMULHUW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG E4 /r
+VEX.256.66.0F.WIG E4 /r
VPMULHUW ymm1, ymm2, ymm3/m256
NOTES:
@@ -143646,7 +143785,7 @@ Multiply the packed signed word integers in
ymm2 and ymm3/m256, and store the high 16
bits of the results in ymm1.
-EVEX.NDS.128.66.0F.WIG E5 /r
+EVEX.128.66.0F.WIG E5 /r
VPMULHW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -143660,7 +143799,7 @@ Multiply the packed signed word integers in
xmm2 and xmm3/m128, and store the high 16
bits of the results in xmm1 under writemask k1.
-EVEX.NDS.256.66.0F.WIG E5 /r
+EVEX.256.66.0F.WIG E5 /r
VPMULHW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -143674,7 +143813,7 @@ Multiply the packed signed word integers in
ymm2 and ymm3/m256, and store the high 16
bits of the results in ymm1 under writemask k1.
-EVEX.NDS.512.66.0F.WIG E5 /r
+EVEX.512.66.0F.WIG E5 /r
VPMULHW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -143690,9 +143829,9 @@ bits of the results in zmm1 under writemask k1.
PMULHW mm, mm/m64
66 0F E5 /r
PMULHW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG E5 /r
+VEX.128.66.0F.WIG E5 /r
VPMULHW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG E5 /r
+VEX.256.66.0F.WIG E5 /r
VPMULHW ymm1, ymm2, ymm3/m256
NOTES:
@@ -143935,28 +144074,28 @@ SSE4_1
66 0F 38 40 /r
PMULLD xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 40 /r
+VEX.128.66.0F38.WIG 40 /r
VPMULLD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.WIG 40 /r
+VEX.256.66.0F38.WIG 40 /r
VPMULLD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W0 40 /r
+EVEX.128.66.0F38.W0 40 /r
VPMULLD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 40 /r
+EVEX.256.66.0F38.W0 40 /r
VPMULLD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 40 /r
+EVEX.512.66.0F38.W0 40 /r
VPMULLD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 40 /r
+EVEX.128.66.0F38.W1 40 /r
VPMULLQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 40 /r
+EVEX.256.66.0F38.W1 40 /r
VPMULLQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 40 /r
+EVEX.512.66.0F38.W1 40 /r
VPMULLQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -144295,7 +144434,7 @@ Multiply the packed signed word integers in
ymm2 and ymm3/m256, and store the low 16
bits of the results in ymm1.
-EVEX.NDS.128.66.0F.WIG D5 /r
+EVEX.128.66.0F.WIG D5 /r
VPMULLW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -144306,7 +144445,7 @@ AVX512VL Multiply the packed signed word integers in
AVX512BW xmm2 and xmm3/m128, and store the low 16
bits of the results in xmm1 under writemask k1.
-EVEX.NDS.256.66.0F.WIG D5 /r
+EVEX.256.66.0F.WIG D5 /r
VPMULLW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -144317,7 +144456,7 @@ AVX512VL Multiply the packed signed word integers in
AVX512BW ymm2 and ymm3/m256, and store the low 16
bits of the results in ymm1 under writemask k1.
-EVEX.NDS.512.66.0F.WIG D5 /r
+EVEX.512.66.0F.WIG D5 /r
VPMULLW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -144331,9 +144470,9 @@ bits of the results in zmm1 under writemask k1.
PMULLW mm, mm/m64
66 0F D5 /r
PMULLW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG D5 /r
+VEX.128.66.0F.WIG D5 /r
VPMULLW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG D5 /r
+VEX.256.66.0F.WIG D5 /r
VPMULLW ymm1, ymm2, ymm3/m256
NOTES:
@@ -144620,7 +144759,7 @@ ymm2 by packed unsigned doubleword integers
in ymm3/m256, and store the quadword results
in ymm1.
-EVEX.NDS.128.66.0F.W1 F4 /r
+EVEX.128.66.0F.W1 F4 /r
VPMULUDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -144636,7 +144775,7 @@ xmm2 by packed unsigned doubleword integers
in xmm3/m128/m64bcst, and store the
quadword results in xmm1 under writemask k1.
-EVEX.NDS.256.66.0F.W1 F4 /r
+EVEX.256.66.0F.W1 F4 /r
VPMULUDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -144652,7 +144791,7 @@ ymm2 by packed unsigned doubleword integers
in ymm3/m256/m64bcst, and store the
quadword results in ymm1 under writemask k1.
-EVEX.NDS.512.66.0F.W1 F4 /r
+EVEX.512.66.0F.W1 F4 /r
VPMULUDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -144671,10 +144810,10 @@ PMULUDQ mm1, mm2/m64
66 0F F4 /r
PMULUDQ xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG F4 /r
+VEX.128.66.0F.WIG F4 /r
VPMULUDQ xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG F4 /r
+VEX.256.66.0F.WIG F4 /r
VPMULUDQ ymm1, ymm2, ymm3/m256
NOTES:
@@ -146863,7 +147002,7 @@ AVX2
Bitwise OR of ymm2/m256 and ymm3.
-EVEX.NDS.128.66.0F.W0 EB /r
+EVEX.128.66.0F.W0 EB /r
VPORD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst
C
@@ -146877,7 +147016,7 @@ Bitwise OR of packed doubleword integers in
xmm2 and xmm3/m128/m32bcst using
writemask k1.
-EVEX.NDS.256.66.0F.W0 EB /r
+EVEX.256.66.0F.W0 EB /r
VPORD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst
C
@@ -146891,7 +147030,7 @@ Bitwise OR of packed doubleword integers in
ymm2 and ymm3/m256/m32bcst using
writemask k1.
-EVEX.NDS.512.66.0F.W0 EB /r
+EVEX.512.66.0F.W0 EB /r
VPORD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst
C
@@ -146904,7 +147043,7 @@ Bitwise OR of packed doubleword integers in
zmm2 and zmm3/m512/m32bcst using
writemask k1.
-EVEX.NDS.128.66.0F.W1 EB /r
+EVEX.128.66.0F.W1 EB /r
VPORQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst
C
@@ -146918,7 +147057,7 @@ Bitwise OR of packed quadword integers in
xmm2 and xmm3/m128/m64bcst using
writemask k1.
-EVEX.NDS.256.66.0F.W1 EB /r
+EVEX.256.66.0F.W1 EB /r
VPORQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst
C
@@ -146932,7 +147071,7 @@ Bitwise OR of packed quadword integers in
ymm2 and ymm3/m256/m64bcst using
writemask k1.
-EVEX.NDS.512.66.0F.W1 EB /r
+EVEX.512.66.0F.W1 EB /r
VPORQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst
C
@@ -146948,9 +147087,9 @@ writemask k1.
POR mm, mm/m64
66 0F EB /r
POR xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG EB /r
+VEX.128.66.0F.WIG EB /r
VPOR xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG EB /r
+VEX.256.66.0F.WIG EB /r
VPOR ymm1, ymm2, ymm3/m256
NOTES:
@@ -147287,7 +147426,7 @@ V/V
CPUID
Feature
Flag
-PRFCHW
+PREFETCHW
Description
@@ -147341,7 +147480,7 @@ Operation
FETCH_WITH_EXCLUSIVE_OWNERSHIP (m8);
Flags Affected
-All flags are affected
+All flags are affected.
C/C++ Compiler Intrinsic Equivalent
void _m_prefetchw( void * );
@@ -147446,7 +147585,7 @@ packed unsigned byte integers from ymm3
differences are summed separately to produce
four unsigned word integer results.
-EVEX.NDS.128.66.0F.WIG F6 /r
+EVEX.128.66.0F.WIG F6 /r
VPSADBW xmm1, xmm2, xmm3/m128
C
@@ -147462,7 +147601,7 @@ packed unsigned byte integers from xmm3
differences are summed separately to produce
four unsigned word integer results.
-EVEX.NDS.256.66.0F.WIG F6 /r
+EVEX.256.66.0F.WIG F6 /r
VPSADBW ymm1, ymm2, ymm3/m256
C
@@ -147478,7 +147617,7 @@ packed unsigned byte integers from ymm3
differences are summed separately to produce
four unsigned word integer results.
-EVEX.NDS.512.66.0F.WIG F6 /r
+EVEX.512.66.0F.WIG F6 /r
VPSADBW zmm1, zmm2, zmm3/m512
C
@@ -147498,10 +147637,10 @@ PSADBW mm1, mm2/m64
66 0F F6 /r
PSADBW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG F6 /r
+VEX.128.66.0F.WIG F6 /r
VPSADBW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG F6 /r
+VEX.256.66.0F.WIG F6 /r
VPSADBW ymm1, ymm2, ymm3/m256
NOTES:
@@ -147812,7 +147951,7 @@ AVX2
Shuffle bytes in ymm2 according to contents of
ymm3/m256.
-EVEX.NDS.128.66.0F38.WIG 00 /r
+EVEX.128.66.0F38.WIG 00 /r
VPSHUFB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -147822,7 +147961,7 @@ V/V
AVX512VL Shuffle bytes in xmm2 according to contents of
AVX512BW xmm3/m128 under write mask k1.
-EVEX.NDS.256.66.0F38.WIG 00 /r
+EVEX.256.66.0F38.WIG 00 /r
VPSHUFB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -147832,7 +147971,7 @@ V/V
AVX512VL Shuffle bytes in ymm2 according to contents of
AVX512BW ymm3/m256 under write mask k1.
-EVEX.NDS.512.66.0F38.WIG 00 /r
+EVEX.512.66.0F38.WIG 00 /r
VPSHUFB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -147845,9 +147984,9 @@ zmm3/m512 under write mask k1.
PSHUFB mm1, mm2/m64
66 0F 38 00 /r
PSHUFB xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 00 /r
+VEX.128.66.0F38.WIG 00 /r
VPSHUFB xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 00 /r
+VEX.256.66.0F38.WIG 00 /r
VPSHUFB ymm1, ymm2, ymm3/m256
NOTES:
@@ -149173,17 +149312,17 @@ NP 0F 38 0A /r1
PSIGND mm1, mm2/m64
66 0F 38 0A /r
PSIGND xmm1, xmm2/m128
-VEX.NDS.128.66.0F38.WIG 08 /r
+VEX.128.66.0F38.WIG 08 /r
VPSIGNB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F38.WIG 09 /r
+VEX.128.66.0F38.WIG 09 /r
VPSIGNW xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F38.WIG 0A /r
+VEX.128.66.0F38.WIG 0A /r
VPSIGND xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.WIG 08 /r
+VEX.256.66.0F38.WIG 08 /r
VPSIGNB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F38.WIG 09 /r
+VEX.256.66.0F38.WIG 09 /r
VPSIGNW ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F38.WIG 0A /r
+VEX.256.66.0F38.WIG 0A /r
VPSIGND ymm1, ymm2, ymm3/m256
NOTES:
@@ -149456,7 +149595,7 @@ AVX2
Shift ymm2 left by imm8 bytes while shifting
in 0s and store result in ymm1.
-EVEX.NDD.128.66.0F.WIG 73 /7 ib
+EVEX.128.66.0F.WIG 73 /7 ib
VPSLLDQ xmm1,xmm2/ m128, imm8
C
@@ -149466,7 +149605,7 @@ V/V
AVX512VL Shift xmm2/m128 left by imm8 bytes while
AVX512BW shifting in 0s and store result in xmm1.
-EVEX.NDD.256.66.0F.WIG 73 /7 ib
+EVEX.256.66.0F.WIG 73 /7 ib
VPSLLDQ ymm1, ymm2/m256, imm8
C
@@ -149476,7 +149615,7 @@ V/V
AVX512VL Shift ymm2/m256 left by imm8 bytes while
AVX512BW shifting in 0s and store result in ymm1.
-EVEX.NDD.512.66.0F.WIG 73 /7 ib
+EVEX.512.66.0F.WIG 73 /7 ib
VPSLLDQ zmm1, zmm2/m512, imm8
C
@@ -149487,9 +149626,9 @@ AVX512BW Shift zmm2/m512 left by imm8 bytes while
shifting in 0s and store result in zmm1.
PSLLDQ xmm1, imm8
-VEX.NDD.128.66.0F.WIG 73 /7 ib
+VEX.128.66.0F.WIG 73 /7 ib
VPSLLDQ xmm1, xmm2, imm8
-VEX.NDD.256.66.0F.WIG 73 /7 ib
+VEX.256.66.0F.WIG 73 /7 ib
VPSLLDQ ymm1, ymm2, imm8
Instruction Operand Encoding
@@ -149830,21 +149969,21 @@ NP 0F 73 /6 ib1
PSLLQ mm, imm8
66 0F 73 /6 ib
PSLLQ xmm1, imm8
-VEX.NDS.128.66.0F.WIG F1 /r
+VEX.128.66.0F.WIG F1 /r
VPSLLW xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 71 /6 ib
+VEX.128.66.0F.WIG 71 /6 ib
VPSLLW xmm1, xmm2, imm8
-VEX.NDS.128.66.0F.WIG F2 /r
+VEX.128.66.0F.WIG F2 /r
VPSLLD xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 72 /6 ib
+VEX.128.66.0F.WIG 72 /6 ib
VPSLLD xmm1, xmm2, imm8
-VEX.NDS.128.66.0F.WIG F3 /r
+VEX.128.66.0F.WIG F3 /r
VPSLLQ xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 73 /6 ib
+VEX.128.66.0F.WIG 73 /6 ib
VPSLLQ xmm1, xmm2, imm8
-VEX.NDS.256.66.0F.WIG F1 /r
+VEX.256.66.0F.WIG F1 /r
VPSLLW ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 71 /6 ib
+VEX.256.66.0F.WIG 71 /6 ib
VPSLLW ymm1, ymm2, imm8
PSLLW/PSLLD/PSLLQ—Shift Packed Data Left Logical
@@ -149853,7 +149992,7 @@ Vol. 2B 4-429
INSTRUCTION SET REFERENCE, M-U
-VEX.NDS.256.66.0F.WIG F2 /r
+VEX.256.66.0F.WIG F2 /r
C
@@ -149891,7 +150030,7 @@ AVX2
Shift quadwords in ymm2 left by imm8 while
shifting in 0s.
-EVEX.NDS.128.66.0F.WIG F1 /r
+EVEX.128.66.0F.WIG F1 /r
VPSLLW xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -149902,7 +150041,7 @@ AVX512VL Shift words in xmm2 left by amount specified in
AVX512BW xmm3/m128 while shifting in 0s using
writemask k1.
-EVEX.NDS.256.66.0F.WIG F1 /r
+EVEX.256.66.0F.WIG F1 /r
VPSLLW ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -149913,7 +150052,7 @@ AVX512VL Shift words in ymm2 left by amount specified in
AVX512BW xmm3/m128 while shifting in 0s using
writemask k1.
-EVEX.NDS.512.66.0F.WIG F1 /r
+EVEX.512.66.0F.WIG F1 /r
VPSLLW zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -149924,7 +150063,7 @@ AVX512BW Shift words in zmm2 left by amount specified in
xmm3/m128 while shifting in 0s using
writemask k1.
-EVEX.NDD.128.66.0F.WIG 71 /6 ib
+EVEX.128.66.0F.WIG 71 /6 ib
VPSLLW xmm1 {k1}{z}, xmm2/m128, imm8
E
@@ -149934,7 +150073,7 @@ V/V
AVX512VL Shift words in xmm2/m128 left by imm8 while
AVX512BW shifting in 0s using writemask k1.
-EVEX.NDD.256.66.0F.WIG 71 /6 ib
+EVEX.256.66.0F.WIG 71 /6 ib
VPSLLW ymm1 {k1}{z}, ymm2/m256, imm8
E
@@ -149944,7 +150083,7 @@ V/V
AVX512VL Shift words in ymm2/m256 left by imm8 while
AVX512BW shifting in 0s using writemask k1.
-EVEX.NDD.512.66.0F.WIG 71 /6 ib
+EVEX.512.66.0F.WIG 71 /6 ib
VPSLLW zmm1 {k1}{z}, zmm2/m512, imm8
E
@@ -149954,7 +150093,7 @@ V/V
AVX512BW Shift words in zmm2/m512 left by imm8 while
shifting in 0 using writemask k1.
-EVEX.NDS.128.66.0F.W0 F2 /r
+EVEX.128.66.0F.W0 F2 /r
VPSLLD xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -149968,7 +150107,7 @@ Shift doublewords in xmm2 left by amount
specified in xmm3/m128 while shifting in 0s
under writemask k1.
-EVEX.NDS.256.66.0F.W0 F2 /r
+EVEX.256.66.0F.W0 F2 /r
VPSLLD ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -149982,7 +150121,7 @@ Shift doublewords in ymm2 left by amount
specified in xmm3/m128 while shifting in 0s
under writemask k1.
-EVEX.NDS.512.66.0F.W0 F2 /r
+EVEX.512.66.0F.W0 F2 /r
VPSLLD zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -149995,7 +150134,7 @@ Shift doublewords in zmm2 left by amount
specified in xmm3/m128 while shifting in 0s
under writemask k1.
-EVEX.NDD.128.66.0F.W0 72 /6 ib
+EVEX.128.66.0F.W0 72 /6 ib
VPSLLD xmm1 {k1}{z}, xmm2/m128/m32bcst,
imm8
@@ -150009,7 +150148,7 @@ AVX512F
Shift doublewords in xmm2/m128/m32bcst left
by imm8 while shifting in 0s using writemask k1.
-EVEX.NDD.256.66.0F.W0 72 /6 ib
+EVEX.256.66.0F.W0 72 /6 ib
VPSLLD ymm1 {k1}{z}, ymm2/m256/m32bcst,
imm8
@@ -150023,7 +150162,7 @@ AVX512F
Shift doublewords in ymm2/m256/m32bcst left
by imm8 while shifting in 0s using writemask k1.
-EVEX.NDD.512.66.0F.W0 72 /6 ib
+EVEX.512.66.0F.W0 72 /6 ib
VPSLLD zmm1 {k1}{z}, zmm2/m512/m32bcst,
imm8
@@ -150036,7 +150175,7 @@ AVX512F
Shift doublewords in zmm2/m512/m32bcst left
by imm8 while shifting in 0s using writemask k1.
-EVEX.NDS.128.66.0F.W1 F3 /r
+EVEX.128.66.0F.W1 F3 /r
VPSLLQ xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -150050,7 +150189,7 @@ Shift quadwords in xmm2 left by amount
specified in xmm3/m128 while shifting in 0s
using writemask k1.
-EVEX.NDS.256.66.0F.W1 F3 /r
+EVEX.256.66.0F.W1 F3 /r
VPSLLQ ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -150064,7 +150203,7 @@ Shift quadwords in ymm2 left by amount
specified in xmm3/m128 while shifting in 0s
using writemask k1.
-EVEX.NDS.512.66.0F.W1 F3 /r
+EVEX.512.66.0F.W1 F3 /r
VPSLLQ zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -150078,11 +150217,11 @@ specified in xmm3/m128 while shifting in 0s
using writemask k1.
VPSLLD ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 72 /6 ib
+VEX.256.66.0F.WIG 72 /6 ib
VPSLLD ymm1, ymm2, imm8
-VEX.NDS.256.66.0F.WIG F3 /r
+VEX.256.66.0F.WIG F3 /r
VPSLLQ ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 73 /6 ib
+VEX.256.66.0F.WIG 73 /6 ib
VPSLLQ ymm1, ymm2, imm8
4-430 Vol. 2B
@@ -150091,7 +150230,7 @@ PSLLW/PSLLD/PSLLQ—Shift Packed Data Left Logical
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDD.128.66.0F.W1 73 /6 ib
+EVEX.128.66.0F.W1 73 /6 ib
VPSLLQ xmm1 {k1}{z}, xmm2/m128/m64bcst,
imm8
@@ -150105,7 +150244,7 @@ AVX512F
Shift quadwords in xmm2/m128/m64bcst left
by imm8 while shifting in 0s using writemask k1.
-EVEX.NDD.256.66.0F.W1 73 /6 ib
+EVEX.256.66.0F.W1 73 /6 ib
VPSLLQ ymm1 {k1}{z}, ymm2/m256/m64bcst,
imm8
@@ -150119,7 +150258,7 @@ AVX512F
Shift quadwords in ymm2/m256/m64bcst left
by imm8 while shifting in 0s using writemask k1.
-EVEX.NDD.512.66.0F.W1 73 /6 ib
+EVEX.512.66.0F.W1 73 /6 ib
VPSLLQ zmm1 {k1}{z}, zmm2/m512/m64bcst,
imm8
@@ -150903,7 +151042,7 @@ AVX2
Shift doublewords in ymm2 right by imm8 while
shifting in sign bits.
-EVEX.NDS.128.66.0F.WIG E1 /r
+EVEX.128.66.0F.WIG E1 /r
VPSRAW xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -150914,7 +151053,7 @@ AVX512VL Shift words in xmm2 right by amount specified in
AVX512BW xmm3/m128 while shifting in sign bits using
writemask k1.
-EVEX.NDS.256.66.0F.WIG E1 /r
+EVEX.256.66.0F.WIG E1 /r
VPSRAW ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -150925,7 +151064,7 @@ AVX512VL Shift words in ymm2 right by amount specified in
AVX512BW xmm3/m128 while shifting in sign bits using
writemask k1.
-EVEX.NDS.512.66.0F.WIG E1 /r
+EVEX.512.66.0F.WIG E1 /r
VPSRAW zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -150951,21 +151090,21 @@ NP 0F 72 /4 ib1
PSRAD mm, imm8
66 0F 72 /4 ib
PSRAD xmm1, imm8
-VEX.NDS.128.66.0F.WIG E1 /r
+VEX.128.66.0F.WIG E1 /r
VPSRAW xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 71 /4 ib
+VEX.128.66.0F.WIG 71 /4 ib
VPSRAW xmm1, xmm2, imm8
-VEX.NDS.128.66.0F.WIG E2 /r
+VEX.128.66.0F.WIG E2 /r
VPSRAD xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 72 /4 ib
+VEX.128.66.0F.WIG 72 /4 ib
VPSRAD xmm1, xmm2, imm8
-VEX.NDS.256.66.0F.WIG E1 /r
+VEX.256.66.0F.WIG E1 /r
VPSRAW ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 71 /4 ib
+VEX.256.66.0F.WIG 71 /4 ib
VPSRAW ymm1, ymm2, imm8
-VEX.NDS.256.66.0F.WIG E2 /r
+VEX.256.66.0F.WIG E2 /r
VPSRAD ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 72 /4 ib
+VEX.256.66.0F.WIG 72 /4 ib
VPSRAD ymm1, ymm2, imm8
PSRAW/PSRAD/PSRAQ—Shift Packed Data Right Arithmetic
@@ -150974,7 +151113,7 @@ Vol. 2B 4-441
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDD.128.66.0F.WIG 71 /4 ib
+EVEX.128.66.0F.WIG 71 /4 ib
VPSRAW xmm1 {k1}{z}, xmm2/m128, imm8
E
@@ -150984,7 +151123,7 @@ V/V
AVX512VL Shift words in xmm2/m128 right by imm8 while
AVX512BW shifting in sign bits using writemask k1.
-EVEX.NDD.256.66.0F.WIG 71 /4 ib
+EVEX.256.66.0F.WIG 71 /4 ib
VPSRAW ymm1 {k1}{z}, ymm2/m256, imm8
E
@@ -150994,7 +151133,7 @@ V/V
AVX512VL Shift words in ymm2/m256 right by imm8 while
AVX512BW shifting in sign bits using writemask k1.
-EVEX.NDD.512.66.0F.WIG 71 /4 ib
+EVEX.512.66.0F.WIG 71 /4 ib
VPSRAW zmm1 {k1}{z}, zmm2/m512, imm8
E
@@ -151004,7 +151143,7 @@ V/V
AVX512BW Shift words in zmm2/m512 right by imm8 while
shifting in sign bits using writemask k1.
-EVEX.NDS.128.66.0F.W0 E2 /r
+EVEX.128.66.0F.W0 E2 /r
VPSRAD xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -151018,7 +151157,7 @@ Shift doublewords in xmm2 right by amount
specified in xmm3/m128 while shifting in sign bits
using writemask k1.
-EVEX.NDS.256.66.0F.W0 E2 /r
+EVEX.256.66.0F.W0 E2 /r
VPSRAD ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -151032,7 +151171,7 @@ Shift doublewords in ymm2 right by amount
specified in xmm3/m128 while shifting in sign bits
using writemask k1.
-EVEX.NDS.512.66.0F.W0 E2 /r
+EVEX.512.66.0F.W0 E2 /r
VPSRAD zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -151045,7 +151184,7 @@ Shift doublewords in zmm2 right by amount
specified in xmm3/m128 while shifting in sign bits
using writemask k1.
-EVEX.NDD.128.66.0F.W0 72 /4 ib
+EVEX.128.66.0F.W0 72 /4 ib
VPSRAD xmm1 {k1}{z}, xmm2/m128/m32bcst,
imm8
@@ -151060,7 +151199,7 @@ Shift doublewords in xmm2/m128/m32bcst right
by imm8 while shifting in sign bits using
writemask k1.
-EVEX.NDD.256.66.0F.W0 72 /4 ib
+EVEX.256.66.0F.W0 72 /4 ib
VPSRAD ymm1 {k1}{z}, ymm2/m256/m32bcst,
imm8
@@ -151075,7 +151214,7 @@ Shift doublewords in ymm2/m256/m32bcst right
by imm8 while shifting in sign bits using
writemask k1.
-EVEX.NDD.512.66.0F.W0 72 /4 ib
+EVEX.512.66.0F.W0 72 /4 ib
VPSRAD zmm1 {k1}{z}, zmm2/m512/m32bcst,
imm8
@@ -151089,7 +151228,7 @@ Shift doublewords in zmm2/m512/m32bcst right
by imm8 while shifting in sign bits using
writemask k1.
-EVEX.NDS.128.66.0F.W1 E2 /r
+EVEX.128.66.0F.W1 E2 /r
VPSRAQ xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -151103,7 +151242,7 @@ Shift quadwords in xmm2 right by amount
specified in xmm3/m128 while shifting in sign bits
using writemask k1.
-EVEX.NDS.256.66.0F.W1 E2 /r
+EVEX.256.66.0F.W1 E2 /r
VPSRAQ ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -151117,7 +151256,7 @@ Shift quadwords in ymm2 right by amount
specified in xmm3/m128 while shifting in sign bits
using writemask k1.
-EVEX.NDS.512.66.0F.W1 E2 /r
+EVEX.512.66.0F.W1 E2 /r
VPSRAQ zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -151130,7 +151269,7 @@ Shift quadwords in zmm2 right by amount
specified in xmm3/m128 while shifting in sign bits
using writemask k1.
-EVEX.NDD.128.66.0F.W1 72 /4 ib
+EVEX.128.66.0F.W1 72 /4 ib
VPSRAQ xmm1 {k1}{z}, xmm2/m128/m64bcst,
imm8
@@ -151145,7 +151284,7 @@ Shift quadwords in xmm2/m128/m64bcst right by
imm8 while shifting in sign bits using writemask
k1.
-EVEX.NDD.256.66.0F.W1 72 /4 ib
+EVEX.256.66.0F.W1 72 /4 ib
VPSRAQ ymm1 {k1}{z}, ymm2/m256/m64bcst,
imm8
@@ -151160,7 +151299,7 @@ Shift quadwords in ymm2/m256/m64bcst right by
imm8 while shifting in sign bits using writemask
k1.
-EVEX.NDD.512.66.0F.W1 72 /4 ib
+EVEX.512.66.0F.W1 72 /4 ib
VPSRAQ zmm1 {k1}{z}, zmm2/m512/m64bcst,
imm8
@@ -151765,7 +151904,7 @@ AVX2
Shift ymm1 right by imm8 bytes while shifting in
0s.
-EVEX.NDD.128.66.0F.WIG 73 /3 ib
+EVEX.128.66.0F.WIG 73 /3 ib
VPSRLDQ xmm1, xmm2/m128, imm8
C
@@ -151775,7 +151914,7 @@ V/V
AVX512VL Shift xmm2/m128 right by imm8 bytes while
AVX512BW shifting in 0s and store result in xmm1.
-EVEX.NDD.256.66.0F.WIG 73 /3 ib
+EVEX.256.66.0F.WIG 73 /3 ib
VPSRLDQ ymm1, ymm2/m256, imm8
C
@@ -151785,7 +151924,7 @@ V/V
AVX512VL Shift ymm2/m256 right by imm8 bytes while
AVX512BW shifting in 0s and store result in ymm1.
-EVEX.NDD.512.66.0F.WIG 73 /3 ib
+EVEX.512.66.0F.WIG 73 /3 ib
VPSRLDQ zmm1, zmm2/m512, imm8
C
@@ -151796,9 +151935,9 @@ AVX512BW Shift zmm2/m512 right by imm8 bytes while
shifting in 0s and store result in zmm1.
PSRLDQ xmm1, imm8
-VEX.NDD.128.66.0F.WIG 73 /3 ib
+VEX.128.66.0F.WIG 73 /3 ib
VPSRLDQ xmm1, xmm2, imm8
-VEX.NDD.256.66.0F.WIG 73 /3 ib
+VEX.256.66.0F.WIG 73 /3 ib
VPSRLDQ ymm1, ymm2, imm8
Instruction Operand Encoding
@@ -152141,21 +152280,21 @@ NP 0F 73 /2 ib1
PSRLQ mm, imm8
66 0F 73 /2 ib
PSRLQ xmm1, imm8
-VEX.NDS.128.66.0F.WIG D1 /r
+VEX.128.66.0F.WIG D1 /r
VPSRLW xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 71 /2 ib
+VEX.128.66.0F.WIG 71 /2 ib
VPSRLW xmm1, xmm2, imm8
-VEX.NDS.128.66.0F.WIG D2 /r
+VEX.128.66.0F.WIG D2 /r
VPSRLD xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 72 /2 ib
+VEX.128.66.0F.WIG 72 /2 ib
VPSRLD xmm1, xmm2, imm8
-VEX.NDS.128.66.0F.WIG D3 /r
+VEX.128.66.0F.WIG D3 /r
VPSRLQ xmm1, xmm2, xmm3/m128
-VEX.NDD.128.66.0F.WIG 73 /2 ib
+VEX.128.66.0F.WIG 73 /2 ib
VPSRLQ xmm1, xmm2, imm8
-VEX.NDS.256.66.0F.WIG D1 /r
+VEX.256.66.0F.WIG D1 /r
VPSRLW ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 71 /2 ib
+VEX.256.66.0F.WIG 71 /2 ib
VPSRLW ymm1, ymm2, imm8
PSRLW/PSRLD/PSRLQ—Shift Packed Data Right Logical
@@ -152164,7 +152303,7 @@ Vol. 2B 4-453
INSTRUCTION SET REFERENCE, M-U
-VEX.NDS.256.66.0F.WIG D2 /r
+VEX.256.66.0F.WIG D2 /r
C
@@ -152202,7 +152341,7 @@ AVX2
Shift quadwords in ymm2 right by imm8 while
shifting in 0s.
-EVEX.NDS.128.66.0F.WIG D1 /r
+EVEX.128.66.0F.WIG D1 /r
VPSRLW xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -152216,7 +152355,7 @@ Shift words in xmm2 right by amount specified
in xmm3/m128 while shifting in 0s using
writemask k1.
-EVEX.NDS.256.66.0F.WIG D1 /r
+EVEX.256.66.0F.WIG D1 /r
VPSRLW ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -152230,7 +152369,7 @@ Shift words in ymm2 right by amount specified
in xmm3/m128 while shifting in 0s using
writemask k1.
-EVEX.NDS.512.66.0F.WIG D1 /r
+EVEX.512.66.0F.WIG D1 /r
VPSRLW zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -152243,7 +152382,7 @@ Shift words in zmm2 right by amount specified
in xmm3/m128 while shifting in 0s using
writemask k1.
-EVEX.NDD.128.66.0F.WIG 71 /2 ib
+EVEX.128.66.0F.WIG 71 /2 ib
VPSRLW xmm1 {k1}{z}, xmm2/m128, imm8
E
@@ -152256,7 +152395,7 @@ AVX512BW
Shift words in xmm2/m128 right by imm8
while shifting in 0s using writemask k1.
-EVEX.NDD.256.66.0F.WIG 71 /2 ib
+EVEX.256.66.0F.WIG 71 /2 ib
VPSRLW ymm1 {k1}{z}, ymm2/m256, imm8
E
@@ -152269,7 +152408,7 @@ AVX512BW
Shift words in ymm2/m256 right by imm8
while shifting in 0s using writemask k1.
-EVEX.NDD.512.66.0F.WIG 71 /2 ib
+EVEX.512.66.0F.WIG 71 /2 ib
VPSRLW zmm1 {k1}{z}, zmm2/m512, imm8
E
@@ -152281,7 +152420,7 @@ AVX512BW
Shift words in zmm2/m512 right by imm8
while shifting in 0s using writemask k1.
-EVEX.NDS.128.66.0F.W0 D2 /r
+EVEX.128.66.0F.W0 D2 /r
VPSRLD xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -152295,7 +152434,7 @@ Shift doublewords in xmm2 right by amount
specified in xmm3/m128 while shifting in 0s
using writemask k1.
-EVEX.NDS.256.66.0F.W0 D2 /r
+EVEX.256.66.0F.W0 D2 /r
VPSRLD ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -152309,7 +152448,7 @@ Shift doublewords in ymm2 right by amount
specified in xmm3/m128 while shifting in 0s
using writemask k1.
-EVEX.NDS.512.66.0F.W0 D2 /r
+EVEX.512.66.0F.W0 D2 /r
VPSRLD zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -152322,7 +152461,7 @@ Shift doublewords in zmm2 right by amount
specified in xmm3/m128 while shifting in 0s
using writemask k1.
-EVEX.NDD.128.66.0F.W0 72 /2 ib
+EVEX.128.66.0F.W0 72 /2 ib
VPSRLD xmm1 {k1}{z}, xmm2/m128/m32bcst,
imm8
@@ -152337,7 +152476,7 @@ Shift doublewords in xmm2/m128/m32bcst
right by imm8 while shifting in 0s using
writemask k1.
-EVEX.NDD.256.66.0F.W0 72 /2 ib
+EVEX.256.66.0F.W0 72 /2 ib
VPSRLD ymm1 {k1}{z}, ymm2/m256/m32bcst,
imm8
@@ -152352,7 +152491,7 @@ Shift doublewords in ymm2/m256/m32bcst
right by imm8 while shifting in 0s using
writemask k1.
-EVEX.NDD.512.66.0F.W0 72 /2 ib
+EVEX.512.66.0F.W0 72 /2 ib
VPSRLD zmm1 {k1}{z}, zmm2/m512/m32bcst,
imm8
@@ -152366,7 +152505,7 @@ Shift doublewords in zmm2/m512/m32bcst
right by imm8 while shifting in 0s using
writemask k1.
-EVEX.NDS.128.66.0F.W1 D3 /r
+EVEX.128.66.0F.W1 D3 /r
VPSRLQ xmm1 {k1}{z}, xmm2, xmm3/m128
G
@@ -152380,7 +152519,7 @@ Shift quadwords in xmm2 right by amount
specified in xmm3/m128 while shifting in 0s
using writemask k1.
-EVEX.NDS.256.66.0F.W1 D3 /r
+EVEX.256.66.0F.W1 D3 /r
VPSRLQ ymm1 {k1}{z}, ymm2, xmm3/m128
G
@@ -152394,7 +152533,7 @@ Shift quadwords in ymm2 right by amount
specified in xmm3/m128 while shifting in 0s
using writemask k1.
-EVEX.NDS.512.66.0F.W1 D3 /r
+EVEX.512.66.0F.W1 D3 /r
VPSRLQ zmm1 {k1}{z}, zmm2, xmm3/m128
G
@@ -152408,11 +152547,11 @@ specified in xmm3/m128 while shifting in 0s
using writemask k1.
VPSRLD ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 72 /2 ib
+VEX.256.66.0F.WIG 72 /2 ib
VPSRLD ymm1, ymm2, imm8
-VEX.NDS.256.66.0F.WIG D3 /r
+VEX.256.66.0F.WIG D3 /r
VPSRLQ ymm1, ymm2, xmm3/m128
-VEX.NDD.256.66.0F.WIG 73 /2 ib
+VEX.256.66.0F.WIG 73 /2 ib
VPSRLQ ymm1, ymm2, imm8
4-454 Vol. 2B
@@ -152421,7 +152560,7 @@ PSRLW/PSRLD/PSRLQ—Shift Packed Data Right Logical
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDD.128.66.0F.W1 73 /2 ib
+EVEX.128.66.0F.W1 73 /2 ib
VPSRLQ xmm1 {k1}{z}, xmm2/m128/m64bcst,
imm8
@@ -152436,7 +152575,7 @@ Shift quadwords in xmm2/m128/m64bcst
right by imm8 while shifting in 0s using
writemask k1.
-EVEX.NDD.256.66.0F.W1 73 /2 ib
+EVEX.256.66.0F.W1 73 /2 ib
VPSRLQ ymm1 {k1}{z}, ymm2/m256/m64bcst,
imm8
@@ -152451,7 +152590,7 @@ Shift quadwords in ymm2/m256/m64bcst
right by imm8 while shifting in 0s using
writemask k1.
-EVEX.NDD.512.66.0F.W1 73 /2 ib
+EVEX.512.66.0F.W1 73 /2 ib
VPSRLQ zmm1 {k1}{z}, zmm2/m512/m64bcst,
imm8
@@ -153150,7 +153289,7 @@ Subtract packed doubleword integers in
xmm2/mem128 from packed doubleword
integers in xmm1.
-VEX.NDS.128.66.0F.WIG F8 /r
+VEX.128.66.0F.WIG F8 /r
VPSUBB xmm1, xmm2, xmm3/m128
B
@@ -153162,7 +153301,7 @@ AVX
Subtract packed byte integers in xmm3/m128
from xmm2.
-VEX.NDS.128.66.0F.WIG F9 /r
+VEX.128.66.0F.WIG F9 /r
B
@@ -153173,7 +153312,7 @@ AVX
Subtract packed word integers in
xmm3/m128 from xmm2.
-VEX.NDS.128.66.0F.WIG FA /r
+VEX.128.66.0F.WIG FA /r
VPSUBD xmm1, xmm2, xmm3/m128
B
@@ -153185,7 +153324,7 @@ AVX
Subtract packed doubleword integers in
xmm3/m128 from xmm2.
-VEX.NDS.256.66.0F.WIG F8 /r
+VEX.256.66.0F.WIG F8 /r
VPSUBB ymm1, ymm2, ymm3/m256
B
@@ -153197,7 +153336,7 @@ AVX2
Subtract packed byte integers in ymm3/m256
from ymm2.
-VEX.NDS.256.66.0F.WIG F9 /r
+VEX.256.66.0F.WIG F9 /r
VPSUBW ymm1, ymm2, ymm3/m256
B
@@ -153209,7 +153348,7 @@ AVX2
Subtract packed word integers in
ymm3/m256 from ymm2.
-VEX.NDS.256.66.0F.WIG FA /r
+VEX.256.66.0F.WIG FA /r
VPSUBD ymm1, ymm2, ymm3/m256
B
@@ -153221,7 +153360,7 @@ AVX2
Subtract packed doubleword integers in
ymm3/m256 from ymm2.
-EVEX.NDS.128.66.0F.WIG F8 /r
+EVEX.128.66.0F.WIG F8 /r
VPSUBB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -153232,7 +153371,7 @@ AVX512VL Subtract packed byte integers in xmm3/m128
AVX512BW from xmm2 and store in xmm1 using
writemask k1.
-EVEX.NDS.256.66.0F.WIG F8 /r
+EVEX.256.66.0F.WIG F8 /r
VPSUBB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -153243,7 +153382,7 @@ AVX512VL Subtract packed byte integers in ymm3/m256
AVX512BW from ymm2 and store in ymm1 using
writemask k1.
-EVEX.NDS.512.66.0F.WIG F8 /r
+EVEX.512.66.0F.WIG F8 /r
VPSUBB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -153254,7 +153393,7 @@ AVX512BW Subtract packed byte integers in zmm3/m512
from zmm2 and store in zmm1 using
writemask k1.
-EVEX.NDS.128.66.0F.WIG F9 /r
+EVEX.128.66.0F.WIG F9 /r
VPSUBW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -153265,7 +153404,7 @@ AVX512VL Subtract packed word integers in
AVX512BW xmm3/m128 from xmm2 and store in xmm1
using writemask k1.
-EVEX.NDS.256.66.0F.WIG F9 /r
+EVEX.256.66.0F.WIG F9 /r
VPSUBW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -153276,7 +153415,7 @@ AVX512VL Subtract packed word integers in
AVX512BW ymm3/m256 from ymm2 and store in ymm1
using writemask k1.
-EVEX.NDS.512.66.0F.WIG F9 /r
+EVEX.512.66.0F.WIG F9 /r
VPSUBW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -153307,7 +153446,7 @@ Vol. 2B 4-465
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.128.66.0F.W0 FA /r
+EVEX.128.66.0F.W0 FA /r
D
VPSUBD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst
@@ -153320,7 +153459,7 @@ Subtract packed doubleword integers in
xmm3/m128/m32bcst from xmm2 and store
in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.W0 FA /r
+EVEX.256.66.0F.W0 FA /r
D
VPSUBD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst
@@ -153333,7 +153472,7 @@ Subtract packed doubleword integers in
ymm3/m256/m32bcst from ymm2 and store
in ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.W0 FA /r
+EVEX.512.66.0F.W0 FA /r
VPSUBD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst
V/V
@@ -153782,7 +153921,7 @@ AVX2
Subtract packed quadword integers in
ymm3/m256 from ymm2.
-EVEX.NDS.128.66.0F.W1 FB /r
+EVEX.128.66.0F.W1 FB /r
C
VPSUBQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst
@@ -153793,7 +153932,7 @@ AVX512F
xmm3/m128/m64bcst from xmm2 and store
in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.W1 FB /r
+EVEX.256.66.0F.W1 FB /r
C
VPSUBQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst
@@ -153804,7 +153943,7 @@ AVX512F
ymm3/m256/m64bcst from ymm2 and store
in ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.W1 FB/r
+EVEX.512.66.0F.W1 FB/r
C
VPSUBQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst
@@ -153815,9 +153954,9 @@ AVX512F
PSUBQ mm1, mm2/m64
66 0F FB /r
PSUBQ xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG FB/r
+VEX.128.66.0F.WIG FB/r
VPSUBQ xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG FB /r
+VEX.256.66.0F.WIG FB /r
VPSUBQ ymm1, ymm2, ymm3/m256
Subtract packed quadword integers in
@@ -154076,7 +154215,7 @@ Subtract packed signed word integers in
ymm3/m256 from packed signed word integers
in ymm2 and saturate results.
-EVEX.NDS.128.66.0F.WIG E8 /r
+EVEX.128.66.0F.WIG E8 /r
VPSUBSB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -154088,7 +154227,7 @@ AVX512BW xmm3/m128 from packed signed byte integers
in xmm2 and saturate results and store in
xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.WIG E8 /r
+EVEX.256.66.0F.WIG E8 /r
VPSUBSB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -154100,7 +154239,7 @@ AVX512BW ymm3/m256 from packed signed byte integers
in ymm2 and saturate results and store in
ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.WIG E8 /r
+EVEX.512.66.0F.WIG E8 /r
VPSUBSB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -154112,7 +154251,7 @@ zmm3/m512 from packed signed byte integers
in zmm2 and saturate results and store in zmm1
using writemask k1.
-EVEX.NDS.128.66.0F.WIG E9 /r
+EVEX.128.66.0F.WIG E9 /r
VPSUBSW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -154124,7 +154263,7 @@ AVX512BW xmm3/m128 from packed signed word integers
in xmm2 and saturate results and store in
xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.WIG E9 /r
+EVEX.256.66.0F.WIG E9 /r
VPSUBSW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -154143,13 +154282,13 @@ NP 0F E9 /r1
PSUBSW mm, mm/m64
66 0F E9 /r
PSUBSW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG E8 /r
+VEX.128.66.0F.WIG E8 /r
VPSUBSB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG E9 /r
+VEX.128.66.0F.WIG E9 /r
VPSUBSW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG E8 /r
+VEX.256.66.0F.WIG E8 /r
VPSUBSB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG E9 /r
+VEX.256.66.0F.WIG E9 /r
VPSUBSW ymm1, ymm2, ymm3/m256
PSUBSB/PSUBSW—Subtract Packed Signed Integers with Signed Saturation
@@ -154158,7 +154297,7 @@ Vol. 2B 4-475
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.512.66.0F.WIG E9 /r
+EVEX.512.66.0F.WIG E9 /r
VPSUBSW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -154474,7 +154613,7 @@ Subtract packed unsigned word integers in
ymm3/m256 from packed unsigned word
integers in ymm2 and saturate result.
-EVEX.NDS.128.66.0F.WIG D8 /r
+EVEX.128.66.0F.WIG D8 /r
VPSUBUSB xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -154486,7 +154625,7 @@ AVX512BW xmm3/m128 from packed unsigned byte
integers in xmm2, saturate results and store
in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.WIG D8 /r
+EVEX.256.66.0F.WIG D8 /r
VPSUBUSB ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -154498,7 +154637,7 @@ AVX512BW ymm3/m256 from packed unsigned byte
integers in ymm2, saturate results and store
in ymm1 using writemask k1.
-EVEX.NDS.512.66.0F.WIG D8 /r
+EVEX.512.66.0F.WIG D8 /r
VPSUBUSB zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -154510,7 +154649,7 @@ zmm3/m512 from packed unsigned byte
integers in zmm2, saturate results and store
in zmm1 using writemask k1.
-EVEX.NDS.128.66.0F.WIG D9 /r
+EVEX.128.66.0F.WIG D9 /r
VPSUBUSW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -154522,7 +154661,7 @@ AVX512BW xmm3/m128 from packed unsigned word
integers in xmm2 and saturate results and
store in xmm1 using writemask k1.
-EVEX.NDS.256.66.0F.WIG D9 /r
+EVEX.256.66.0F.WIG D9 /r
VPSUBUSW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -154541,13 +154680,13 @@ NP 0F D9 /r1
PSUBUSW mm, mm/m64
66 0F D9 /r
PSUBUSW xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG D8 /r
+VEX.128.66.0F.WIG D8 /r
VPSUBUSB xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG D9 /r
+VEX.128.66.0F.WIG D9 /r
VPSUBUSW xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG D8 /r
+VEX.256.66.0F.WIG D8 /r
VPSUBUSB ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG D9 /r
+VEX.256.66.0F.WIG D9 /r
VPSUBUSW ymm1, ymm2, ymm3/m256
PSUBUSB/PSUBUSW—Subtract Packed Unsigned Integers with Unsigned Saturation
@@ -154556,7 +154695,7 @@ Vol. 2B 4-479
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.512.66.0F.WIG D9 /r
+EVEX.512.66.0F.WIG D9 /r
VPSUBUSW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -155205,7 +155344,7 @@ AVX
Interleave high-order doublewords from
xmm2 and xmm3/m128 into xmm1.
-VEX.NDS.128.66.0F.WIG 6D/r
+VEX.128.66.0F.WIG 6D/r
VPUNPCKHQDQ xmm1, xmm2, xmm3/m128
B
@@ -155217,7 +155356,7 @@ AVX
Interleave high-order quadword from xmm2
and xmm3/m128 into xmm1 register.
-VEX.NDS.256.66.0F.WIG 68 /r
+VEX.256.66.0F.WIG 68 /r
VPUNPCKHBW ymm1, ymm2, ymm3/m256
B
@@ -155229,7 +155368,7 @@ AVX2
Interleave high-order bytes from ymm2 and
ymm3/m256 into ymm1 register.
-VEX.NDS.256.66.0F.WIG 69 /r
+VEX.256.66.0F.WIG 69 /r
VPUNPCKHWD ymm1, ymm2, ymm3/m256
B
@@ -155241,7 +155380,7 @@ AVX2
Interleave high-order words from ymm2 and
ymm3/m256 into ymm1 register.
-VEX.NDS.256.66.0F.WIG 6A /r
+VEX.256.66.0F.WIG 6A /r
VPUNPCKHDQ ymm1, ymm2, ymm3/m256
B
@@ -155253,7 +155392,7 @@ AVX2
Interleave high-order doublewords from
ymm2 and ymm3/m256 into ymm1 register.
-VEX.NDS.256.66.0F.WIG 6D /r
+VEX.256.66.0F.WIG 6D /r
VPUNPCKHQDQ ymm1, ymm2, ymm3/m256
B
@@ -155265,7 +155404,7 @@ AVX2
Interleave high-order quadword from ymm2
and ymm3/m256 into ymm1 register.
-EVEX.NDS.128.66.0F.WIG 68 /r
+EVEX.128.66.0F.WIG 68 /r
VPUNPCKHBW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -155276,7 +155415,7 @@ AVX512VL Interleave high-order bytes from xmm2 and
AVX512BW xmm3/m128 into xmm1 register using k1
write mask.
-EVEX.NDS.128.66.0F.WIG 69 /r
+EVEX.128.66.0F.WIG 69 /r
VPUNPCKHWD xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -155287,7 +155426,7 @@ AVX512VL Interleave high-order words from xmm2 and
AVX512BW xmm3/m128 into xmm1 register using k1
write mask.
-EVEX.NDS.128.66.0F.W0 6A /r
+EVEX.128.66.0F.W0 6A /r
VPUNPCKHDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -155302,7 +155441,7 @@ Interleave high-order doublewords from
xmm2 and xmm3/m128/m32bcst into xmm1
register using k1 write mask.
-EVEX.NDS.128.66.0F.W1 6D /r
+EVEX.128.66.0F.W1 6D /r
VPUNPCKHQDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -155330,11 +155469,11 @@ PUNPCKHDQ mm, mm/m64
PUNPCKHDQ xmm1, xmm2/m128
66 0F 6D /r
PUNPCKHQDQ xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 68/r
+VEX.128.66.0F.WIG 68/r
VPUNPCKHBW xmm1,xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 69/r
+VEX.128.66.0F.WIG 69/r
VPUNPCKHWD xmm1,xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 6A/r
+VEX.128.66.0F.WIG 6A/r
VPUNPCKHDQ xmm1, xmm2, xmm3/m128
PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ/PUNPCKHQDQ— Unpack High Data
@@ -155343,7 +155482,7 @@ Vol. 2B 4-487
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.256.66.0F.WIG 68 /r
+EVEX.256.66.0F.WIG 68 /r
VPUNPCKHBW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -155354,7 +155493,7 @@ AVX512VL Interleave high-order bytes from ymm2 and
AVX512BW ymm3/m256 into ymm1 register using k1
write mask.
-EVEX.NDS.256.66.0F.WIG 69 /r
+EVEX.256.66.0F.WIG 69 /r
VPUNPCKHWD ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -155365,7 +155504,7 @@ AVX512VL Interleave high-order words from ymm2 and
AVX512BW ymm3/m256 into ymm1 register using k1
write mask.
-EVEX.NDS.256.66.0F.W0 6A /r
+EVEX.256.66.0F.W0 6A /r
VPUNPCKHDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -155380,7 +155519,7 @@ Interleave high-order doublewords from
ymm2 and ymm3/m256/m32bcst into ymm1
register using k1 write mask.
-EVEX.NDS.256.66.0F.W1 6D /r
+EVEX.256.66.0F.W1 6D /r
VPUNPCKHQDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -155395,7 +155534,7 @@ Interleave high-order quadword from ymm2
and ymm3/m256/m64bcst into ymm1
register using k1 write mask.
-EVEX.NDS.512.66.0F.WIG 68/r
+EVEX.512.66.0F.WIG 68/r
VPUNPCKHBW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -155405,7 +155544,7 @@ V/V
AVX512BW Interleave high-order bytes from zmm2 and
zmm3/m512 into zmm1 register.
-EVEX.NDS.512.66.0F.WIG 69/r
+EVEX.512.66.0F.WIG 69/r
VPUNPCKHWD zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -155415,7 +155554,7 @@ V/V
AVX512BW Interleave high-order words from zmm2 and
zmm3/m512 into zmm1 register.
-EVEX.NDS.512.66.0F.W0 6A /r
+EVEX.512.66.0F.W0 6A /r
VPUNPCKHDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -155429,7 +155568,7 @@ Interleave high-order doublewords from
zmm2 and zmm3/m512/m32bcst into zmm1
register using k1 write mask.
-EVEX.NDS.512.66.0F.W1 6D /r
+EVEX.512.66.0F.W1 6D /r
VPUNPCKHQDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -156157,7 +156296,7 @@ AVX2
Interleave low-order quadword from ymm2
and ymm3/m256 into ymm1 register.
-EVEX.NDS.128.66.0F.WIG 60 /r
+EVEX.128.66.0F.WIG 60 /r
VPUNPCKLBW xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -156168,7 +156307,7 @@ AVX512VL Interleave low-order bytes from xmm2 and
AVX512BW xmm3/m128 into xmm1 register subject to
write mask k1.
-EVEX.NDS.128.66.0F.WIG 61 /r
+EVEX.128.66.0F.WIG 61 /r
VPUNPCKLWD xmm1 {k1}{z}, xmm2, xmm3/m128
C
@@ -156179,7 +156318,7 @@ AVX512VL Interleave low-order words from xmm2 and
AVX512BW xmm3/m128 into xmm1 register subject to
write mask k1.
-EVEX.NDS.128.66.0F.W0 62 /r
+EVEX.128.66.0F.W0 62 /r
VPUNPCKLDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -156192,7 +156331,7 @@ AVX512F
and xmm3/m128/m32bcst into xmm1
register subject to write mask k1.
-EVEX.NDS.128.66.0F.W1 6C /r
+EVEX.128.66.0F.W1 6C /r
VPUNPCKLQDQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -156218,21 +156357,21 @@ PUNPCKLDQ mm, mm/m32
PUNPCKLDQ xmm1, xmm2/m128
66 0F 6C /r
PUNPCKLQDQ xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 60/r
+VEX.128.66.0F.WIG 60/r
VPUNPCKLBW xmm1,xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 61/r
+VEX.128.66.0F.WIG 61/r
VPUNPCKLWD xmm1,xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 62/r
+VEX.128.66.0F.WIG 62/r
VPUNPCKLDQ xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F.WIG 6C/r
+VEX.128.66.0F.WIG 6C/r
VPUNPCKLQDQ xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 60 /r
+VEX.256.66.0F.WIG 60 /r
VPUNPCKLBW ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG 61 /r
+VEX.256.66.0F.WIG 61 /r
VPUNPCKLWD ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG 62 /r
+VEX.256.66.0F.WIG 62 /r
VPUNPCKLDQ ymm1, ymm2, ymm3/m256
-VEX.NDS.256.66.0F.WIG 6C /r
+VEX.256.66.0F.WIG 6C /r
VPUNPCKLQDQ ymm1, ymm2, ymm3/m256
PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ/PUNPCKLQDQ—Unpack Low Data
@@ -156241,7 +156380,7 @@ Vol. 2B 4-497
INSTRUCTION SET REFERENCE, M-U
-EVEX.NDS.256.66.0F.WIG 60 /r
+EVEX.256.66.0F.WIG 60 /r
VPUNPCKLBW ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -156252,7 +156391,7 @@ AVX512VL Interleave low-order bytes from ymm2 and
AVX512BW ymm3/m256 into ymm1 register subject to
write mask k1.
-EVEX.NDS.256.66.0F.WIG 61 /r
+EVEX.256.66.0F.WIG 61 /r
VPUNPCKLWD ymm1 {k1}{z}, ymm2, ymm3/m256
C
@@ -156263,7 +156402,7 @@ AVX512VL Interleave low-order words from ymm2 and
AVX512BW ymm3/m256 into ymm1 register subject to
write mask k1.
-EVEX.NDS.256.66.0F.W0 62 /r
+EVEX.256.66.0F.W0 62 /r
VPUNPCKLDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -156276,7 +156415,7 @@ AVX512F
and ymm3/m256/m32bcst into ymm1
register subject to write mask k1.
-EVEX.NDS.256.66.0F.W1 6C /r
+EVEX.256.66.0F.W1 6C /r
VPUNPCKLQDQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -156289,7 +156428,7 @@ AVX512F
and ymm3/m256/m64bcst into ymm1
register subject to write mask k1.
-EVEX.NDS.512.66.0F.WIG 60/r
+EVEX.512.66.0F.WIG 60/r
VPUNPCKLBW zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -156300,7 +156439,7 @@ AVX512BW Interleave low-order bytes from zmm2 and
zmm3/m512 into zmm1 register subject to
write mask k1.
-EVEX.NDS.512.66.0F.WIG 61/r
+EVEX.512.66.0F.WIG 61/r
VPUNPCKLWD zmm1 {k1}{z}, zmm2, zmm3/m512
C
@@ -156311,7 +156450,7 @@ AVX512BW Interleave low-order words from zmm2 and
zmm3/m512 into zmm1 register subject to
write mask k1.
-EVEX.NDS.512.66.0F.W0 62 /r
+EVEX.512.66.0F.W0 62 /r
VPUNPCKLDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -156325,7 +156464,7 @@ Interleave low-order doublewords from zmm2
and zmm3/m512/m32bcst into zmm1
register subject to write mask k1.
-EVEX.NDS.512.66.0F.W1 6C /r
+EVEX.512.66.0F.W1 6C /r
VPUNPCKLQDQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -157769,7 +157908,7 @@ SSE2
Bitwise XOR of xmm2/m128 and xmm1.
-VEX.NDS.128.66.0F.WIG EF /r
+VEX.128.66.0F.WIG EF /r
VPXOR xmm1, xmm2, xmm3/m128
B
@@ -157780,7 +157919,7 @@ AVX
Bitwise XOR of xmm3/m128 and xmm2.
-VEX.NDS.256.66.0F.WIG EF /r
+VEX.256.66.0F.WIG EF /r
VPXOR ymm1, ymm2, ymm3/m256
B
@@ -157791,7 +157930,7 @@ AVX2
Bitwise XOR of ymm3/m256 and ymm2.
-EVEX.NDS.128.66.0F.W0 EF /r
+EVEX.128.66.0F.W0 EF /r
C
VPXORD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst
@@ -157800,7 +157939,7 @@ V/V
AVX512VL Bitwise XOR of packed doubleword integers in
AVX512F xmm2 and xmm3/m128 using writemask k1.
-EVEX.NDS.256.66.0F.W0 EF /r
+EVEX.256.66.0F.W0 EF /r
C
VPXORD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst
@@ -157809,7 +157948,7 @@ V/V
AVX512VL Bitwise XOR of packed doubleword integers in
AVX512F ymm2 and ymm3/m256 using writemask k1.
-EVEX.NDS.512.66.0F.W0 EF /r
+EVEX.512.66.0F.W0 EF /r
C
VPXORD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst
@@ -157817,7 +157956,7 @@ V/V
AVX512F
-EVEX.NDS.128.66.0F.W1 EF /r
+EVEX.128.66.0F.W1 EF /r
VPXORQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -157828,7 +157967,7 @@ V/V
AVX512VL Bitwise XOR of packed quadword integers in
AVX512F xmm2 and xmm3/m128 using writemask k1.
-EVEX.NDS.256.66.0F.W1 EF /r
+EVEX.256.66.0F.W1 EF /r
C
VPXORQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst
@@ -157837,7 +157976,7 @@ V/V
AVX512VL Bitwise XOR of packed quadword integers in
AVX512F ymm2 and ymm3/m256 using writemask k1.
-EVEX.NDS.512.66.0F.W1 EF /r
+EVEX.512.66.0F.W1 EF /r
C
VPXORQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst
@@ -159255,7 +159394,7 @@ values (bits[127:32]) from xmm2 are copied to
xmm1[127:32].
RCPSS xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 53 /r
+VEX.LIG.F3.0F.WIG 53 /r
VRCPSS xmm1, xmm2, xmm3/m32
Instruction Operand Encoding
@@ -159878,10 +160017,9 @@ supported. The two counter types are:
General-purpose or special-purpose performance counters are specified with ECX[30] = 0: The number of
general-purpose performance counters on processor supporting architectural performance monitoring are
-reported by CPUID 0AH leaf. The number of general-purpose counters is model specific if the processor does
-not support architectural performance monitoring, see Chapter 18, “Performance Monitoring” of Intel® 64 and
-IA-32 Architectures Software Developer’s Manual, Volume 3B. Special-purpose counters are available only in
-selected processor members, see Table 4-16.
+reported by CPUID 0AH leaf. The availability of special-purpose counters, as well as the number of generalpurpose counters if the processor does not support architectural performance monitoring, is model specific;
+see Chapter 18, “Performance Monitoring” of Intel® 64 and IA-32 Architectures Software Developer’s Manual,
+Volume 3B.
@@ -159889,196 +160027,9 @@ Fixed-function performance counters are specified with ECX[30] = 1. The number f
counters is enumerated by CPUID 0AH leaf. See Chapter 18, “Performance Monitoring” of Intel® 64 and IA-32
Architectures Software Developer’s Manual, Volume 3B. This counter type is selected if ECX[30] is set.
-The width of fixed-function performance counters and general-purpose performance counters on processor
+The width of fixed-function performance counters and general-purpose performance counters on processors
supporting architectural performance monitoring are reported by CPUID 0AH leaf. The width of general-purpose
performance counters are 40-bits for processors that do not support architectural performance monitoring counters. The width of special-purpose performance counters are implementation specific.
-Table 4-16 lists valid indices of the general-purpose and special-purpose performance counters according to the
-DisplayFamily_DisplayModel values of CPUID encoding for each processor family (see CPUID instruction in Chapter
-3, “Instruction Set Reference, A-L” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume
-2A).
-
-Table 4-16. Valid General and Special Purpose Performance Counter Index Range for RDPMC
-Processor Family
-
-DisplayFamily_DisplayModel/
-Other Signatures
-
-Valid PMC Index
-Range
-
-General-purpose
-Counters
-
-P6
-
-06H_01H, 06H_03H, 06H_05H,
-06H_06H, 06H_07H, 06H_08H,
-06H_0AH, 06H_0BH
-
-0, 1
-
-0, 1
-
-Processors Based on Intel NetBurst
-microarchitecture (No L3)
-
-0FH_00H, 0FH_01H, 0FH_02H,
-0FH_03H, 0FH_04H, 0FH_06H
-
-≥ 0 and ≤ 17
-
-≥ 0 and ≤ 17
-
-Pentium M processors
-
-06H_09H, 06H_0DH
-
-0, 1
-
-0, 1
-
-Processors Based on Intel NetBurst
-microarchitecture (No L3)
-
-0FH_03H, 0FH_04H) and (L3 is
-present)
-
-≥ 0 and ≤ 25
-
-≥ 0 and ≤ 17
-
-RDPMC—Read Performance-Monitoring Counters
-
-Vol. 2B 4-533
-
- INSTRUCTION SET REFERENCE, M-U
-
-Table 4-16. Valid General and Special Purpose Performance Counter Index Range for RDPMC (Contd.)
-Processor Family
-
-DisplayFamily_DisplayModel/
-Other Signatures
-
-Valid PMC Index
-Range
-
-General-purpose
-Counters
-
-Intel® Core™ Solo and Intel® Core™ Duo
-processors, Dual-core Intel® Xeon®
-processor LV
-
-06H_0EH
-
-0, 1
-
-0, 1
-
-Intel® Core™2 Duo processor, Intel Xeon
-processor 3000, 5100, 5300, 7300 Series general-purpose PMC
-
-06H_0FH
-
-0, 1
-
-0, 1
-
-Intel® Core™2 Duo processor family, Intel
-Xeon processor 3100, 3300, 5200, 5400
-series - general-purpose PMC
-
-06H_17H
-
-0, 1
-
-0, 1
-
-Intel Xeon processors 7400 series
-
-(06H_1DH)
-
-≥ 0 and ≤ 9
-
-0, 1
-
-06H_1CH, 06_26H, 06_27H,
-06_35H, 06_36H
-
-0, 1
-
-0, 1
-
-Intel® Atom™ processors based on
-Silvermont or Airmont microarchitectures
-
-06H_37H, 06_4AH, 06_4DH,
-06_5AH, 06_5DH, 06_4CH
-
-0, 1
-
-0, 1
-
-Next Generation Intel® Atom™ processors
-based on Goldmont microarchitecture
-
-06H_5CH, 06_5FH
-
-0-3
-
-0-3
-
-Intel® processors based on the Nehalem,
-Westmere microarchitectures
-
-06H_1AH, 06H_1EH, 06H_1FH,
-06_25H, 06_2CH, 06H_2EH,
-06_2FH
-
-0-3
-
-0-3
-
-Intel® processors based on the Sandy
-Bridge, Ivy Bridge microarchitecture
-
-06H_2AH, 06H_2DH, 06H_3AH,
-06H_3EH
-
-0-3 (0-7 if
-HyperThreading is off)
-
-0-3 (0-7 if
-HyperThreading is off)
-
-Intel® processors based on the Haswell,
-Broadwell, SkyLake microarchitectures
-
-06H_3CH, 06H_45H, 06H_46H,
-06H_3FH, 06_3DH, 06_47H,
-4FH, 06_56H, 06_4EH, 06_5EH
-
-0-3 (0-7 if
-HyperThreading is off)
-
-0-3 (0-7 if
-HyperThreading is off)
-
-45 nm and 32 nm Intel
-
-
-Atom™ processors
-
-Processors based on Intel NetBurst microarchitecture support “fast” (32-bit) and “slow” (40-bit) reads on the first
-18 performance counters. Selected this option using ECX[31]. If bit 31 is set, RDPMC reads only the low 32 bits of
-the selected performance counter. If bit 31 is clear, all 40 bits are read. A 32-bit result is returned in EAX and EDX
-is set to 0. A 32-bit read executes faster on these processors than a full 40-bit read.
-On processors based on Intel NetBurst microarchitecture with L3, performance counters with indices 18-25 are 32bit counters. EDX is cleared after executing RDPMC for these counters.
-In Intel Core 2 processor family, Intel Xeon processor 3000, 5100, 5300 and 7400 series, the fixed-function performance counters are 40-bits wide; they can be accessed by RDMPC with ECX between from 4000_0000H and
-4000_0002H.
-On Intel Xeon processor 7400 series, there are eight 32-bit special-purpose counters addressable with indices 2-9,
-ECX[30]=0.
When in protected or virtual 8086 mode, the performance-monitoring counters enabled (PCE) flag in register CR4
restricts the use of the RDPMC instruction as follows. When the PCE flag is set, the RDPMC instruction can be
executed at any privilege level; when the flag is clear, the instruction can only be executed at privilege level 0.
@@ -160090,16 +160041,16 @@ number of instructions decoded, number of interrupts received, or number of cach
the events that can be counted for various processors in the Intel 64 and IA-32 architecture families.
The RDPMC instruction is not a serializing instruction; that is, it does not imply that all the events caused by the
preceding instructions have been completed or that events caused by subsequent instructions have not begun. If
-
-4-534 Vol. 2B
+an exact event count is desired, software must insert a serializing instruction (such as the CPUID instruction)
+before and/or after the RDPMC instruction.
+Performing back-to-back fast reads are not guaranteed to be monotonic. To guarantee monotonicity on back-toback reads, a serializing instruction must be placed between the two RDPMC instructions.
RDPMC—Read Performance-Monitoring Counters
+Vol. 2B 4-533
+
INSTRUCTION SET REFERENCE, M-U
-an exact event count is desired, software must insert a serializing instruction (such as the CPUID instruction)
-before and/or after the RDPMC instruction.
-Performing back-to-back fast reads are not guaranteed to be monotonic. To guarantee monotonicity on back-toback reads, a serializing instruction must be placed between the two RDPMC instructions.
The RDPMC instruction can execute in 16-bit addressing mode or virtual-8086 mode; however, the full contents of
the ECX register are used to select the counter, and the event count is stored in the full EAX and EDX registers. The
RDPMC instruction was introduced into the IA-32 Architecture in the Pentium Pro processor and the Pentium
@@ -160107,67 +160058,12 @@ processor with MMX technology. The earlier Pentium processors have performance-m
must be read with the RDMSR instruction.
Operation
-(* Intel processors that support architectural performance monitoring *)
-Most significant counter bit (MSCB) = 47
-IF ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
-THEN IF (ECX[30] = 1 and ECX[29:0] in valid fixed-counter range)
-EAX ← IA32_FIXED_CTR(ECX)[30:0];
-EDX ← IA32_FIXED_CTR(ECX)[MSCB:32];
-ELSE IF (ECX[30] = 0 and ECX[29:0] in valid general-purpose counter range)
-EAX ← PMC(ECX[30:0])[31:0];
-EDX ← PMC(ECX[30:0])[MSCB:32];
-ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
-#GP(0);
-FI;
-(* Intel Core 2 Duo processor family and Intel Xeon processor 3000, 5100, 5300, 7400 series*)
-Most significant counter bit (MSCB) = 39
-IF ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
-THEN IF (ECX[30] = 1 and ECX[29:0] in valid fixed-counter range)
-EAX ← IA32_FIXED_CTR(ECX)[30:0];
-EDX ← IA32_FIXED_CTR(ECX)[MSCB:32];
-ELSE IF (ECX[30] = 0 and ECX[29:0] in valid general-purpose counter range)
-EAX ← PMC(ECX[30:0])[31:0];
-EDX ← PMC(ECX[30:0])[MSCB:32];
-ELSE IF (ECX[30] = 0 and ECX[29:0] in valid special-purpose counter range)
-EAX ← PMC(ECX[30:0])[31:0]; (* 32-bit read *)
-ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
-#GP(0);
-FI;
-(* P6 family processors and Pentium processor with MMX technology *)
-IF (ECX = 0 or 1) and ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
+MSCB = Most Significant Counter Bit (* Model-specific *)
+IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates a supported counter))
THEN
-EAX ← PMC(ECX)[31:0];
-EDX ← PMC(ECX)[39:32];
-ELSE (* ECX is not 0 or 1 or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
-#GP(0);
-FI;
-(* Processors based on Intel NetBurst microarchitecture *)
-IF ((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0))
-THEN IF (ECX[30:0] = 0:17)
-THEN IF ECX[31] = 0
-RDPMC—Read Performance-Monitoring Counters
-
-Vol. 2B 4-535
-
- INSTRUCTION SET REFERENCE, M-U
-
-THEN
-EAX ← PMC(ECX[30:0])[31:0]; (* 40-bit read *)
-EDX ← PMC(ECX[30:0])[39:32];
-ELSE (* ECX[31] = 1*)
-THEN
-EAX ← PMC(ECX[30:0])[31:0]; (* 32-bit read *)
-EDX ← 0;
-FI;
-ELSE IF (*64-bit Intel processor based on Intel NetBurst microarchitecture with L3 *)
-THEN IF (ECX[30:0] = 18:25 )
-EAX ← PMC(ECX[30:0])[31:0]; (* 32-bit read *)
-EDX ← 0;
-FI;
-ELSE (* Invalid PMC index in ECX[30:0], see Table 4-19. *)
-GP(0);
-FI;
-ELSE (* CR4.PCE = 0 and (CPL = 1, 2, or 3) and CR0.PE = 1 *)
+EAX ← counter[31:0];
+EDX ← ZeroExtend(counter[MSCB:32]);
+ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *)
#GP(0);
FI;
@@ -160178,7 +160074,7 @@ Protected Mode Exceptions
#GP(0)
If the current privilege level is not 0 and the PCE flag in the CR4 register is clear.
-If an invalid performance counter index is specified (see Table 4-16).
+If an invalid performance counter index is specified.
#UD
@@ -160187,7 +160083,7 @@ If the LOCK prefix is used.
Real-Address Mode Exceptions
#GP
-If an invalid performance counter index is specified (see Table 4-16).
+If an invalid performance counter index is specified.
#UD
@@ -160197,7 +160093,7 @@ Virtual-8086 Mode Exceptions
#GP(0)
If the PCE flag in the CR4 register is clear.
-If an invalid performance counter index is specified (see Table 4-16).
+If an invalid performance counter index is specified.
#UD
@@ -160210,11 +160106,11 @@ Same exceptions as in protected mode.
#GP(0)
If the current privilege level is not 0 and the PCE flag in the CR4 register is clear.
-If an invalid performance counter index is specified (see Table 4-16).
+If an invalid performance counter index is specified.
#UD
-4-536 Vol. 2B
+4-534 Vol. 2B
If the LOCK prefix is used.
@@ -160239,7 +160135,7 @@ Flag
Description
-0F C7 /6
+NFx 0F C7 /6
M
@@ -160269,9 +160165,9 @@ Read a 64-bit random number and store in the
destination register.
RDRAND r16
-0F C7 /6
+NFx 0F C7 /6
RDRAND r32
-REX.W + 0F C7 /6
+NFx REX.W + 0F C7 /6
RDRAND r64
Instruction Operand Encoding
@@ -160329,7 +160225,7 @@ The CF flag is set according to the result (see the “Operation” section abov
set to 0.
RDRAND—Read Random Number
-Vol. 2B 4-537
+Vol. 2B 4-535
INSTRUCTION SET REFERENCE, M-U
@@ -160350,7 +160246,6 @@ Protected Mode Exceptions
#UD
If the LOCK prefix is used.
-If the F2H or F3H prefix is used.
If CPUID.01H:ECX.RDRAND[bit 30] = 0.
Real-Address Mode Exceptions
@@ -160365,7 +160260,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-4-538 Vol. 2B
+4-536 Vol. 2B
RDRAND—Read Random Number
@@ -160392,9 +160287,9 @@ bit Mode
Support
V/V
-0F C7 /7
+NFx 0F C7 /7
RDSEED r16
-0F C7 /7
+NFx 0F C7 /7
RDSEED r32
M
@@ -160406,7 +160301,7 @@ RDSEED
Read a 32-bit NIST SP800-90B & C compliant random value and
store in the destination register.
-REX.W + 0F C7 /7
+NFx REX.W + 0F C7 /7
RDSEED r64
M
@@ -160477,7 +160372,7 @@ OF, SF, ZF, AF, PF ← 0;
RDSEED—Read Random SEED
-Vol. 2B 4-539
+Vol. 2B 4-537
INSTRUCTION SET REFERENCE, M-U
@@ -160494,38 +160389,33 @@ Protected Mode Exceptions
#UD
If the LOCK prefix is used.
-If the F2H or F3H prefix is used.
If CPUID.(EAX=07H, ECX=0H):EBX.RDSEED[bit 18] = 0.
Real-Address Mode Exceptions
#UD
If the LOCK prefix is used.
-If the F2H or F3H prefix is used.
If CPUID.(EAX=07H, ECX=0H):EBX.RDSEED[bit 18] = 0.
Virtual-8086 Mode Exceptions
#UD
If the LOCK prefix is used.
-If the F2H or F3H prefix is used.
If CPUID.(EAX=07H, ECX=0H):EBX.RDSEED[bit 18] = 0.
Compatibility Mode Exceptions
#UD
If the LOCK prefix is used.
-If the F2H or F3H prefix is used.
If CPUID.(EAX=07H, ECX=0H):EBX.RDSEED[bit 18] = 0.
64-Bit Mode Exceptions
#UD
If the LOCK prefix is used.
-If the F2H or F3H prefix is used.
If CPUID.(EAX=07H, ECX=0H):EBX.RDSEED[bit 18] = 0.
-4-540 Vol. 2B
+4-538 Vol. 2B
RDSEED—Read Random SEED
@@ -160625,7 +160515,7 @@ None.
1. A load is considered to become globally visible when the value to be loaded is determined.
RDTSC—Read Time-Stamp Counter
-Vol. 2B 4-541
+Vol. 2B 4-539
INSTRUCTION SET REFERENCE, M-U
@@ -160658,7 +160548,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-4-542 Vol. 2B
+4-540 Vol. 2B
RDTSC—Read Time-Stamp Counter
@@ -160756,7 +160646,7 @@ None.
1. A load is considered to become globally visible when the value to be loaded is determined.
RDTSCP—Read Time-Stamp Counter and Processor ID
-Vol. 2B 4-543
+Vol. 2B 4-541
INSTRUCTION SET REFERENCE, M-U
@@ -160790,7 +160680,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-4-544 Vol. 2B
+4-542 Vol. 2B
RDTSCP—Read Time-Stamp Counter and Processor ID
@@ -161229,7 +161119,7 @@ ES:[(E)DI].
REP/REPE/REPZ/REPNE/REPNZ—Repeat String Operation Prefix
-Vol. 2B 4-545
+Vol. 2B 4-543
INSTRUCTION SET REFERENCE, M-U
@@ -161421,9 +161311,9 @@ F3H is a mandatory prefix for POPCNT, LZCNT, and ADOX.
The REP prefixes apply only to one string instruction at a time. To repeat a block of instructions, use the LOOP
instruction or another looping construct. All of these repeat prefixes cause the associated instruction to be repeated
-until the count in register is decremented to 0. See Table 4-17.
+until the count in register is decremented to 0. See Table 4-16.
-Table 4-17. Repeat Prefixes
+Table 4-16. Repeat Prefixes
Repeat Prefix
Termination Condition 1*
@@ -161451,7 +161341,7 @@ ZF = 1
NOTES:
* Count register is CX, ECX or RCX by default, depending on attributes of the operating modes.
-4-546 Vol. 2B
+4-544 Vol. 2B
REP/REPE/REPZ/REPNE/REPNZ—Repeat String Operation Prefix
@@ -161510,7 +161400,7 @@ None; however, the CMPS and SCAS instructions do set the status flags in the EFL
REP/REPE/REPZ/REPNE/REPNZ—Repeat String Operation Prefix
-Vol. 2B 4-547
+Vol. 2B 4-545
INSTRUCTION SET REFERENCE, M-U
@@ -161520,7 +161410,7 @@ Exceptions may be generated by an instruction associated with the prefix.
64-Bit Mode Exceptions
#GP(0)
-4-548 Vol. 2B
+4-546 Vol. 2B
If the memory address is in a non-canonical form.
@@ -161667,10 +161557,15 @@ In 64-bit mode, the default operation size of this instruction is the stack-addr
near returns, not far returns; the default operation size of far returns is 32 bits.
RET—Return from Procedure
-Vol. 2B 4-549
+Vol. 2B 4-547
INSTRUCTION SET REFERENCE, M-U
+Instruction ordering. Instructions following a far return may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the far return have
+completed execution (the later instructions may execute before data stored by the earlier instructions have become
+globally visible).
+
Operation
(* Near return *)
IF instruction = near return
@@ -161718,17 +161613,17 @@ IF OperandSize = 32
THEN
IF top 8 bytes of stack not within stack limits
THEN #SS(0); FI;
-EIP ← Pop();
-CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *)
-ELSE (* OperandSize = 16 *)
-IF top 4 bytes of stack not within stack limits
-THEN #SS(0); FI;
-4-550 Vol. 2B
+4-548 Vol. 2B
RET—Return from Procedure
INSTRUCTION SET REFERENCE, M-U
+EIP ← Pop();
+CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *)
+ELSE (* OperandSize = 16 *)
+IF top 4 bytes of stack not within stack limits
+THEN #SS(0); FI;
tempEIP ← Pop();
tempEIP ← tempEIP AND 0000FFFFH;
IF tempEIP not within code segment limits
@@ -161778,22 +161673,21 @@ FI;
FI;
RETURN-TO-SAME-PRIVILEGE-LEVEL:
IF the return instruction pointer is not within the return code segment limit
-THEN #GP(0); FI;
-IF OperandSize = 32
-THEN
-EIP ← Pop();
-CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *)
RET—Return from Procedure
-Vol. 2B 4-551
+Vol. 2B 4-549
INSTRUCTION SET REFERENCE, M-U
+THEN #GP(0); FI;
+IF OperandSize = 32
+THEN
+EIP ← Pop();
+CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *)
ELSE (* OperandSize = 16 *)
EIP ← Pop();
EIP ← EIP AND 0000FFFFH;
CS ← Pop(); (* 16-bit pop *)
-
FI;
IF instruction has immediate operand
THEN (* Release parameters from stack *)
@@ -161838,12 +161732,7 @@ SP ← SP + SRC;
FI;
FI;
tempESP ← Pop();
-tempSS ← Pop(); (* 32-bit pop, high-order 16 bits discarded; seg. descriptor loaded *)
-ESP ← tempESP;
-SS ← tempSS;
-ELSE (* OperandSize = 16 *)
-EIP ← Pop();
-4-552 Vol. 2B
+4-550 Vol. 2B
RET—Return from Procedure
@@ -161851,6 +161740,11 @@ RET—Return from Procedure
FI;
+tempSS ← Pop(); (* 32-bit pop, high-order 16 bits discarded; seg. descriptor loaded *)
+ESP ← tempESP;
+SS ← tempSS;
+ELSE (* OperandSize = 16 *)
+EIP ← Pop();
EIP ← EIP AND 0000FFFFH;
CS ← Pop(); (* 16-bit pop; segment descriptor loaded *)
CS(RPL) ← CPL;
@@ -161897,17 +161791,17 @@ THEN #SS(0); FI;
ELSE
IF OperandSize = 16
THEN
-IF second word on stack is not within stack limits
-THEN #SS(0); FI;
-IF first or second word on stack is not in canonical space
-THEN #SS(0); FI;
-ELSE (* OperandSize = 64 *)
RET—Return from Procedure
-Vol. 2B 4-553
+Vol. 2B 4-551
INSTRUCTION SET REFERENCE, M-U
+IF second word on stack is not within stack limits
+THEN #SS(0); FI;
+IF first or second word on stack is not in canonical space
+THEN #SS(0); FI;
+ELSE (* OperandSize = 64 *)
IF first or second quadword on stack is not in canonical space
THEN #SS(0); FI;
FI
@@ -161956,17 +161850,17 @@ CS ← Pop(); (* 16-bit pop *)
ELSE (* OperandSize = 64 *)
RIP ← Pop();
CS ← Pop(); (* 64-bit pop, high-order 48 bits discarded *)
-FI;
-FI;
-IF instruction has immediate operand
-THEN (* Release parameters from stack *)
-IF StackAddressSize = 32
-4-554 Vol. 2B
+4-552 Vol. 2B
RET—Return from Procedure
INSTRUCTION SET REFERENCE, M-U
+FI;
+FI;
+IF instruction has immediate operand
+THEN (* Release parameters from stack *)
+IF StackAddressSize = 32
THEN
ESP ← ESP + SRC;
ELSE
@@ -162015,17 +161909,17 @@ THEN
EIP ← Pop();
CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded, segment descriptor loaded *)
CS(RPL) ← CPL;
-IF instruction has immediate operand
-THEN (* Release parameters from called procedure’s stack *)
-IF StackAddressSize = 32
-THEN
-ESP ← ESP + SRC;
RET—Return from Procedure
-Vol. 2B 4-555
+Vol. 2B 4-553
INSTRUCTION SET REFERENCE, M-U
+IF instruction has immediate operand
+THEN (* Release parameters from called procedure’s stack *)
+IF StackAddressSize = 32
+THEN
+ESP ← ESP + SRC;
ELSE
IF StackAddressSize = 16
THEN
@@ -162075,16 +161969,18 @@ FI;
tempESP ← Pop();
tempSS ← Pop(); (* 64-bit pop; high-order 48 bits discarded; seg. desc. loaded *)
ESP ← tempESP;
-SS ← tempSS;
-FI;
-FI;
-FOR each of segment register (ES, FS, GS, and DS)
-4-556 Vol. 2B
+4-554 Vol. 2B
RET—Return from Procedure
INSTRUCTION SET REFERENCE, M-U
+FI;
+
+SS ← tempSS;
+
+FI;
+FOR each of segment register (ES, FS, GS, and DS)
DO
IF segment register points to data or non-conforming code segment
and CPL > segment descriptor DPL; (* DPL in hidden part of segment register *)
@@ -162146,6 +162042,12 @@ If a page fault occurs.
If an unaligned memory access occurs when the CPL is 3 and alignment checking is enabled.
+RET—Return from Procedure
+
+Vol. 2B 4-555
+
+ INSTRUCTION SET REFERENCE, M-U
+
Real-Address Mode Exceptions
#GP
@@ -162155,12 +162057,6 @@ If the return instruction pointer is not within the return code segment limit
If the top bytes of stack are not within stack limits.
-RET—Return from Procedure
-
-Vol. 2B 4-557
-
- INSTRUCTION SET REFERENCE, M-U
-
Virtual-8086 Mode Exceptions
#GP(0)
@@ -162225,7 +162121,7 @@ If a page fault occurs.
If alignment checking is enabled and an unaligned memory reference is made while the
current privilege level is 3.
-4-558 Vol. 2B
+4-556 Vol. 2B
RET—Return from Procedure
@@ -162314,14 +162210,11 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
-
-If VEX.W = 1.
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
RORX — Rotate Right Logical Without Affecting Flags
-Vol. 2B 4-559
+Vol. 2B 4-557
INSTRUCTION SET REFERENCE, M-U
@@ -162412,7 +162305,7 @@ operand). The rounding process rounds each input floating-point value to an inte
result as a double-precision floating-point value.
The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in
Figure 4-24. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the
-source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-18 lists the encoded
+source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-17 lists the encoded
values for rounding-mode field).
The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an
SNaN then it will be converted to a QNaN. If DAZ is set to ‘1 then denormals will be converted to zero before
@@ -162425,7 +162318,7 @@ VEX.256 encoded version: The source operand is a YMM register or a 256-bit memor
operand is a YMM register.
Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.
-4-560 Vol. 2B
+4-558 Vol. 2B
ROUNDPD — Round Packed Double Precision Floating-Point Values
@@ -162442,7 +162335,7 @@ RS — Rounding select; 1: MXCSR.RC, 0: Imm8.RC
RC — Rounding mode
Figure 4-24. Bit Control Fields of Immediate Byte for ROUNDxx Instruction
-Table 4-18. Rounding Modes and Encoding of Rounding Control (RC) Field
+Table 4-17. Rounding Modes and Encoding of Rounding Control (RC) Field
Rounding
Mode
@@ -162505,7 +162398,7 @@ DEST[255:192]  RoundToInteger(SRC[255:192] ], ROUND_CONTROL)
ROUNDPD — Round Packed Double Precision Floating-Point Values
-Vol. 2B 4-561
+Vol. 2B 4-559
INSTRUCTION SET REFERENCE, M-U
@@ -162526,7 +162419,7 @@ Other Exceptions
See Exceptions Type 2; additionally
#UD
-4-562 Vol. 2B
+4-560 Vol. 2B
If VEX.vvvv ≠ 1111B.
@@ -162622,7 +162515,7 @@ operand). The rounding process rounds each input floating-point value to an inte
result as a single-precision floating-point value.
The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in
Figure 4-24. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the
-source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-18 lists the encoded
+source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-17 lists the encoded
values for rounding-mode field).
The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an
SNaN then it will be converted to a QNaN. If DAZ is set to ‘1 then denormals will be converted to zero before
@@ -162637,7 +162530,7 @@ Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b otherwise
ROUNDPS — Round Packed Single Precision Floating-Point Values
-Vol. 2B 4-563
+Vol. 2B 4-561
INSTRUCTION SET REFERENCE, M-U
@@ -162686,7 +162579,7 @@ __m256 _mm256_round_ps(__m256 s1, int iRoundMode);
__m256 _mm256_floor_ps(__m256 s1);
__m256 _mm256_ceil_ps(__m256 s1)
-4-564 Vol. 2B
+4-562 Vol. 2B
ROUNDPS — Round Packed Single Precision Floating-Point Values
@@ -162705,7 +162598,7 @@ If VEX.vvvv ≠ 1111B.
ROUNDPS — Round Packed Single Precision Floating-Point Values
-Vol. 2B 4-565
+Vol. 2B 4-563
INSTRUCTION SET REFERENCE, M-U
@@ -162740,7 +162633,7 @@ floating-point value in xmm2/m64 and place
the result in xmm1. The rounding mode is
determined by imm8.
-VEX.NDS.LIG.66.0F3A.WIG 0B /r ib
+VEX.LIG.66.0F3A.WIG 0B /r ib
VROUNDSD xmm1, xmm2, xmm3/m64, imm8
RVMI V/V
@@ -162792,7 +162685,7 @@ as a double precision floating-point value in the lowest position. The upper dou
the destination is retained.
The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in
Figure 4-24. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the
-source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-18 lists the encoded
+source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-17 lists the encoded
values for rounding-mode field).
The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an
SNaN then it will be converted to a QNaN. If DAZ is set to ‘1 then denormals will be converted to zero before
@@ -162814,7 +162707,7 @@ ROUNDSD (128-bit Legacy SSE version)
DEST[63:0]  RoundToInteger(SRC[63:0], ROUND_CONTROL)
DEST[MAXVL-1:64] (Unmodified)
-4-566 Vol. 2B
+4-564 Vol. 2B
ROUNDSD — Round Scalar Double Precision Floating-Point Values
@@ -162842,7 +162735,7 @@ See Exceptions Type 3.
ROUNDSD — Round Scalar Double Precision Floating-Point Values
-Vol. 2B 4-567
+Vol. 2B 4-565
INSTRUCTION SET REFERENCE, M-U
@@ -162877,7 +162770,7 @@ floating-point value in xmm2/m32 and place
the result in xmm1. The rounding mode is
determined by imm8.
-VEX.NDS.LIG.66.0F3A.WIG 0A /r ib
+VEX.LIG.66.0F3A.WIG 0A /r ib
VROUNDSS xmm1, xmm2, xmm3/m32, imm8
RVMI V/V
@@ -162931,7 +162824,7 @@ returns the result as a single-precision floating-point value in the lowest posi
floating-point values in the destination are retained.
The immediate operand specifies control fields for the rounding operation, three bit fields are defined and shown in
Figure 4-24. Bit 3 of the immediate byte controls processor behavior for a precision exception, bit 2 selects the
-source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-18 lists the encoded
+source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Table 4-17 lists the encoded
values for rounding-mode field).
The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an
SNaN then it will be converted to a QNaN. If DAZ is set to ‘1 then denormals will be converted to zero before
@@ -162953,7 +162846,7 @@ ROUNDSS (128-bit Legacy SSE version)
DEST[31:0]  RoundToInteger(SRC[31:0], ROUND_CONTROL)
DEST[MAXVL-1:32] (Unmodified)
-4-568 Vol. 2B
+4-566 Vol. 2B
ROUNDSS — Round Scalar Single Precision Floating-Point Values
@@ -162981,7 +162874,7 @@ See Exceptions Type 3.
ROUNDSS — Round Scalar Single Precision Floating-Point Values
-Vol. 2B 4-569
+Vol. 2B 4-567
INSTRUCTION SET REFERENCE, M-U
@@ -163078,7 +162971,7 @@ Same exceptions as in protected mode.
Compatibility Mode Exceptions
Same exceptions as in protected mode.
-4-570 Vol. 2B
+4-568 Vol. 2B
RSM—Resume from System Management Mode
@@ -163089,7 +162982,7 @@ Same exceptions as in protected mode.
RSM—Resume from System Management Mode
-Vol. 2B 4-571
+Vol. 2B 4-569
INSTRUCTION SET REFERENCE, M-U
@@ -163193,7 +163086,7 @@ VEX.256 encoded version: The first source operand is a YMM register. The second
register or a 256-bit memory location. The destination operand is a YMM register.
Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.
-4-572 Vol. 2B
+4-570 Vol. 2B
RSQRTPS—Compute Reciprocals of Square Roots of Packed Single-Precision Floating-Point Values
@@ -163242,7 +163135,7 @@ If VEX.vvvv ≠ 1111B.
RSQRTPS—Compute Reciprocals of Square Roots of Packed Single-Precision Floating-Point Values
-Vol. 2B 4-573
+Vol. 2B 4-571
INSTRUCTION SET REFERENCE, M-U
@@ -163289,7 +163182,7 @@ from xmm2 are copied to xmm1[127:32].
RSQRTSS xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 52 /r
+VEX.LIG.F3.0F.WIG 52 /r
VRSQRTSS xmm1, xmm2, xmm3/m32
Instruction Operand Encoding
@@ -163350,7 +163243,7 @@ DEST[31:0]  APPROXIMATE(1/SQRT(SRC2[31:0]))
DEST[127:32]  SRC1[127:32]
DEST[MAXVL-1:128]  0
-4-574 Vol. 2B
+4-572 Vol. 2B
RSQRTSS—Compute Reciprocal of Square Root of Scalar Single-Precision Floating-Point Value
@@ -163369,7 +163262,7 @@ See Exceptions Type 5.
RSQRTSS—Compute Reciprocal of Square Root of Scalar Single-Precision Floating-Point Value
-Vol. 2B 4-575
+Vol. 2B 4-573
INSTRUCTION SET REFERENCE, M-U
@@ -163460,7 +163353,7 @@ None.
Compatibility Mode Exceptions
None.
-4-576 Vol. 2B
+4-574 Vol. 2B
SAHF—Store AH into Flags
@@ -163474,7 +163367,7 @@ If the LOCK prefix is used.
SAHF—Store AH into Flags
-Vol. 2B 4-577
+Vol. 2B 4-575
INSTRUCTION SET REFERENCE, M-U
@@ -163972,7 +163865,7 @@ Valid
Multiply r/m32 by 2, once.
-4-578 Vol. 2B
+4-576 Vol. 2B
SAL/SAR/SHL/SHR—Shift
@@ -164292,7 +164185,7 @@ Figure 7-7 in the Intel® 64 and IA-32 Architectures Software Developer’s Manu
SAL/SAR/SHL/SHR—Shift
-Vol. 2B 4-579
+Vol. 2B 4-577
INSTRUCTION SET REFERENCE, M-U
@@ -164349,7 +164242,7 @@ THEN
DEST ← DEST ∗ 2;
ELSE
IF instruction is SAR
-4-580 Vol. 2B
+4-578 Vol. 2B
SAL/SAR/SHL/SHR—Shift
@@ -164433,7 +164326,7 @@ If the LOCK prefix is used.
SAL/SAR/SHL/SHR—Shift
-Vol. 2B 4-581
+Vol. 2B 4-579
INSTRUCTION SET REFERENCE, M-U
@@ -164483,7 +164376,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-4-582 Vol. 2B
+4-580 Vol. 2B
SAL/SAR/SHL/SHR—Shift
@@ -164507,17 +164400,17 @@ Feature
Flag
BMI2
-VEX.NDS.LZ.F3.0F38.W0 F7 /r
+VEX.LZ.F3.0F38.W0 F7 /r
SARX r32a, r/m32, r32b
-VEX.NDS.LZ.66.0F38.W0 F7 /r
+VEX.LZ.66.0F38.W0 F7 /r
SHLX r32a, r/m32, r32b
-VEX.NDS.LZ.F2.0F38.W0 F7 /r
+VEX.LZ.F2.0F38.W0 F7 /r
SHRX r32a, r/m32, r32b
-VEX.NDS.LZ.F3.0F38.W1 F7 /r
+VEX.LZ.F3.0F38.W1 F7 /r
SARX r64a, r/m64, r64b
-VEX.NDS.LZ.66.0F38.W1 F7 /r
+VEX.LZ.66.0F38.W1 F7 /r
SHLX r64a, r/m64, r64b
-VEX.NDS.LZ.F2.0F38.W1 F7 /r
+VEX.LZ.F2.0F38.W1 F7 /r
SHRX r64a, r/m64, r64b
Description
@@ -164615,7 +164508,7 @@ DEST[] ← DEST *2;
SARX/SHLX/SHRX — Shift Without Affecting Flags
-Vol. 2B 4-583
+Vol. 2B 4-581
INSTRUCTION SET REFERENCE, M-U
@@ -164641,12 +164534,9 @@ SIMD Floating-Point Exceptions
None
Other Exceptions
-See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29; additionally
-#UD
+See Section 2.5.1, “Exception Conditions for VEX-Encoded GPR Instructions”, Table 2-29.
-4-584 Vol. 2B
-
-If VEX.W = 1.
+4-582 Vol. 2B
SARX/SHLX/SHRX — Shift Without Affecting Flags
@@ -164991,7 +164881,7 @@ NA
SBB—Integer Subtraction with Borrow
-Vol. 2B 4-585
+Vol. 2B 4-583
INSTRUCTION SET REFERENCE, M-U
@@ -165072,7 +164962,7 @@ If a memory operand effective address is outside the SS segment limit.
If the LOCK prefix is used but the destination is not a memory operand.
-4-586 Vol. 2B
+4-584 Vol. 2B
SBB—Integer Subtraction with Borrow
@@ -165126,7 +165016,7 @@ If the LOCK prefix is used but the destination is not a memory operand.
SBB—Integer Subtraction with Borrow
-Vol. 2B 4-587
+Vol. 2B 4-585
INSTRUCTION SET REFERENCE, M-U
@@ -165293,7 +165183,7 @@ the DF flag in the EFLAGS register. If the DF flag is 0, the (E)DI register is i
register is decremented. The register is incremented or decremented by 1 for byte operations, by 2 for word operations, and by 4 for doubleword operations.
SCAS, SCASB, SCASW, SCASD, and SCASQ can be preceded by the REP prefix for block comparisons of ECX bytes,
words, doublewords, or quadwords. Often, however, these instructions will be used in a LOOP construct that takes
-4-588 Vol. 2B
+4-586 Vol. 2B
SCAS/SCASB/SCASW/SCASD—Scan String
@@ -165352,7 +165242,7 @@ FI;
SCAS/SCASB/SCASW/SCASD—Scan String
-Vol. 2B 4-589
+Vol. 2B 4-587
INSTRUCTION SET REFERENCE, M-U
@@ -165435,7 +165325,7 @@ If the LOCK prefix is used.
Compatibility Mode Exceptions
Same exceptions as in protected mode.
-4-590 Vol. 2B
+4-588 Vol. 2B
SCAS/SCASB/SCASW/SCASD—Scan String
@@ -165461,7 +165351,7 @@ If the LOCK prefix is used.
SCAS/SCASB/SCASW/SCASD—Scan String
-Vol. 2B 4-591
+Vol. 2B 4-589
INSTRUCTION SET REFERENCE, M-U
@@ -165949,7 +165839,7 @@ Valid
Set byte if not less or equal (ZF=0 and SF=OF).
-4-592 Vol. 2B
+4-590 Vol. 2B
SETcc—Set Byte on Condition
@@ -166261,7 +166151,7 @@ choosing the logically opposite condition for the SETcc instruction, then decrem
test for overflow, use the SETNO instruction, then decrement the result.
SETcc—Set Byte on Condition
-Vol. 2B 4-593
+Vol. 2B 4-591
INSTRUCTION SET REFERENCE, M-U
@@ -166348,7 +166238,7 @@ If a page fault occurs.
If the LOCK prefix is used.
-4-594 Vol. 2B
+4-592 Vol. 2B
SETcc—Set Byte on Condition
@@ -166431,7 +166321,7 @@ If the LOCK prefix is used.
SFENCE—Store Fence
-Vol. 2B 4-595
+Vol. 2B 4-593
INSTRUCTION SET REFERENCE, M-U
@@ -166517,7 +166407,7 @@ FI;
Flags Affected
None.
-4-596 Vol. 2B
+4-594 Vol. 2B
SGDT—Store Global Descriptor Table Register
@@ -166606,7 +166496,7 @@ If alignment checking is enabled and an unaligned memory reference is made while
SGDT—Store Global Descriptor Table Register
-Vol. 2B 4-597
+Vol. 2B 4-595
INSTRUCTION SET REFERENCE, M-U
@@ -166691,7 +166581,7 @@ FOR i = 1 to 3
A_(i +1)  f (B_i, C_i, D_i) + (A_i ROL 5) +Wi+ E_i +K;
B_(i +1)  A_i;
-4-598 Vol. 2B
+4-596 Vol. 2B
SHA1RNDS4—Perform Four Rounds of SHA1 Operation
@@ -166716,7 +166606,7 @@ See Exceptions Type 4.
SHA1RNDS4—Perform Four Rounds of SHA1 Operation
-Vol. 2B 4-599
+Vol. 2B 4-597
INSTRUCTION SET REFERENCE, M-U
@@ -166787,7 +166677,7 @@ None
Other Exceptions
See Exceptions Type 4.
-4-600 Vol. 2B
+4-598 Vol. 2B
SHA1NEXTE—Calculate SHA1 State Variable E after Four Rounds
@@ -166863,7 +166753,7 @@ See Exceptions Type 4.
SHA1MSG1—Perform an Intermediate Calculation for the Next Four SHA1 Message Dwords
-Vol. 2B 4-601
+Vol. 2B 4-599
INSTRUCTION SET REFERENCE, M-U
@@ -166937,7 +166827,7 @@ None
Other Exceptions
See Exceptions Type 4.
-4-602 Vol. 2B
+4-600 Vol. 2B
SHA1MSG2—Perform a Final Calculation for the Next Four SHA1 Message Dwords
@@ -167026,7 +166916,7 @@ DEST[31:0]  F_2;
SHA256RNDS2—Perform Two Rounds of SHA256 Operation
-Vol. 2B 4-603
+Vol. 2B 4-601
INSTRUCTION SET REFERENCE, M-U
@@ -167039,7 +166929,7 @@ None
Other Exceptions
See Exceptions Type 4.
-4-604 Vol. 2B
+4-602 Vol. 2B
SHA256RNDS2—Perform Two Rounds of SHA256 Operation
@@ -167116,7 +167006,7 @@ See Exceptions Type 4.
SHA256MSG1—Perform an Intermediate Calculation for the Next Four SHA256 Message Dwords
-Vol. 2B 4-605
+Vol. 2B 4-603
INSTRUCTION SET REFERENCE, M-U
@@ -167189,7 +167079,7 @@ None
Other Exceptions
See Exceptions Type 4.
-4-606 Vol. 2B
+4-604 Vol. 2B
SHA256MSG2—Perform a Final Calculation for the Next Four SHA256 Message Dwords
@@ -167343,7 +167233,7 @@ ELSE
SHLD—Double Precision Shift Left
-Vol. 2B 4-607
+Vol. 2B 4-605
INSTRUCTION SET REFERENCE, M-U
@@ -167432,7 +167322,7 @@ If the LOCK prefix is used.
Compatibility Mode Exceptions
Same exceptions as in protected mode.
-4-608 Vol. 2B
+4-606 Vol. 2B
SHLD—Double Precision Shift Left
@@ -167462,7 +167352,7 @@ If the LOCK prefix is used.
SHLD—Double Precision Shift Left
-Vol. 2B 4-609
+Vol. 2B 4-607
INSTRUCTION SET REFERENCE, M-U
@@ -167612,7 +167502,7 @@ THEN
No operation;
ELSE
-4-610 Vol. 2B
+4-608 Vol. 2B
SHRD—Double Precision Shift Right
@@ -167705,7 +167595,7 @@ Same exceptions as in protected mode.
SHRD—Double Precision Shift Right
-Vol. 2B 4-611
+Vol. 2B 4-609
INSTRUCTION SET REFERENCE, M-U
@@ -167731,7 +167621,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-4-612 Vol. 2B
+4-610 Vol. 2B
SHRD—Double Precision Shift Right
@@ -167758,7 +167648,7 @@ SSE2
66 0F C6 /r ib
SHUFPD xmm1, xmm2/m128, imm8
-VEX.NDS.128.66.0F.WIG C6 /r ib
+VEX.128.66.0F.WIG C6 /r ib
VSHUFPD xmm1, xmm2, xmm3/m128,
imm8
@@ -167768,7 +167658,7 @@ V/V
AVX
-VEX.NDS.256.66.0F.WIG C6 /r ib
+VEX.256.66.0F.WIG C6 /r ib
VSHUFPD ymm1, ymm2, ymm3/m256,
imm8
@@ -167778,7 +167668,7 @@ V/V
AVX
-EVEX.NDS.128.66.0F.W1 C6 /r ib
+EVEX.128.66.0F.W1 C6 /r ib
VSHUFPD xmm1{k1}{z}, xmm2,
xmm3/m128/m64bcst, imm8
@@ -167789,7 +167679,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F.W1 C6 /r ib
+EVEX.256.66.0F.W1 C6 /r ib
VSHUFPD ymm1{k1}{z}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -167800,7 +167690,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F.W1 C6 /r ib
+EVEX.512.66.0F.W1 C6 /r ib
VSHUFPD zmm1{k1}{z}, zmm2,
zmm3/m512/m64bcst, imm8
@@ -167905,7 +167795,7 @@ imm8[7:2) are ignored.
SHUFPD—Packed Interleave Shuffle of Pairs of Double-Precision Floating-Point Values
-Vol. 2B 4-613
+Vol. 2B 4-611
INSTRUCTION SET REFERENCE, M-U
@@ -167980,7 +167870,7 @@ i  j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+63:i]  TMP_DEST[i+63:i]
ELSE
-4-614 Vol. 2B
+4-612 Vol. 2B
SHUFPD—Packed Interleave Shuffle of Pairs of Double-Precision Floating-Point Values
@@ -168043,7 +167933,7 @@ IF *merging-masking*
THEN *DEST[i+63:i] remains unchanged*
SHUFPD—Packed Interleave Shuffle of Pairs of Double-Precision Floating-Point Values
-Vol. 2B 4-615
+Vol. 2B 4-613
INSTRUCTION SET REFERENCE, M-U
@@ -168097,7 +167987,7 @@ SHUFPD __m128d _mm_shuffle_pd (__m128d a, __m128d b, const int select);
VSHUFPD __m128d _mm_mask_shuffle_pd(__m128d s, __mmask8 k, __m128d a, __m128d b, int imm);
VSHUFPD __m128d _mm_maskz_shuffle_pd( __mmask8 k, __m128d a, __m128d b, int imm);
-4-616 Vol. 2B
+4-614 Vol. 2B
SHUFPD—Packed Interleave Shuffle of Pairs of Double-Precision Floating-Point Values
@@ -168111,7 +168001,7 @@ EVEX-encoded instruction, see Exceptions Type E4NF.
SHUFPD—Packed Interleave Shuffle of Pairs of Double-Precision Floating-Point Values
-Vol. 2B 4-617
+Vol. 2B 4-615
INSTRUCTION SET REFERENCE, M-U
@@ -168135,13 +168025,13 @@ SSE
NP 0F C6 /r ib
SHUFPS xmm1, xmm3/m128, imm8
-VEX.NDS.128.0F.WIG C6 /r ib
+VEX.128.0F.WIG C6 /r ib
VSHUFPS xmm1, xmm2,
xmm3/m128, imm8
-VEX.NDS.256.0F.WIG C6 /r ib
+VEX.256.0F.WIG C6 /r ib
VSHUFPS ymm1, ymm2,
ymm3/m256, imm8
-EVEX.NDS.128.0F.W0 C6 /r ib
+EVEX.128.0F.W0 C6 /r ib
VSHUFPS xmm1{k1}{z}, xmm2,
xmm3/m128/m32bcst, imm8
@@ -168164,7 +168054,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.0F.W0 C6 /r ib
+EVEX.256.0F.W0 C6 /r ib
VSHUFPS ymm1{k1}{z}, ymm2,
ymm3/m256/m32bcst, imm8
@@ -168175,7 +168065,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.0F.W0 C6 /r ib
+EVEX.512.0F.W0 C6 /r ib
VSHUFPS zmm1{k1}{z}, zmm2,
zmm3/m512/m32bcst, imm8
@@ -168269,7 +168159,7 @@ register or a 128-bit memory location. The destination operand is a XMM register
the corresponding ZMM register destination are zeroed. Imm8[7:0] provides 4 select controls for each element of
the destination.
-4-618 Vol. 2B
+4-616 Vol. 2B
SHUFPS—Packed Interleave Shuffle of Quadruplets of Single-Precision Floating-Point Values
@@ -168366,7 +168256,7 @@ FOR j  0 TO KL-1
SHUFPS—Packed Interleave Shuffle of Quadruplets of Single-Precision Floating-Point Values
-Vol. 2B 4-619
+Vol. 2B 4-617
INSTRUCTION SET REFERENCE, M-U
@@ -168427,7 +168317,7 @@ DEST[i+31:i]  0
FI
FI;
ENDFOR
-4-620 Vol. 2B
+4-618 Vol. 2B
SHUFPS—Packed Interleave Shuffle of Quadruplets of Single-Precision Floating-Point Values
@@ -168474,7 +168364,7 @@ EVEX-encoded instruction, see Exceptions Type E4NF.
SHUFPS—Packed Interleave Shuffle of Quadruplets of Single-Precision Floating-Point Values
-Vol. 2B 4-621
+Vol. 2B 4-619
INSTRUCTION SET REFERENCE, M-U
@@ -168557,7 +168447,7 @@ FI;
Flags Affected
None.
-4-622 Vol. 2B
+4-620 Vol. 2B
SIDT—Store Interrupt Descriptor Table Register
@@ -168650,7 +168540,7 @@ If alignment checking is enabled and an unaligned memory reference is made while
SIDT—Store Interrupt Descriptor Table Register
-Vol. 2B 4-623
+Vol. 2B 4-621
INSTRUCTION SET REFERENCE, M-U
@@ -168761,7 +168651,7 @@ If alignment checking is enabled and an unaligned memory reference is made while
If the LOCK prefix is used.
-4-624 Vol. 2B
+4-622 Vol. 2B
SLDT—Store Local Descriptor Table Register
@@ -168802,7 +168692,7 @@ If the LOCK prefix is used.
SLDT—Store Local Descriptor Table Register
-Vol. 2B 4-625
+Vol. 2B 4-623
INSTRUCTION SET REFERENCE, M-U
@@ -168917,7 +168807,7 @@ DEST ← CR0[15:0];
Flags Affected
None.
-4-626 Vol. 2B
+4-624 Vol. 2B
SMSW—Store Machine Status Word
@@ -169008,7 +168898,7 @@ If the LOCK prefix is used.
SMSW—Store Machine Status Word
-Vol. 2B 4-627
+Vol. 2B 4-625
INSTRUCTION SET REFERENCE, M-U
@@ -169153,7 +169043,7 @@ zeroed.
register destination are unmodified.
Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.
-4-628 Vol. 2B
+4-626 Vol. 2B
SQRTPD—Square Root of Double-Precision Floating-Point Values
@@ -169214,7 +169104,7 @@ VSQRTPD __m128d _mm_maskz_sqrt_pd( __mmask8 k, __m128d a, int r);
SQRTPD—Square Root of Double-Precision Floating-Point Values
-Vol. 2B 4-629
+Vol. 2B 4-627
INSTRUCTION SET REFERENCE, M-U
@@ -169229,7 +169119,7 @@ If VEX.vvvv != 1111B.
EVEX-encoded instruction, see Exceptions Type E2.
#UD
-4-630 Vol. 2B
+4-628 Vol. 2B
If EVEX.vvvv != 1111B.
@@ -169380,7 +169270,7 @@ Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructio
SQRTPS—Square Root of Single-Precision Floating-Point Values
-Vol. 2B 4-631
+Vol. 2B 4-629
INSTRUCTION SET REFERENCE, M-U
@@ -169433,7 +169323,7 @@ DEST[95:64] SQRT(SRC[95:64])
DEST[127:96] SQRT(SRC[127:96])
DEST[MAXVL-1:128] (Unmodified)
-4-632 Vol. 2B
+4-630 Vol. 2B
SQRTPS—Square Root of Single-Precision Floating-Point Values
@@ -169464,7 +169354,7 @@ If EVEX.vvvv != 1111B.
SQRTPS—Square Root of Single-Precision Floating-Point Values
-Vol. 2B 4-633
+Vol. 2B 4-631
INSTRUCTION SET REFERENCE, M-U
@@ -169477,10 +169367,10 @@ En
F2 0F 51/r
SQRTSD xmm1,xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 51/r
+VEX.LIG.F2.0F.WIG 51/r
VSQRTSD xmm1,xmm2,
xmm3/m64
-EVEX.NDS.LIG.F2.0F.W1 51/r
+EVEX.LIG.F2.0F.W1 51/r
VSQRTSD xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
@@ -169580,7 +169470,7 @@ writemask.
Software should ensure VSQRTSD is encoded with VEX.L=0. Encoding VSQRTSD with VEX.L=1 may encounter
unpredictable behavior across different processor generations.
-4-634 Vol. 2B
+4-632 Vol. 2B
SQRTSD—Compute Square Root of Scalar Double-Precision Floating-Point Value
@@ -169628,7 +169518,7 @@ EVEX-encoded instruction, see Exceptions Type E3.
SQRTSD—Compute Square Root of Scalar Double-Precision Floating-Point Value
-Vol. 2B 4-635
+Vol. 2B 4-633
INSTRUCTION SET REFERENCE, M-U
@@ -169652,7 +169542,7 @@ SSE
F3 0F 51 /r
SQRTSS xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 51 /r
+VEX.LIG.F3.0F.WIG 51 /r
VSQRTSS xmm1, xmm2,
xmm3/m32
@@ -169662,7 +169552,7 @@ V/V
AVX
-EVEX.NDS.LIG.F3.0F.W0 51 /r
+EVEX.LIG.F3.0F.W0 51 /r
VSQRTSS xmm1 {k1}{z}, xmm2,
xmm3/m32{er}
@@ -169745,7 +169635,7 @@ writemask.
Software should ensure VSQRTSS is encoded with VEX.L=0. Encoding VSQRTSS with VEX.L=1 may encounter
unpredictable behavior across different processor generations.
-4-636 Vol. 2B
+4-634 Vol. 2B
SQRTSS—Compute Square Root of Scalar Single-Precision Value
@@ -169793,7 +169683,7 @@ EVEX-encoded instruction, see Exceptions Type E3.
SQRTSS—Compute Square Root of Scalar Single-Precision Value
-Vol. 2B 4-637
+Vol. 2B 4-635
INSTRUCTION SET REFERENCE, M-U
@@ -169891,7 +169781,7 @@ If the LOCK prefix is used.
If the CPL > 0.
If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.
-4-638 Vol. 2B
+4-636 Vol. 2B
STAC—Set AC Flag in EFLAGS Register
@@ -169960,7 +169850,7 @@ STC—Set Carry Flag
If the LOCK prefix is used.
-Vol. 2B 4-639
+Vol. 2B 4-637
INSTRUCTION SET REFERENCE, M-U
@@ -170023,7 +169913,7 @@ The DF flag is set. The CF, OF, ZF, SF, AF, and PF flags are unaffected.
Exceptions (All Operating Modes)
#UD
-4-640 Vol. 2B
+4-638 Vol. 2B
If the LOCK prefix is used.
@@ -170100,10 +169990,10 @@ VME mode (virtual-8086 mode extensions): CR0.PE = 1, EFLAGS.VM = 1, and CR4.VME
If IOPL < 3, EFLAGS.VIP = 1, and either VME mode or PVI mode is active, STI sets the VIF flag in the EFLAGS
register, leaving IF unaffected.
-Table 4-19 indicates the action of the STI instruction depending on the processor operating mode, IOPL, CPL, and
+Table 4-18 indicates the action of the STI instruction depending on the processor operating mode, IOPL, CPL, and
EFLAGS.VIP.
-Table 4-19. Decision Table for STI Results
+Table 4-18. Decision Table for STI Results
Mode
IOPL
@@ -170194,7 +170084,7 @@ NOTES:
STI—Set Interrupt Flag
-Vol. 2B 4-641
+Vol. 2B 4-639
INSTRUCTION SET REFERENCE, M-U
@@ -170253,7 +170143,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-4-642 Vol. 2B
+4-640 Vol. 2B
STI—Set Interrupt Flag
@@ -170344,7 +170234,7 @@ If VEX.vvvv ≠ 1111B.
STMXCSR—Store MXCSR Register State
-Vol. 2B 4-643
+Vol. 2B 4-641
INSTRUCTION SET REFERENCE, M-U
@@ -170513,7 +170403,7 @@ incremented or decremented according to the setting of the DF flag in the EFLAGS
register is incremented; if the DF flag is 1, the register is decremented (the register is incremented or decremented
by 1 for byte operations, by 2 for word operations, by 4 for doubleword operations).
-4-644 Vol. 2B
+4-642 Vol. 2B
STOS/STOSB/STOSW/STOSD/STOSQ—Store String
@@ -170570,7 +170460,7 @@ THEN
DEST ← AX;
STOS/STOSB/STOSW/STOSD/STOSQ—Store String
-Vol. 2B 4-645
+Vol. 2B 4-643
INSTRUCTION SET REFERENCE, M-U
@@ -170649,7 +170539,7 @@ If the LOCK prefix is used.
Compatibility Mode Exceptions
Same exceptions as in protected mode.
-4-646 Vol. 2B
+4-644 Vol. 2B
STOS/STOSB/STOSW/STOSD/STOSQ—Store String
@@ -170675,7 +170565,7 @@ If the LOCK prefix is used.
STOS/STOSB/STOSW/STOSD/STOSQ—Store String
-Vol. 2B 4-647
+Vol. 2B 4-645
INSTRUCTION SET REFERENCE, M-U
@@ -170781,7 +170671,7 @@ The STR instruction is not recognized in virtual-8086 mode.
Compatibility Mode Exceptions
Same exceptions as in protected mode.
-4-648 Vol. 2B
+4-646 Vol. 2B
STR—Store Task Register
@@ -170812,7 +170702,7 @@ If the LOCK prefix is used.
STR—Store Task Register
-Vol. 2B 4-649
+Vol. 2B 4-647
INSTRUCTION SET REFERENCE, M-U
@@ -171157,7 +171047,7 @@ can be an immediate, register, or memory location. (However, two memory operands
instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination
operand format.
-4-650 Vol. 2B
+4-648 Vol. 2B
SUB—Subtract
@@ -171262,7 +171152,7 @@ If the LOCK prefix is used but the destination is not a memory operand.
SUB—Subtract
-Vol. 2B 4-651
+Vol. 2B 4-649
INSTRUCTION SET REFERENCE, M-U
@@ -171286,17 +171176,17 @@ SSE2
66 0F 5C /r
SUBPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 5C /r
+VEX.128.66.0F.WIG 5C /r
VSUBPD xmm1,xmm2, xmm3/m128
-VEX.NDS.256.66.0F.WIG 5C /r
+VEX.256.66.0F.WIG 5C /r
VSUBPD ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.66.0F.W1 5C /r
+EVEX.128.66.0F.W1 5C /r
VSUBPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 5C /r
+EVEX.256.66.0F.W1 5C /r
VSUBPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 5C /r
+EVEX.512.66.0F.W1 5C /r
VSUBPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
@@ -171412,7 +171302,7 @@ registers. The destination operand is conditionally updated according to the wri
128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper Bits (MAXVL-1:128) of the corresponding
register destination are unmodified.
-4-652 Vol. 2B
+4-650 Vol. 2B
SUBPD—Subtract Packed Double-Precision Floating-Point Values
@@ -171473,7 +171363,7 @@ DEST[MAXVL-1:256]  0
SUBPD—Subtract Packed Double-Precision Floating-Point Values
-Vol. 2B 4-653
+Vol. 2B 4-651
INSTRUCTION SET REFERENCE, M-U
@@ -171504,7 +171394,7 @@ Other Exceptions
VEX-encoded instructions, see Exceptions Type 2.
EVEX-encoded instructions, see Exceptions Type E2.
-4-654 Vol. 2B
+4-652 Vol. 2B
SUBPD—Subtract Packed Double-Precision Floating-Point Values
@@ -171530,17 +171420,17 @@ SSE
NP 0F 5C /r
SUBPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 5C /r
+VEX.128.0F.WIG 5C /r
VSUBPS xmm1,xmm2, xmm3/m128
-VEX.NDS.256.0F.WIG 5C /r
+VEX.256.0F.WIG 5C /r
VSUBPS ymm1, ymm2, ymm3/m256
-EVEX.NDS.128.0F.W0 5C /r
+EVEX.128.0F.W0 5C /r
VSUBPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 5C /r
+EVEX.256.0F.W0 5C /r
VSUBPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 5C /r
+EVEX.512.0F.W0 5C /r
VSUBPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{er}
@@ -171657,7 +171547,7 @@ register destination are unmodified.
SUBPS—Subtract Packed Single-Precision Floating-Point Values
-Vol. 2B 4-655
+Vol. 2B 4-653
INSTRUCTION SET REFERENCE, M-U
@@ -171717,7 +171607,7 @@ DEST[191:160] SRC1[191:160] - SRC2[191:160]
DEST[223:192]  SRC1[223:192] - SRC2[223:192]
DEST[255:224]  SRC1[255:224] - SRC2[255:224].
DEST[MAXVL-1:256]  0
-4-656 Vol. 2B
+4-654 Vol. 2B
SUBPS—Subtract Packed Single-Precision Floating-Point Values
@@ -171756,7 +171646,7 @@ EVEX-encoded instructions, see Exceptions Type E2.
SUBPS—Subtract Packed Single-Precision Floating-Point Values
-Vol. 2B 4-657
+Vol. 2B 4-655
INSTRUCTION SET REFERENCE, M-U
@@ -171780,9 +171670,9 @@ SSE2
F2 0F 5C /r
SUBSD xmm1, xmm2/m64
-VEX.NDS.LIG.F2.0F.WIG 5C /r
+VEX.LIG.F2.0F.WIG 5C /r
VSUBSD xmm1,xmm2, xmm3/m64
-EVEX.NDS.LIG.F2.0F.W1 5C /r
+EVEX.LIG.F2.0F.W1 5C /r
VSUBSD xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
@@ -171869,7 +171759,7 @@ EVEX encoded version: The low quadword element of the destination operand is upd
writemask.
Software should ensure VSUBSD is encoded with VEX.L=0. Encoding VSUBSD with VEX.L=1 may encounter unpredictable behavior across different processor generations.
-4-658 Vol. 2B
+4-656 Vol. 2B
SUBSD—Subtract Scalar Double-Precision Floating-Point Value
@@ -171919,7 +171809,7 @@ EVEX-encoded instructions, see Exceptions Type E3.
SUBSD—Subtract Scalar Double-Precision Floating-Point Value
-Vol. 2B 4-659
+Vol. 2B 4-657
INSTRUCTION SET REFERENCE, M-U
@@ -171943,9 +171833,9 @@ SSE
F3 0F 5C /r
SUBSS xmm1, xmm2/m32
-VEX.NDS.LIG.F3.0F.WIG 5C /r
+VEX.LIG.F3.0F.WIG 5C /r
VSUBSS xmm1,xmm2, xmm3/m32
-EVEX.NDS.LIG.F3.0F.W0 5C /r
+EVEX.LIG.F3.0F.W0 5C /r
VSUBSS xmm1 {k1}{z}, xmm2,
xmm3/m32{er}
@@ -172032,7 +171922,7 @@ EVEX encoded version: The low doubleword element of the destination operand is u
writemask.
Software should ensure VSUBSS is encoded with VEX.L=0. Encoding VSUBSD with VEX.L=1 may encounter unpredictable behavior across different processor generations.
-4-660 Vol. 2B
+4-658 Vol. 2B
SUBSS—Subtract Scalar Single-Precision Floating-Point Value
@@ -172082,7 +171972,7 @@ EVEX-encoded instructions, see Exceptions Type E3.
SUBSS—Subtract Scalar Single-Precision Floating-Point Value
-Vol. 2B 4-661
+Vol. 2B 4-659
INSTRUCTION SET REFERENCE, M-U
@@ -172179,7 +172069,7 @@ If Mode
Virtual-8086 Mode Exceptions
#UD
-4-662 Vol. 2B
+4-660 Vol. 2B
If Mode
@@ -172209,7 +172099,7 @@ If the LOCK prefix is used.
SWAPGS—Swap GS Base Register
-Vol. 2B 4-663
+Vol. 2B 4-661
INSTRUCTION SET REFERENCE, M-U
@@ -172277,6 +172167,10 @@ pointer, it is the responsibility of software to save the previous value of the
to executing SYSCALL, with software restoring the stack pointer with the instruction following SYSCALL (which will
be executed after SYSRET). Alternatively, the OS system-call handler may save the stack pointer and restore it
before executing SYSRET.
+Instruction ordering. Instructions following a SYSCALL may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the SYSCALL have
+completed execution (the later instructions may execute before data stored by the earlier instructions have become
+globally visible).
Operation
IF (CS.L ≠ 1 ) or (IA32_EFER.LMA ≠ 1) or (IA32_EFER.SCE ≠ 1)
@@ -172299,18 +172193,20 @@ CS.Type ← 11;
CS.S ← 1;
CS.DPL ← 0;
CS.P ← 1;
+4-662 Vol. 2B
+
+SYSCALL—Fast System Call
+
+ INSTRUCTION SET REFERENCE, M-U
+
CS.L ← 1;
-(* Entry is to 64-bit mode *)
CS.D ← 0;
-(* Required if CS.L = 1 *)
CS.G ← 1;
-(* 4-KByte granularity *)
CPL ← 0;
-4-664 Vol. 2B
-
-SYSCALL—Fast System Call
- INSTRUCTION SET REFERENCE, M-U
+(* Entry is to 64-bit mode *)
+(* Required if CS.L = 1 *)
+(* 4-KByte granularity *)
SS.Selector ← IA32_STAR[47:32] + 8;
(* Set rest of SS to a fixed value *)
@@ -172362,7 +172258,7 @@ If the LOCK prefix is used.
SYSCALL—Fast System Call
-Vol. 2B 4-665
+Vol. 2B 4-663
INSTRUCTION SET REFERENCE, M-U
@@ -172468,7 +172364,7 @@ required return IP and processor state information if a return to the calling pr
operating system or executive procedures called with SYSENTER instructions must have access to and use this
saved return and state information when returning to the user code.
-4-666 Vol. 2B
+4-664 Vol. 2B
SYSENTER—Fast System Call
@@ -172488,6 +172384,10 @@ SYSENTER/SYSEXIT_Supported; FI;
FI;
When the CPUID instruction is executed on the Pentium Pro processor (model 1), the processor returns a the SEP
flag as set, but does not support the SYSENTER/SYSEXIT instructions.
+Instruction ordering. Instructions following a SYSENTER may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the SYSENTER have
+completed execution (the later instructions may execute before data stored by the earlier instructions have
+become globally visible).
Operation
IF CR0.PE = 0 OR IA32_SYSENTER_CS[15:2] = 0 THEN #GP(0); FI;
@@ -172530,26 +172430,27 @@ CS.G ← 1;
(* 4-KByte granularity *)
CPL ← 0;
SS.Selector ← CS.Selector + 8;
-(* Set rest of SS to a fixed value *)
-SS.Base ← 0;
-SS.Limit ← FFFFFH;
-SS.Type ← 3;
SYSENTER—Fast System Call
(* SS just above CS *)
-(* Flat segment *)
-(* With 4-KByte granularity, implies a 4-GByte limit *)
-(* Read/write data, accessed *)
-Vol. 2B 4-667
+Vol. 2B 4-665
INSTRUCTION SET REFERENCE, M-U
+(* Set rest of SS to a fixed value *)
+SS.Base ← 0;
+SS.Limit ← FFFFFH;
+SS.Type ← 3;
SS.S ← 1;
SS.DPL ← 0;
SS.P ← 1;
SS.B ← 1;
SS.G ← 1;
+(* Flat segment *)
+(* With 4-KByte granularity, implies a 4-GByte limit *)
+(* Read/write data, accessed *)
+
(* 32-bit stack segment*)
(* 4-KByte granularity *)
@@ -172583,7 +172484,7 @@ Same exceptions as in protected mode.
64-Bit Mode Exceptions
Same exceptions as in protected mode.
-4-668 Vol. 2B
+4-666 Vol. 2B
SYSENTER—Fast System Call
@@ -172698,10 +172599,15 @@ When the CPUID instruction is executed on the Pentium Pro processor (model 1), t
flag as set, but does not support the SYSENTER/SYSEXIT instructions.
SYSEXIT—Fast Return from Fast System Call
-Vol. 2B 4-669
+Vol. 2B 4-667
INSTRUCTION SET REFERENCE, M-U
+Instruction ordering. Instructions following a SYSEXIT may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the SYSEXIT have
+completed execution (the later instructions may execute before data stored by the earlier instructions have become
+globally visible).
+
Operation
IF IA32_SYSENTER_CS[15:2] = 0 OR CR0.PE = 0 OR CPL ≠ 0 THEN #GP(0); FI;
IF operand size is 64-bit
@@ -172769,6 +172675,12 @@ SS.G ← 1;
Flags Affected
None.
+4-668 Vol. 2B
+
+SYSEXIT—Fast Return from Fast System Call
+
+ INSTRUCTION SET REFERENCE, M-U
+
Protected Mode Exceptions
#GP(0)
@@ -172777,16 +172689,10 @@ If CPL
#UD
-4-670 Vol. 2B
-
≠ 0.
If the LOCK prefix is used.
-SYSEXIT—Fast Return from Fast System Call
-
- INSTRUCTION SET REFERENCE, M-U
-
Real-Address Mode Exceptions
#GP
@@ -172819,7 +172725,7 @@ If the LOCK prefix is used.
SYSEXIT—Fast Return from Fast System Call
-Vol. 2B 4-671
+Vol. 2B 4-669
INSTRUCTION SET REFERENCE, M-U
@@ -172918,9 +172824,14 @@ canonical. The OS can address this possibility using one or more of the followin
— Using paging to ensure that the SYSCALL instruction will never save a non-canonical value into RCX.
— Using the IST mechanism for gate 13 (#GP) in the IDT.
+Instruction ordering. Instructions following a SYSRET may be fetched from memory before earlier instructions
+complete execution, but they will not execute (even speculatively) until all instructions prior to the SYSRET have
+completed execution (the later instructions may execute before data stored by the earlier instructions have become
+globally visible).
+
1. Regardless of the value of R11, the RF and VM flags are always 0 in RFLAGS after execution of SYSRET. In addition, all reserved bits
in RFLAGS retain the fixed values.
-4-672 Vol. 2B
+4-670 Vol. 2B
SYSRET—Return From Fast System Call
@@ -172939,7 +172850,7 @@ ELSE (* Return to Compatibility Mode *)
RIP ← ECX;
FI;
RFLAGS ← (R11 & 3C7FD7H) | 2;
-(* Clear RF, VM, reserved bits; set bit 2 *)
+(* Clear RF, VM, reserved bits; set bit 1 *)
IF (operand size is 64-bit)
THEN CS.Selector ← IA32_STAR[63:48]+16;
ELSE CS.Selector ← IA32_STAR[63:48];
@@ -173000,7 +172911,7 @@ The SYSRET instruction is not recognized in protected mode.
SYSRET—Return From Fast System Call
-Vol. 2B 4-673
+Vol. 2B 4-671
INSTRUCTION SET REFERENCE, M-U
@@ -173033,7 +172944,7 @@ If CPL
If the return is to 64-bit mode and RCX contains a non-canonical address.
-4-674 Vol. 2B
+4-672 Vol. 2B
SYSRET—Return From Fast System Call
@@ -173288,7 +173199,7 @@ section for encoding data and limits.
TEST—Logical Compare
-Vol. 2B 4-675
+Vol. 2B 4-673
INSTRUCTION SET REFERENCE, M-U
@@ -173390,7 +173301,7 @@ current privilege level is 3.
If the LOCK prefix is used.
-4-676 Vol. 2B
+4-674 Vol. 2B
TEST—Logical Compare
@@ -173506,7 +173417,7 @@ unsigned __int64 _tzcnt_u64(unsigned __int64 src);
TZCNT — Count the Number of Trailing Zero Bits
-Vol. 2B 4-677
+Vol. 2B 4-675
INSTRUCTION SET REFERENCE, M-U
@@ -173578,7 +173489,7 @@ For a page fault.
If alignment checking is enabled and an unaligned memory reference is made while the
current privilege level is 3.
-4-678 Vol. 2B
+4-676 Vol. 2B
TZCNT — Count the Number of Trailing Zero Bits
@@ -173696,7 +173607,7 @@ OF, AF, SF  0; }
UCOMISD—Unordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGS
-Vol. 2B 4-679
+Vol. 2B 4-677
INSTRUCTION SET REFERENCE, M-U
@@ -173718,7 +173629,7 @@ If VEX.vvvv != 1111B.
EVEX-encoded instructions, see Exceptions Type E3NF.
-4-680 Vol. 2B
+4-678 Vol. 2B
UCOMISD—Unordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGS
@@ -173832,7 +173743,7 @@ OF, AF, SF  0; }
UCOMISS—Unordered Compare Scalar Single-Precision Floating-Point Values and Set EFLAGS
-Vol. 2B 4-681
+Vol. 2B 4-679
INSTRUCTION SET REFERENCE, M-U
@@ -173863,7 +173774,7 @@ If VEX.vvvv != 1111B.
EVEX-encoded instructions, see Exceptions Type E3NF.
-4-682 Vol. 2B
+4-680 Vol. 2B
UCOMISS—Unordered Compare Scalar Single-Precision Floating-Point Values and Set EFLAGS
@@ -173975,7 +173886,7 @@ UD—Undefined Instruction
Raises an invalid opcode exception in all operating modes.
-Vol. 2B 4-683
+Vol. 2B 4-681
INSTRUCTION SET REFERENCE, M-U
@@ -173999,19 +173910,19 @@ SSE2
66 0F 15 /r
UNPCKHPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 15 /r
+VEX.128.66.0F.WIG 15 /r
VUNPCKHPD xmm1,xmm2,
xmm3/m128
-VEX.NDS.256.66.0F.WIG 15 /r
+VEX.256.66.0F.WIG 15 /r
VUNPCKHPD ymm1,ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F.W1 15 /r
+EVEX.128.66.0F.W1 15 /r
VUNPCKHPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 15 /r
+EVEX.256.66.0F.W1 15 /r
VUNPCKHPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 15 /r
+EVEX.512.66.0F.W1 15 /r
VUNPCKHPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -174135,7 +174046,7 @@ EVEX.256 encoded version: The first source operand is a YMM register. The second
register, a 256-bit memory location, or a 256-bit vector broadcasted from a 64-bit memory location. The destination operand is a YMM register, conditionally updated using writemask k1.
EVEX.128 encoded version: The first source operand is a XMM register. The second source operand is a XMM
register, a 128-bit memory location, or a 128-bit vector broadcasted from a 64-bit memory location. The destination operand is a XMM register, conditionally updated using writemask k1.
-4-684 Vol. 2B
+4-682 Vol. 2B
UNPCKHPD—Unpack and Interleave High Packed Double-Precision Floating-Point Values
@@ -174176,7 +174087,7 @@ DEST[MAXVL-1:VL]  0
UNPCKHPD—Unpack and Interleave High Packed Double-Precision Floating-Point Values
-Vol. 2B 4-685
+Vol. 2B 4-683
INSTRUCTION SET REFERENCE, M-U
@@ -174232,7 +174143,7 @@ UNPCKHPD (128-bit Legacy SSE version)
DEST[63:0] SRC1[127:64]
DEST[127:64] SRC2[127:64]
DEST[MAXVL-1:128] (Unmodified)
-4-686 Vol. 2B
+4-684 Vol. 2B
UNPCKHPD—Unpack and Interleave High Packed Double-Precision Floating-Point Values
@@ -174256,7 +174167,7 @@ EVEX-encoded instructions, see Exceptions Type E4NF.
UNPCKHPD—Unpack and Interleave High Packed Double-Precision Floating-Point Values
-Vol. 2B 4-687
+Vol. 2B 4-685
INSTRUCTION SET REFERENCE, M-U
@@ -174280,13 +174191,13 @@ SSE
NP 0F 15 /r
UNPCKHPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 15 /r
+VEX.128.0F.WIG 15 /r
VUNPCKHPS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.0F.WIG 15 /r
+VEX.256.0F.WIG 15 /r
VUNPCKHPS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.0F.W0 15 /r
+EVEX.128.0F.W0 15 /r
VUNPCKHPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -174314,7 +174225,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.0F.W0 15 /r
+EVEX.256.0F.W0 15 /r
VUNPCKHPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -174325,7 +174236,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.0F.W0 15 /r
+EVEX.512.0F.W0 15 /r
VUNPCKHPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -174415,7 +174326,7 @@ the corresponding ZMM register destination are zeroed.
VEX.256 encoded version: The second source operand is an YMM register or an 256-bit memory location. The first
source operand and destination operands are YMM registers.
-4-688 Vol. 2B
+4-686 Vol. 2B
UNPCKHPS—Unpack and Interleave High Packed Single-Precision Floating-Point Values
@@ -174510,7 +174421,7 @@ FI;
UNPCKHPS—Unpack and Interleave High Packed Single-Precision Floating-Point Values
-Vol. 2B 4-689
+Vol. 2B 4-687
INSTRUCTION SET REFERENCE, M-U
@@ -174571,7 +174482,7 @@ THEN *DEST[i+31:i] remains unchanged*
ELSE *zeroing-masking*
; zeroing-masking
DEST[i+31:i]  0
-4-690 Vol. 2B
+4-688 Vol. 2B
UNPCKHPS—Unpack and Interleave High Packed Single-Precision Floating-Point Values
@@ -174621,7 +174532,7 @@ EVEX-encoded instructions, see Exceptions Type E4NF.
UNPCKHPS—Unpack and Interleave High Packed Single-Precision Floating-Point Values
-Vol. 2B 4-691
+Vol. 2B 4-689
INSTRUCTION SET REFERENCE, M-U
@@ -174645,19 +174556,19 @@ SSE2
66 0F 14 /r
UNPCKLPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 14 /r
+VEX.128.66.0F.WIG 14 /r
VUNPCKLPD xmm1,xmm2,
xmm3/m128
-VEX.NDS.256.66.0F.WIG 14 /r
+VEX.256.66.0F.WIG 14 /r
VUNPCKLPD ymm1,ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F.W1 14 /r
+EVEX.128.66.0F.W1 14 /r
VUNPCKLPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 14 /r
+EVEX.256.66.0F.W1 14 /r
VUNPCKLPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 14 /r
+EVEX.512.66.0F.W1 14 /r
VUNPCKLPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -174780,7 +174691,7 @@ register, a 256-bit memory location, or a 256-bit vector broadcasted from a 64-b
EVEX.128 encoded version: The first source operand is an XMM register. The second source operand is a XMM
register, a 128-bit memory location, or a 128-bit vector broadcasted from a 64-bit memory location. The destination operand is a XMM register, conditionally updated using writemask k1.
-4-692 Vol. 2B
+4-690 Vol. 2B
UNPCKLPD—Unpack and Interleave Low Packed Double-Precision Floating-Point Values
@@ -174821,7 +174732,7 @@ DEST[MAXVL-1:VL]  0
UNPCKLPD—Unpack and Interleave Low Packed Double-Precision Floating-Point Values
-Vol. 2B 4-693
+Vol. 2B 4-691
INSTRUCTION SET REFERENCE, M-U
@@ -174878,7 +174789,7 @@ DEST[63:0] SRC1[63:0]
DEST[127:64] SRC2[63:0]
DEST[MAXVL-1:128] (Unmodified)
-4-694 Vol. 2B
+4-692 Vol. 2B
UNPCKLPD—Unpack and Interleave Low Packed Double-Precision Floating-Point Values
@@ -174902,7 +174813,7 @@ EVEX-encoded instructions, see Exceptions Type E4NF.
UNPCKLPD—Unpack and Interleave Low Packed Double-Precision Floating-Point Values
-Vol. 2B 4-695
+Vol. 2B 4-693
INSTRUCTION SET REFERENCE, M-U
@@ -174926,19 +174837,19 @@ SSE
NP 0F 14 /r
UNPCKLPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 14 /r
+VEX.128.0F.WIG 14 /r
VUNPCKLPS xmm1,xmm2,
xmm3/m128
-VEX.NDS.256.0F.WIG 14 /r
+VEX.256.0F.WIG 14 /r
VUNPCKLPS
ymm1,ymm2,ymm3/m256
-EVEX.NDS.128.0F.W0 14 /r
+EVEX.128.0F.W0 14 /r
VUNPCKLPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 14 /r
+EVEX.256.0F.W0 14 /r
VUNPCKLPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 14 /r
+EVEX.512.0F.W0 14 /r
VUNPCKLPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -175056,7 +174967,7 @@ the corresponding ZMM register destination are zeroed.
VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM
register or a 256-bit memory location. The destination operand is a YMM register.
-4-696 Vol. 2B
+4-694 Vol. 2B
UNPCKLPS—Unpack and Interleave Low Packed Single-Precision Floating-Point Values
@@ -175152,7 +175063,7 @@ FOR j  0 TO KL-1
i  j * 32
UNPCKLPS—Unpack and Interleave Low Packed Single-Precision Floating-Point Values
-Vol. 2B 4-697
+Vol. 2B 4-695
INSTRUCTION SET REFERENCE, M-U
@@ -175213,7 +175124,7 @@ ELSE *zeroing-masking*
DEST[i+31:i]  0
FI
FI;
-4-698 Vol. 2B
+4-696 Vol. 2B
UNPCKLPS—Unpack and Interleave Low Packed Single-Precision Floating-Point Values
@@ -175261,11 +175172,11 @@ EVEX-encoded instructions, see Exceptions Type E4NF.
UNPCKLPS—Unpack and Interleave Low Packed Single-Precision Floating-Point Values
-Vol. 2B 4-699
+Vol. 2B 4-697
INSTRUCTION SET REFERENCE, M-U
-4-700 Vol. 2B
+4-698 Vol. 2B
UNPCKLPS—Unpack and Interleave Low Packed Single-Precision Floating-Point Values
@@ -176002,10 +175913,10 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F3A.W0 03 /r ib
+EVEX.128.66.0F3A.W0 03 /r ib
VALIGND xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst, imm8
-EVEX.NDS.128.66.0F3A.W1 03 /r ib
+EVEX.128.66.0F3A.W1 03 /r ib
VALIGNQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst, imm8
@@ -176016,7 +175927,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F3A.W0 03 /r ib
+EVEX.256.66.0F3A.W0 03 /r ib
VALIGND ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst, imm8
@@ -176027,7 +175938,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F3A.W1 03 /r ib
+EVEX.256.66.0F3A.W1 03 /r ib
VALIGNQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -176038,7 +175949,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F3A.W0 03 /r ib
+EVEX.512.66.0F3A.W0 03 /r ib
VALIGND zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst, imm8
@@ -176048,7 +175959,7 @@ V/V
AVX512F
-EVEX.NDS.512.66.0F3A.W1 03 /r ib
+EVEX.512.66.0F3A.W1 03 /r ib
VALIGNQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst, imm8
@@ -176262,22 +176173,22 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W1 65 /r
+EVEX.128.66.0F38.W1 65 /r
VBLENDMPD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 65 /r
+EVEX.256.66.0F38.W1 65 /r
VBLENDMPD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 65 /r
+EVEX.512.66.0F38.W1 65 /r
VBLENDMPD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst
-EVEX.NDS.128.66.0F38.W0 65 /r
+EVEX.128.66.0F38.W0 65 /r
VBLENDMPS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 65 /r
+EVEX.256.66.0F38.W0 65 /r
VBLENDMPS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 65 /r
+EVEX.512.66.0F38.W0 65 /r
VBLENDMPS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst
@@ -181630,9 +181541,9 @@ Feature
Flag
AVX512F
-EVEX.NDS.LIG.F2.0F.W0 7B /r
+EVEX.LIG.F2.0F.W0 7B /r
VCVTUSI2SD xmm1, xmm2, r/m32
-EVEX.NDS.LIG.F2.0F.W1 7B /r
+EVEX.LIG.F2.0F.W1 7B /r
VCVTUSI2SD xmm1, xmm2, r/m64{er}
A
@@ -181742,9 +181653,9 @@ Feature
Flag
AVX512F
-EVEX.NDS.LIG.F3.0F.W0 7B /r
+EVEX.LIG.F3.0F.W0 7B /r
VCVTUSI2SS xmm1, xmm2, r/m32{er}
-EVEX.NDS.LIG.F3.0F.W1 7B /r
+EVEX.LIG.F3.0F.W1 7B /r
VCVTUSI2SS xmm1, xmm2, r/m64{er}
A
@@ -181856,11 +181767,11 @@ Flag
AVX512VL
AVX512BW
-EVEX.NDS.128.66.0F3A.W0 42 /r ib
+EVEX.128.66.0F3A.W0 42 /r ib
VDBPSADBW xmm1 {k1}{z}, xmm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W0 42 /r ib
+EVEX.256.66.0F3A.W0 42 /r ib
VDBPSADBW ymm1 {k1}{z}, ymm2,
ymm3/m256, imm8
@@ -181871,7 +181782,7 @@ V/V
AVX512VL
AVX512BW
-EVEX.NDS.512.66.0F3A.W0 42 /r ib
+EVEX.512.66.0F3A.W0 42 /r ib
VDBPSADBW zmm1 {k1}{z}, zmm2,
zmm3/m512, imm8
@@ -183640,13 +183551,13 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F3A.W1 54 /r ib
+EVEX.128.66.0F3A.W1 54 /r ib
VFIXUPIMMPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst, imm8
-EVEX.NDS.256.66.0F3A.W1 54 /r ib
+EVEX.256.66.0F3A.W1 54 /r ib
VFIXUPIMMPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst, imm8
-EVEX.NDS.512.66.0F3A.W1 54 /r ib
+EVEX.512.66.0F3A.W1 54 /r ib
VFIXUPIMMPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{sae}, imm8
@@ -183909,13 +183820,13 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F3A.W0 54 /r
+EVEX.128.66.0F3A.W0 54 /r
VFIXUPIMMPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst, imm8
-EVEX.NDS.256.66.0F3A.W0 54 /r
+EVEX.256.66.0F3A.W0 54 /r
VFIXUPIMMPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst, imm8
-EVEX.NDS.512.66.0F3A.W0 54 /r ib
+EVEX.512.66.0F3A.W0 54 /r ib
VFIXUPIMMPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{sae}, imm8
@@ -184164,7 +184075,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W1 55 /r ib
+EVEX.LIG.66.0F3A.W1 55 /r ib
VFIXUPIMMSD xmm1 {k1}{z},
xmm2, xmm3/m64{sae}, imm8
@@ -184382,7 +184293,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W0 55 /r ib
+EVEX.LIG.66.0F3A.W0 55 /r ib
VFIXUPIMMSS xmm1 {k1}{z}, xmm2,
xmm3/m32{sae}, imm8
@@ -184600,49 +184511,49 @@ Instruction
Op/
En
-VEX.NDS.128.66.0F38.W1 98 /r
+VEX.128.66.0F38.W1 98 /r
VFMADD132PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 A8 /r
+VEX.128.66.0F38.W1 A8 /r
VFMADD213PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 B8 /r
+VEX.128.66.0F38.W1 B8 /r
VFMADD231PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W1 98 /r
+VEX.256.66.0F38.W1 98 /r
VFMADD132PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 A8 /r
+VEX.256.66.0F38.W1 A8 /r
VFMADD213PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 B8 /r
+VEX.256.66.0F38.W1 B8 /r
VFMADD231PD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W1 98 /r
+EVEX.128.66.0F38.W1 98 /r
VFMADD132PD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.128.66.0F38.W1 A8 /r
+EVEX.128.66.0F38.W1 A8 /r
VFMADD213PD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.128.66.0F38.W1 B8 /r
+EVEX.128.66.0F38.W1 B8 /r
VFMADD231PD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 98 /r
+EVEX.256.66.0F38.W1 98 /r
VFMADD132PD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.256.66.0F38.W1 A8 /r
+EVEX.256.66.0F38.W1 A8 /r
VFMADD213PD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.256.66.0F38.W1 B8 /r
+EVEX.256.66.0F38.W1 B8 /r
VFMADD231PD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 98 /r
+EVEX.512.66.0F38.W1 98 /r
VFMADD132PD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 A8 /r
+EVEX.512.66.0F38.W1 A8 /r
VFMADD213PD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 B8 /r
+EVEX.512.66.0F38.W1 B8 /r
VFMADD231PD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
@@ -185145,49 +185056,49 @@ Feature
Flag
FMA
-VEX.NDS.128.66.0F38.W0 98 /r
+VEX.128.66.0F38.W0 98 /r
VFMADD132PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 A8 /r
+VEX.128.66.0F38.W0 A8 /r
VFMADD213PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 B8 /r
+VEX.128.66.0F38.W0 B8 /r
VFMADD231PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W0 98 /r
+VEX.256.66.0F38.W0 98 /r
VFMADD132PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W0 A8 /r
+VEX.256.66.0F38.W0 A8 /r
VFMADD213PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.0 B8 /r
+VEX.256.66.0F38.0 B8 /r
VFMADD231PS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W0 98 /r
+EVEX.128.66.0F38.W0 98 /r
VFMADD132PS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 A8 /r
+EVEX.128.66.0F38.W0 A8 /r
VFMADD213PS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 B8 /r
+EVEX.128.66.0F38.W0 B8 /r
VFMADD231PS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 98 /r
+EVEX.256.66.0F38.W0 98 /r
VFMADD132PS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 A8 /r
+EVEX.256.66.0F38.W0 A8 /r
VFMADD213PS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 B8 /r
+EVEX.256.66.0F38.W0 B8 /r
VFMADD231PS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 98 /r
+EVEX.512.66.0F38.W0 98 /r
VFMADD132PS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 A8 /r
+EVEX.512.66.0F38.W0 A8 /r
VFMADD213PS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 B8 /r
+EVEX.512.66.0F38.W0 B8 /r
VFMADD231PS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{er}
@@ -185679,22 +185590,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W1 99 /r
+VEX.LIG.66.0F38.W1 99 /r
VFMADD132SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 A9 /r
+VEX.LIG.66.0F38.W1 A9 /r
VFMADD213SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 B9 /r
+VEX.LIG.66.0F38.W1 B9 /r
VFMADD231SD xmm1, xmm2,
xmm3/m64
-EVEX.DDS.LIG.66.0F38.W1 99 /r
+EVEX.LIG.66.0F38.W1 99 /r
VFMADD132SD xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 A9 /r
+EVEX.LIG.66.0F38.W1 A9 /r
VFMADD213SD xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 B9 /r
+EVEX.LIG.66.0F38.W1 B9 /r
VFMADD231SD xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
@@ -185937,22 +185848,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W0 99 /r
+VEX.LIG.66.0F38.W0 99 /r
VFMADD132SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 A9 /r
+VEX.LIG.66.0F38.W0 A9 /r
VFMADD213SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 B9 /r
+VEX.LIG.66.0F38.W0 B9 /r
VFMADD231SS xmm1, xmm2,
xmm3/m32
-EVEX.DDS.LIG.66.0F38.W0 99 /r
+EVEX.LIG.66.0F38.W0 99 /r
VFMADD132SS xmm1 {k1}{z}, xmm2,
xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 A9 /r
+EVEX.LIG.66.0F38.W0 A9 /r
VFMADD213SS xmm1 {k1}{z}, xmm2,
xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 B9 /r
+EVEX.LIG.66.0F38.W0 B9 /r
VFMADD231SS xmm1 {k1}{z}, xmm2,
xmm3/m32{er}
@@ -186200,25 +186111,25 @@ Feature
Flag
FMA
-VEX.DDS.128.66.0F38.W1 96 /r
+VEX.128.66.0F38.W1 96 /r
VFMADDSUB132PD xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W1 A6 /r
+VEX.128.66.0F38.W1 A6 /r
VFMADDSUB213PD xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W1 B6 /r
+VEX.128.66.0F38.W1 B6 /r
VFMADDSUB231PD xmm1, xmm2,
xmm3/m128
-VEX.DDS.256.66.0F38.W1 96 /r
+VEX.256.66.0F38.W1 96 /r
VFMADDSUB132PD ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W1 A6 /r
+VEX.256.66.0F38.W1 A6 /r
VFMADDSUB213PD ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W1 B6 /r
+VEX.256.66.0F38.W1 B6 /r
VFMADDSUB231PD ymm1, ymm2,
ymm3/m256
-EVEX.DDS.128.66.0F38.W1 A6 /r
+EVEX.128.66.0F38.W1 A6 /r
VFMADDSUB213PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -186259,7 +186170,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F38.W1 B6 /r
+EVEX.128.66.0F38.W1 B6 /r
VFMADDSUB231PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -186270,7 +186181,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F38.W1 96 /r
+EVEX.128.66.0F38.W1 96 /r
VFMADDSUB132PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -186281,7 +186192,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W1 A6 /r
+EVEX.256.66.0F38.W1 A6 /r
VFMADDSUB213PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -186292,7 +186203,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W1 B6 /r
+EVEX.256.66.0F38.W1 B6 /r
VFMADDSUB231PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -186303,7 +186214,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W1 96 /r
+EVEX.256.66.0F38.W1 96 /r
VFMADDSUB132PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -186381,10 +186292,10 @@ Feature
Flag
AVX512F
-EVEX.DDS.512.66.0F38.W1 A6 /r
+EVEX.512.66.0F38.W1 A6 /r
VFMADDSUB213PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.DDS.512.66.0F38.W1 B6 /r
+EVEX.512.66.0F38.W1 B6 /r
VFMADDSUB231PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
@@ -186394,7 +186305,7 @@ V/V
AVX512F
-EVEX.DDS.512.66.0F38.W1 96 /r
+EVEX.512.66.0F38.W1 96 /r
VFMADDSUB132PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
@@ -186825,25 +186736,25 @@ Feature
Flag
FMA
-VEX.DDS.128.66.0F38.W0 96 /r
+VEX.128.66.0F38.W0 96 /r
VFMADDSUB132PS xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W0 A6 /r
+VEX.128.66.0F38.W0 A6 /r
VFMADDSUB213PS xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W0 B6 /r
+VEX.128.66.0F38.W0 B6 /r
VFMADDSUB231PS xmm1, xmm2,
xmm3/m128
-VEX.DDS.256.66.0F38.W0 96 /r
+VEX.256.66.0F38.W0 96 /r
VFMADDSUB132PS ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W0 A6 /r
+VEX.256.66.0F38.W0 A6 /r
VFMADDSUB213PS ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W0 B6 /r
+VEX.256.66.0F38.W0 B6 /r
VFMADDSUB231PS ymm1, ymm2,
ymm3/m256
-EVEX.DDS.128.66.0F38.W0 A6 /r
+EVEX.128.66.0F38.W0 A6 /r
VFMADDSUB213PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
@@ -186884,13 +186795,13 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F38.W0 B6 /r
+EVEX.128.66.0F38.W0 B6 /r
VFMADDSUB231PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.DDS.128.66.0F38.W0 96 /r
+EVEX.128.66.0F38.W0 96 /r
VFMADDSUB132PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.DDS.256.66.0F38.W0 A6 /r
+EVEX.256.66.0F38.W0 A6 /r
VFMADDSUB213PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
@@ -186915,13 +186826,13 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W0 B6 /r
+EVEX.256.66.0F38.W0 B6 /r
VFMADDSUB231PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.DDS.256.66.0F38.W0 96 /r
+EVEX.256.66.0F38.W0 96 /r
VFMADDSUB132PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.DDS.512.66.0F38.W0 A6 /r
+EVEX.512.66.0F38.W0 A6 /r
VFMADDSUB213PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
@@ -186945,10 +186856,10 @@ V/V
AVX512F
-EVEX.DDS.512.66.0F38.W0 B6 /r
+EVEX.512.66.0F38.W0 B6 /r
VFMADDSUB231PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.DDS.512.66.0F38.W0 96 /r
+EVEX.512.66.0F38.W0 96 /r
VFMADDSUB132PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
@@ -187431,25 +187342,25 @@ Feature
Flag
FMA
-VEX.DDS.128.66.0F38.W1 97 /r
+VEX.128.66.0F38.W1 97 /r
VFMSUBADD132PD xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W1 A7 /r
+VEX.128.66.0F38.W1 A7 /r
VFMSUBADD213PD xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W1 B7 /r
+VEX.128.66.0F38.W1 B7 /r
VFMSUBADD231PD xmm1, xmm2,
xmm3/m128
-VEX.DDS.256.66.0F38.W1 97 /r
+VEX.256.66.0F38.W1 97 /r
VFMSUBADD132PD ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W1 A7 /r
+VEX.256.66.0F38.W1 A7 /r
VFMSUBADD213PD ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W1 B7 /r
+VEX.256.66.0F38.W1 B7 /r
VFMSUBADD231PD ymm1, ymm2,
ymm3/m256
-EVEX.DDS.128.66.0F38.W1 97 /r
+EVEX.128.66.0F38.W1 97 /r
VFMSUBADD132PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -187490,7 +187401,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F38.W1 A7 /r
+EVEX.128.66.0F38.W1 A7 /r
VFMSUBADD213PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -187501,7 +187412,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F38.W1 B7 /r
+EVEX.128.66.0F38.W1 B7 /r
VFMSUBADD231PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -187512,7 +187423,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W1 97 /r
+EVEX.256.66.0F38.W1 97 /r
VFMSUBADD132PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -187523,7 +187434,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W1 A7 /r
+EVEX.256.66.0F38.W1 A7 /r
VFMSUBADD213PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -187534,7 +187445,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W1 B7 /r
+EVEX.256.66.0F38.W1 B7 /r
VFMSUBADD231PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -187612,10 +187523,10 @@ Feature
Flag
AVX512F
-EVEX.DDS.512.66.0F38.W1 97 /r
+EVEX.512.66.0F38.W1 97 /r
VFMSUBADD132PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.DDS.512.66.0F38.W1 A7 /r
+EVEX.512.66.0F38.W1 A7 /r
VFMSUBADD213PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
@@ -187625,7 +187536,7 @@ V/V
AVX512F
-EVEX.DDS.512.66.0F38.W1 B7 /r
+EVEX.512.66.0F38.W1 B7 /r
VFMSUBADD231PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
@@ -188056,28 +187967,28 @@ Feature
Flag
FMA
-VEX.DDS.128.66.0F38.W0 97 /r
+VEX.128.66.0F38.W0 97 /r
VFMSUBADD132PS xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W0 A7 /r
+VEX.128.66.0F38.W0 A7 /r
VFMSUBADD213PS xmm1, xmm2,
xmm3/m128
-VEX.DDS.128.66.0F38.W0 B7 /r
+VEX.128.66.0F38.W0 B7 /r
VFMSUBADD231PS xmm1, xmm2,
xmm3/m128
-VEX.DDS.256.66.0F38.W0 97 /r
+VEX.256.66.0F38.W0 97 /r
VFMSUBADD132PS ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W0 A7 /r
+VEX.256.66.0F38.W0 A7 /r
VFMSUBADD213PS ymm1, ymm2,
ymm3/m256
-VEX.DDS.256.66.0F38.W0 B7 /r
+VEX.256.66.0F38.W0 B7 /r
VFMSUBADD231PS ymm1, ymm2,
ymm3/m256
-EVEX.DDS.128.66.0F38.W0 97 /r
+EVEX.128.66.0F38.W0 97 /r
VFMSUBADD132PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.DDS.128.66.0F38.W0 A7 /r
+EVEX.128.66.0F38.W0 A7 /r
VFMSUBADD213PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
@@ -188125,13 +188036,13 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F38.W0 B7 /r
+EVEX.128.66.0F38.W0 B7 /r
VFMSUBADD231PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.DDS.256.66.0F38.W0 97 /r
+EVEX.256.66.0F38.W0 97 /r
VFMSUBADD132PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.DDS.256.66.0F38.W0 A7 /r
+EVEX.256.66.0F38.W0 A7 /r
VFMSUBADD213PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
@@ -188156,13 +188067,13 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W0 B7 /r
+EVEX.256.66.0F38.W0 B7 /r
VFMSUBADD231PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.DDS.512.66.0F38.W0 97 /r
+EVEX.512.66.0F38.W0 97 /r
VFMSUBADD132PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.DDS.512.66.0F38.W0 A7 /r
+EVEX.512.66.0F38.W0 A7 /r
VFMSUBADD213PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
@@ -188185,7 +188096,7 @@ V/V
AVX512F
-EVEX.DDS.512.66.0F38.W0 B7 /r
+EVEX.512.66.0F38.W0 B7 /r
VFMSUBADD231PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
@@ -188668,49 +188579,49 @@ Feature
Flag
FMA
-VEX.NDS.128.66.0F38.W1 9A /r
+VEX.128.66.0F38.W1 9A /r
VFMSUB132PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 AA /r
+VEX.128.66.0F38.W1 AA /r
VFMSUB213PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 BA /r
+VEX.128.66.0F38.W1 BA /r
VFMSUB231PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W1 9A /r
+VEX.256.66.0F38.W1 9A /r
VFMSUB132PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 AA /r
+VEX.256.66.0F38.W1 AA /r
VFMSUB213PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 BA /r
+VEX.256.66.0F38.W1 BA /r
VFMSUB231PD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W1 9A /r
+EVEX.128.66.0F38.W1 9A /r
VFMSUB132PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
-EVEX.NDS.128.66.0F38.W1 AA /r
+EVEX.128.66.0F38.W1 AA /r
VFMSUB213PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
-EVEX.NDS.128.66.0F38.W1 BA /r
+EVEX.128.66.0F38.W1 BA /r
VFMSUB231PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 9A /r
+EVEX.256.66.0F38.W1 9A /r
VFMSUB132PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
-EVEX.NDS.256.66.0F38.W1 AA /r
+EVEX.256.66.0F38.W1 AA /r
VFMSUB213PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
-EVEX.NDS.256.66.0F38.W1 BA /r
+EVEX.256.66.0F38.W1 BA /r
VFMSUB231PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 9A /r
+EVEX.512.66.0F38.W1 9A /r
VFMSUB132PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 AA /r
+EVEX.512.66.0F38.W1 AA /r
VFMSUB213PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 BA /r
+EVEX.512.66.0F38.W1 BA /r
VFMSUB231PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
@@ -189188,49 +189099,49 @@ Instruction
Op/E
n
-VEX.NDS.128.66.0F38.W0 9A /r
+VEX.128.66.0F38.W0 9A /r
VFMSUB132PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 AA /r
+VEX.128.66.0F38.W0 AA /r
VFMSUB213PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 BA /r
+VEX.128.66.0F38.W0 BA /r
VFMSUB231PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W0 9A /r
+VEX.256.66.0F38.W0 9A /r
VFMSUB132PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W0 AA /r
+VEX.256.66.0F38.W0 AA /r
VFMSUB213PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.0 BA /r
+VEX.256.66.0F38.0 BA /r
VFMSUB231PS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W0 9A /r
+EVEX.128.66.0F38.W0 9A /r
VFMSUB132PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 AA /r
+EVEX.128.66.0F38.W0 AA /r
VFMSUB213PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 BA /r
+EVEX.128.66.0F38.W0 BA /r
VFMSUB231PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 9A /r
+EVEX.256.66.0F38.W0 9A /r
VFMSUB132PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 AA /r
+EVEX.256.66.0F38.W0 AA /r
VFMSUB213PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 BA /r
+EVEX.256.66.0F38.W0 BA /r
VFMSUB231PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 9A /r
+EVEX.512.66.0F38.W0 9A /r
VFMSUB132PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 AA /r
+EVEX.512.66.0F38.W0 AA /r
VFMSUB213PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 BA /r
+EVEX.512.66.0F38.W0 BA /r
VFMSUB231PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
@@ -189731,22 +189642,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W1 9B /r
+VEX.LIG.66.0F38.W1 9B /r
VFMSUB132SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 AB /r
+VEX.LIG.66.0F38.W1 AB /r
VFMSUB213SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 BB /r
+VEX.LIG.66.0F38.W1 BB /r
VFMSUB231SD xmm1, xmm2,
xmm3/m64
-EVEX.DDS.LIG.66.0F38.W1 9B /r
+EVEX.LIG.66.0F38.W1 9B /r
VFMSUB132SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 AB /r
+EVEX.LIG.66.0F38.W1 AB /r
VFMSUB213SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 BB /r
+EVEX.LIG.66.0F38.W1 BB /r
VFMSUB231SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
@@ -189994,22 +189905,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W0 9B /r
+VEX.LIG.66.0F38.W0 9B /r
VFMSUB132SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 AB /r
+VEX.LIG.66.0F38.W0 AB /r
VFMSUB213SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 BB /r
+VEX.LIG.66.0F38.W0 BB /r
VFMSUB231SS xmm1, xmm2,
xmm3/m32
-EVEX.DDS.LIG.66.0F38.W0 9B /r
+EVEX.LIG.66.0F38.W0 9B /r
VFMSUB132SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 AB /r
+EVEX.LIG.66.0F38.W0 AB /r
VFMSUB213SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 BB /r
+EVEX.LIG.66.0F38.W0 BB /r
VFMSUB231SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
@@ -190258,25 +190169,25 @@ Feature
Flag
FMA
-VEX.NDS.128.66.0F38.W1 9C /r
+VEX.128.66.0F38.W1 9C /r
VFNMADD132PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 AC /r
+VEX.128.66.0F38.W1 AC /r
VFNMADD213PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 BC /r
+VEX.128.66.0F38.W1 BC /r
VFNMADD231PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W1 9C /r
+VEX.256.66.0F38.W1 9C /r
VFNMADD132PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 AC /r
+VEX.256.66.0F38.W1 AC /r
VFNMADD213PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 BC /r
+VEX.256.66.0F38.W1 BC /r
VFNMADD231PD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W1 9C /r
+EVEX.128.66.0F38.W1 9C /r
VFNMADD132PD xmm0 {k1}{z},
xmm1, xmm2/m128/m64bcst
@@ -190317,10 +190228,10 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W1 AC /r
+EVEX.128.66.0F38.W1 AC /r
VFNMADD213PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
-EVEX.NDS.128.66.0F38.W1 BC /r
+EVEX.128.66.0F38.W1 BC /r
VFNMADD231PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -190338,7 +190249,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W1 9C /r
+EVEX.256.66.0F38.W1 9C /r
VFNMADD132PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -190349,10 +190260,10 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W1 AC /r
+EVEX.256.66.0F38.W1 AC /r
VFNMADD213PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
-EVEX.NDS.256.66.0F38.W1 BC /r
+EVEX.256.66.0F38.W1 BC /r
VFNMADD231PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -190388,13 +190299,13 @@ V/V
AVX512F
-EVEX.NDS.512.66.0F38.W1 9C /r
+EVEX.512.66.0F38.W1 9C /r
VFNMADD132PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 AC /r
+EVEX.512.66.0F38.W1 AC /r
VFNMADD213PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 BC /r
+EVEX.512.66.0F38.W1 BC /r
VFNMADD231PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
@@ -190782,49 +190693,49 @@ Instruction
Op/
En
-VEX.NDS.128.66.0F38.W0 9C /r
+VEX.128.66.0F38.W0 9C /r
VFNMADD132PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 AC /r
+VEX.128.66.0F38.W0 AC /r
VFNMADD213PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 BC /r
+VEX.128.66.0F38.W0 BC /r
VFNMADD231PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W0 9C /r
+VEX.256.66.0F38.W0 9C /r
VFNMADD132PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W0 AC /r
+VEX.256.66.0F38.W0 AC /r
VFNMADD213PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.0 BC /r
+VEX.256.66.0F38.0 BC /r
VFNMADD231PS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W0 9C /r
+EVEX.128.66.0F38.W0 9C /r
VFNMADD132PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 AC /r
+EVEX.128.66.0F38.W0 AC /r
VFNMADD213PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 BC /r
+EVEX.128.66.0F38.W0 BC /r
VFNMADD231PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 9C /r
+EVEX.256.66.0F38.W0 9C /r
VFNMADD132PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 AC /r
+EVEX.256.66.0F38.W0 AC /r
VFNMADD213PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 BC /r
+EVEX.256.66.0F38.W0 BC /r
VFNMADD231PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 9C /r
+EVEX.512.66.0F38.W0 9C /r
VFNMADD132PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 AC /r
+EVEX.512.66.0F38.W0 AC /r
VFNMADD213PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 BC /r
+EVEX.512.66.0F38.W0 BC /r
VFNMADD231PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
@@ -191313,22 +191224,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W1 9D /r
+VEX.LIG.66.0F38.W1 9D /r
VFNMADD132SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 AD /r
+VEX.LIG.66.0F38.W1 AD /r
VFNMADD213SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 BD /r
+VEX.LIG.66.0F38.W1 BD /r
VFNMADD231SD xmm1, xmm2,
xmm3/m64
-EVEX.DDS.LIG.66.0F38.W1 9D /r
+EVEX.LIG.66.0F38.W1 9D /r
VFNMADD132SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 AD /r
+EVEX.LIG.66.0F38.W1 AD /r
VFNMADD213SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 BD /r
+EVEX.LIG.66.0F38.W1 BD /r
VFNMADD231SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
@@ -191572,22 +191483,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W0 9D /r
+VEX.LIG.66.0F38.W0 9D /r
VFNMADD132SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 AD /r
+VEX.LIG.66.0F38.W0 AD /r
VFNMADD213SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 BD /r
+VEX.LIG.66.0F38.W0 BD /r
VFNMADD231SS xmm1, xmm2,
xmm3/m32
-EVEX.DDS.LIG.66.0F38.W0 9D /r
+EVEX.LIG.66.0F38.W0 9D /r
VFNMADD132SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 AD /r
+EVEX.LIG.66.0F38.W0 AD /r
VFNMADD213SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 BD /r
+EVEX.LIG.66.0F38.W0 BD /r
VFNMADD231SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
@@ -191833,25 +191744,25 @@ Feature
Flag
FMA
-VEX.NDS.128.66.0F38.W1 9E /r
+VEX.128.66.0F38.W1 9E /r
VFNMSUB132PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 AE /r
+VEX.128.66.0F38.W1 AE /r
VFNMSUB213PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W1 BE /r
+VEX.128.66.0F38.W1 BE /r
VFNMSUB231PD xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W1 9E /r
+VEX.256.66.0F38.W1 9E /r
VFNMSUB132PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 AE /r
+VEX.256.66.0F38.W1 AE /r
VFNMSUB213PD ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W1 BE /r
+VEX.256.66.0F38.W1 BE /r
VFNMSUB231PD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W1 9E /r
+EVEX.128.66.0F38.W1 9E /r
VFNMSUB132PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -191892,10 +191803,10 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W1 AE /r
+EVEX.128.66.0F38.W1 AE /r
VFNMSUB213PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
-EVEX.NDS.128.66.0F38.W1 BE /r
+EVEX.128.66.0F38.W1 BE /r
VFNMSUB231PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -191913,7 +191824,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W1 9E /r
+EVEX.256.66.0F38.W1 9E /r
VFNMSUB132PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -191924,10 +191835,10 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W1 AE /r
+EVEX.256.66.0F38.W1 AE /r
VFNMSUB213PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
-EVEX.NDS.256.66.0F38.W1 BE /r
+EVEX.256.66.0F38.W1 BE /r
VFNMSUB231PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -191963,13 +191874,13 @@ V/V
AVX512F
-EVEX.NDS.512.66.0F38.W1 9E /r
+EVEX.512.66.0F38.W1 9E /r
VFNMSUB132PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 AE /r
+EVEX.512.66.0F38.W1 AE /r
VFNMSUB213PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
-EVEX.NDS.512.66.0F38.W1 BE /r
+EVEX.512.66.0F38.W1 BE /r
VFNMSUB231PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst{er}
@@ -192350,49 +192261,49 @@ Instruction
Op/
En
-VEX.NDS.128.66.0F38.W0 9E /r
+VEX.128.66.0F38.W0 9E /r
VFNMSUB132PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 AE /r
+VEX.128.66.0F38.W0 AE /r
VFNMSUB213PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.128.66.0F38.W0 BE /r
+VEX.128.66.0F38.W0 BE /r
VFNMSUB231PS xmm1, xmm2,
xmm3/m128
-VEX.NDS.256.66.0F38.W0 9E /r
+VEX.256.66.0F38.W0 9E /r
VFNMSUB132PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.W0 AE /r
+VEX.256.66.0F38.W0 AE /r
VFNMSUB213PS ymm1, ymm2,
ymm3/m256
-VEX.NDS.256.66.0F38.0 BE /r
+VEX.256.66.0F38.0 BE /r
VFNMSUB231PS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F38.W0 9E /r
+EVEX.128.66.0F38.W0 9E /r
VFNMSUB132PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 AE /r
+EVEX.128.66.0F38.W0 AE /r
VFNMSUB213PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.128.66.0F38.W0 BE /r
+EVEX.128.66.0F38.W0 BE /r
VFNMSUB231PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 9E /r
+EVEX.256.66.0F38.W0 9E /r
VFNMSUB132PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 AE /r
+EVEX.256.66.0F38.W0 AE /r
VFNMSUB213PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.256.66.0F38.W0 BE /r
+EVEX.256.66.0F38.W0 BE /r
VFNMSUB231PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 9E /r
+EVEX.512.66.0F38.W0 9E /r
VFNMSUB132PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 AE /r
+EVEX.512.66.0F38.W0 AE /r
VFNMSUB213PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
-EVEX.NDS.512.66.0F38.W0 BE /r
+EVEX.512.66.0F38.W0 BE /r
VFNMSUB231PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst{er}
@@ -192880,22 +192791,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W1 9F /r
+VEX.LIG.66.0F38.W1 9F /r
VFNMSUB132SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 AF /r
+VEX.LIG.66.0F38.W1 AF /r
VFNMSUB213SD xmm1, xmm2,
xmm3/m64
-VEX.DDS.LIG.66.0F38.W1 BF /r
+VEX.LIG.66.0F38.W1 BF /r
VFNMSUB231SD xmm1, xmm2,
xmm3/m64
-EVEX.DDS.LIG.66.0F38.W1 9F /r
+EVEX.LIG.66.0F38.W1 9F /r
VFNMSUB132SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 AF /r
+EVEX.LIG.66.0F38.W1 AF /r
VFNMSUB213SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
-EVEX.DDS.LIG.66.0F38.W1 BF /r
+EVEX.LIG.66.0F38.W1 BF /r
VFNMSUB231SD xmm1 {k1}{z},
xmm2, xmm3/m64{er}
@@ -193141,22 +193052,22 @@ Feature
Flag
FMA
-VEX.DDS.LIG.66.0F38.W0 9F /r
+VEX.LIG.66.0F38.W0 9F /r
VFNMSUB132SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 AF /r
+VEX.LIG.66.0F38.W0 AF /r
VFNMSUB213SS xmm1, xmm2,
xmm3/m32
-VEX.DDS.LIG.66.0F38.W0 BF /r
+VEX.LIG.66.0F38.W0 BF /r
VFNMSUB231SS xmm1, xmm2,
xmm3/m32
-EVEX.DDS.LIG.66.0F38.W0 9F /r
+EVEX.LIG.66.0F38.W0 9F /r
VFNMSUB132SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 AF /r
+EVEX.LIG.66.0F38.W0 AF /r
VFNMSUB213SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
-EVEX.DDS.LIG.66.0F38.W0 BF /r
+EVEX.LIG.66.0F38.W0 BF /r
VFNMSUB231SS xmm1 {k1}{z},
xmm2, xmm3/m32{er}
@@ -194068,12 +193979,12 @@ Feature
Flag
AVX2
-VEX.DDS.128.66.0F38.W1 92 /r
+VEX.128.66.0F38.W1 92 /r
VGATHERDPD xmm1, vm32x, xmm2
Description
-VEX.DDS.128.66.0F38.W1 93 /r
+VEX.128.66.0F38.W1 93 /r
VGATHERQPD xmm1, vm64x, xmm2
RMV
@@ -194085,7 +193996,7 @@ AVX2
Using qword indices specified in vm64x, gather double-precision FP values from memory conditioned on mask specified by xmm2. Conditionally gathered elements are merged
into xmm1.
-VEX.DDS.256.66.0F38.W1 92 /r
+VEX.256.66.0F38.W1 92 /r
VGATHERDPD ymm1, vm32x, ymm2
RMV
@@ -194097,7 +194008,7 @@ AVX2
Using dword indices specified in vm32x, gather double-precision FP values from memory conditioned on mask specified by ymm2. Conditionally gathered elements are merged
into ymm1.
-VEX.DDS.256.66.0F38.W1 93 /r
+VEX.256.66.0F38.W1 93 /r
VGATHERQPD ymm1, vm64y, ymm2
RMV
@@ -194163,8 +194074,8 @@ Vol. 2C 5-245
VEX.128 version: The instruction will gather two double-precision floating-point values. For dword indices, only the
lower two indices in the vector index register are used.
-VEX.256 version: The instruction will gather four double-precision floating-point values. For dword indices, only
-the lower four indices in the vector index register are used.
+VEX.256 version: The instruction will gather four double-precision floating-point values. For dword indices, only the
+lower four indices in the vector index register are used.
Note that:
@@ -194220,6 +194131,7 @@ SCALE: scale factor encoded by SIB:[7:6];
DISP: optional 1, 4 byte displacement;
MASK  SRC3;
VGATHERDPD (VEX.128 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 1
i  j * 64;
IF MASK[63+i] THEN
@@ -194237,10 +194149,9 @@ DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +63: i]  0;
ENDFOR
-MASK[MAXVL-1:128]  0;
DEST[MAXVL-1:128]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
VGATHERQPD (VEX.128 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 1
i  j * 64;
IF MASK[63+i] THEN
@@ -194257,9 +194168,7 @@ DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits this instruction
FI;
MASK[i +63: i]  0;
ENDFOR
-MASK[MAXVL-1:128]  0;
DEST[MAXVL-1:128]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
VGATHERDPD/VGATHERQPD — Gather Packed DP FP Values Using Signed Dword/Qword Indices
@@ -194268,6 +194177,7 @@ Vol. 2C 5-247
INSTRUCTION SET REFERENCE, V-Z
VGATHERQPD (VEX.256 version)
+MASK[MAXVL-1:256]  0;
FOR j 0 to 3
i  j * 64;
IF MASK[63+i] THEN
@@ -194284,8 +194194,9 @@ DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +63: i]  0;
ENDFOR
-(non-masked elements of the mask register have the content of respective element cleared)
+DEST[MAXVL-1:256]  0;
VGATHERDPD (VEX.256 version)
+MASK[MAXVL-1:256]  0;
FOR j 0 to 3
i  j * 64;
IF MASK[63+i] THEN
@@ -194303,7 +194214,7 @@ DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +63:i]  0;
ENDFOR
-(non-masked elements of the mask register have the content of respective element cleared)
+DEST[MAXVL-1:256]  0;
5-248 Vol. 2C
@@ -194355,12 +194266,12 @@ Feature
Flag
AVX2
-VEX.DDS.128.66.0F38.W0 92 /r
+VEX.128.66.0F38.W0 92 /r
VGATHERDPS xmm1, vm32x, xmm2
Description
-VEX.DDS.128.66.0F38.W0 93 /r
+VEX.128.66.0F38.W0 93 /r
VGATHERQPS xmm1, vm64x, xmm2
A
@@ -194373,7 +194284,7 @@ Using qword indices specified in vm64x, gather single-precision FP values from m
by xmm2. Conditionally gathered elements are merged into
xmm1.
-VEX.DDS.256.66.0F38.W0 92 /r
+VEX.256.66.0F38.W0 92 /r
VGATHERDPS ymm1, vm32y, ymm2
A
@@ -194386,7 +194297,7 @@ Using dword indices specified in vm32y, gather single-precision FP values from m
by ymm2. Conditionally gathered elements are merged into
ymm1.
-VEX.DDS.256.66.0F38.W0 93 /r
+VEX.256.66.0F38.W0 93 /r
VGATHERQPS xmm1, vm64y, xmm2
A
@@ -194513,6 +194424,7 @@ Vol. 2C 5-251
INSTRUCTION SET REFERENCE, V-Z
VGATHERDPS (VEX.128 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 3
i  j * 32;
IF MASK[31+i] THEN
@@ -194521,7 +194433,6 @@ ELSE
MASK[i +31:i]  0;
FI;
ENDFOR
-MASK[MAXVL-1:128]  0;
FOR j 0 to 3
i  j * 32;
DATA_ADDR  BASE_ADDR + (SignExtend(VINDEX[i+31:i])*SCALE + DISP;
@@ -194531,8 +194442,8 @@ FI;
MASK[i +31:i]  0;
ENDFOR
DEST[MAXVL-1:128]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
VGATHERQPS (VEX.128 version)
+MASK[MAXVL-1:64]  0;
FOR j 0 to 3
i  j * 32;
IF MASK[31+i] THEN
@@ -194541,7 +194452,6 @@ ELSE
MASK[i +31:i]  0;
FI;
ENDFOR
-MASK[MAXVL-1:128]  0;
FOR j 0 to 1
k  j * 64;
i  j * 32;
@@ -194551,9 +194461,7 @@ DEST[i +31:i]  FETCH_32BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +31:i]  0;
ENDFOR
-MASK[127:64]  0;
DEST[MAXVL-1:64]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
5-252 Vol. 2C
@@ -194562,6 +194470,7 @@ VGATHERDPS/VGATHERQPS — Gather Packed SP FP values Using Signed Dword/Qword In
INSTRUCTION SET REFERENCE, V-Z
VGATHERDPS (VEX.256 version)
+MASK[MAXVL-1:256]  0;
FOR j 0 to 7
i  j * 32;
IF MASK[31+i] THEN
@@ -194578,8 +194487,9 @@ DEST[i +31:i]  FETCH_32BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +31:i]  0;
ENDFOR
-(non-masked elements of the mask register have the content of respective element cleared)
+DEST[MAXVL-1:256]  0;
VGATHERQPS (VEX.256 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 7
i  j * 32;
IF MASK[31+i] THEN
@@ -194597,9 +194507,7 @@ DEST[i +31:i]  FETCH_32BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +31:i]  0;
ENDFOR
-MASK[MAXVL-1:128]  0;
DEST[MAXVL-1:128]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
VGATHERDPS/VGATHERQPS — Gather Packed SP FP values Using Signed Dword/Qword Indices
@@ -194820,7 +194728,7 @@ Operation
BASE_ADDR stands for the memory operand base address (a GPR); may not exist
VINDEX stands for the memory operand vector of indices (a vector register)
SCALE stands for the memory operand scalar (1, 2, 4 or 8)
-DISP is the optional 1, 2 or 4 byte displacement
+DISP is the optional 1 or 4 byte displacement
VGATHERDPS (EVEX encoded version)
(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j  0 TO KL-1
@@ -195043,9 +194951,9 @@ Operation
BASE_ADDR stands for the memory operand base address (a GPR); may not exist
VINDEX stands for the memory operand vector of indices (a ZMM register)
SCALE stands for the memory operand scalar (1, 2, 4 or 8)
-DISP is the optional 1, 2 or 4 byte displacement
+DISP is the optional 1 or 4 byte displacement
VGATHERQPS (EVEX encoded version)
-(KL, VL) = (2, 64), (4, 128), (8, 256)
+(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j  0 TO KL-1
i  j * 32
k  j * 64
@@ -195635,7 +195543,7 @@ Instruction
Op/
En
-EVEX.NDS.LIG.66.0F38.W1 43 /r
+EVEX.LIG.66.0F38.W1 43 /r
VGETEXPSD xmm1 {k1}{z},
xmm2, xmm3/m64{sae}
@@ -195743,7 +195651,7 @@ Instruction
Op/
En
-EVEX.NDS.LIG.66.0F38.W0 43 /r
+EVEX.LIG.66.0F38.W0 43 /r
VGETEXPSS xmm1 {k1}{z}, xmm2,
xmm3/m32{sae}
@@ -196447,7 +196355,7 @@ Instruction
Op/
En
-EVEX.NDS.LIG.66.0F3A.W1 27 /r ib
+EVEX.LIG.66.0F3A.W1 27 /r ib
VGETMANTSD xmm1 {k1}{z}, xmm2,
xmm3/m64{sae}, imm8
@@ -196572,7 +196480,7 @@ Instruction
Op/
En
-EVEX.NDS.LIG.66.0F3A.W0 27 /r ib
+EVEX.LIG.66.0F3A.W0 27 /r ib
VGETMANTSS xmm1 {k1}{z}, xmm2,
xmm3/m32{sae}, imm8
@@ -196709,25 +196617,25 @@ Feature
Flag
AVX
-VEX.NDS.256.66.0F3A.W0 18 /r ib
+VEX.256.66.0F3A.W0 18 /r ib
VINSERTF128 ymm1, ymm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W0 18 /r ib
+EVEX.256.66.0F3A.W0 18 /r ib
VINSERTF32X4 ymm1 {k1}{z}, ymm2,
xmm3/m128, imm8
-EVEX.NDS.512.66.0F3A.W0 18 /r ib
+EVEX.512.66.0F3A.W0 18 /r ib
VINSERTF32X4 zmm1 {k1}{z}, zmm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W1 18 /r ib
+EVEX.256.66.0F3A.W1 18 /r ib
VINSERTF64X2 ymm1 {k1}{z}, ymm2,
xmm3/m128, imm8
-EVEX.NDS.512.66.0F3A.W1 18 /r ib
+EVEX.512.66.0F3A.W1 18 /r ib
VINSERTF64X2 zmm1 {k1}{z}, zmm2,
xmm3/m128, imm8
-EVEX.NDS.512.66.0F3A.W0 1A /r ib
+EVEX.512.66.0F3A.W0 1A /r ib
VINSERTF32X8 zmm1 {k1}{z}, zmm2,
ymm3/m256, imm8
-EVEX.NDS.512.66.0F3A.W1 1A /r ib
+EVEX.512.66.0F3A.W1 1A /r ib
VINSERTF64X4 zmm1 {k1}{z}, zmm2,
ymm3/m256, imm8
@@ -197055,25 +196963,25 @@ Feature
Flag
AVX2
-VEX.NDS.256.66.0F3A.W0 38 /r ib
+VEX.256.66.0F3A.W0 38 /r ib
VINSERTI128 ymm1, ymm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W0 38 /r ib
+EVEX.256.66.0F3A.W0 38 /r ib
VINSERTI32X4 ymm1 {k1}{z}, ymm2,
xmm3/m128, imm8
-EVEX.NDS.512.66.0F3A.W0 38 /r ib
+EVEX.512.66.0F3A.W0 38 /r ib
VINSERTI32X4 zmm1 {k1}{z}, zmm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W1 38 /r ib
+EVEX.256.66.0F3A.W1 38 /r ib
VINSERTI64X2 ymm1 {k1}{z}, ymm2,
xmm3/m128, imm8
-EVEX.NDS.512.66.0F3A.W1 38 /r ib
+EVEX.512.66.0F3A.W1 38 /r ib
VINSERTI64X2 zmm1 {k1}{z}, zmm2,
xmm3/m128, imm8
-EVEX.NDS.512.66.0F3A.W0 3A /r ib
+EVEX.512.66.0F3A.W0 3A /r ib
VINSERTI32X8 zmm1 {k1}{z}, zmm2,
ymm3/m256, imm8
-EVEX.NDS.512.66.0F3A.W1 3A /r ib
+EVEX.512.66.0F3A.W1 3A /r ib
VINSERTI64X4 zmm1 {k1}{z}, zmm2,
ymm3/m256, imm8
@@ -197398,7 +197306,7 @@ Mode
Feature
Flag
-VEX.NDS.128.66.0F38.W0 2C /r
+VEX.128.66.0F38.W0 2C /r
RVM V/V
@@ -197457,19 +197365,19 @@ Conditionally store packed double-precision values from
ymm2 using mask in ymm1.
VMASKMOVPS xmm1, xmm2, m128
-VEX.NDS.256.66.0F38.W0 2C /r
+VEX.256.66.0F38.W0 2C /r
VMASKMOVPS ymm1, ymm2, m256
-VEX.NDS.128.66.0F38.W0 2D /r
+VEX.128.66.0F38.W0 2D /r
VMASKMOVPD xmm1, xmm2, m128
-VEX.NDS.256.66.0F38.W0 2D /r
+VEX.256.66.0F38.W0 2D /r
VMASKMOVPD ymm1, ymm2, m256
-VEX.NDS.128.66.0F38.W0 2E /r
+VEX.128.66.0F38.W0 2E /r
VMASKMOVPS m128, xmm1, xmm2
-VEX.NDS.256.66.0F38.W0 2E /r
+VEX.256.66.0F38.W0 2E /r
VMASKMOVPS m256, ymm1, ymm2
-VEX.NDS.128.66.0F38.W0 2F /r
+VEX.128.66.0F38.W0 2F /r
VMASKMOVPD m128, xmm1, xmm2
-VEX.NDS.256.66.0F38.W0 2F /r
+VEX.256.66.0F38.W0 2F /r
VMASKMOVPD m256, ymm1, ymm2
Description
@@ -197637,9 +197545,9 @@ Feature
Flag
AVX2
-VEX.NDS.128.66.0F3A.W0 02 /r ib
+VEX.128.66.0F3A.W0 02 /r ib
VPBLENDD xmm1, xmm2, xmm3/m128, imm8
-VEX.NDS.256.66.0F3A.W0 02 /r ib
+VEX.256.66.0F3A.W0 02 /r ib
VPBLENDD ymm1, ymm2, ymm3/m256, imm8
RVMI
@@ -197768,22 +197676,22 @@ Flag
AVX512VL
AVX512BW
-EVEX.NDS.128.66.0F38.W0 66 /r
+EVEX.128.66.0F38.W0 66 /r
VPBLENDMB xmm1 {k1}{z},
xmm2, xmm3/m128
-EVEX.NDS.256.66.0F38.W0 66 /r
+EVEX.256.66.0F38.W0 66 /r
VPBLENDMB ymm1 {k1}{z},
ymm2, ymm3/m256
-EVEX.NDS.512.66.0F38.W0 66 /r
+EVEX.512.66.0F38.W0 66 /r
VPBLENDMB zmm1 {k1}{z},
zmm2, zmm3/m512
-EVEX.NDS.128.66.0F38.W1 66 /r
+EVEX.128.66.0F38.W1 66 /r
VPBLENDMW xmm1 {k1}{z},
xmm2, xmm3/m128
-EVEX.NDS.256.66.0F38.W1 66 /r
+EVEX.256.66.0F38.W1 66 /r
VPBLENDMW ymm1 {k1}{z},
ymm2, ymm3/m256
-EVEX.NDS.512.66.0F38.W1 66 /r
+EVEX.512.66.0F38.W1 66 /r
VPBLENDMW zmm1 {k1}{z},
zmm2, zmm3/m512
@@ -197953,22 +197861,22 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W0 64 /r
+EVEX.128.66.0F38.W0 64 /r
VPBLENDMD xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 64 /r
+EVEX.256.66.0F38.W0 64 /r
VPBLENDMD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 64 /r
+EVEX.512.66.0F38.W0 64 /r
VPBLENDMD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 64 /r
+EVEX.128.66.0F38.W1 64 /r
VPBLENDMQ xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 64 /r
+EVEX.256.66.0F38.W1 64 /r
VPBLENDMQ ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 64 /r
+EVEX.512.66.0F38.W1 64 /r
VPBLENDMQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -199460,10 +199368,10 @@ Flag
AVX512VL
AVX512BW
-EVEX.NDS.128.66.0F3A.W0 3F /r ib
+EVEX.128.66.0F3A.W0 3F /r ib
VPCMPB k1 {k2}, xmm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W0 3F /r ib
+EVEX.256.66.0F3A.W0 3F /r ib
A
@@ -199471,7 +199379,7 @@ V/V
VPCMPB k1 {k2}, ymm2,
ymm3/m256, imm8
-EVEX.NDS.512.66.0F3A.W0 3F /r ib
+EVEX.512.66.0F3A.W0 3F /r ib
VPCMPB k1 {k2}, zmm2,
zmm3/m512, imm8
@@ -199484,7 +199392,7 @@ V/V
AVX512BW
-EVEX.NDS.128.66.0F3A.W0 3E /r ib
+EVEX.128.66.0F3A.W0 3E /r ib
A
@@ -199492,7 +199400,7 @@ V/V
VPCMPUB k1 {k2}, xmm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W0 3E /r ib
+EVEX.256.66.0F3A.W0 3E /r ib
AVX512VL
AVX512BW
@@ -199512,7 +199420,7 @@ AVX512BW
VPCMPUB k1 {k2}, ymm2,
ymm3/m256, imm8
-EVEX.NDS.512.66.0F3A.W0 3E /r ib
+EVEX.512.66.0F3A.W0 3E /r ib
VPCMPUB k1 {k2}, zmm2,
zmm3/m512, imm8
@@ -199725,10 +199633,10 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F3A.W0 1F /r ib
+EVEX.128.66.0F3A.W0 1F /r ib
VPCMPD k1 {k2}, xmm2,
xmm3/m128/m32bcst, imm8
-EVEX.NDS.256.66.0F3A.W0 1F /r ib
+EVEX.256.66.0F3A.W0 1F /r ib
VPCMPD k1 {k2}, ymm2,
ymm3/m256/m32bcst, imm8
@@ -199739,7 +199647,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F3A.W0 1F /r ib
+EVEX.512.66.0F3A.W0 1F /r ib
VPCMPD k1 {k2}, zmm2,
zmm3/m512/m32bcst, imm8
@@ -199749,7 +199657,7 @@ V/V
AVX512F
-EVEX.NDS.128.66.0F3A.W0 1E /r ib
+EVEX.128.66.0F3A.W0 1E /r ib
VPCMPUD k1 {k2}, xmm2,
xmm3/m128/m32bcst, imm8
@@ -199760,7 +199668,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F3A.W0 1E /r ib
+EVEX.256.66.0F3A.W0 1E /r ib
VPCMPUD k1 {k2}, ymm2,
ymm3/m256/m32bcst, imm8
@@ -199771,7 +199679,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F3A.W0 1E /r ib
+EVEX.512.66.0F3A.W0 1E /r ib
VPCMPUD k1 {k2}, zmm2,
zmm3/m512/m32bcst, imm8
@@ -199962,10 +199870,10 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F3A.W1 1F /r ib
+EVEX.128.66.0F3A.W1 1F /r ib
VPCMPQ k1 {k2}, xmm2,
xmm3/m128/m64bcst, imm8
-EVEX.NDS.256.66.0F3A.W1 1F /r ib
+EVEX.256.66.0F3A.W1 1F /r ib
VPCMPQ k1 {k2}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -199976,7 +199884,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F3A.W1 1F /r ib
+EVEX.512.66.0F3A.W1 1F /r ib
VPCMPQ k1 {k2}, zmm2,
zmm3/m512/m64bcst, imm8
@@ -199986,7 +199894,7 @@ V/V
AVX512F
-EVEX.NDS.128.66.0F3A.W1 1E /r ib
+EVEX.128.66.0F3A.W1 1E /r ib
VPCMPUQ k1 {k2}, xmm2,
xmm3/m128/m64bcst, imm8
@@ -199997,7 +199905,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F3A.W1 1E /r ib
+EVEX.256.66.0F3A.W1 1E /r ib
VPCMPUQ k1 {k2}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -200008,7 +199916,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F3A.W1 1E /r ib
+EVEX.512.66.0F3A.W1 1E /r ib
VPCMPUQ k1 {k2}, zmm2,
zmm3/m512/m64bcst, imm8
@@ -200199,10 +200107,10 @@ Flag
AVX512VL
AVX512BW
-EVEX.NDS.128.66.0F3A.W1 3F /r ib
+EVEX.128.66.0F3A.W1 3F /r ib
VPCMPW k1 {k2}, xmm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W1 3F /r ib
+EVEX.256.66.0F3A.W1 3F /r ib
A
@@ -200210,7 +200118,7 @@ V/V
VPCMPW k1 {k2}, ymm2,
ymm3/m256, imm8
-EVEX.NDS.512.66.0F3A.W1 3F /r ib
+EVEX.512.66.0F3A.W1 3F /r ib
VPCMPW k1 {k2}, zmm2,
zmm3/m512, imm8
@@ -200223,7 +200131,7 @@ V/V
AVX512BW
-EVEX.NDS.128.66.0F3A.W1 3E /r ib
+EVEX.128.66.0F3A.W1 3E /r ib
A
@@ -200231,7 +200139,7 @@ V/V
VPCMPUW k1 {k2}, xmm2,
xmm3/m128, imm8
-EVEX.NDS.256.66.0F3A.W1 3E /r ib
+EVEX.256.66.0F3A.W1 3E /r ib
AVX512VL
AVX512BW
@@ -200934,7 +200842,7 @@ En
Mode
Support
-VEX.NDS.256.66.0F3A.W0 06 /r ib
+VEX.256.66.0F3A.W0 06 /r ib
VPERM2F128 ymm1, ymm2, ymm3/m256, imm8
RVMI V/V
@@ -201064,7 +200972,7 @@ Instruction
Op/
En
-VEX.NDS.256.66.0F3A.W0 46 /r ib
+VEX.256.66.0F3A.W0 46 /r ib
VPERM2I128 ymm1, ymm2, ymm3/m256, imm8
RVMI
@@ -201200,13 +201108,13 @@ bit Mode
Support
V/V
-EVEX.NDS.128.66.0F38.W0 8D /r
+EVEX.128.66.0F38.W0 8D /r
VPERMB xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.W0 8D /r
+EVEX.256.66.0F38.W0 8D /r
VPERMB ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.W0 8D /r
+EVEX.512.66.0F38.W0 8D /r
VPERMB zmm1 {k1}{z}, zmm2,
zmm3/m512
@@ -201323,37 +201231,36 @@ Instruction
Op /
En
+A
+
+64/32
+bit Mode
+Support
+V/V
+
+CPUID
+Feature
+Flag
+AVX2
-VEX.NDS.256.66.0F38.W0 36 /r
+VEX.256.66.0F38.W0 36 /r
VPERMD ymm1, ymm2, ymm3/m256
-EVEX.NDS.256.66.0F38.W0 36 /r
+EVEX.256.66.0F38.W0 36 /r
VPERMD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 36 /r
+EVEX.512.66.0F38.W0 36 /r
VPERMD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 8D /r
+EVEX.128.66.0F38.W1 8D /r
VPERMW xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.W1 8D /r
+EVEX.256.66.0F38.W1 8D /r
VPERMW ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.W1 8D /r
+EVEX.512.66.0F38.W1 8D /r
VPERMW zmm1 {k1}{z}, zmm2,
zmm3/m512
-A
-
-64/32
-bit Mode
-Support
-V/V
-
-CPUID
-Feature
-Flag
-AVX2
-
B
V/V
@@ -201381,20 +201288,12 @@ V/V
AVX512VL
AVX512BW
-Permute word integers in ymm3/m256 using indexes
-in ymm2 and store the result in ymm1 using writemask
-k1.
-
C
V/V
AVX512BW
-Permute word integers in zmm3/m512 using indexes
-in zmm2 and store the result in zmm1 using writemask
-k1.
-
Description
Permute doublewords in ymm3/m256 using indices in
ymm2 and store the result in ymm1.
@@ -201407,6 +201306,12 @@ writemask k1.
Permute word integers in xmm3/m128 using indexes
in xmm2 and store the result in xmm1 using writemask
k1.
+Permute word integers in ymm3/m256 using indexes
+in ymm2 and store the result in ymm1 using writemask
+k1.
+Permute word integers in zmm3/m512 using indexes
+in zmm2 and store the result in zmm1 using writemask
+k1.
Instruction Operand Encoding
Op/En
@@ -201588,7 +201493,7 @@ Flag
Description
-EVEX.DDS.128.66.0F38.W0 75 /r
+EVEX.128.66.0F38.W0 75 /r
VPERMI2B xmm1 {k1}{z}, xmm2,
xmm3/m128
@@ -201603,7 +201508,7 @@ Permute bytes in xmm3/m128 and xmm2 using
byte indexes in xmm1 and store the byte results
in xmm1 using writemask k1.
-EVEX.DDS.256.66.0F38.W0 75 /r
+EVEX.256.66.0F38.W0 75 /r
VPERMI2B ymm1 {k1}{z}, ymm2,
ymm3/m256
@@ -201618,7 +201523,7 @@ Permute bytes in ymm3/m256 and ymm2 using
byte indexes in ymm1 and store the byte results
in ymm1 using writemask k1.
-EVEX.DDS.512.66.0F38.W0 75 /r
+EVEX.512.66.0F38.W0 75 /r
VPERMI2B zmm1 {k1}{z}, zmm2,
zmm3/m512
@@ -201739,16 +201644,16 @@ Flag
AVX512VL
AVX512BW
-EVEX.DDS.128.66.0F38.W1 75 /r
+EVEX.128.66.0F38.W1 75 /r
VPERMI2W xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.DDS.256.66.0F38.W1 75 /r
+EVEX.256.66.0F38.W1 75 /r
VPERMI2W ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.DDS.512.66.0F38.W1 75 /r
+EVEX.512.66.0F38.W1 75 /r
VPERMI2W zmm1 {k1}{z}, zmm2,
zmm3/m512
-EVEX.DDS.128.66.0F38.W0 76 /r
+EVEX.128.66.0F38.W0 76 /r
VPERMI2D xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -201772,7 +201677,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W0 76 /r
+EVEX.256.66.0F38.W0 76 /r
VPERMI2D ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -201783,7 +201688,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.512.66.0F38.W0 76 /r
+EVEX.512.66.0F38.W0 76 /r
VPERMI2D zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -201793,7 +201698,7 @@ V/V
AVX512F
-EVEX.DDS.128.66.0F38.W1 76 /r
+EVEX.128.66.0F38.W1 76 /r
VPERMI2Q xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -201804,7 +201709,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W1 76 /r
+EVEX.256.66.0F38.W1 76 /r
VPERMI2Q ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -201815,7 +201720,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.512.66.0F38.W1 76 /r
+EVEX.512.66.0F38.W1 76 /r
VPERMI2Q zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -201825,7 +201730,7 @@ V/V
AVX512F
-EVEX.DDS.128.66.0F38.W0 77 /r
+EVEX.128.66.0F38.W0 77 /r
VPERMI2PS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -201836,7 +201741,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F38.W0 77 /r
+EVEX.256.66.0F38.W0 77 /r
VPERMI2PS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -201847,7 +201752,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.512.66.0F38.W0 77 /r
+EVEX.512.66.0F38.W0 77 /r
VPERMI2PS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -201928,10 +201833,10 @@ Flag
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F38.W1 77 /r
+EVEX.128.66.0F38.W1 77 /r
VPERMI2PD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.DDS.256.66.0F38.W1 77 /r
+EVEX.256.66.0F38.W1 77 /r
VPERMI2PD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -201942,7 +201847,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.512.66.0F38.W1 77 /r
+EVEX.512.66.0F38.W1 77 /r
VPERMI2PD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -202244,9 +202149,9 @@ Feature
Flag
AVX
-VEX.NDS.128.66.0F38.W0 0D /r
+VEX.128.66.0F38.W0 0D /r
VPERMILPD xmm1, xmm2, xmm3/m128
-VEX.NDS.256.66.0F38.W0 0D /r
+VEX.256.66.0F38.W0 0D /r
VPERMILPD ymm1, ymm2, ymm3/m256
A
@@ -202255,13 +202160,13 @@ V/V
AVX
-EVEX.NDS.128.66.0F38.W1 0D /r
+EVEX.128.66.0F38.W1 0D /r
VPERMILPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 0D /r
+EVEX.256.66.0F38.W1 0D /r
VPERMILPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 0D /r
+EVEX.512.66.0F38.W1 0D /r
VPERMILPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
VEX.128.66.0F3A.W0 05 /r ib
@@ -202709,7 +202614,7 @@ Feature
Flag
AVX
-VEX.NDS.128.66.0F38.W0 0C /r
+VEX.128.66.0F38.W0 0C /r
VPERMILPS xmm1, xmm2, xmm3/m128
VEX.128.66.0F3A.W0 04 /r ib
VPERMILPS xmm1, xmm2/m128, imm8
@@ -202720,7 +202625,7 @@ V/V
AVX
-VEX.NDS.256.66.0F38.W0 0C /r
+VEX.256.66.0F38.W0 0C /r
VPERMILPS ymm1, ymm2, ymm3/m256
A
@@ -202738,13 +202643,13 @@ V/V
AVX
-EVEX.NDS.128.66.0F38.W0 0C /r
+EVEX.128.66.0F38.W0 0C /r
VPERMILPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 0C /r
+EVEX.256.66.0F38.W0 0C /r
VPERMILPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 0C /r
+EVEX.512.66.0F38.W0 0C /r
VPERMILPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
EVEX.128.66.0F3A.W0 04 /r ib
@@ -203198,10 +203103,10 @@ ymm2/m256/m64bcst, imm8
EVEX.512.66.0F3A.W1 01 /r ib
VPERMPD zmm1 {k1}{z},
zmm2/m512/m64bcst, imm8
-EVEX.NDS.256.66.0F38.W1 16 /r
+EVEX.256.66.0F38.W1 16 /r
VPERMPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 16 /r
+EVEX.512.66.0F38.W1 16 /r
VPERMPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -203469,10 +203374,10 @@ AVX2
VEX.256.66.0F38.W0 16 /r
VPERMPS ymm1, ymm2,
ymm3/m256
-EVEX.NDS.256.66.0F38.W0 16 /r
+EVEX.256.66.0F38.W0 16 /r
VPERMPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 16 /r
+EVEX.512.66.0F38.W0 16 /r
VPERMPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -203674,10 +203579,10 @@ ymm2/m256/m64bcst, imm8
EVEX.512.66.0F3A.W1 00 /r ib
VPERMQ zmm1 {k1}{z},
zmm2/m512/m64bcst, imm8
-EVEX.NDS.256.66.0F38.W1 36 /r
+EVEX.256.66.0F38.W1 36 /r
VPERMQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 36 /r
+EVEX.512.66.0F38.W1 36 /r
VPERMQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -203937,7 +203842,7 @@ Flag
Description
-EVEX.DDS.128.66.0F38.W0 7D /r
+EVEX.128.66.0F38.W0 7D /r
VPERMT2B xmm1 {k1}{z}, xmm2,
xmm3/m128
@@ -203952,7 +203857,7 @@ Permute bytes in xmm3/m128 and xmm1 using byte
indexes in xmm2 and store the byte results in xmm1
using writemask k1.
-EVEX.NDS.256.66.0F38.W0 7D /r
+EVEX.256.66.0F38.W0 7D /r
VPERMT2B ymm1 {k1}{z}, ymm2,
ymm3/m256
@@ -203967,7 +203872,7 @@ Permute bytes in ymm3/m256 and ymm1 using byte
indexes in ymm2 and store the byte results in ymm1
using writemask k1.
-EVEX.NDS.512.66.0F38.W0 7D /r
+EVEX.512.66.0F38.W0 7D /r
VPERMT2B zmm1 {k1}{z}, zmm2,
zmm3/m512
@@ -204086,7 +203991,7 @@ Flag
Description
-EVEX.DDS.128.66.0F38.W1 7D /r
+EVEX.128.66.0F38.W1 7D /r
VPERMT2W xmm1 {k1}{z}, xmm2,
xmm3/m128
@@ -204101,7 +204006,7 @@ Permute word integers from two tables in xmm3/m128
and xmm1 using indexes in xmm2 and store the result in
xmm1 using writemask k1.
-EVEX.DDS.256.66.0F38.W1 7D /r
+EVEX.256.66.0F38.W1 7D /r
VPERMT2W ymm1 {k1}{z}, ymm2,
ymm3/m256
@@ -204116,7 +204021,7 @@ Permute word integers from two tables in ymm3/m256
and ymm1 using indexes in ymm2 and store the result in
ymm1 using writemask k1.
-EVEX.DDS.512.66.0F38.W1 7D /r
+EVEX.512.66.0F38.W1 7D /r
VPERMT2W zmm1 {k1}{z}, zmm2,
zmm3/m512
@@ -204130,7 +204035,7 @@ Permute word integers from two tables in zmm3/m512
and zmm1 using indexes in zmm2 and store the result in
zmm1 using writemask k1.
-EVEX.DDS.128.66.0F38.W0 7E /r
+EVEX.128.66.0F38.W0 7E /r
VPERMT2D xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -204145,7 +204050,7 @@ Permute double-words from two tables in
xmm3/m128/m32bcst and xmm1 using indexes in xmm2
and store the result in xmm1 using writemask k1.
-EVEX.DDS.256.66.0F38.W0 7E /r
+EVEX.256.66.0F38.W0 7E /r
VPERMT2D ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -204160,7 +204065,7 @@ Permute double-words from two tables in
ymm3/m256/m32bcst and ymm1 using indexes in ymm2
and store the result in ymm1 using writemask k1.
-EVEX.DDS.512.66.0F38.W0 7E /r
+EVEX.512.66.0F38.W0 7E /r
VPERMT2D zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -204174,7 +204079,7 @@ Permute double-words from two tables in
zmm3/m512/m32bcst and zmm1 using indices in zmm2
and store the result in zmm1 using writemask k1.
-EVEX.DDS.128.66.0F38.W1 7E /r
+EVEX.128.66.0F38.W1 7E /r
VPERMT2Q xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -204189,7 +204094,7 @@ Permute quad-words from two tables in
xmm3/m128/m64bcst and xmm1 using indexes in xmm2
and store the result in xmm1 using writemask k1.
-EVEX.DDS.256.66.0F38.W1 7E /r
+EVEX.256.66.0F38.W1 7E /r
VPERMT2Q ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -204204,7 +204109,7 @@ Permute quad-words from two tables in
ymm3/m256/m64bcst and ymm1 using indexes in ymm2
and store the result in ymm1 using writemask k1.
-EVEX.DDS.512.66.0F38.W1 7E /r
+EVEX.512.66.0F38.W1 7E /r
VPERMT2Q zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -204218,7 +204123,7 @@ Permute quad-words from two tables in
zmm3/m512/m64bcst and zmm1 using indices in zmm2
and store the result in zmm1 using writemask k1.
-EVEX.DDS.128.66.0F38.W0 7F /r
+EVEX.128.66.0F38.W0 7F /r
VPERMT2PS xmm1 {k1}{z},
xmm2, xmm3/m128/m32bcst
@@ -204233,7 +204138,7 @@ Permute single-precision FP values from two tables in
xmm3/m128/m32bcst and xmm1 using indexes in xmm2
and store the result in xmm1 using writemask k1.
-EVEX.DDS.256.66.0F38.W0 7F /r
+EVEX.256.66.0F38.W0 7F /r
VPERMT2PS ymm1 {k1}{z},
ymm2, ymm3/m256/m32bcst
@@ -204248,7 +204153,7 @@ Permute single-precision FP values from two tables in
ymm3/m256/m32bcst and ymm1 using indexes in ymm2
and store the result in ymm1 using writemask k1.
-EVEX.DDS.512.66.0F38.W0 7F /r
+EVEX.512.66.0F38.W0 7F /r
VPERMT2PS zmm1 {k1}{z},
zmm2, zmm3/m512/m32bcst
@@ -204262,7 +204167,7 @@ Permute single-precision FP values from two tables in
zmm3/m512/m32bcst and zmm1 using indices in zmm2
and store the result in zmm1 using writemask k1.
-EVEX.DDS.128.66.0F38.W1 7F /r
+EVEX.128.66.0F38.W1 7F /r
VPERMT2PD xmm1 {k1}{z},
xmm2, xmm3/m128/m64bcst
@@ -204277,7 +204182,7 @@ Permute double-precision FP values from two tables in
xmm3/m128/m64bcst and xmm1 using indexes in xmm2
and store the result in xmm1 using writemask k1.
-EVEX.DDS.256.66.0F38.W1 7F /r
+EVEX.256.66.0F38.W1 7F /r
VPERMT2PD ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -204292,7 +204197,7 @@ Permute double-precision FP values from two tables in
ymm3/m256/m64bcst and ymm1 using indexes in ymm2
and store the result in ymm1 using writemask k1.
-EVEX.DDS.512.66.0F38.W1 7F /r
+EVEX.512.66.0F38.W1 7F /r
VPERMT2PD zmm1 {k1}{z},
zmm2, zmm3/m512/m64bcst
@@ -204862,10 +204767,10 @@ Feature
Flag
AVX2
-VEX.DDS.128.66.0F38.W0 90 /r
+VEX.128.66.0F38.W0 90 /r
VPGATHERDD xmm1, vm32x, xmm2
-VEX.DDS.128.66.0F38.W0 91 /r
+VEX.128.66.0F38.W0 91 /r
VPGATHERQD xmm1, vm64x, xmm2
RMV
@@ -204874,7 +204779,7 @@ V/V
AVX2
-VEX.DDS.256.66.0F38.W0 90 /r
+VEX.256.66.0F38.W0 90 /r
VPGATHERDD ymm1, vm32y, ymm2
RMV
@@ -204883,7 +204788,7 @@ V/V
AVX2
-VEX.DDS.256.66.0F38.W0 91 /r
+VEX.256.66.0F38.W0 91 /r
VPGATHERQD xmm1, vm64y, xmm2
RMV
@@ -205007,6 +204912,7 @@ SCALE: scale factor encoded by SIB:[7:6];
DISP: optional 1, 4 byte displacement;
MASK  SRC3;
VPGATHERDD (VEX.128 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 3
i  j * 32;
IF MASK[31+i] THEN
@@ -205015,7 +204921,6 @@ ELSE
MASK[i +31:i]  0;
FI;
ENDFOR
-MASK[MAXVL-1:128]  0;
FOR j 0 to 3
i  j * 32;
DATA_ADDR  BASE_ADDR + (SignExtend(VINDEX[i+31:i])*SCALE + DISP;
@@ -205025,7 +204930,7 @@ FI;
MASK[i +31:i]  0;
ENDFOR
DEST[MAXVL-1:128]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
+
5-382 Vol. 2C
VPGATHERDD/VPGATHERQD — Gather Packed Dword Values Using Signed Dword/Qword Indices
@@ -205033,6 +204938,7 @@ VPGATHERDD/VPGATHERQD — Gather Packed Dword Values Using Signed Dword/Qword In
INSTRUCTION SET REFERENCE, V-Z
VPGATHERQD (VEX.128 version)
+MASK[MAXVL-1:64]  0;
FOR j 0 to 3
i  j * 32;
IF MASK[31+i] THEN
@@ -205041,7 +204947,6 @@ ELSE
MASK[i +31:i]  0;
FI;
ENDFOR
-MASK[MAXVL-1:128]  0;
FOR j 0 to 1
k  j * 64;
i  j * 32;
@@ -205051,10 +204956,9 @@ DEST[i +31:i]  FETCH_32BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +31:i]  0;
ENDFOR
-MASK[127:64]  0;
DEST[MAXVL-1:64]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
VPGATHERDD (VEX.256 version)
+MASK[MAXVL-1:256]  0;
FOR j 0 to 7
i  j * 32;
IF MASK[31+i] THEN
@@ -205071,7 +204975,7 @@ DEST[i +31:i]  FETCH_32BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +31:i]  0;
ENDFOR
-(non-masked elements of the mask register have the content of respective element cleared)
+DEST[MAXVL-1:256]  0;
VPGATHERDD/VPGATHERQD — Gather Packed Dword Values Using Signed Dword/Qword Indices
@@ -205080,6 +204984,7 @@ Vol. 2C 5-383
INSTRUCTION SET REFERENCE, V-Z
VPGATHERQD (VEX.256 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 7
i  j * 32;
IF MASK[31+i] THEN
@@ -205097,9 +205002,7 @@ DEST[i +31:i]  FETCH_32BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +31:i]  0;
ENDFOR
-MASK[MAXVL-1:128]  0;
DEST[MAXVL-1:128]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
Intel C/C++ Compiler Intrinsic Equivalent
VPGATHERDD: __m128i _mm_i32gather_epi32 (int const * base, __m128i index, const int scale);
@@ -205287,7 +205190,7 @@ Operation
BASE_ADDR stands for the memory operand base address (a GPR); may not exist
VINDEX stands for the memory operand vector of indices (a ZMM register)
SCALE stands for the memory operand scalar (1, 2, 4 or 8)
-DISP is the optional 1, 2 or 4 byte displacement
+DISP is the optional 1 or 4 byte displacement
VPGATHERDD (EVEX encoded version)
(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j  0 TO KL-1
@@ -205363,10 +205266,10 @@ Feature
Flag
AVX2
-VEX.DDS.128.66.0F38.W1 90 /r
+VEX.128.66.0F38.W1 90 /r
VPGATHERDQ xmm1, vm32x, xmm2
-VEX.DDS.128.66.0F38.W1 91 /r
+VEX.128.66.0F38.W1 91 /r
VPGATHERQQ xmm1, vm64x, xmm2
A
@@ -205375,7 +205278,7 @@ V/V
AVX2
-VEX.DDS.256.66.0F38.W1 90 /r
+VEX.256.66.0F38.W1 90 /r
VPGATHERDQ ymm1, vm32x, ymm2
A
@@ -205384,7 +205287,7 @@ V/V
AVX2
-VEX.DDS.256.66.0F38.W1 91 /r
+VEX.256.66.0F38.W1 91 /r
VPGATHERQQ ymm1, vm64y, ymm2
A
@@ -205460,8 +205363,8 @@ VPGATHERDQ/VPGATHERQQ — Gather Packed Qword Values Using Signed Dword/Qword In
INSTRUCTION SET REFERENCE, V-Z
-VEX.256 version: The instruction will gather four qword values. For dword indices, only the lower four indices in
-the vector index register are used.
+VEX.256 version: The instruction will gather four qword values. For dword indices, only the lower four indices in the
+vector index register are used.
Note that:
@@ -205511,6 +205414,7 @@ SCALE: scale factor encoded by SIB:[7:6];
DISP: optional 1, 4 byte displacement;
MASK  SRC3;
VPGATHERDQ (VEX.128 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 1
i  j * 64;
IF MASK[63+i] THEN
@@ -205528,7 +205432,6 @@ DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +63:i]  0;
ENDFOR
-MASK[MAXVL-1:128]  0;
DEST[MAXVL-1:128]  0;
VPGATHERDQ/VPGATHERQQ — Gather Packed Qword Values Using Signed Dword/Qword Indices
@@ -205536,8 +205439,8 @@ Vol. 2C 5-389
INSTRUCTION SET REFERENCE, V-Z
-(non-masked elements of the mask register have the content of respective element cleared)
VPGATHERQQ (VEX.128 version)
+MASK[MAXVL-1:128]  0;
FOR j 0 to 1
i  j * 64;
IF MASK[63+i] THEN
@@ -205554,10 +205457,9 @@ DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +63:i]  0;
ENDFOR
-MASK[MAXVL-1:128]  0;
DEST[MAXVL-1:128]  0;
-(non-masked elements of the mask register have the content of respective element cleared)
VPGATHERQQ (VEX.256 version)
+MASK[MAXVL-1:256]  0;
FOR j 0 to 3
i  j * 64;
IF MASK[63+i] THEN
@@ -205574,8 +205476,9 @@ DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +63:i]  0;
ENDFOR
-(non-masked elements of the mask register have the content of respective element cleared)
+DEST[MAXVL-1:256]  0;
VPGATHERDQ (VEX.256 version)
+MASK[MAXVL-1:256]  0;
FOR j 0 to 3
i  j * 64;
IF MASK[63+i] THEN
@@ -205587,19 +205490,19 @@ ENDFOR
FOR j 0 to 3
k  j * 32;
i  j * 64;
+DATA_ADDR  BASE_ADDR + (SignExtend(VINDEX1[k+31:k])*SCALE + DISP;
5-390 Vol. 2C
VPGATHERDQ/VPGATHERQQ — Gather Packed Qword Values Using Signed Dword/Qword Indices
INSTRUCTION SET REFERENCE, V-Z
-DATA_ADDR  BASE_ADDR + (SignExtend(VINDEX1[k+31:k])*SCALE + DISP;
IF MASK[63+i] THEN
DEST[i +63:i]  FETCH_64BITS(DATA_ADDR); // a fault exits the instruction
FI;
MASK[i +63:i]  0;
ENDFOR
-(non-masked elements of the mask register have the content of respective element cleared)
+DEST[MAXVL-1:256]  0;
Intel C/C++ Compiler Intrinsic Equivalent
VPGATHERDQ: __m128i _mm_i32gather_epi64 (__int64 const * base, __m128i index, const int scale);
@@ -205789,7 +205692,7 @@ Operation
BASE_ADDR stands for the memory operand base address (a GPR); may not exist
VINDEX stands for the memory operand vector of indices (a ZMM register)
SCALE stands for the memory operand scalar (1, 2, 4 or 8)
-DISP is the optional 1, 2 or 4 byte displacement
+DISP is the optional 1 or 4 byte displacement
VPGATHERQD (EVEX encoded version)
(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j  0 TO KL-1
@@ -206082,7 +205985,7 @@ Support
Description
-EVEX.DDS.128.66.0F38.W1 B5 /r
+EVEX.128.66.0F38.W1 B5 /r
VPMADD52HUQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -206097,7 +206000,7 @@ Multiply unsigned 52-bit integers in xmm2 and
xmm3/m128 and add the high 52 bits of the 104bit product to the qword unsigned integers in
xmm1 using writemask k1.
-EVEX.DDS.256.66.0F38.W1 B5 /r
+EVEX.256.66.0F38.W1 B5 /r
VPMADD52HUQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -206112,7 +206015,7 @@ Multiply unsigned 52-bit integers in ymm2 and
ymm3/m128 and add the high 52 bits of the 104bit product to the qword unsigned integers in
ymm1 using writemask k1.
-EVEX.DDS.512.66.0F38.W1 B5 /r
+EVEX.512.66.0F38.W1 B5 /r
VPMADD52HUQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -206225,7 +206128,7 @@ Support
Description
-EVEX.DDS.128.66.0F38.W1 B4 /r
+EVEX.128.66.0F38.W1 B4 /r
VPMADD52LUQ xmm1 {k1}{z},
xmm2,xmm3/m128/m64bcst
@@ -206241,7 +206144,7 @@ xmm3/m128 and add the low 52 bits of the 104-bit
product to the qword unsigned integers in xmm1
using writemask k1.
-EVEX.DDS.256.66.0F38.W1 B4 /r
+EVEX.256.66.0F38.W1 B4 /r
VPMADD52LUQ ymm1 {k1}{z},
ymm2, ymm3/m256/m64bcst
@@ -206257,7 +206160,7 @@ ymm3/m128 and add the low 52 bits of the 104-bit
product to the qword unsigned integers in ymm1
using writemask k1.
-EVEX.DDS.512.66.0F38.W1 B4 /r
+EVEX.512.66.0F38.W1 B4 /r
VPMADD52LUQ zmm1 {k1}{z},
zmm2,zmm3/m512/m64bcst
@@ -206375,21 +206278,21 @@ Feature
Flag
AVX2
-VEX.NDS.128.66.0F38.W0 8C /r
+VEX.128.66.0F38.W0 8C /r
VPMASKMOVD xmm1, xmm2, m128
-VEX.NDS.256.66.0F38.W0 8C /r
+VEX.256.66.0F38.W0 8C /r
VPMASKMOVD ymm1, ymm2, m256
-VEX.NDS.128.66.0F38.W1 8C /r
+VEX.128.66.0F38.W1 8C /r
VPMASKMOVQ xmm1, xmm2, m128
-VEX.NDS.256.66.0F38.W1 8C /r
+VEX.256.66.0F38.W1 8C /r
VPMASKMOVQ ymm1, ymm2, m256
-VEX.NDS.128.66.0F38.W0 8E /r
+VEX.128.66.0F38.W0 8E /r
VPMASKMOVD m128, xmm1, xmm2
-VEX.NDS.256.66.0F38.W0 8E /r
+VEX.256.66.0F38.W0 8E /r
VPMASKMOVD m256, ymm1, ymm2
-VEX.NDS.128.66.0F38.W1 8E /r
+VEX.128.66.0F38.W1 8E /r
VPMASKMOVQ m128, xmm1, xmm2
-VEX.NDS.256.66.0F38.W1 8E /r
+VEX.256.66.0F38.W1 8E /r
VPMASKMOVQ m256, ymm1, ymm2
RVM
@@ -209071,7 +208974,7 @@ Flag
Description
-EVEX.NDS.128.66.0F38.W1 83 /r
+EVEX.128.66.0F38.W1 83 /r
VPMULTISHIFTQB xmm1 {k1}{z},
xmm2,xmm3/m128/m64bcst
@@ -209086,7 +208989,7 @@ Select unaligned bytes from qwords in
xmm3/m128/m64bcst using control bytes in
xmm2, write byte results to xmm1 under k1.
-EVEX.NDS.256.66.0F38.W1 83 /r
+EVEX.256.66.0F38.W1 83 /r
VPMULTISHIFTQB ymm1 {k1}{z},
ymm2,ymm3/m256/m64bcst
@@ -209101,7 +209004,7 @@ Select unaligned bytes from qwords in
ymm3/m256/m64bcst using control bytes in
ymm2, write byte results to ymm1 under k1.
-EVEX.NDS.512.66.0F38.W1 83 /r
+EVEX.512.66.0F38.W1 83 /r
VPMULTISHIFTQB zmm1 {k1}{z},
zmm2,zmm3/m512/m64bcst
@@ -209218,31 +209121,31 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W0 15 /r
+EVEX.128.66.0F38.W0 15 /r
VPROLVD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDD.128.66.0F.W0 72 /1 ib
+EVEX.128.66.0F.W0 72 /1 ib
VPROLD xmm1 {k1}{z},
xmm2/m128/m32bcst, imm8
-EVEX.NDS.128.66.0F38.W1 15 /r
+EVEX.128.66.0F38.W1 15 /r
VPROLVQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDD.128.66.0F.W1 72 /1 ib
+EVEX.128.66.0F.W1 72 /1 ib
VPROLQ xmm1 {k1}{z},
xmm2/m128/m64bcst, imm8
-EVEX.NDS.256.66.0F38.W0 15 /r
+EVEX.256.66.0F38.W0 15 /r
VPROLVD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDD.256.66.0F.W0 72 /1 ib
+EVEX.256.66.0F.W0 72 /1 ib
VPROLD ymm1 {k1}{z},
ymm2/m256/m32bcst, imm8
-EVEX.NDS.256.66.0F38.W1 15 /r
+EVEX.256.66.0F38.W1 15 /r
VPROLVQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDD.256.66.0F.W1 72 /1 ib
+EVEX.256.66.0F.W1 72 /1 ib
VPROLQ ymm1 {k1}{z},
ymm2/m256/m64bcst, imm8
-EVEX.NDS.512.66.0F38.W0 15 /r
+EVEX.512.66.0F38.W0 15 /r
VPROLVD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -209301,13 +209204,13 @@ V/V
AVX512F
-EVEX.NDD.512.66.0F.W0 72 /1 ib
+EVEX.512.66.0F.W0 72 /1 ib
VPROLD zmm1 {k1}{z},
zmm2/m512/m32bcst, imm8
-EVEX.NDS.512.66.0F38.W1 15 /r
+EVEX.512.66.0F38.W1 15 /r
VPROLVQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
-EVEX.NDD.512.66.0F.W1 72 /1 ib
+EVEX.512.66.0F.W1 72 /1 ib
VPROLQ zmm1 {k1}{z},
zmm2/m512/m64bcst, imm8
@@ -209599,19 +209502,19 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W0 14 /r
+EVEX.128.66.0F38.W0 14 /r
VPRORVD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDD.128.66.0F.W0 72 /0 ib
+EVEX.128.66.0F.W0 72 /0 ib
VPRORD xmm1 {k1}{z},
xmm2/m128/m32bcst, imm8
-EVEX.NDS.128.66.0F38.W1 14 /r
+EVEX.128.66.0F38.W1 14 /r
VPRORVQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDD.128.66.0F.W1 72 /0 ib
+EVEX.128.66.0F.W1 72 /0 ib
VPRORQ xmm1 {k1}{z},
xmm2/m128/m64bcst, imm8
-EVEX.NDS.256.66.0F38.W0 14 /r
+EVEX.256.66.0F38.W0 14 /r
VPRORVD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -209643,16 +209546,16 @@ V/V
AVX512VL
AVX512F
-EVEX.NDD.256.66.0F.W0 72 /0 ib
+EVEX.256.66.0F.W0 72 /0 ib
VPRORD ymm1 {k1}{z},
ymm2/m256/m32bcst, imm8
-EVEX.NDS.256.66.0F38.W1 14 /r
+EVEX.256.66.0F38.W1 14 /r
VPRORVQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDD.256.66.0F.W1 72 /0 ib
+EVEX.256.66.0F.W1 72 /0 ib
VPRORQ ymm1 {k1}{z},
ymm2/m256/m64bcst, imm8
-EVEX.NDS.512.66.0F38.W0 14 /r
+EVEX.512.66.0F38.W0 14 /r
VPRORVD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -209683,13 +209586,13 @@ V/V
AVX512F
-EVEX.NDD.512.66.0F.W0 72 /0 ib
+EVEX.512.66.0F.W0 72 /0 ib
VPRORD zmm1 {k1}{z},
zmm2/m512/m32bcst, imm8
-EVEX.NDS.512.66.0F38.W1 14 /r
+EVEX.512.66.0F38.W1 14 /r
VPRORVQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
-EVEX.NDD.512.66.0F.W1 72 /0 ib
+EVEX.512.66.0F.W1 72 /0 ib
VPRORQ zmm1 {k1}{z},
zmm2/m512/m64bcst, imm8
@@ -210185,7 +210088,7 @@ Operation
BASE_ADDR stands for the memory operand base address (a GPR); may not exist
VINDEX stands for the memory operand vector of indices (a ZMM register)
SCALE stands for the memory operand scalar (1, 2, 4 or 8)
-DISP is the optional 1, 2 or 4 byte displacement
+DISP is the optional 1 or 4 byte displacement
VPSCATTERDD (EVEX encoded versions)
(KL, VL)= (4, 128), (8, 256), (16, 512)
FOR j  0 TO KL-1
@@ -210295,9 +210198,9 @@ Feature
Flag
AVX2
-VEX.NDS.128.66.0F38.W0 47 /r
+VEX.128.66.0F38.W0 47 /r
VPSLLVD xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F38.W1 47 /r
+VEX.128.66.0F38.W1 47 /r
VPSLLVQ xmm1, xmm2, xmm3/m128
A
@@ -210306,7 +210209,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F38.W0 47 /r
+VEX.256.66.0F38.W0 47 /r
VPSLLVD ymm1, ymm2, ymm3/m256
A
@@ -210315,7 +210218,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F38.W1 47 /r
+VEX.256.66.0F38.W1 47 /r
VPSLLVQ ymm1, ymm2, ymm3/m256
A
@@ -210324,31 +210227,31 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.W1 12 /r
+EVEX.128.66.0F38.W1 12 /r
VPSLLVW xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.W1 12 /r
+EVEX.256.66.0F38.W1 12 /r
VPSLLVW ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.W1 12 /r
+EVEX.512.66.0F38.W1 12 /r
VPSLLVW zmm1 {k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38.W0 47 /r
+EVEX.128.66.0F38.W0 47 /r
VPSLLVD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 47 /r
+EVEX.256.66.0F38.W0 47 /r
VPSLLVD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 47 /r
+EVEX.512.66.0F38.W0 47 /r
VPSLLVD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 47 /r
+EVEX.128.66.0F38.W1 47 /r
VPSLLVQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 47 /r
+EVEX.256.66.0F38.W1 47 /r
VPSLLVQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 47 /r
+EVEX.512.66.0F38.W1 47 /r
VPSLLVQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -210552,7 +210455,7 @@ VPSLLVW/VPSLLVD/VPSLLVQ—Variable Bit Shift Left Logical
VPSLLVD (VEX.128 version)
COUNT_0 SRC2[31 : 0]
(* Repeat Each COUNT_i for the 2nd through 4th dwords of SRC2*)
-COUNT_3 SRC2[100 : 96];
+COUNT_3 SRC2[127 : 96];
IF COUNT_0 < 32 THEN
DEST[31:0] ZeroExtend(SRC1[31:0] << COUNT_0);
ELSE
@@ -210566,7 +210469,7 @@ DEST[MAXVL-1:128] 0;
VPSLLVD (VEX.256 version)
COUNT_0 SRC2[31 : 0];
(* Repeat Each COUNT_i for the 2nd through 7th dwords of SRC2*)
-COUNT_7 SRC2[228 : 224];
+COUNT_7 SRC2[255 : 224];
IF COUNT_0 < 32 THEN
DEST[31:0] ZeroExtend(SRC1[31:0] << COUNT_0);
ELSE
@@ -210619,7 +210522,7 @@ DEST[MAXVL-1:128] 0;
VPSLLVQ (VEX.256 version)
COUNT_0 SRC2[63 : 0];
(* Repeat Each COUNT_i for the 2nd through 4th dwords of SRC2*)
-COUNT_3 SRC2[197 : 192];
+COUNT_3 SRC2[255 : 192];
IF COUNT_0 < 64THEN
DEST[63:0] ZeroExtend(SRC1[63:0] << COUNT_0);
ELSE
@@ -210712,12 +210615,12 @@ Feature
Flag
AVX2
-VEX.NDS.128.66.0F38.W0 46 /r
+VEX.128.66.0F38.W0 46 /r
VPSRAVD xmm1, xmm2, xmm3/m128
Description
-VEX.NDS.256.66.0F38.W0 46 /r
+VEX.256.66.0F38.W0 46 /r
VPSRAVD ymm1, ymm2, ymm3/m256
A
@@ -210726,16 +210629,16 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.W1 11 /r
+EVEX.128.66.0F38.W1 11 /r
VPSRAVW xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.W1 11 /r
+EVEX.256.66.0F38.W1 11 /r
VPSRAVW ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.W1 11 /r
+EVEX.512.66.0F38.W1 11 /r
VPSRAVW zmm1 {k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38.W0 46 /r
+EVEX.128.66.0F38.W0 46 /r
VPSRAVD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
@@ -210766,7 +210669,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W0 46 /r
+EVEX.256.66.0F38.W0 46 /r
VPSRAVD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
@@ -210777,7 +210680,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F38.W0 46 /r
+EVEX.512.66.0F38.W0 46 /r
VPSRAVD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -210787,7 +210690,7 @@ V/V
AVX512F
-EVEX.NDS.128.66.0F38.W1 46 /r
+EVEX.128.66.0F38.W1 46 /r
VPSRAVQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
@@ -210798,7 +210701,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W1 46 /r
+EVEX.256.66.0F38.W1 46 /r
VPSRAVQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
@@ -210809,7 +210712,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F38.W1 46 /r
+EVEX.512.66.0F38.W1 46 /r
VPSRAVQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -210920,11 +210823,8 @@ second source operand (the third operand). As the bits in the data elements are
bits are set to the MSB (sign extension).
The count values are specified individually in each data element of the second source operand. If the unsigned
integer value specified in the respective data element of the second source operand is greater than 15 (for words),
-31 (for doublewords), or 63 (for a quadword), then the destination data element are filled with the corresponding
+31 (for doublewords), or 63 (for a quadword), then the destination data element is filled with the corresponding
sign bit of the source element.
-The count values are specified individually in each data element of the second source operand. If the unsigned
-integer value specified in the respective data element of the second source operand is greater than 16 (for word),
-31 (for doublewords), or 63 (for a quadword), then the destination data element are written with 0.
VEX.128 encoded version: The destination and first source operands are XMM registers. The count operand can be
either an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding destination register
are zeroed.
@@ -210975,7 +210875,7 @@ Vol. 2C 5-457
VPSRAVD (VEX.128 version)
COUNT_0  SRC2[31 : 0]
(* Repeat Each COUNT_i for the 2nd through 4th dwords of SRC2*)
-COUNT_3  SRC2[100 : 96];
+COUNT_3  SRC2[127 : 96];
DEST[31:0]  SignExtend(SRC1[31:0] >> COUNT_0);
(* Repeat shift operation for 2nd through 4th dwords *)
DEST[127:96]  SignExtend(SRC1[127:96] >> COUNT_3);
@@ -210983,7 +210883,7 @@ DEST[MAXVL-1:128]  0;
VPSRAVD (VEX.256 version)
COUNT_0  SRC2[31 : 0];
(* Repeat Each COUNT_i for the 2nd through 8th dwords of SRC2*)
-COUNT_7  SRC2[228 : 224];
+COUNT_7  SRC2[255 : 224];
DEST[31:0]  SignExtend(SRC1[31:0] >> COUNT_0);
(* Repeat shift operation for 2nd through 7th dwords *)
DEST[255:224]  SignExtend(SRC1[255:224] >> COUNT_7);
@@ -211137,9 +211037,9 @@ Feature
Flag
AVX2
-VEX.NDS.128.66.0F38.W0 45 /r
+VEX.128.66.0F38.W0 45 /r
VPSRLVD xmm1, xmm2, xmm3/m128
-VEX.NDS.128.66.0F38.W1 45 /r
+VEX.128.66.0F38.W1 45 /r
VPSRLVQ xmm1, xmm2, xmm3/m128
A
@@ -211148,7 +211048,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F38.W0 45 /r
+VEX.256.66.0F38.W0 45 /r
VPSRLVD ymm1, ymm2, ymm3/m256
A
@@ -211157,7 +211057,7 @@ V/V
AVX2
-VEX.NDS.256.66.0F38.W1 45 /r
+VEX.256.66.0F38.W1 45 /r
VPSRLVQ ymm1, ymm2, ymm3/m256
A
@@ -211166,31 +211066,31 @@ V/V
AVX2
-EVEX.NDS.128.66.0F38.W1 10 /r
+EVEX.128.66.0F38.W1 10 /r
VPSRLVW xmm1 {k1}{z}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.W1 10 /r
+EVEX.256.66.0F38.W1 10 /r
VPSRLVW ymm1 {k1}{z}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.W1 10 /r
+EVEX.512.66.0F38.W1 10 /r
VPSRLVW zmm1 {k1}{z}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38.W0 45 /r
+EVEX.128.66.0F38.W0 45 /r
VPSRLVD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 45 /r
+EVEX.256.66.0F38.W0 45 /r
VPSRLVD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 45 /r
+EVEX.512.66.0F38.W0 45 /r
VPSRLVD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
-EVEX.NDS.128.66.0F38.W1 45 /r
+EVEX.128.66.0F38.W1 45 /r
VPSRLVQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 45 /r
+EVEX.256.66.0F38.W1 45 /r
VPSRLVQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 45 /r
+EVEX.512.66.0F38.W1 45 /r
VPSRLVQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -211557,11 +211457,11 @@ Flag
AVX512VL
AVX512F
-EVEX.DDS.128.66.0F3A.W0 25 /r ib
+EVEX.128.66.0F3A.W0 25 /r ib
VPTERNLOGD xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst, imm8
-EVEX.DDS.256.66.0F3A.W0 25 /r ib
+EVEX.256.66.0F3A.W0 25 /r ib
VPTERNLOGD ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst, imm8
@@ -211572,7 +211472,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.512.66.0F3A.W0 25 /r ib
+EVEX.512.66.0F3A.W0 25 /r ib
VPTERNLOGD zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst, imm8
@@ -211582,7 +211482,7 @@ V/V
AVX512F
-EVEX.DDS.128.66.0F3A.W1 25 /r ib
+EVEX.128.66.0F3A.W1 25 /r ib
VPTERNLOGQ xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst, imm8
@@ -211593,7 +211493,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.256.66.0F3A.W1 25 /r ib
+EVEX.256.66.0F3A.W1 25 /r ib
VPTERNLOGQ ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -211604,7 +211504,7 @@ V/V
AVX512VL
AVX512F
-EVEX.DDS.512.66.0F3A.W1 25 /r ib
+EVEX.512.66.0F3A.W1 25 /r ib
VPTERNLOGQ zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst, imm8
@@ -211947,25 +211847,25 @@ Flag
AVX512VL
AVX512BW
-EVEX.NDS.128.66.0F38.W0 26 /r
+EVEX.128.66.0F38.W0 26 /r
VPTESTMB k2 {k1}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.W0 26 /r
+EVEX.256.66.0F38.W0 26 /r
VPTESTMB k2 {k1}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.W0 26 /r
+EVEX.512.66.0F38.W0 26 /r
VPTESTMB k2 {k1}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38.W1 26 /r
+EVEX.128.66.0F38.W1 26 /r
VPTESTMW k2 {k1}, xmm2,
xmm3/m128
-EVEX.NDS.256.66.0F38.W1 26 /r
+EVEX.256.66.0F38.W1 26 /r
VPTESTMW k2 {k1}, ymm2,
ymm3/m256
-EVEX.NDS.512.66.0F38.W1 26 /r
+EVEX.512.66.0F38.W1 26 /r
VPTESTMW k2 {k1}, zmm2,
zmm3/m512
-EVEX.NDS.128.66.0F38.W0 27 /r
+EVEX.128.66.0F38.W0 27 /r
VPTESTMD k2 {k1}, xmm2,
xmm3/m128/m32bcst
@@ -212009,7 +211909,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W0 27 /r
+EVEX.256.66.0F38.W0 27 /r
VPTESTMD k2 {k1}, ymm2,
ymm3/m256/m32bcst
@@ -212020,7 +211920,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F38.W0 27 /r
+EVEX.512.66.0F38.W0 27 /r
VPTESTMD k2 {k1}, zmm2,
zmm3/m512/m32bcst
@@ -212030,7 +211930,7 @@ V/V
AVX512F
-EVEX.NDS.128.66.0F38.W1 27 /r
+EVEX.128.66.0F38.W1 27 /r
VPTESTMQ k2 {k1}, xmm2,
xmm3/m128/m64bcst
@@ -212041,7 +211941,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F38.W1 27 /r
+EVEX.256.66.0F38.W1 27 /r
VPTESTMQ k2 {k1}, ymm2,
ymm3/m256/m64bcst
@@ -212052,7 +211952,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F38.W1 27 /r
+EVEX.512.66.0F38.W1 27 /r
VPTESTMQ k2 {k1}, zmm2,
zmm3/m512/m64bcst
@@ -212265,7 +212165,7 @@ bit Mode
Support
V/V
-EVEX.NDS.128.F3.0F38.W0 26 /r
+EVEX.128.F3.0F38.W0 26 /r
VPTESTNMB k2 {k1}, xmm2,
xmm3/m128
@@ -212276,7 +212176,7 @@ Bitwise NAND of packed byte integers in xmm2 and
xmm3/m128 and set mask k2 to reflect the zero/non-zero
status of each element of the result, under writemask k1.
-EVEX.NDS.256.F3.0F38.W0 26 /r
+EVEX.256.F3.0F38.W0 26 /r
VPTESTNMB k2 {k1}, ymm2,
ymm3/m256
@@ -212291,7 +212191,7 @@ Bitwise NAND of packed byte integers in ymm2 and
ymm3/m256 and set mask k2 to reflect the zero/non-zero
status of each element of the result, under writemask k1.
-EVEX.NDS.512.F3.0F38.W0 26 /r
+EVEX.512.F3.0F38.W0 26 /r
VPTESTNMB k2 {k1}, zmm2,
zmm3/m512
@@ -212306,7 +212206,7 @@ Bitwise NAND of packed byte integers in zmm2 and
zmm3/m512 and set mask k2 to reflect the zero/non-zero
status of each element of the result, under writemask k1.
-EVEX.NDS.128.F3.0F38.W1 26 /r
+EVEX.128.F3.0F38.W1 26 /r
VPTESTNMW k2 {k1}, xmm2,
xmm3/m128
@@ -212321,7 +212221,7 @@ Bitwise NAND of packed word integers in xmm2 and
xmm3/m128 and set mask k2 to reflect the zero/non-zero
status of each element of the result, under writemask k1.
-EVEX.NDS.256.F3.0F38.W1 26 /r
+EVEX.256.F3.0F38.W1 26 /r
VPTESTNMW k2 {k1}, ymm2,
ymm3/m256
@@ -212336,7 +212236,7 @@ Bitwise NAND of packed word integers in ymm2 and
ymm3/m256 and set mask k2 to reflect the zero/non-zero
status of each element of the result, under writemask k1.
-EVEX.NDS.512.F3.0F38.W1 26 /r
+EVEX.512.F3.0F38.W1 26 /r
VPTESTNMW k2 {k1}, zmm2,
zmm3/m512
@@ -212351,7 +212251,7 @@ Bitwise NAND of packed word integers in zmm2 and
zmm3/m512 and set mask k2 to reflect the zero/non-zero
status of each element of the result, under writemask k1.
-EVEX.NDS.128.F3.0F38.W0 27 /r
+EVEX.128.F3.0F38.W0 27 /r
VPTESTNMD k2 {k1}, xmm2,
xmm3/m128/m32bcst
@@ -212362,7 +212262,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.F3.0F38.W0 27 /r
+EVEX.256.F3.0F38.W0 27 /r
VPTESTNMD k2 {k1}, ymm2,
ymm3/m256/m32bcst
@@ -212373,7 +212273,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.F3.0F38.W0 27 /r
+EVEX.512.F3.0F38.W0 27 /r
VPTESTNMD k2 {k1}, zmm2,
zmm3/m512/m32bcst
@@ -212383,7 +212283,7 @@ V/V
AVX512F
-EVEX.NDS.128.F3.0F38.W1 27 /r
+EVEX.128.F3.0F38.W1 27 /r
VPTESTNMQ k2 {k1}, xmm2,
xmm3/m128/m64bcst
@@ -212394,7 +212294,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.256.F3.0F38.W1 27 /r
+EVEX.256.F3.0F38.W1 27 /r
VPTESTNMQ k2 {k1}, ymm2,
ymm3/m256/m64bcst
@@ -212405,7 +212305,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.F3.0F38.W1 27 /r
+EVEX.512.F3.0F38.W1 27 /r
VPTESTNMQ k2 {k1}, zmm2,
zmm3/m512/m64bcst
@@ -212615,11 +212515,11 @@ Flag
AVX512VL
AVX512DQ
-EVEX.NDS.128.66.0F3A.W1 50 /r ib
+EVEX.128.66.0F3A.W1 50 /r ib
VRANGEPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst, imm8
-EVEX.NDS.256.66.0F3A.W1 50 /r ib
+EVEX.256.66.0F3A.W1 50 /r ib
VRANGEPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -212630,7 +212530,7 @@ V/V
AVX512VL
AVX512DQ
-EVEX.NDS.512.66.0F3A.W1 50 /r ib
+EVEX.512.66.0F3A.W1 50 /r ib
VRANGEPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{sae}, imm8
@@ -213065,11 +212965,11 @@ Flag
AVX512VL
AVX512DQ
-EVEX.NDS.128.66.0F3A.W0 50 /r ib
+EVEX.128.66.0F3A.W0 50 /r ib
VRANGEPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst, imm8
-EVEX.NDS.256.66.0F3A.W0 50 /r ib
+EVEX.256.66.0F3A.W0 50 /r ib
VRANGEPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst, imm8
@@ -213080,7 +212980,7 @@ V/V
AVX512VL
AVX512DQ
-EVEX.NDS.512.66.0F3A.W0 50 /r ib
+EVEX.512.66.0F3A.W0 50 /r ib
VRANGEPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{sae}, imm8
@@ -213284,7 +213184,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W1 51 /r
+EVEX.LIG.66.0F3A.W1 51 /r
VRANGESD xmm1 {k1}{z},
xmm2, xmm3/m64{sae}, imm8
@@ -213466,7 +213366,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W0 51 /r
+EVEX.LIG.66.0F3A.W0 51 /r
VRANGESS xmm1 {k1}{z},
xmm2, xmm3/m32{sae}, imm8
@@ -213838,7 +213738,7 @@ AVX512F
Description
-EVEX.NDS.LIG.66.0F38.W1 4D /r
+EVEX.LIG.66.0F38.W1 4D /r
VRCP14SD xmm1 {k1}{z}, xmm2,
xmm3/m64
@@ -213885,7 +213785,8 @@ zero only in case of FTZ bit set in MXCSR. Otherwise it will be treated correctl
written) with the sign of the operand. When a source value is a SNaN or QNaN, the SNaN is converted to a QNaN
or the source QNaN is returned. See Table 5-22 for special-case input values.
MXCSR exception flags are not affected by this instruction and floating-point exceptions are not reported.
-A numerically exact implementation of VRCP14xx can be found at https://software.intel.com/en-us/articles/reference-implementations-for-IA-approximation-instructions-vrcp14-vrsqrt14-vrcp28-vrsqrt28-vexp2.
+A numerically exact implementation of VRCP14xx can be found at:
+https://software.intel.com/en-us/articles/reference-implementations-for-IA-approximation-instructions-vrcp14vrsqrt14-vrcp28-vrsqrt28-vexp2.
Operation
VRCP14SD (EVEX version)
IF k1[0] OR *no writemask*
@@ -214056,7 +213957,8 @@ X = -2-n
-2n
* in this case the mantissa is shifted right by one or two bits
-A numerically exact implementation of VRCP14xx can be found at https://software.intel.com/en-us/articles/reference-implementations-for-IA-approximation-instructions-vrcp14-vrsqrt14-vrcp28-vrsqrt28-vexp2.
+A numerically exact implementation of VRCP14xx can be found at:
+https://software.intel.com/en-us/articles/reference-implementations-for-IA-approximation-instructions-vrcp14vrsqrt14-vrcp28-vrsqrt28-vexp2.
VRCP14PS—Compute Approximate Reciprocals of Packed Float32 Values
@@ -214113,7 +214015,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W0 4D /r
+EVEX.LIG.66.0F38.W0 4D /r
VRCP14SS xmm1 {k1}{z}, xmm2,
xmm3/m32
@@ -214489,7 +214391,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W1 57
+EVEX.LIG.66.0F3A.W1 57
VREDUCESD xmm1 {k1}{z},
xmm2, xmm3/m64{sae},
imm8/r
@@ -214781,8 +214683,8 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W0 57
-/r /ib
+EVEX.LIG.66.0F3A.W0 57 /r
+/ib
VREDUCESS xmm1 {k1}{z},
xmm2, xmm3/m32{sae},
imm8
@@ -215173,7 +215075,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W1 0B /r ib
+EVEX.LIG.66.0F3A.W1 0B /r ib
VRNDSCALESD xmm1 {k1}{z}, xmm2,
xmm3/m64{sae}, imm8
@@ -215220,9 +215122,9 @@ ModRM:r/m (r)
Imm8
Description
-Rounds a double-precision floating-point value in the low quadword (see Figure 5-29) element the second source
+Rounds a double-precision floating-point value in the low quadword (see Figure 5-29) element of the second source
operand (the third operand) by the rounding mode specified in the immediate operand and places the result in the
-corresponding element of the destination operand (the third operand) according to the writemask. The quadword
+corresponding element of the destination operand (the first operand) according to the writemask. The quadword
element at bits 127:64 of the destination is copied from the first source operand (the second operand).
The destination and first source operands are XMM registers, the 2nd source operand can be an XMM register or
memory location. Bits MAXVL-1:128 of the destination register are cleared.
@@ -215527,7 +215429,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F3A.W0 0A /r ib
+EVEX.LIG.66.0F3A.W0 0A /r ib
VRNDSCALESS xmm1 {k1}{z}, xmm2,
xmm3/m32{sae}, imm8
@@ -215873,7 +215775,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W1 4F /r
+EVEX.LIG.66.0F38.W1 4F /r
VRSQRT14SD xmm1 {k1}{z},
xmm2, xmm3/m64
@@ -216212,7 +216114,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W0 4F /r
+EVEX.LIG.66.0F38.W0 4F /r
VRSQRT14SS xmm1 {k1}{z},
xmm2, xmm3/m32
@@ -216372,13 +216274,13 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W1 2C /r
+EVEX.128.66.0F38.W1 2C /r
VSCALEFPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F38.W1 2C /r
+EVEX.256.66.0F38.W1 2C /r
VSCALEFPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F38.W1 2C /r
+EVEX.512.66.0F38.W1 2C /r
VSCALEFPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst{er}
@@ -216623,7 +216525,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W1 2D /r
+EVEX.LIG.66.0F38.W1 2D /r
VSCALEFSD xmm1 {k1}{z}, xmm2,
xmm3/m64{er}
@@ -216755,13 +216657,13 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.128.66.0F38.W0 2C /r
+EVEX.128.66.0F38.W0 2C /r
VSCALEFPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.66.0F38.W0 2C /r
+EVEX.256.66.0F38.W0 2C /r
VSCALEFPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.66.0F38.W0 2C /r
+EVEX.512.66.0F38.W0 2C /r
VSCALEFPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst{er}
@@ -216938,7 +216840,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W0 2D /r
+EVEX.LIG.66.0F38.W0 2D /r
VSCALEFSS xmm1 {k1}{z}, xmm2,
xmm3/m32{er}
@@ -217272,7 +217174,7 @@ Operation
BASE_ADDR stands for the memory operand base address (a GPR); may not exist
VINDEX stands for the memory operand vector of indices (a ZMM register)
SCALE stands for the memory operand scalar (1, 2, 4 or 8)
-DISP is the optional 1, 2 or 4 byte displacement
+DISP is the optional 1 or 4 byte displacement
VSCATTERDPS (EVEX encoded versions)
(KL, VL)= (4, 128), (8, 256), (16, 512)
FOR j  0 TO KL-1
@@ -217389,10 +217291,10 @@ Flag
AVX512VL
AVX512F
-EVEX.NDS.256.66.0F3A.W0 23 /r ib
+EVEX.256.66.0F3A.W0 23 /r ib
VSHUFF32X4 ymm1{k1}{z}, ymm2,
ymm3/m256/m32bcst, imm8
-EVEX.NDS.512.66.0F3A.W0 23 /r ib
+EVEX.512.66.0F3A.W0 23 /r ib
VSHUFF32x4 zmm1{k1}{z}, zmm2,
zmm3/m512/m32bcst, imm8
@@ -217402,7 +217304,7 @@ V/V
AVX512F
-EVEX.NDS.256.66.0F3A.W1 23 /r ib
+EVEX.256.66.0F3A.W1 23 /r ib
VSHUFF64X2 ymm1{k1}{z}, ymm2,
ymm3/m256/m64bcst, imm8
@@ -217413,7 +217315,7 @@ V/V
AVX512VL
AVX512F
-EVEX.NDS.512.66.0F3A.W1 23 /r ib
+EVEX.512.66.0F3A.W1 23 /r ib
VSHUFF64x2 zmm1{k1}{z}, zmm2,
zmm3/m512/m64bcst, imm8
@@ -217423,16 +217325,16 @@ V/V
AVX512F
-EVEX.NDS.256.66.0F3A.W0 43 /r ib
+EVEX.256.66.0F3A.W0 43 /r ib
VSHUFI32X4 ymm1{k1}{z}, ymm2,
ymm3/m256/m32bcst, imm8
-EVEX.NDS.512.66.0F3A.W0 43 /r ib
+EVEX.512.66.0F3A.W0 43 /r ib
VSHUFI32x4 zmm1{k1}{z}, zmm2,
zmm3/m512/m32bcst, imm8
-EVEX.NDS.256.66.0F3A.W1 43 /r ib
+EVEX.256.66.0F3A.W1 43 /r ib
VSHUFI64X2 ymm1{k1}{z}, ymm2,
ymm3/m256/m64bcst, imm8
-EVEX.NDS.512.66.0F3A.W1 43 /r ib
+EVEX.512.66.0F3A.W1 43 /r ib
VSHUFI64x2 zmm1{k1}{z}, zmm2,
zmm3/m512/m64bcst, imm8
@@ -219925,7 +219827,7 @@ Other Exceptions
#UD
CPUID.(EAX=7, ECX=0):EBX.RTM[bit 11] = 0.
-If LOCK or 66H or F2H or F3H prefix is used.
+If LOCK prefix is used.
#GP(0)
@@ -220696,19 +220598,19 @@ SSE2
66 0F 57/r
XORPD xmm1, xmm2/m128
-VEX.NDS.128.66.0F.WIG 57 /r
+VEX.128.66.0F.WIG 57 /r
VXORPD xmm1,xmm2,
xmm3/m128
-VEX.NDS.256.66.0F.WIG 57 /r
+VEX.256.66.0F.WIG 57 /r
VXORPD ymm1, ymm2,
ymm3/m256
-EVEX.NDS.128.66.0F.W1 57 /r
+EVEX.128.66.0F.W1 57 /r
VXORPD xmm1 {k1}{z}, xmm2,
xmm3/m128/m64bcst
-EVEX.NDS.256.66.0F.W1 57 /r
+EVEX.256.66.0F.W1 57 /r
VXORPD ymm1 {k1}{z}, ymm2,
ymm3/m256/m64bcst
-EVEX.NDS.512.66.0F.W1 57 /r
+EVEX.512.66.0F.W1 57 /r
VXORPD zmm1 {k1}{z}, zmm2,
zmm3/m512/m64bcst
@@ -220913,7 +220815,7 @@ SSE
NP 0F 57 /r
XORPS xmm1, xmm2/m128
-VEX.NDS.128.0F.WIG 57 /r
+VEX.128.0F.WIG 57 /r
VXORPS xmm1,xmm2, xmm3/m128
B
@@ -220922,7 +220824,7 @@ V/V
AVX
-VEX.NDS.256.0F.WIG 57 /r
+VEX.256.0F.WIG 57 /r
VXORPS ymm1, ymm2, ymm3/m256
B
@@ -220931,13 +220833,13 @@ V/V
AVX
-EVEX.NDS.128.0F.W0 57 /r
+EVEX.128.0F.W0 57 /r
VXORPS xmm1 {k1}{z}, xmm2,
xmm3/m128/m32bcst
-EVEX.NDS.256.0F.W0 57 /r
+EVEX.256.0F.W0 57 /r
VXORPS ymm1 {k1}{z}, ymm2,
ymm3/m256/m32bcst
-EVEX.NDS.512.0F.W0 57 /r
+EVEX.512.0F.W0 57 /r
VXORPS zmm1 {k1}{z}, zmm2,
zmm3/m512/m32bcst
@@ -223102,7 +223004,29 @@ SENTER function.
SENTER Global Enable: Must be set to ‘1’ to enable operation of GETSEC[SENTER].
-63:16
+16
+
+Reserved
+
+17
+
+SGX Launch Control Enable: Must be set to ‘1’ to enable runtime re-configuration of SGX Launch Control via the
+IA32_SGXLEPUBKEYHASHn MSR.
+
+18
+
+SGX Global Enable: Must be set to ‘1’ to enable Intel SGX leaf functions.
+
+19
+
+Reserved
+
+20
+
+LMCE On: When set, system software can program the MSRs associated with LMCE to configure delivery of some
+machine check exceptions to a single logical processor.
+
+63:21
Reserved
@@ -223144,6 +223068,10 @@ System software must first query for available GETSEC leaf functions by executin
CAPABILITIES leaf function returns a bit map of available GETSEC leaves. An attempt to execute an unsupported
leaf index results in an undefined opcode (#UD) exception.
+6-2 Vol. 2D
+
+ SAFER MODE EXTENSIONS REFERENCE
+
6.2.2.1
GETSEC[CAPABILITIES]
@@ -223151,11 +223079,6 @@ GETSEC[CAPABILITIES]
The SMX functionality provides an architectural interface for newer processor generations to extend SMX capabilities. Specifically, the GETSEC instruction provides a capability leaf function for system software to discover the
available GETSEC leaf functions that are supported in a processor. Table 6-2 lists the currently available GETSEC
leaf functions.
-
-6-2 Vol. 2D
-
- SAFER MODE EXTENSIONS REFERENCE
-
.
Table 6-2. GETSEC Leaf Functions
@@ -223254,6 +223177,10 @@ near pointer passed with the GETSEC[EXITAC] instruction.
The authenticated code execution area is no longer accessible after completion of GETSEC[EXITAC]. RBX (or EBX)
holds the address of the near absolute indirect target to be taken.
+Vol. 2D 6-3
+
+ SAFER MODE EXTENSIONS REFERENCE
+
6.2.2.4
GETSEC[SENTER]
@@ -223263,10 +223190,6 @@ GETSEC[SENTER] can be considered a superset of the ENTERACCS leaf, because it en
environment launch.
Measured environment startup consists of the following steps:
-Vol. 2D 6-3
-
- SAFER MODE EXTENSIONS REFERENCE
-
the ILP rendezvous the responding logical processors (RLPs) in the platform into a controlled state (At the
@@ -223330,6 +223253,10 @@ GETSEC[WAKEUP]. When the RLPs in SENTER sleep state wake up, these logical proce
entry point defined in a data structure held in system memory (pointed to by an chipset register LT.MLE.JOIN) in
TXT configuration space.
+6-4 Vol. 2D
+
+ SAFER MODE EXTENSIONS REFERENCE
+
6.2.3
Measured Environment and SMX
@@ -223337,11 +223264,6 @@ Measured Environment and SMX
This section gives a simplified view of a representative life cycle of a measured environment that is launched by a
system executive using SMX leaf functions. Intel® Trusted Execution Technology Measured Launched Environment
Programming Guide provides more detailed examples of using SMX and chipset resources (including chipset registers, Trusted Platform Module) to launch an MVMM.
-
-6-4 Vol. 2D
-
- SAFER MODE EXTENSIONS REFERENCE
-
The life cycle starts with the system executive (an OS, an OS loader, and so forth) loading the MLE and SINIT AC
module into available system memory. The system executive must validate and prepare the platform for the
measured launch. When the platform is properly configured, the system executive executes GETSEC[SENTER] on
@@ -223357,7 +223279,7 @@ capable of DMA, producing a hash of the MLE, storing the hash value in TPM PCR 1
When SINIT completes execution, it executes the GETSEC[EXITAC] instruction and transfers control the MLE at the
designated entry point.
Upon receiving control from the SINIT AC module, the MLE must establish its protection and isolation controls
-before enabling DMA and interrupts and transferring control to other software modules. It must also wakeup the
+before enabling DMA and interrupts and transferring control to other software modules. It must also wake up the
RLPs from their SENTER sleep state using the GETSEC[WAKEUP] instruction and bring them into its protection and
isolation environment.
While executing in a measured environment, the MVMM can access the Trusted Platform Module (TPM) in locality 2.
@@ -223381,27 +223303,26 @@ of an undefined opcode exception.
All GETSEC leaf functions are available in protected mode, including the compatibility sub-mode of IA-32e mode
and the 64-bit sub-mode of IA-32e mode. Unless otherwise noted, the behavior of all GETSEC functions and interactions related to the measured environment are independent of IA-32e mode. This also applies to the interpretation of register widths1 passed as input parameters to GETSEC functions and to register results returned as output
parameters.
-The GETSEC functions ENTERACCS, SENTER, SEXIT, and WAKEUP require a Intel® TXT capable-chipset to be
-present in the platform. The GETSEC[CAPABILITIES] returned bit vector in position 0 indicates an Intel® TXTcapable chipset has been sampled present2 by the processor.
-The processor's operating mode also affects the execution of the following GETSEC leaf functions: SMCTRL, ENTERACCS, EXITAC, SENTER, SEXIT, and WAKEUP. These functions are only allowed in protected mode at CPL = 0. They
+
1.
This chapter uses the 64-bit notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because processors that support SMX also
support Intel 64 Architecture. The MVMM can be launched in IA-32e mode or outside IA-32e mode. The 64-bit notation of processor
registers also refer to its 32-bit forms if SMX is used in 32-bit environment. In some places, notation such as EAX is used to refer
specifically to lower 32 bits of the indicated register
-
-2. Sampled present means that the processor sent a message to the chipset and the chipset responded that it (a) knows about the
-message and (b) is capable of executing SENTER. This means that the chipset CAN support Intel® TXT, and is configured and WILLING
-to support it.
Vol. 2D 6-5
SAFER MODE EXTENSIONS REFERENCE
-
+The GETSEC functions ENTERACCS, SENTER, SEXIT, and WAKEUP require a Intel® TXT capable-chipset to be
+present in the platform. The GETSEC[CAPABILITIES] returned bit vector in position 0 indicates an Intel® TXTcapable chipset has been sampled present1 by the processor.
+The processor's operating mode also affects the execution of the following GETSEC leaf functions: SMCTRL, ENTERACCS, EXITAC, SENTER, SEXIT, and WAKEUP. These functions are only allowed in protected mode at CPL = 0. They
are not allowed while in SMM in order to prevent potential intra-mode conflicts. Further execution qualifications
exist to prevent potential architectural conflicts (for example: nesting of the measured environment or authenticated code execution mode). See the definitions of the GETSEC leaf functions for specific requirements.
For the purpose of performance monitor counting, the execution of GETSEC functions is counted as a single instruction with respect to retired instructions. The response by a responding logical processor (RLP) to messages associated with GETSEC[SENTER] or GTSEC[SEXIT] is transparent to the retired instruction count on the ILP.
+1. Sampled present means that the processor sent a message to the chipset and the chipset responded that it (a) knows about the
+message and (b) is capable of executing SENTER. This means that the chipset CAN support Intel® TXT, and is configured and WILLING
+to support it.
6-6 Vol. 2D
SAFER MODE EXTENSIONS REFERENCE
@@ -223428,10 +223349,10 @@ EAX. GETSEC[CAPABILITIES] may be executed at all privilege levels, but the CR4.S
With EBX = 0 upon execution of GETSEC[CAPABILITIES], EAX returns the a bit vector representing status on the
presence of a Intel® TXT-capable chipset and the first 30 available GETSEC leaf functions. The format of the
returned bit vector is provided in Table 6-3.
-If bit 0 is set to 1, then an Intel® TXT-capable chipset has been sampled present by the processor. If bits in the
-range of 1-30 are set, then the corresponding GETSEC leaf function is available. If the bit value at a given bit index
-is 0, then the GETSEC leaf function corresponding to that index is unsupported and attempted execution results in
-a #UD.
+If bit 0 is set to 1, then an Intel® TXT-capable chipset has been sampled present by the processor. If bits in the range
+of 1-30 are set, then the corresponding GETSEC leaf function is available. If the bit value at a given bit index is 0,
+then the GETSEC leaf function corresponding to that index is unsupported and attempted execution results in a
+#UD.
Bit 31 of EAX indicates if further leaf indexes are supported. If the Extended Leafs bit 31 is set, then additional leaf
functions are accessed by repeating GETSEC[CAPABILITIES] with EBX incremented by one. When the most significant bit of EAX is not set, then additional GETSEC leaf functions are not supported; indexing EBX to a higher value
results in EAX returning zero.
@@ -223992,7 +223913,6 @@ access to this normally restricted chipset state for the purpose of securing the
Once the authenticated code module is launched at the completion of GETSEC[ENTERACCS], it is free to enable
interrupts by setting EFLAGS.IF and enable NMI by execution of IRET. This presumes that it has re-established
interrupt handling support through initialization of the IDT, GDT, and corresponding interrupt handling code.
-
GETSEC[ENTERACCS] - Execute Authenticated Chipset Code
Vol. 2D 6-13
@@ -224285,8 +224205,6 @@ The content of the authenticated code execution area is invalidated by hardware
use or visibility. This internal processor storage area can no longer be used or relied upon after GETSEC[EXITAC].
Data structures need to be re-established outside of the authenticated code execution area if they are to be referenced after EXITAC. Since addressed memory content formerly mapped to the authenticated code execution area
may no longer be coherent with external system memory after EXITAC, processor TLBs in support of linear to physical address translation are also invalidated.
-Upon completion of GETSEC[EXITAC] a near absolute indirect transfer is performed with EIP loaded with the
-contents of EBX (based on the current operating mode size). In 64-bit mode, all 64 bits of RBX are loaded into RIP
6-18 Vol. 2D
@@ -224294,6 +224212,8 @@ GETSEC[EXITAC]—Exit Authenticated Code Execution Mode
SAFER MODE EXTENSIONS REFERENCE
+Upon completion of GETSEC[EXITAC] a near absolute indirect transfer is performed with EIP loaded with the
+contents of EBX (based on the current operating mode size). In 64-bit mode, all 64 bits of RBX are loaded into RIP
if REX.W precedes GETSEC[EXITAC]. Otherwise RBX is treated as 32 bits even while in 64-bit mode. Conventional
CS limit checking is performed as part of this control transfer. Any exception conditions generated as part of this
control transfer will be directed to the existing IDT; thus it is recommended that an IDTR should also be established
@@ -224342,6 +224262,12 @@ END;
Flags Affected
None.
+GETSEC[EXITAC]—Exit Authenticated Code Execution Mode
+
+Vol. 2D 6-19
+
+ SAFER MODE EXTENSIONS REFERENCE
+
Use of Prefixes
LOCK
@@ -224355,12 +224281,6 @@ Operand size
Causes #UD.
-GETSEC[EXITAC]—Exit Authenticated Code Execution Mode
-
-Vol. 2D 6-19
-
- SAFER MODE EXTENSIONS REFERENCE
-
Segment overrides Ignored.
Address size
@@ -224502,11 +224422,6 @@ thus synchronizing the RLP(s) with the ILP.
In response to a message signaling the completion of rendezvous, RLPs clear the bootstrap processor indicator flag
(IA32_APIC_BASE.BSP) and enter an SENTER sleep state. In this sleep state, RLPs enter an idle processor condition while waiting to be activated after a measured environment has been established by the system executive.
RLPs in the SENTER sleep state can only be activated by the GETSEC leaf function WAKEUP in a measured environment.
-A successful launch of the measured environment results in the initiating logical processor entering the authenticated code execution mode. Prior to reaching this point, the ILP performs the following steps internally:
-
-•
-
-Inhibit processor response to the external events: INIT, A20M, NMI, and SMI.
GETSEC[SENTER]—Enter a Measured Environment
@@ -224514,6 +224429,9 @@ Vol. 2D 6-21
SAFER MODE EXTENSIONS REFERENCE
+A successful launch of the measured environment results in the initiating logical processor entering the authenticated code execution mode. Prior to reaching this point, the ILP performs the following steps internally:
+
+•
@@ -224526,6 +224444,7 @@ Vol. 2D 6-21
+Inhibit processor response to the external events: INIT, A20M, NMI, and SMI.
Establish and check the location and size of the authenticated code module to be executed by the ILP.
Check for the existence of an Intel® TXT-capable chipset.
Verify the current power management configuration is acceptable.
@@ -224571,14 +224490,13 @@ purpose of this masking control is to prevent exposure to existing external even
has been put in place to directly handle these events. Masked external pin events may be unmasked conditionally
or unconditionally via the GETSEC[EXITAC], GETSEC[SEXIT], GETSEC[SMCTRL] or for specific VMX related operations such as a VM entry or the VMXOFF instruction (see respective GETSEC leaves and Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C for more details). The state of the A20M pin is masked and
forced internally to a de-asserted state so that external assertion is not recognized. A20M masking as set by
-GETSEC[SENTER] is undone only after taking down the measured environment with the GETSEC[SEXIT] instruction or processor reset. INTR is masked by simply clearing the EFLAGS.IF bit. It is the responsibility of system software to control the processor response to INTR through appropriate management of EFLAGS.
-
6-22 Vol. 2D
GETSEC[SENTER]—Enter a Measured Environment
SAFER MODE EXTENSIONS REFERENCE
+GETSEC[SENTER] is undone only after taking down the measured environment with the GETSEC[SEXIT] instruction or processor reset. INTR is masked by simply clearing the EFLAGS.IF bit. It is the responsibility of system software to control the processor response to INTR through appropriate management of EFLAGS.
To prevent other (logical) processors from interfering with the ILP operating in authenticated code execution mode,
memory (excluding implicit write-back transactions) and I/O activities originating from other processor agents are
blocked. This protection starts when the ILP enters into authenticated code execution mode. Only memory and I/O
@@ -225013,18 +224931,18 @@ REX
Ignored.
-Protected Mode Exceptions
-#UD
-
-If CR4.SMXE = 0.
-If GETSEC[SENTER] is not reported as supported by GETSEC[CAPABILITIES].
-
6-28 Vol. 2D
GETSEC[SENTER]—Enter a Measured Environment
SAFER MODE EXTENSIONS REFERENCE
+Protected Mode Exceptions
+#UD
+
+If CR4.SMXE = 0.
+If GETSEC[SENTER] is not reported as supported by GETSEC[CAPABILITIES].
+
#GP(0)
If CR0.CD = 1 or CR0.NW = 1 or CR0.NE = 0 or CR0.PE = 0 or CPL > 0 or EFLAGS.VM = 1.
@@ -226226,7 +226144,7 @@ Flag
Description
-EVEX.DDS.512.F2.0F38.W0 9A /r
+EVEX.512.F2.0F38.W0 9A /r
V4FMADDPS zmm1{k1}{z}, zmm2+3,
m128
@@ -226241,7 +226159,7 @@ values from source register block indicated by
zmm2 by values from m128 and accumulate the
result in zmm1.
-EVEX.DDS.512.F2.0F38.W0 AA /r
+EVEX.512.F2.0F38.W0 AA /r
V4FNMADDPS zmm1{k1}{z},
zmm2+3, m128
@@ -226363,7 +226281,7 @@ Flag
Description
-EVEX.DDS.LLIG.F2.0F38.W0 9B /r
+EVEX.LLIG.F2.0F38.W0 9B /r
V4FMADDSS xmm1{k1}{z},
xmm2+3, m128
@@ -226378,7 +226296,7 @@ values from source register block indicated by
xmm2 by values from m128 and accumulate the
result in xmm1.
-EVEX.DDS.LLIG.F2.0F38.W0 AB /r
+EVEX.LLIG.F2.0F38.W0 AB /r
V4FNMADDSS xmm1{k1}{z},
xmm2+3, m128
@@ -227111,7 +227029,7 @@ Flag
Description
-EVEX.DDS.512.F2.0F38.W0 53 /r
+EVEX.512.F2.0F38.W0 53 /r
VP4DPWSSDS zmm1{k1}{z},
zmm2+3, m128
@@ -227225,7 +227143,7 @@ Flag
Description
-EVEX.DDS.512.F2.0F38.W0 52 /r
+EVEX.512.F2.0F38.W0 52 /r
VP4DPWSSD zmm1{k1}{z}, zmm2+3,
m128
@@ -227549,7 +227467,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W1 CB /r
+EVEX.LIG.66.0F38.W1 CB /r
VRCP28SD xmm1 {k1}{z}, xmm2,
xmm3/m64 {sae}
@@ -227897,7 +227815,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W0 CB /r
+EVEX.LIG.66.0F38.W0 CB /r
VRCP28SS xmm1 {k1}{z},
xmm2, xmm3/m32 {sae}
@@ -228222,7 +228140,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W1 CD /r
+EVEX.LIG.66.0F38.W1 CD /r
VRSQRT28SD xmm1 {k1}{z},
xmm2, xmm3/m64 {sae}
@@ -228527,7 +228445,7 @@ Instruction
Op /
En
-EVEX.NDS.LIG.66.0F38.W0 CD /r
+EVEX.LIG.66.0F38.W0 CD /r
VRSQRT28SS xmm1 {k1}{z},
xmm2, xmm3/m32 {sae}
@@ -251900,8 +251818,8 @@ Basic Architecture, Order Number 253665; Instruction Set Reference A-Z, Order Nu
System Programming Guide, Order Number 325384; Model-Specific Registers, Order Number 335592.
Refer to all four volumes when evaluating your design needs.
-Order Number: 325384-067US
-May 2018
+Order Number: 325384-068US
+November 2018
Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn
more at intel.com, or from the OEM or retailer.
@@ -252598,29 +252516,29 @@ IA32_MISC_ENABLE MSR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Microcode Update Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33
PROGRAMMING CONSIDERATIONS FOR HARDWARE MULTI-THREADING CAPABLE PROCESSORS . . . . . . . . . . . . . . . . . . . . . 8-34
Hierarchical Mapping of Shared Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-34
-Hierarchical Mapping of CPUID Extended Topology Leaf. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
-Hierarchical ID of Logical Processors in an MP System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
-Hierarchical ID of Logical Processors with x2APIC ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
-Algorithm for Three-Level Mappings of APIC_ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
-Identifying Topological Relationships in a MP System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
-MANAGEMENT OF IDLE AND BLOCKED CONDITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
-HLT Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
-PAUSE Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
-Detecting Support MONITOR/MWAIT Instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47
-MONITOR/MWAIT Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-48
-Monitor/Mwait Address Range Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
-Required Operating System Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
-Use the PAUSE Instruction in Spin-Wait Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-50
-Potential Usage of MONITOR/MWAIT in C0 Idle Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-50
-Halt Idle Logical Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
-Potential Usage of MONITOR/MWAIT in C1 Idle Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-52
-Guidelines for Scheduling Threads on Logical Processors Sharing Execution Resources . . . . . . . . . . . . . . . . . . . . . . . . . 8-52
-Eliminate Execution-Based Timing Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
-Place Locks and Semaphores in Aligned, 128-Byte Blocks of Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
-MP INITIALIZATION FOR P6 FAMILY PROCESSORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
-Overview of the MP Initialization Process For P6 Family Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
-MP Initialization Protocol Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
-Error Detection and Handling During the MP Initialization Protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
+Hierarchical Mapping of CPUID Extended Topology Leaf. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
+Hierarchical ID of Logical Processors in an MP System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
+Hierarchical ID of Logical Processors with x2APIC ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-40
+Algorithm for Three-Level Mappings of APIC_ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-40
+Identifying Topological Relationships in a MP System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-45
+MANAGEMENT OF IDLE AND BLOCKED CONDITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
+HLT Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
+PAUSE Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
+Detecting Support MONITOR/MWAIT Instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49
+MONITOR/MWAIT Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-50
+Monitor/Mwait Address Range Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
+Required Operating System Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-51
+Use the PAUSE Instruction in Spin-Wait Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-52
+Potential Usage of MONITOR/MWAIT in C0 Idle Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-52
+Halt Idle Logical Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-53
+Potential Usage of MONITOR/MWAIT in C1 Idle Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
+Guidelines for Scheduling Threads on Logical Processors Sharing Execution Resources . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
+Eliminate Execution-Based Timing Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
+Place Locks and Semaphores in Aligned, 128-Byte Blocks of Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
+MP INITIALIZATION FOR P6 FAMILY PROCESSORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
+Overview of the MP Initialization Process For P6 Family Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
+MP Initialization Protocol Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-56
+Error Detection and Handling During the MP Initialization Protocol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58
CHAPTER 9
PROCESSOR MANAGEMENT AND INITIALIZATION
@@ -253350,8 +253268,8 @@ Processor Model Specific Error Code Field Type B: Bus and Interconnect Error . .
16.2.2.2
Processor Model Specific Error Code Field Type C: Cache Bus Controller Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-7
16.3
-INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_1AH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-7
+INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_1AH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-7
16.3.1
Intel QPI Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-8
16.3.2
@@ -253359,8 +253277,8 @@ Internal Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . .
16.3.3
Memory Controller Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-9
16.4
-INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_2DH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-10
+INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_2DH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-10
16.4.1
Internal Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-10
16.4.2
@@ -253368,15 +253286,15 @@ Intel QPI Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . .
16.4.3
Integrated Memory Controller Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-11
16.5
-INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_3EH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-13
+INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_3EH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-13
16.5.1
Internal Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-13
16.5.2
Integrated Memory Controller Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-14
16.6
-INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_3FH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-15
+INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_3FH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-15
16.6.1
Internal Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-16
16.6.2
@@ -253384,22 +253302,22 @@ Intel QPI Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . .
16.6.3
Integrated Memory Controller Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-17
16.7
-INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_56H, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-19
+INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_56H, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-19
16.7.1
Internal Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-19
16.7.2
Integrated Memory Controller Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-20
16.8
-INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_4FH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-21
+INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_4FH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-21
16.8.1
Integrated Memory Controller Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-21
16.8.2
Home Agent Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-22
16.9
-INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_55H, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-22
+INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_55H, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-22
16.9.1
Internal Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-22
16.9.2
@@ -253410,8 +253328,8 @@ Integrated Memory Controller Machine Check Errors . . . . . . . . . . . . . . .
M2M Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-26
16.9.5
Home Agent Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-27
-16.10 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL
-SIGNATURE 06_5FH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-28
+16.10 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY WITH CPUID DISPLAYFAMILY_DISPLAYMODEL SIGNATURE
+06_5FH, MACHINE ERROR CODES FOR MACHINE CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-28
16.10.1
Integrated Memory Controller Machine Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-28
16.11 INCREMENTAL DECODING INFORMATION: PROCESSOR FAMILY 0FH MACHINE ERROR CODES FOR MACHINE CHECK . . 16-29
@@ -253519,141 +253437,141 @@ Branch Trace Message Visibility . . . . . . . . . . . . . . . . . . . . . . . .
Branch Trace Store (BTS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-14
CPL-Qualified Branch Trace Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-14
Freezing LBR and Performance Counters on PMI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-14
-LBR Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-15
-LBR Stack and Intel® 64 Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17
-LBR Stack and IA-32 Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-18
-Last Exception Records and Intel 64 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-18
+LBR Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-16
+LBR Stack and Intel® 64 Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-16
+LBR Stack and IA-32 Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17
+Last Exception Records and Intel 64 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17
BTS and DS Save Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-18
-64 Bit Format of the DS Save Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-21
-Setting Up the DS Save Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-23
-Setting Up the BTS Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-24
-Setting Up CPL-Qualified BTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-25
-Writing the DS Interrupt Service Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-25
-LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (INTEL® CORE™ 2 DUO AND INTEL® ATOM™ PROCESSORS) . 17-26
-LBR Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-26
-LBR Stack in Intel Atom Processors based on the Silvermont Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-27
+64 Bit Format of the DS Save Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-20
+Setting Up the DS Save Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-22
+Setting Up the BTS Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-23
+Setting Up CPL-Qualified BTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-24
+Writing the DS Interrupt Service Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-24
+LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (INTEL® CORE™ 2 DUO AND INTEL® ATOM™ PROCESSORS) . 17-25
+LBR Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-25
+LBR Stack in Intel Atom Processors based on the Silvermont Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-26
LAST BRANCH, CALL STACK, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON GOLDMONT
+MICROARCHITECTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-26
+LAST BRANCH, CALL STACK, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON GOLDMONT PLUS
MICROARCHITECTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-27
-LAST BRANCH, CALL STACK, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON GOLDMONT
-PLUS MICROARCHITECTURE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-28
-LAST BRANCH, INTERRUPT AND EXCEPTION RECORDING FOR INTEL® XEON PHI™ PROCESSOR 7200/5200/3200 . . . 17-28
+LAST BRANCH, INTERRUPT AND EXCEPTION RECORDING FOR INTEL® XEON PHI™ PROCESSOR 7200/5200/3200 . . . 17-27
LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON INTEL® MICROARCHITECTURE
-CODE NAME NEHALEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-28
-LBR Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-30
-Filtering of Last Branch Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-30
+CODE NAME NEHALEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-27
+LBR Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-29
+Filtering of Last Branch Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-29
LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON INTEL® MICROARCHITECTURE
-CODE NAME SANDY BRIDGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-31
+CODE NAME SANDY BRIDGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-30
LAST BRANCH, CALL STACK, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON HASWELL
-MICROARCHITECTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-31
-LBR Stack Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-32
+MICROARCHITECTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-30
+LBR Stack Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-31
LAST BRANCH, CALL STACK, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON SKYLAKE
-MICROARCHITECTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-33
-MSR_LBR_INFO_x MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-33
-Streamlined Freeze_LBRs_On_PMI Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34
-LBR Behavior and Deep C-State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34
+MICROARCHITECTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-32
+MSR_LBR_INFO_x MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-32
+Streamlined Freeze_LBRs_On_PMI Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-33
+LBR Behavior and Deep C-State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-33
LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PROCESSORS BASED ON INTEL NETBURST®
-MICROARCHITECTURE). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34
-MSR_DEBUGCTLA MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-35
+MICROARCHITECTURE). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-33
+MSR_DEBUGCTLA MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34
Vol. 3A xv
CONTENTS
PAGE
17.13.2
-LBR Stack for Processors Based on Intel NetBurst® Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-36
+LBR Stack for Processors Based on Intel NetBurst® Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-35
17.13.3
-Last Exception Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-37
+Last Exception Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-36
17.14 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (INTEL® CORE™ SOLO AND INTEL® CORE™ DUO
-PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-37
-17.15 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PENTIUM M PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-39
-17.16 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (P6 FAMILY PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-40
+PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-36
+17.15 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PENTIUM M PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-38
+17.16 LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (P6 FAMILY PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-39
17.16.1
-DEBUGCTLMSR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-40
+DEBUGCTLMSR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-39
17.16.2
-Last Branch and Last Exception MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-41
+Last Branch and Last Exception MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-40
17.16.3
-Monitoring Branches, Exceptions, and Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-41
-17.17 TIME-STAMP COUNTER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-42
+Monitoring Branches, Exceptions, and Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-40
+17.17 TIME-STAMP COUNTER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-41
17.17.1
-Invariant TSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-43
+Invariant TSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-42
17.17.2
-IA32_TSC_AUX Register and RDTSCP Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-43
+IA32_TSC_AUX Register and RDTSCP Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-42
17.17.3
-Time-Stamp Counter Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-44
+Time-Stamp Counter Adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-43
17.17.4
-Invariant Time-Keeping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-44
-17.18 INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) MONITORING FEATURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-44
+Invariant Time-Keeping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-43
+17.18 INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) MONITORING FEATURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-43
17.18.1
-Overview of Cache Monitoring Technology and Memory Bandwidth Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-45
+Overview of Cache Monitoring Technology and Memory Bandwidth Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-44
17.18.2
-Enabling Monitoring: Usage Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-45
+Enabling Monitoring: Usage Flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-44
17.18.3
-Enumeration and Detecting Support of Cache Monitoring Technology and Memory Bandwidth Monitoring . . . . . . . . .17-46
+Enumeration and Detecting Support of Cache Monitoring Technology and Memory Bandwidth Monitoring . . . . . . . . .17-45
17.18.4
-Monitoring Resource Type and Capability Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-46
+Monitoring Resource Type and Capability Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-45
17.18.5
-Feature-Specific Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-47
+Feature-Specific Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-46
17.18.5.1
-Cache Monitoring Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-48
+Cache Monitoring Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-47
17.18.5.2
-Memory Bandwidth Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-48
+Memory Bandwidth Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-47
17.18.6
-Monitoring Resource RMID Association. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-48
+Monitoring Resource RMID Association. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-47
17.18.7
-Monitoring Resource Selection and Reporting Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-49
+Monitoring Resource Selection and Reporting Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-48
17.18.8
-Monitoring Programming Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-50
+Monitoring Programming Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-49
17.18.8.1
-Monitoring Dynamic Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-51
+Monitoring Dynamic Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-50
17.18.8.2
-Monitoring Operation With Power Saving Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-51
+Monitoring Operation With Power Saving Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-50
17.18.8.3
-Monitoring Operation with Other Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-51
+Monitoring Operation with Other Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-50
17.18.8.4
-Monitoring Operation with RAS Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-51
-17.19 INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-51
+Monitoring Operation with RAS Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-50
+17.19 INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-50
17.19.1
-Introduction to Cache Allocation Technology (CAT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-51
+Introduction to Cache Allocation Technology (CAT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-50
17.19.2
-Cache Allocation Technology Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-52
+Cache Allocation Technology Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-51
17.19.3
-Code and Data Prioritization (CDP) Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-55
+Code and Data Prioritization (CDP) Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-54
17.19.4
-Enabling Cache Allocation Technology Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-56
+Enabling Cache Allocation Technology Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-55
17.19.4.1
-Enumeration and Detection Support of Cache Allocation Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-57
+Enumeration and Detection Support of Cache Allocation Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-56
17.19.4.2
-Cache Allocation Technology: Resource Type and Capability Enumeration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-57
+Cache Allocation Technology: Resource Type and Capability Enumeration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-56
17.19.4.3
-Cache Allocation Technology: Cache Mask Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-60
+Cache Allocation Technology: Cache Mask Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-59
17.19.4.4
-Class of Service to Cache Mask Association: Common Across Allocation Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-60
+Class of Service to Cache Mask Association: Common Across Allocation Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-59
17.19.5
-Code and Data Prioritization (CDP): Enumerating and Enabling L3 CDP Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-61
+Code and Data Prioritization (CDP): Enumerating and Enabling L3 CDP Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-60
17.19.5.1
-Mapping Between L3 CDP Masks and CAT Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-61
+Mapping Between L3 CDP Masks and CAT Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-60
17.19.6
-Code and Data Prioritization (CDP): Enumerating and Enabling L2 CDP Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-62
+Code and Data Prioritization (CDP): Enumerating and Enabling L2 CDP Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-61
17.19.6.1
-Mapping Between L2 CDP Masks and L2 CAT Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-63
+Mapping Between L2 CDP Masks and L2 CAT Masks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-62
17.19.6.2
-Common L2 and L3 CDP Programming Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-63
+Common L2 and L3 CDP Programming Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-62
17.19.6.3
-Cache Allocation Technology Dynamic Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-63
+Cache Allocation Technology Dynamic Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-62
17.19.6.4
-Cache Allocation Technology Operation With Power Saving Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-64
+Cache Allocation Technology Operation With Power Saving Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-63
17.19.6.5
-Cache Allocation Technology Operation with Other Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-64
+Cache Allocation Technology Operation with Other Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-63
17.19.6.6
-Associating Threads with CAT/CDP Classes of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-64
+Associating Threads with CAT/CDP Classes of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-63
17.19.7
Introduction to Memory Bandwidth Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-64
17.19.7.1
Memory Bandwidth Allocation Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-65
17.19.7.2
-Memory Bandwidth Allocation Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-67
+Memory Bandwidth Allocation Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-66
17.19.7.3
-Memory Bandwidth Allocation Usage Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-68
+Memory Bandwidth Allocation Usage Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-67
CHAPTER 18
PERFORMANCE MONITORING
@@ -253729,7 +253647,7 @@ Intel® Xeon® Processor E5 v2 and E7 v2 Family Uncore Performance Monitoring Fa
18.3.6
4th Generation Intel® Core™ Processor Performance Monitoring Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-47
18.3.6.1
-Processor Event Based Sampling (PEBS) Facility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-47
+Processor Event Based Sampling (PEBS) Facility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-48
18.3.6.2
PEBS Data Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-48
18.3.6.3
@@ -253747,7 +253665,7 @@ Intel® Xeon® Processor E5 v3 Family Uncore Performance Monitoring Facility . .
18.3.8
6th Generation, 7th Generation and 8th Generation Intel® Core™ Processor Performance Monitoring Facility . . . . 18-56
18.3.8.1
-Processor Event Based Sampling (PEBS) Facility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-57
+Processor Event Based Sampling (PEBS) Facility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-58
18.3.8.2
Off-core Response Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-62
18.4
@@ -253781,23 +253699,23 @@ Performance Monitoring for Goldmont Plus Microarchitecture . . . . . . . . . . .
18.5.4.1
Extended PEBS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-82
18.5.4.2
-Reduced Skid PEBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-84
+Reduced Skid PEBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-83
18.6
-PERFORMANCE MONITORING (LEGACY INTEL PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-84
+PERFORMANCE MONITORING (LEGACY INTEL PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-83
18.6.1
-Performance Monitoring (Intel® Core™ Solo and Intel® Core™ Duo Processors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-84
+Performance Monitoring (Intel® Core™ Solo and Intel® Core™ Duo Processors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-83
18.6.2
Performance Monitoring (Processors Based on Intel® Core™ Microarchitecture) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-85
18.6.2.1
Fixed-function Performance Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-86
18.6.2.2
-Global Counter Control Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-87
+Global Counter Control Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-86
18.6.2.3
At-Retirement Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-88
18.6.2.4
-Processor Event Based Sampling (PEBS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-89
+Processor Event Based Sampling (PEBS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-88
18.6.3
-Performance Monitoring (Processors Based on Intel NetBurst® Microarchitecture) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-91
+Performance Monitoring (Processors Based on Intel NetBurst® Microarchitecture) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-90
18.6.3.1
ESCR MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-94
18.6.3.2
@@ -253857,11 +253775,11 @@ PAGE
Processor Event-Based Sampling (PEBS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-106
Operating System Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-107
Performance Monitoring and Intel Hyper-Threading Technology in Processors Based on Intel NetBurst®
-Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-108
+Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-107
ESCR MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-108
-CCCR MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-109
+CCCR MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-108
IA32_PEBS_ENABLE MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-110
-Performance Monitoring Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-111
+Performance Monitoring Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-110
Counting Clocks on systems with Intel Hyper-Threading Technology in Processors Based on Intel NetBurst®
Microarchitecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-112
Performance Monitoring and Dual-Core Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-113
@@ -253886,8 +253804,7 @@ COUNTING CLOCKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Non-Halted Reference Clockticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-127
Cycle Counting and Opportunistic Processor Operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-127
Determining the Processor Base Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-128
-For Intel® Processors Based on Microarchitecture Code Name Sandy Bridge, Ivy Bridge, Haswell and
-Broadwell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-128
+For Intel® Processors Based on Microarchitecture Code Name Sandy Bridge, Ivy Bridge, Haswell and Broadwell18-128
For Intel® Processors Based on Microarchitecture Code Name Nehalem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-128
For Intel® Atom™ Processors Based on the Silvermont Microarchitecture (Including Intel Processors Based on
Airmont Microarchitecture). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-128
@@ -254356,6 +254273,7 @@ VIRTUAL MACHINE CONTROL STRUCTURES
24.6.17
24.6.18
24.6.19
+24.6.20
24.7
24.7.1
24.7.2
@@ -254374,7 +254292,6 @@ VIRTUAL MACHINE CONTROL STRUCTURES
24.11.1
24.11.2
24.11.3
-24.11.4
23-1
23-1
@@ -254390,56 +254307,58 @@ FORMAT OF THE VMCS REGION. . . . . . . . . . . . . . . . . . . . . . . . . . . .
ORGANIZATION OF VMCS DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-3
GUEST-STATE AREA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-4
Guest Register State. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-4
-Guest Non-Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-5
+Guest Non-Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-6
HOST-STATE AREA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-8
-VM-EXECUTION CONTROL FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-8
-Pin-Based VM-Execution Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-8
+VM-EXECUTION CONTROL FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-9
+Pin-Based VM-Execution Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-9
Processor-Based VM-Execution Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-9
Exception Bitmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-12
I/O-Bitmap Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-12
Time-Stamp Counter Offset and Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-12
-Guest/Host Masks and Read Shadows for CR0 and CR4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-12
+Guest/Host Masks and Read Shadows for CR0 and CR4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-13
CR3-Target Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-13
Controls for APIC Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-13
MSR-Bitmap Address. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-14
-Executive-VMCS Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-14
+Executive-VMCS Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-15
Extended-Page-Table Pointer (EPTP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-15
Virtual-Processor Identifier (VPID) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-15
Controls for PAUSE-Loop Exiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-15
VM-Function Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-16
VMCS Shadowing Bitmap Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-16
ENCLS-Exiting Bitmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-16
-Control Field for Page-Modification Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-16
-Controls for Virtualization Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-16
+ENCLV-Exiting Bitmap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-16
+Control Field for Page-Modification Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-17
+Controls for Virtualization Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-17
XSS-Exiting Bitmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-17
VM-EXIT CONTROL FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-17
VM-Exit Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-17
VM-Exit Controls for MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-18
-VM-ENTRY CONTROL FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-18
+VM-ENTRY CONTROL FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-19
VM-Entry Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-19
-VM-Entry Controls for MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-19
+VM-Entry Controls for MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-20
VM-Entry Controls for Event Injection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-20
VM-EXIT INFORMATION FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-21
-Basic VM-Exit Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-21
+Basic VM-Exit Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-22
Information for VM Exits Due to Vectored Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-22
-Information for VM Exits That Occur During Event Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-22
-Information for VM Exits Due to Instruction Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-23
-VM-Instruction Error Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-23
+Information for VM Exits That Occur During Event Delivery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-23
+Information for VM Exits Due to Instruction Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-24
+VM-Instruction Error Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-24
VMCS TYPES: ORDINARY AND SHADOW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-24
-SOFTWARE USE OF THE VMCS AND RELATED STRUCTURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-24
-Software Use of Virtual-Machine Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-24
-VMREAD, VMWRITE, and Encodings of VMCS Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-25
+SOFTWARE USE OF THE VMCS AND RELATED STRUCTURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-25
+Software Use of Virtual-Machine Control Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-25
+VMREAD, VMWRITE, and Encodings of VMCS Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-26
Initializing a VMCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-27
-Software Access to Related Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24-27
xxii Vol. 3A
CONTENTS
PAGE
+24.11.4
24.11.5
-VMXON Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-27
+Software Access to Related Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-28
+VMXON Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-28
CHAPTER 25
VMX NON-ROOT OPERATION
@@ -254475,20 +254394,20 @@ OTHER CAUSES OF VM EXITS . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-6
OTHER CHANGES IN VMX NON-ROOT OPERATION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-10
Event Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-10
-Treatment of Task Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-10
+Treatment of Task Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-11
FEATURES SPECIFIC TO VMX NON-ROOT OPERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-11
-VMX-Preemption Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-11
+VMX-Preemption Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-12
Monitor Trap Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-12
Translation of Guest-Physical Addresses Using EPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-13
APIC Virtualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-13
VM Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-13
Enabling VM Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-13
-General Operation of the VMFUNC Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-13
+General Operation of the VMFUNC Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-14
EPTP Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-14
Virtualization Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-15
-Convertible EPT Violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-15
+Convertible EPT Violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-16
Virtualization-Exception Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-16
-Delivery of Virtualization Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-16
+Delivery of Virtualization Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-17
UNRESTRICTED GUESTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-17
CHAPTER 26
@@ -254524,7 +254443,6 @@ VM ENTRIES
26.5.1.2
26.5.1.3
26.5.2
-26.6
BASIC VM-ENTRY CHECKS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
CHECKS ON VMX CONTROLS AND HOST-STATE AREA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-2
@@ -254557,13 +254475,13 @@ Details of Vectored-Event Injection. . . . . . . . . . . . . . . . . . . . . . .
VM Exits During Event Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-20
Event Injection for VM Entries to Real-Address Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-20
Injection of Pending MTF VM Exits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-21
-SPECIAL FEATURES OF VM ENTRY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-21
Vol. 3A xxiii
CONTENTS
PAGE
+26.6
26.6.1
26.6.2
26.6.3
@@ -254576,6 +254494,7 @@ PAGE
26.7
26.8
+SPECIAL FEATURES OF VM ENTRY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26-21
Interruptibility State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26-21
Activity State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26-22
Delivery of Pending Debug Exceptions after VM Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26-22
@@ -254621,20 +254540,20 @@ Information for VM Exits During Event Delivery . . . . . . . . . . . . . . . . .
Information for VM Exits Due to Instruction Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-13
SAVING GUEST STATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-21
Saving Control Registers, Debug Registers, and MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-21
-Saving Segment Registers and Descriptor-Table Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-21
+Saving Segment Registers and Descriptor-Table Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-22
Saving RIP, RSP, and RFLAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-22
-Saving Non-Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-23
+Saving Non-Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-24
SAVING MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-25
-LOADING HOST STATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-25
+LOADING HOST STATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-26
Loading Host Control Registers, Debug Registers, MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-26
Loading Host Segment and Descriptor-Table Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-27
-Loading Host RIP, RSP, and RFLAGS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-28
-Checking and Loading Host Page-Directory-Pointer-Table Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-28
-Updating Non-Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-28
+Loading Host RIP, RSP, and RFLAGS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-29
+Checking and Loading Host Page-Directory-Pointer-Table Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-29
+Updating Non-Register State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-29
Clearing Address-Range Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27-29
-LOADING MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-29
-VMX ABORTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-29
-MACHINE-CHECK EVENTS DURING VM EXIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-30
+LOADING MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-30
+VMX ABORTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-30
+MACHINE-CHECK EVENTS DURING VM EXIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-31
CHAPTER 28
VMX SUPPORT FOR ADDRESS TRANSLATION
@@ -254948,15 +254867,16 @@ HANDLING ACTIVITY STATES BY VMM . . . . . . . . . . . . . . . . . . . . . . . .
CHAPTER 34
SYSTEM MANAGEMENT MODE
-
34.1
-SYSTEM MANAGEMENT MODE OVERVIEW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-1
34.1.1
-System Management Mode and VMX Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-1
34.2
-SYSTEM MANAGEMENT INTERRUPT (SMI). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-2
34.3
-SWITCHING BETWEEN SMM AND THE OTHER PROCESSOR OPERATING MODES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-2
+
+SYSTEM MANAGEMENT MODE OVERVIEW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-1
+System Management Mode and VMX Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-1
+SYSTEM MANAGEMENT INTERRUPT (SMI). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-2
+SWITCHING BETWEEN SMM AND THE OTHER
+PROCESSOR OPERATING MODES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-2
34.3.1
Entering SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-2
34.3.2
@@ -254980,7 +254900,8 @@ SMI Handler Operating Mode Switching . . . . . . . . . . . . . . . . . . . . . .
34.6
EXCEPTIONS AND INTERRUPTS WITHIN SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-10
34.7
-MANAGING SYNCHRONOUS AND ASYNCHRONOUS SYSTEM MANAGEMENT INTERRUPTS . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-11
+MANAGING SYNCHRONOUS AND ASYNCHRONOUS
+SYSTEM MANAGEMENT INTERRUPTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-11
34.7.1
I/O State Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-12
34.8
@@ -255035,15 +254956,15 @@ Checks on the Guest State Area . . . . . . . . . . . . . . . . . . . . . . . . .
Loading Guest State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-23
34.15.4.6
VMX-Preemption Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-23
-34.15.4.7
-Updating the Current-VMCS and SMM-Transfer VMCS Pointers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-23
-34.15.4.8
-VM Exits Induced by VM Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34-24
Vol. 3A xxvii
CONTENTS
PAGE
+34.15.4.7
+Updating the Current-VMCS and SMM-Transfer VMCS Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34-23
+34.15.4.8
+VM Exits Induced by VM Entry. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34-24
34.15.4.9
SMI Blocking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .34-24
34.15.4.10
@@ -255152,15 +255073,15 @@ IA32_RTIT_STATUS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
IA32_RTIT_ADDRn_A and IA32_RTIT_ADDRn_B MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-21
35.2.7.6
IA32_RTIT_CR3_MATCH MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-21
-35.2.7.7
-IA32_RTIT_OUTPUT_BASE MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-21
-35.2.7.8
-IA32_RTIT_OUTPUT_MASK_PTRS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-22
xxviii Vol. 3A
CONTENTS
PAGE
+35.2.7.7
+IA32_RTIT_OUTPUT_BASE MSR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-21
+35.2.7.8
+IA32_RTIT_OUTPUT_MASK_PTRS MSR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-22
35.2.8
Interaction of Intel® Processor Trace and Other Processor Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-23
35.2.8.1
@@ -255280,52 +255201,53 @@ MWAIT Packet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Power Entry (PWRE) Packet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-60
35.4.2.25
Power Exit (PWRX) Packet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-61
-35.5
-TRACING IN VMX OPERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-61
-35.5.1
-VMX-Specific Packets and VMCS Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-62
Vol. 3A xxix
CONTENTS
PAGE
+35.5
+35.5.1
35.5.2
-Managing Trace Packet Generation Across VMX Transitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-63
35.5.2.1
-System-Wide Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-63
35.5.2.2
-Host-Only Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
35.5.2.3
-Guest-Only Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
35.5.2.4
-Virtualization of Guest Output Packet Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
35.5.2.5
-Emulation of Intel PT Traced State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
35.5.2.6
-TSC Scaling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-65
35.5.2.7
-Failed VM Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-65
35.5.2.8
-VMX Abort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-65
35.6
-TRACING AND SMM TRANSFER MONITOR (STM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-65
35.7
-PACKET GENERATION SCENARIOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-65
35.8
-SOFTWARE CONSIDERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-77
35.8.1
-Tracing SMM Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-77
35.8.2
-Cooperative Transition of Multiple Trace Collection Agents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-78
35.8.3
-Tracking Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-78
35.8.3.1
-Time Domain Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-78
35.8.3.2
-Estimating TSC within Intel PT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-79
35.8.3.3
-VMX TSC Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-79
35.8.3.4
+
+TRACING IN VMX OPERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-61
+VMX-Specific Packets and VMCS Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-62
+Managing Trace Packet Generation Across VMX Transitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-63
+System-Wide Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-63
+Host-Only Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
+Guest-Only Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
+Virtualization of Guest Output Packet Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
+Emulation of Intel PT Traced State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-64
+TSC Scaling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-65
+Failed VM Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-65
+VMX Abort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-65
+TRACING AND SMM TRANSFER MONITOR (STM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-65
+PACKET GENERATION SCENARIOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-65
+SOFTWARE CONSIDERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-78
+Tracing SMM Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-78
+Cooperative Transition of Multiple Trace Collection Agents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-78
+Tracking Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-78
+Time Domain Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-79
+Estimating TSC within Intel PT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-79
+VMX TSC Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-80
Calculating Frequency with Intel PT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-80
CHAPTER 36
@@ -255379,8 +255301,6 @@ ENCLAVE ACCESS CONTROL AND DATA STRUCTURES
37.9.1.2
37.9.2
37.9.2.1
-37.9.2.2
-37.10
OVERVIEW OF ENCLAVE EXECUTION ENVIRONMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-1
TERMINOLOGY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-1
@@ -255407,14 +255327,14 @@ EXITINFO. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VECTOR Field Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-9
MISC Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-9
EXINFO Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38-10
-Page Fault Error Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38-10
-PAGE INFORMATION (PAGEINFO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-11
xxx Vol. 3A
CONTENTS
PAGE
+37.9.2.2
+37.10
37.11
37.11.1
37.11.2
@@ -255433,6 +255353,8 @@ PAGE
37.20.1
37.20.2
+Page Fault Error Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-10
+PAGE INFORMATION (PAGEINFO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-11
SECURITY INFORMATION (SECINFO) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-11
SECINFO.FLAGS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-11
PAGE_TYPE Field Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-12
@@ -255495,8 +255417,6 @@ ENCLAVE OPERATION
38.5.12
38.6
38.6.1
-38.6.2
-38.6.3
CONSTRUCTING AN ENCLAVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39-1
ECREATE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39-2
@@ -255540,16 +255460,18 @@ Extending the EPCM Permissions of a Page . . . . . . . . . . . . . . . . . . . .
VMM Oversubscription of EPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39-13
CHANGES TO INSTRUCTION BEHAVIOR INSIDE AN ENCLAVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39-14
Illegal Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39-14
-RDRAND and RDSEED Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39-15
-PAUSE Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39-15
Vol. 3A xxxi
CONTENTS
PAGE
+38.6.2
+38.6.3
38.6.4
38.6.5
+RDRAND and RDSEED Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39-15
+PAUSE Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39-15
Executions of INT1 and INT3 Inside an Enclave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39-15
INVD Handling when Enclaves Are Enabled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .39-15
@@ -255630,13 +255552,13 @@ ERESUME—Re-Enters an Enclave . . . . . . . . . . . . . . . . . . . . . . . . .
INTEL® SGX VIRTUALIZATION LEAF FUNCTION REFERENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-128
EDECVIRTCHILD—Decrement VIRTCHILDCNT in SECS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-129
EINCVIRTCHILD—Increment VIRTCHILDCNT in SECS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-133
-ESETCONTEXT—Set the ENCLAVECONTEXT Field in SECS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-137
-
xxxii Vol. 3A
CONTENTS
PAGE
+ESETCONTEXT—Set the ENCLAVECONTEXT Field in SECS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41-136
+
CHAPTER 41
INTEL® SGX INTERACTIONS WITH IA32 AND INTEL® 64 ARCHITECTURE
41.1
@@ -255698,9 +255620,6 @@ INTEL® SGX INTERACTIONS WITH IA32 AND INTEL® 64 ARCHITECTURE
41.15
41.15.1
41.15.2
-41.15.3
-41.16
-41.17
INTEL® SGX AVAILABILITY IN VARIOUS PROCESSOR MODES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-1
IA32_FEATURE_CONTROL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-1
@@ -255761,14 +255680,17 @@ INTEL® SGX INTERACTIONS WITH S STATES. . . . . . . . . . . . . . . . . . . . .
INTEL® SGX INTERACTIONS WITH MACHINE CHECK ARCHITECTURE (MCA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-14
Interactions with MCA Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-14
Machine Check Enables (IA32_MCi_CTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-14
-CR4.MCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-14
-INTEL® SGX INTERACTIONS WITH PROTECTED MODE VIRTUAL INTERRUPTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-15
-INTEL SGX INTERACTION WITH PROTECTION KEYS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-15
+
Vol. 3A xxxiii
CONTENTS
PAGE
+41.15.3
+CR4.MCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42-14
+41.16 INTEL® SGX INTERACTIONS WITH PROTECTED MODE VIRTUAL INTERRUPTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-15
+41.17 INTEL SGX INTERACTION WITH PROTECTION KEYS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-15
+
CHAPTER 42
ENCLAVE CODE DEBUG AND PROFILING
42.1
@@ -255875,6 +255797,11 @@ VMCS ENUMERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
VPID AND EPT CAPABILITIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
VM FUNCTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
+xxxiv Vol. 3A
+
+ CONTENTS
+PAGE
+
APPENDIX B
FIELD ENCODING IN VMCS
B.1
@@ -255882,24 +255809,6 @@ B.1.1
B.1.2
B.1.3
B.2
-
-16-BIT FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-16-Bit Control Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-16-Bit Guest-State Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-16-Bit Host-State Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-64-BIT FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
-
-xxxiv Vol. 3A
-
-B-1
-B-1
-B-1
-B-2
-B-2
-
- CONTENTS
-PAGE
-
B.2.1
B.2.2
B.2.3
@@ -255915,6 +255824,11 @@ B.4.2
B.4.3
B.4.4
+16-BIT FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
+16-Bit Control Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
+16-Bit Guest-State Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
+16-Bit Host-State Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
+64-BIT FIELDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
64-Bit Control Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
64-Bit Read-Only Data Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-4
64-Bit Guest-State Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-4
@@ -256149,10 +256063,10 @@ Example of Write Ordering in Multiple-Processor Systems . . . . . . . . . . . .
Interpretation of APIC ID in Early MP Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24
Local APICs and I/O APIC in MP System Supporting Intel HT Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
IA-32 Processor with Two Logical Processors Supporting Intel HT Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28
-Generalized Four level Interpretation of the APIC ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
-Conceptual Six-Level Topology and 32-bit APIC ID Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
-Topological Relationships between Hierarchical IDs in a Hypothetical MP Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-37
-MP System With Multiple Pentium III Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
+Generalized Seven Level Interpretation of the APIC ID. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35
+Conceptual Six-Level Topology and 32-bit APIC ID Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
+Topological Relationships between Hierarchical IDs in a Hypothetical MP Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39
+MP System With Multiple Pentium III Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-57
Contents of CR0 Register after Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
Version Information in the EDX Register after Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-5
Processor State After Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15
@@ -256336,7 +256250,7 @@ Debug Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DR6/DR7 Layout on Processors Supporting Intel® 64 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-7
IA32_DEBUGCTL MSR for Processors based on Intel Core microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-12
64-bit Address Layout of LBR MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-17
-DS Save Area Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-20
+DS Save Area Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-19
CONTENTS
PAGE
@@ -256407,41 +256321,41 @@ Figure 18-28.
Figure 18-29.
Figure 18-30.
-32-bit Branch Trace Record Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-21
-PEBS Record Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-21
-IA-32e Mode DS Save Area Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-22
-64-bit Branch Trace Record Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-22
-64-bit PEBS Record Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-23
-IA32_DEBUGCTL MSR for Processors based on Intel microarchitecture code name Nehalem . . . . . . . . . . . . . . . . . . . 17-29
-MSR_DEBUGCTLA MSR for Pentium 4 and Intel Xeon Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-35
-LBR MSR Branch Record Layout for the Pentium 4 and Intel Xeon Processor Family . . . . . . . . . . . . . . . . . . . . . . . . . . 17-37
-IA32_DEBUGCTL MSR for Intel Core Solo and Intel Core Duo Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-38
-LBR Branch Record Layout for the Intel Core Solo and Intel Core Duo Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-38
-MSR_DEBUGCTLB MSR for Pentium M Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-39
-LBR Branch Record Layout for the Pentium M Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-40
-DEBUGCTLMSR Register (P6 Family Processors). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-41
-Platform Shared Resource Monitoring Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-46
-CPUID.(EAX=0FH, ECX=0H) Monitoring Resource Type Enumeration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-47
-L3 Cache Monitoring Capability Enumeration Data (CPUID.(EAX=0FH, ECX=1H) ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-47
-L3 Cache Monitoring Capability Enumeration Event Type Bit Vector (CPUID.(EAX=0FH, ECX=1H) ) . . . . . . . . . . . . . 17-48
-IA32_PQR_ASSOC MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-49
-IA32_QM_EVTSEL and IA32_QM_CTR MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-50
-Software Usage of Cache Monitoring Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-50
-Cache Allocation Technology Enables Allocation of More Resources to High Priority Applications. . . . . . . . . . . . . . . 17-52
-Examples of Cache Capacity Bitmasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-53
-Class of Service and Cache Capacity Bitmasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-54
-Code and Data Capacity Bitmasks of CDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-56
-Cache Allocation Technology Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-57
-CPUID.(EAX=10H, ECX=0H) Available Resource Type Identification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-58
-L3 Cache Allocation Technology and CDP Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-58
-L2 Cache Allocation Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-59
-IA32_PQR_ASSOC, IA32_L3_MASK_n MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-60
-IA32_L2_MASK_n MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-60
-Layout of IA32_L3_QOS_CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-61
-Layout of IA32_L2_QOS_CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-62
-A High-Level Overview of the MBA Feature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-65
-CPUID.(EAX=10H, ECX=3H) MBA Feature Details Identification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-67
-IA32_L2_QoS_Ext_BW_Thrtl_n MSR Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-68
+32-bit Branch Trace Record Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-20
+PEBS Record Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-20
+IA-32e Mode DS Save Area Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-21
+64-bit Branch Trace Record Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-21
+64-bit PEBS Record Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-22
+IA32_DEBUGCTL MSR for Processors based on Intel microarchitecture code name Nehalem . . . . . . . . . . . . . . . . . . . 17-28
+MSR_DEBUGCTLA MSR for Pentium 4 and Intel Xeon Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-34
+LBR MSR Branch Record Layout for the Pentium 4 and Intel Xeon Processor Family . . . . . . . . . . . . . . . . . . . . . . . . . . 17-36
+IA32_DEBUGCTL MSR for Intel Core Solo and Intel Core Duo Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-37
+LBR Branch Record Layout for the Intel Core Solo and Intel Core Duo Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-37
+MSR_DEBUGCTLB MSR for Pentium M Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-38
+LBR Branch Record Layout for the Pentium M Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-39
+DEBUGCTLMSR Register (P6 Family Processors). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-40
+Platform Shared Resource Monitoring Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-45
+CPUID.(EAX=0FH, ECX=0H) Monitoring Resource Type Enumeration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-46
+L3 Cache Monitoring Capability Enumeration Data (CPUID.(EAX=0FH, ECX=1H) ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-46
+L3 Cache Monitoring Capability Enumeration Event Type Bit Vector (CPUID.(EAX=0FH, ECX=1H) ) . . . . . . . . . . . . . 17-47
+IA32_PQR_ASSOC MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-48
+IA32_QM_EVTSEL and IA32_QM_CTR MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-49
+Software Usage of Cache Monitoring Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-49
+Cache Allocation Technology Enables Allocation of More Resources to High Priority Applications. . . . . . . . . . . . . . . 17-51
+Examples of Cache Capacity Bitmasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-52
+Class of Service and Cache Capacity Bitmasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-53
+Code and Data Capacity Bitmasks of CDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-55
+Cache Allocation Technology Usage Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-56
+CPUID.(EAX=10H, ECX=0H) Available Resource Type Identification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-57
+L3 Cache Allocation Technology and CDP Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-57
+L2 Cache Allocation Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-58
+IA32_PQR_ASSOC, IA32_L3_MASK_n MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-59
+IA32_L2_MASK_n MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-59
+Layout of IA32_L3_QOS_CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-60
+Layout of IA32_L2_QOS_CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-61
+A High-Level Overview of the MBA Feature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-64
+CPUID.(EAX=10H, ECX=3H) MBA Feature Details Identification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-66
+IA32_L2_QoS_Ext_BW_Thrtl_n MSR Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-67
Layout of IA32_PERFEVTSELx MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-4
Layout of IA32_FIXED_CTR_CTRL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-7
Layout of IA32_PERF_GLOBAL_CTRL MSR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-8
@@ -256552,18 +256466,18 @@ Request_Type Fields for MSR_OFFCORE_RSPx. . . . . . . . . . . . . . . . . . . .
Response_Supplier and Snoop Info Fields for MSR_OFFCORE_RSPx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-74
Layout of IA32_PEBS_ENABLE MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-82
PEBS Programming Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-83
-Layout of MSR_PERF_FIXED_CTR_CTRL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-87
+Layout of MSR_PERF_FIXED_CTR_CTRL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-86
Layout of MSR_PERF_GLOBAL_CTRL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-87
-Layout of MSR_PERF_GLOBAL_STATUS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-88
+Layout of MSR_PERF_GLOBAL_STATUS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-87
Layout of MSR_PERF_GLOBAL_OVF_CTRL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-88
Event Selection Control Register (ESCR) for Pentium 4 and Intel Xeon Processors without Intel HT Technology
-Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-95
-Performance Counter (Pentium 4 and Intel Xeon Processors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-96
+Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-94
+Performance Counter (Pentium 4 and Intel Xeon Processors) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-95
Counter Configuration Control Register (CCCR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-97
-Effects of Edge Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-101
+Effects of Edge Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-100
Event Selection Control Register (ESCR) for the Pentium 4 Processor, Intel Xeon Processor and Intel Xeon
Processor MP Supporting Hyper-Threading Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-108
-Counter Configuration Control Register (CCCR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-110
+Counter Configuration Control Register (CCCR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-109
Block Diagram of 64-bit Intel Xeon Processor MP with 8-MByte L3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-113
MSR_IFSB_IBUSQx, Addresses: 107CCH and 107CDH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-114
MSR_IFSB_ISNPQx, Addresses: 107CEH and 107CFH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-114
@@ -256739,12 +256653,12 @@ Exception Conditions Checked During a Task Switch . . . . . . . . . . . . . . .
Effect of a Task Switch on Busy Flag, NT Flag, Previous Task Link Field, and TS Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
Broadcast INIT-SIPI-SIPI Sequence and Choice of Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22
Initial APIC IDs for the Logical Processors in a System that has Four Intel Xeon MP Processors Supporting Intel
-Hyper-Threading Technology1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-37
+Hyper-Threading Technology1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39
Initial APIC IDs for the Logical Processors in a System that has Two Physical Processors Supporting Dual-Core
-and Intel Hyper-Threading Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-37
+and Intel Hyper-Threading Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-39
Example of Possible x2APIC ID Assignment in a System that has Two Physical Processors Supporting x2APIC
-and Intel Hyper-Threading Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
-Boot Phase IPI Message Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-54
+and Intel Hyper-Threading Technology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-40
+Boot Phase IPI Message Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-56
IA-32 and Intel 64 Processor States Following Power-up, Reset, or INIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
Variance of RESET Values in Selected Intel Architecture Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4
Recommended Settings of EM and MP Flags on IA-32 Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-6
@@ -257004,23 +256918,23 @@ Decoding Family 0FH Machine Check Codes for Cache Hierarchy Errors . . . . . . .
Breakpoint Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6
Debug Exception Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8
Legacy and Streamlined Operation with Freeze_Perfmon_On_PMI = 1, Counter Overflowed . . . . . . . . . . . . . . . . . .17-15
-LBR Stack Size and TOS Pointer Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-15
-IA32_DEBUGCTL Flag Encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-24
-CPL-Qualified Branch Trace Store Encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-25
-MSR_LASTBRANCH_x_TO_IP for the Goldmont Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-27
-MSR_LASTBRANCH_x_FROM_IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-30
-MSR_LASTBRANCH_x_TO_IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-30
-LBR Stack Size and TOS Pointer Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-30
-MSR_LBR_SELECT for Intel microarchitecture code name Nehalem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-30
-MSR_LBR_SELECT for Intel® microarchitecture code name Sandy Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-31
-MSR_LBR_SELECT for Intel® microarchitecture code name Haswell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-31
-MSR_LASTBRANCH_x_FROM_IP with TSX Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-32
-LBR Stack Size and TOS Pointer Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-33
-MSR_LBR_INFO_x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-33
-LBR MSR Stack Size and TOS Pointer Range for the Pentium® 4 and the Intel® Xeon® Processor Family . . . . . . . . .17-36
-Monitoring Supported Event IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-48
-Re-indexing of COS Numbers and Mapping to CAT/CDP Mask MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-62
-MBA Delay Value MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-68
+LBR Stack Size and TOS Pointer Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-16
+IA32_DEBUGCTL Flag Encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-23
+CPL-Qualified Branch Trace Store Encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-24
+MSR_LASTBRANCH_x_TO_IP for the Goldmont Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-26
+MSR_LASTBRANCH_x_FROM_IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-29
+MSR_LASTBRANCH_x_TO_IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-29
+LBR Stack Size and TOS Pointer Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-29
+MSR_LBR_SELECT for Intel microarchitecture code name Nehalem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-29
+MSR_LBR_SELECT for Intel® microarchitecture code name Sandy Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-30
+MSR_LBR_SELECT for Intel® microarchitecture code name Haswell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-30
+MSR_LASTBRANCH_x_FROM_IP with TSX Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-31
+LBR Stack Size and TOS Pointer Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-32
+MSR_LBR_INFO_x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-32
+LBR MSR Stack Size and TOS Pointer Range for the Pentium® 4 and the Intel® Xeon® Processor Family . . . . . . . . .17-35
+Monitoring Supported Event IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-47
+Re-indexing of COS Numbers and Mapping to CAT/CDP Mask MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-61
+MBA Delay Value MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17-67
UMask and Event Select Encodings for Pre-Defined Architectural Performance Events . . . . . . . . . . . . . . . . . . . . . . . . . 18-5
Association of Fixed-Function Performance Counters with Architectural Performance Events . . . . . . . . . . . . . . . . . . 18-9
PEBS Record Format for Intel Core i7 Processor Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-19
@@ -257153,12 +257067,12 @@ MSR_OFFCORE_RSPx For L2 Miss and Outstanding Requests . . . . . . . . . . . . .
Core PMU Comparison Between the Goldmont Plus and Goldmont Microarchitectures . . . . . . . . . . . . . . . . . . . . . . . . . 18-81
Core Specificity Encoding within a Non-Architectural Umask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-84
Agent Specificity Encoding within a Non-Architectural Umask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-84
-HW Prefetch Qualification Encoding within a Non-Architectural Umask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-85
+HW Prefetch Qualification Encoding within a Non-Architectural Umask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-84
MESI Qualification Definitions within a Non-Architectural Umask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-85
Bus Snoop Qualification Definitions within a Non-Architectural Umask. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-85
-Snoop Type Qualification Definitions within a Non-Architectural Umask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-86
-At-Retirement Performance Events for Intel Core Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-89
-PEBS Performance Events for Intel Core Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-89
+Snoop Type Qualification Definitions within a Non-Architectural Umask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-85
+At-Retirement Performance Events for Intel Core Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-88
+PEBS Performance Events for Intel Core Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-88
Requirements to Program PEBS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-90
Vol. 3A xlv
@@ -257217,13 +257131,14 @@ Table 20-2.
Table 21-1.
Table 22-1.
Table 22-3.
+Table 22-2.
xlvi Vol. 3A
Performance Counter MSRs and Associated CCCR and ESCR MSRs (Processors Based on Intel NetBurst
-Microarchitecture) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-92
-Event Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-99
-CCR Names and Bit Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-103
-Effect of Logical Processor and CPL Qualification for Logical-Processor-Specific (TS) Events . . . . . . . . . . . . . . . . . 18-111
+Microarchitecture) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-91
+Event Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-98
+CCR Names and Bit Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-102
+Effect of Logical Processor and CPL Qualification for Logical-Processor-Specific (TS) Events . . . . . . . . . . . . . . . . . 18-110
Effect of Logical Processor and CPL Qualification for Non-logical-Processor-specific (TI) Events . . . . . . . . . . . . . . 18-112
Nominal Core Crystal Clock Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-128
Architectural Performance Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-2
@@ -257248,8 +257163,8 @@ Intel® Core™ i3-2xxx Processor Series and Intel® Xeon® Processors E3 and E5
Performance Events applicable only to the Processor core for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™
i5-2xxx, Intel® Core™ i3-2xxx Processor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19-77
Performance Events Applicable only to the Processor Core of Intel® Xeon® Processor E5 Family. . . . . . . . . . . . . . . .19-79
-Performance Events In the Processor Uncore for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx,
-Intel® Core™ i3-2xxx Processor Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19-81
+Performance Events In the Processor Uncore for 2nd Generation Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel®
+Core™ i3-2xxx Processor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19-81
Performance Events In the Processor Core for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500
Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19-82
Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500
@@ -257266,8 +257181,7 @@ Performance Events for the Goldmont Microarchitecture . . . . . . . . . . . . .
Performance Events for Silvermont Microarchitecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-177
Performance Events for 45 nm, 32 nm Intel® Atom™ Processors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-181
Performance Events in Intel® Core™ Solo and Intel® Core™ Duo Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-195
-Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement
-Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-200
+Performance Monitoring Events Supported by Intel NetBurst® Microarchitecture for Non-Retirement Counting19-200
Performance Monitoring Events For Intel NetBurst® Microarchitecture for At-Retirement Counting . . . . . . . . . . 19-218
Intel NetBurst® Microarchitecture Model-Specific Performance Monitoring Events (For Model Encoding 3, 4
or 6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19-222
@@ -257284,11 +257198,11 @@ Software Interrupt Handling Methods While in Virtual-8086 Mode . . . . . . . . .
Characteristics of 16-Bit and 32-Bit Program Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21-1
New Instruction in the Pentium Processor and Later IA-32 Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-4
EM and MP Flag Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22-17
+Recommended Values of the EM, MP, and NE Flags for Intel486 SX Microprocessor/Intel 487 SX Math
CONTENTS
PAGE
-Table 22-2.
Table 22-4.
Table 22-5.
Table 22-6.
@@ -257351,8 +257265,8 @@ Table 35-1.
Table 35-2.
Table 35-3.
Table 35-4.
+Table 35-5.
-Recommended Values of the EM, MP, and NE Flags for Intel486 SX Microprocessor/Intel 487 SX Math
Coprocessor System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22-17
Exception Conditions for Legacy SIMD/MMX Instructions with FP Exception and 16-Byte Alignment . . . . . . . . . . . 22-22
Exception Conditions for Legacy SIMD/MMX Instructions with XMM and FP Exception . . . . . . . . . . . . . . . . . . . . . . . . . 22-23
@@ -257366,26 +257280,26 @@ Format of Access Rights . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Format of Interruptibility State. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-6
Format of Pending-Debug-Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-7
Definitions of Pin-Based VM-Execution Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-9
-Definitions of Primary Processor-Based VM-Execution Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-9
+Definitions of Primary Processor-Based VM-Execution Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-10
Definitions of Secondary Processor-Based VM-Execution Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-11
Format of Extended-Page-Table Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-15
Definitions of VM-Function Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-16
Definitions of VM-Exit Controls. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-17
-Format of an MSR Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-18
+Format of an MSR Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-19
Definitions of VM-Entry Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-19
-Format of the VM-Entry Interruption-Information Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-20
-Format of Exit Reason . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-21
-Format of the VM-Exit Interruption-Information Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-22
+Format of the VM-Entry Interruption-Information Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-21
+Format of Exit Reason . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-22
+Format of the VM-Exit Interruption-Information Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-23
Format of the IDT-Vectoring Information Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-23
-Structure of VMCS Component Encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-25
+Structure of VMCS Component Encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24-26
Format of the Virtualization-Exception Information Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25-16
Exit Qualification for Debug Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-4
Exit Qualification for Task Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-5
Exit Qualification for Control-Register Accesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-6
Exit Qualification for MOV DR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-7
Exit Qualification for I/O Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-8
-Exit Qualification for APIC-Access VM Exits from Linear Accesses and Guest-Physical Accesses. . . . . . . . . . . . . . . . . 27-8
-Exit Qualification for EPT Violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-9
+Exit Qualification for APIC-Access VM Exits from Linear Accesses and Guest-Physical Accesses. . . . . . . . . . . . . . . . . 27-9
+Exit Qualification for EPT Violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-10
Format of the VM-Exit Instruction-Information Field as Used for INS and OUTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27-14
Format of the VM-Exit Instruction-Information Field as Used for INVEPT, INVPCID, and INVVPID . . . . . . . . . . . . . . . 27-15
Format of the VM-Exit Instruction-Information Field as Used for LIDT, LGDT, SIDT, or SGDT. . . . . . . . . . . . . . . . . . . . 27-16
@@ -257417,12 +257331,12 @@ COFI Type for Branch Instructions . . . . . . . . . . . . . . . . . . . . . . .
IP Filtering Packet Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-6
ToPA Table Entry Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-11
Algorithm to Manage Intel PT ToPA PMI and XSAVES/XRSTORS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-14
+Behavior on Restricted Memory Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35-15
Vol. 3A xlvii
CONTENTS
PAGE
-Table 35-5.
Table 35-6.
Table 35-7.
Table 35-9.
@@ -257487,9 +257401,9 @@ Table 37-7.
Table 37-8.
Table 37-9.
Table 37-10.
+Table 37-11.
xlviii Vol. 3A
-Behavior on Restricted Memory Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-15
IA32_RTIT_CTL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-16
IA32_RTIT_STATUS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-20
IA32_RTIT_OUTPUT_MASK_PTRS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35-22
@@ -257554,11 +257468,11 @@ Top-to-Bottom Layout of an SSA Frame . . . . . . . . . . . . . . . . . . . . . .
Layout of GPRSGX Portion of the State Save Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-8
Layout of EXITINFO Field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-9
Exception Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-9
+Layout of MISC region of the State Save Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .38-10
CONTENTS
PAGE
-Table 37-11.
Table 37-12.
Table 37-13.
Table 37-14.
@@ -257623,10 +257537,10 @@ Table 40-40.
Table 40-41.
Table 40-42.
Table 40-43.
+Table 40-44.
-Layout of MISC region of the State Save Area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-10
Layout of EXINFO Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-10
-Page Fault Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-10
+Page Fault Error Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-10
Layout of PAGEINFO Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-11
Layout of SECINFO Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-11
Layout of SECINFO.FLAGS Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38-11
@@ -257689,12 +257603,12 @@ Base Concurrency Restrictions of ERDINFO . . . . . . . . . . . . . . . . . . . .
Additional Concurrency Restrictions of ERDINFO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-65
EREMOVE Return Value in RAX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-68
Base Concurrency Restrictions of EREMOVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-69
+Additional Concurrency Restrictions of EREMOVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-69
Vol. 3A xlix
CONTENTS
PAGE
-Table 40-44.
Table 40-45.
Table 40-46.
Table 40-47.
@@ -257753,7 +257667,6 @@ Table C-1.
l Vol. 3A
-Additional Concurrency Restrictions of EREMOVE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41-69
ETRACK Return Value in RAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41-72
Base Concurrency Restrictions of ETRACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41-72
Additional Concurrency Restrictions of ETRACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41-72
@@ -257787,8 +257700,8 @@ Base Concurrency Restrictions of EDECVIRTCHILD. . . . . . . . . . . . . . . . .
Additional Concurrency Restrictions of EDECVIRTCHILD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-130
Base Concurrency Restrictions of EINCVIRTCHILD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-133
Additional Concurrency Restrictions of EINCVIRTCHILD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-134
-Base Concurrency Restrictions of ESETCONTEXT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-137
-Additional Concurrency Restrictions of ESETCONTEXT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-138
+Base Concurrency Restrictions of ESETCONTEXT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-136
+Additional Concurrency Restrictions of ESETCONTEXT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-137
SGX Conflict Exit Qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42-4
SMRAM Synthetic States on Asynchronous Enclave Exit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42-11
Layout of the IA32_SGX_SVN_STATUS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42-13
@@ -263960,14 +263873,17 @@ D T vd
Ignored
X
D
-4
+
+Reserved
X
D
Reserved
-Reserved
+4
+
+4
Address of
2MB page frame
@@ -263978,6 +263894,8 @@ Reserved
Address of page table
+4
+
Reserved
Address of 4KB page frame
@@ -264874,6 +264792,7 @@ page
Ignored
X
D
+3
Prot.
Key4
@@ -264885,6 +264804,8 @@ Ignored
Ignored
+3
+
Rsvd.
Address of
@@ -264900,14 +264821,16 @@ Ign.
I PPUR
PDPTE:
+0 g A C W /S
+/ 1
page
-0 g A C W /S / 1
n DT W
directory
Ignored
X
D
+3
Prot.
Key4
@@ -264919,6 +264842,8 @@ Ignored
Ignored
+3
+
Rsvd.
Rsvd.
@@ -264932,6 +264857,8 @@ Reserved
Address of page table
+3
+
Prot.
Key4
@@ -264957,7 +264884,8 @@ PDE:
page
I PPUR
-0 g A C W /S / 1
+0 g A C W /S
+/ 1
n DT W
PDE:
@@ -270361,7 +270289,7 @@ a. The segment selector and stack pointer for the stack to be used by the handle
for the currently executing task. On this new stack, the processor pushes the stack segment selector and
stack pointer of the interrupted procedure.
b. The processor then saves the current state of the EFLAGS, CS, and EIP registers on the new stack (see
-Figures 6-4).
+Figure 6-4).
c.
@@ -270369,8 +270297,8 @@ c.
If an exception causes an error code to be saved, it is pushed on the new stack after the EIP value.
If the handler procedure is going to be executed at the same privilege level as the interrupted procedure:
-a. The processor saves the current state of the EFLAGS, CS, and EIP registers on the current stack (see
-Figures 6-4).
+a. The processor saves the current state of the EFLAGS, CS, and EIP registers on the current stack (see Figure
+6-4).
b. If an exception causes an error code to be saved, it is pushed on the current stack after the EIP value.
6-12 Vol. 3A
@@ -272372,7 +272300,7 @@ To enable alignment checking, the following conditions must be true:
AM flag in CR0 register is set.
AC flag in the EFLAGS register is set.
-The CPL is 3 (protected mode or virtual-8086 mode).
+The CPL is 3 (including virtual-8086 mode).
Alignment-check exceptions (#AC) are generated only when operating at privilege level 3 (user mode). Memory
references that default to privilege level 0, such as segment descriptor loads, do not generate alignment-check
@@ -275905,7 +275833,7 @@ systems.
-Read Initial APIC ID (If the process does not support CPUID leaf 0BH) — An APIC ID is assigned to a logical
+Read Initial APIC ID (If the processor does not support CPUID leaf 0BH) — An APIC ID is assigned to a logical
processor during power up. This is the initial APIC ID reported by CPUID.1:EBX[31:24] and may be different
from the current value read from the local APIC. The initial APIC ID can be used to determine the topological
relationship between logical processors for multi-processor systems that do not support CPUID leaf 0BH.
@@ -276011,14 +275939,13 @@ CPUID instruction provides several sets of parameter information to aid software
Hardware Multi-Threading feature flag (CPUID.1:EDX[28] = 1) — Indicates when set that the physical
package is capable of supporting Intel Hyper-Threading Technology and/or multiple cores.
-•
-
-Processor topology enumeration parameters for 8-bit APIC ID:
-
8-24 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
+•
+
+Processor topology enumeration parameters for 8-bit APIC ID:
— Addressable IDs for Logical processors in the same Package (CPUID.1:EBX[23:16]) — Indicates
the maximum number of addressable ID for logical processors in a physical package. Within a physical
package, there may be addressable IDs that are not occupied by any logical processors. This parameter
@@ -276660,60 +276587,86 @@ The APIC_ID value associated with each logical processor in a multi-processor sy
sub-fields, where each sub-field corresponds a hierarchical level of the topological mapping of hardware resources.
The decomposition of an APIC_ID may consist of several sub fields representing the topology within a physical
processor package, the higher-order bits of an APIC ID may also be used by cluster vendors to represent the
-topology of cluster nodes of each coherent multiprocessor systems. If the processor does not support CPUID leaf
-0BH, the 8-bit initial APIC ID can represent 4 levels of hierarchy:
+topology of cluster nodes of each coherent multiprocessor systems:
Cluster — Some multi-threading environments consists of multiple clusters of multi-processor systems. The
-CLUSTER_ID sub-field is usually supported by vendor firmware to distinguish different clusters. For nonclustered systems, CLUSTER_ID is usually 0 and system topology is reduced to three levels of hierarchy.
+CLUSTER_ID sub-field is usually supported by vendor firmware to distinguish different clusters. For nonclustered systems, CLUSTER_ID is usually 0 and system topology is reduced.
-Package — A multi-processor system consists of two or more sockets, each mates with a physical processor
-package. The PACKAGE_ID sub-field distinguishes different physical packages within a cluster.
+Package — A physical processor package mates with a socket. A package may contain one or more software
+visible die. The PACKAGE_ID sub-field distinguishes different physical packages within a cluster.
-Core — A physical processor package consists of one or more processor cores. The CORE_ID sub-field distinguishes processor cores in a package. For a single-core processor, the width of this bit field is 0.
+Die — A software-visible chip inside a package. The DIE_ID sub-field distinguishes different die within a
+package. If there are no software visible die, the width of this bit field is 0.
-SMT — A processor core provides one or more logical processors sharing execution resources. The SMT_ID
-sub-field distinguishes logical processors in a core. The width of this bit field is non-zero if a processor core
-provides more than one logical processors.
+Tile — A set of cores, possibly within modules, that share certain resources. The TILE_ID sub-field distinguishes different tiles. If there are no software visible tiles, the width of this bit field is 0.
-SMT and CORE sub-fields are bit-wise contiguous in the APIC_ID field (see Figure 8-5).
+•
+
+Module — A set of cores that share certain resources. The MODULE_ID sub-field distinguishes different
+modules. If there are no software visible modules, the width of this bit field is 0.
8-34 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
-0
+•
+
+Core — Processor cores may be contained within modules, within tiles, on software-visible die, or appear
+directly at the package level. The CORE_ID sub-field distinguishes processor cores. For a single-core processor,
+the width of this bit field is 0.
+
+•
-X=31 if x2APIC is supported
+SMT — A processor core provides one or more logical processors sharing execution resources. The SMT_ID
+sub-field distinguishes logical processors in a core. The width of this bit field is non-zero if a processor core
+provides more than one logical processors.
+
+SMT and CORE sub-fields are bit-wise contiguous in the APIC_ID field (see Figure 8-5).
+X=31 if x2APIC supported, else X=7
X
-Otherwise X= 7
+0
Reserved
+
Cluster ID
Package ID
+Die ID
+Tile ID
+Module ID
Core ID
SMT ID
-Figure 8-5. Generalized Four level Interpretation of the APIC ID
-If the processor supports CPUID leaf 0BH, the 32-bit APIC ID can represent cluster plus several levels of topology
-within the physical processor package. The exact number of hierarchical levels within a physical processor package
-must be enumerated through CPUID leaf 0BH. Common processor families may employ topology similar to that
-represented by 8-bit Initial APIC ID. In general, CPUID leaf 0BH can support topology enumeration algorithm that
-decompose a 32-bit APIC ID into more than four sub-fields (see Figure 8-6).
+Figure 8-5. Generalized Seven Level Interpretation of the APIC ID
+If the processor supports CPUID leaf 0BH and leaf 1FH, the 32-bit APIC ID can represent cluster plus several levels
+of topology within the physical processor package. The exact number of hierarchical levels within a physical
+processor package must be enumerated through CPUID leaf 0BH and leaf 1FH. Common processor families may
+employ topology similar to that represented by 8-bit Initial APIC ID. In general, CPUID leaf 0BH and leaf 1FH can
+support a topology enumeration algorithm that decompose a 32-bit APIC ID into more than four sub-fields (see
+Figure 8-6).
+
+NOTE
+CPUID leaf 0BH and leaf 1FH can have differences in the number of level types reported (CPUID leaf
+1FH defines additional level types). If the processor supports CPUID leaf 1FH, usage of this leaf is
+preferred over leaf 0BH. CPUID leaf 0BH is available for legacy compatibility going forward.
The width of each sub-field depends on hardware and software configurations. Field widths can be determined at
-runtime using the algorithm discussed below (Example 8-16 through Example 8-20).
+runtime using the algorithm discussed below (Example 8-16 through Example 8-21).
Figure 7-6 depicts the relationships of three of the hierarchical sub-fields in a hypothetical MP system. The value of
valid APIC_IDs need not be contiguous across package boundary or core boundaries.
+Vol. 3A 8-35
+
+ MULTIPLE-PROCESSOR MANAGEMENT
+
PACKAGE
Q
CORE
@@ -276744,28 +276697,25 @@ Figure 8-6. Conceptual Six-Level Topology and 32-bit APIC ID Composition
Hierarchical Mapping of CPUID Extended Topology Leaf
-CPUID leaf 0BH provides enumeration parameters for software to identify each hierarchy of the processor topology
-in a deterministic manner. Each hierarchical level of the topology starting from the SMT level is represented numerically by a sub-leaf index within the CPUID 0BH leaf. Each level of the topology is mapped to a sub-field in the APIC
-ID, following the general relationship depicted in Figure 8-6. This mechanism allows software to query the exact
-number of levels within a physical processor package and the bit-width of each sub-field of x2APIC ID directly. For
-example,
+CPUID leaf 0BH and leaf 1FH provide enumeration parameters for software to identify each hierarchy of the
+processor topology in a deterministic manner. Each hierarchical level of the topology starting from the SMT level is
+represented numerically by a sub-leaf index within the CPUID 0BH leaf and 1FH leaf. Each level of the topology is
+mapped to a sub-field in the APIC ID, following the general relationship depicted in Figure 8-6. This mechanism
+allows software to query the exact number of levels within a physical processor package and the bit-width of each
+sub-field of x2APIC ID directly. For example,
-Starting from sub-leaf index 0 and incrementing ECX until CPUID.(EAX=0BH, ECX=N):ECX[15:8] returns an
-invalid “level type” encoding. The number of levels within the physical processor package is “N” (excluding
-PACKAGE). Using Figure 8-6 as an example, CPUID.(EAX=0BH, ECX=3):ECX[15:8] will report 00H, indicating
-sub leaf 03H is invalid. This is also depicted by a pseudo code example:
-
-Vol. 3A 8-35
-
- MULTIPLE-PROCESSOR MANAGEMENT
+Starting from sub-leaf index 0 and incrementing ECX until CPUID.(EAX=0BH or 1FH, ECX=N):ECX[15:8]
+returns an invalid “level type” encoding. The number of levels within the physical processor package is “N”
+(excluding PACKAGE). Using Figure 8-6 as an example, CPUID.(EAX=0BH or 1FH, ECX=4):ECX[15:8] will
+report 00H, indicating sub leaf 04H is invalid. This is also depicted by a pseudo code example:
Example 8-16. Number of Levels Below the Physical Processor Package
Byte type = 1;
s = 0;
While ( type ) {
-EAX = 0BH; // query each sub leaf of CPUID leaf 0BH
+EAX = 0BH or 1FH; // query each sub leaf of CPUID leaf 0BH or 1FH; CPUID leaf 1FH is preferred over leaf 0BH if available
ECX = s;
CPUID;
type = ECX[15:8]; // examine level type encoding
@@ -276776,28 +276726,108 @@ N = ECX[7:0];
Sub-leaf index 0 (ECX= 0 as input) provides enumeration parameters to extract the SMT sub-field of x2APIC
-ID. If EAX = 0BH, and ECX =0 is specified as input when executing CPUID, CPUID.(EAX=0BH,
+ID. If EAX = 0BH or 1FH, and ECX =0 is specified as input when executing CPUID, CPUID.(EAX=0BH or 1FH,
ECX=0):EAX[4:0] reports a value (a right-shift count) that allow software to extract part of x2APIC ID to
distinguish the next higher topological entities above the SMT level. This value also corresponds to the bit-width
of the sub-field of x2APIC ID corresponding the hierarchical level with sub-leaf index 0.
-For each subsequent higher sub-leaf index m, CPUID.(EAX=0BH, ECX=m):EAX[4:0] reports the right-shift
-count that will allow software to extract part of x2APIC ID to distinguish higher-level topological entities. This
-means the right-shift value at of sub-leaf m, corresponds to the least significant (m+1) subfields of the 32-bit
-x2APIC ID.
+For each subsequent higher sub-leaf index m, CPUID.(EAX=0BH or 1FH, ECX=m):EAX[4:0] reports the rightshift count that will allow software to extract part of x2APIC ID to distinguish higher-level topological entities.
+This means the right-shift value at of sub-leaf m, corresponds to the least significant (m+1) subfields of the 32bit x2APIC ID.
+
+8-36 Vol. 3A
+
+ MULTIPLE-PROCESSOR MANAGEMENT
Example 8-17. BitWidth Determination of x2APIC ID Subfields
For m = 0, m < N, m ++;
-{ cumulative_width[m] = CPUID.(EAX=0BH, ECX= m): EAX[4:0]; }
+{ cumulative_width[m] = CPUID.(EAX=0BH or 1FH, ECX= m): EAX[4:0]; }
BitWidth[0] = cumulative_width[0];
For m = 1, m < N, m ++;
BitWidth[m] = cumulative_width[m] - cumulative_width[m-1];
-Currently, only the following encoding of hierarchical level type are defined: 0 (invalid), 1 (SMT), and 2 (core). Software must not assume any “level type“ encoding value to be related to any sub-leaf index, except sub-leaf 0.
-Example 8-16 and Example 8-17 represent the general technique for using CPUID leaf 0BH to enumerate processor
-topology of more than two levels of hierarchy inside a physical package. Most processor families to date requires
-only “SMT” and “CORE” levels within a physical package. The examples in later sections will focus on these threelevel topology only.
+
+NOTE
+CPUID leaf 1FH is a preferred superset to leaf 0BH. Leaf 1FH defines additional level types, and it
+must be parsed by an algorithm that can handle the addition of future level types.
+Previously, only the following encoding of hierarchical level types were defined: 0 (invalid), 1 (SMT), and 2 (core).
+With the additional hierarchical level types available (see Section 8.9.1, “Hierarchical Mapping of Shared
+Resources” and Figure 8-5, “Generalized Seven Level Interpretation of the APIC ID” ) software must not assume
+any “level type” encoding value to be related to any sub-leaf index, except sub-leaf 0.
+Example 8-18. Support Routines for Identifying Package, Die, Core and Logical Processors from 32-bit x2APIC ID
+a.
+
+Derive the extraction bitmask for logical processors in a processor core and associated mask offset for different
+cores.
+
+//
+// This example shows how to enumerate CPU topology level types (level types may or may not be known/supported by the software)
+//
+// Below is the list of sample level types used in the example.
+// Refer to the CPUID Leaf 1FH definition for the actual level type numbers: “V2 Extended Topology Enumeration Leaf” .
+//
+// SMT
+// CORE
+// MODULE
+// TILE
+// DIE
+// PACKAGE
+//
+// The example shows how to identify and derive the extraction bitmask for the levels with identify type SMT/CORE/DIE/PACKAGE
+//
+int DeriveSMT_Mask_Offsets (void)
+{
+IF (!HWMTSupported()) return -1;
+execute cpuid with EAX = 0BH or 1FH, ECX = 0;
+IF (returned level type encoding in EXC[15:8] does not match SMT) return -1;
+Mask_SMT_shift = EAX[4:0];
+//# bits shift right of APIC ID to distinguish different cores
+SMT_MASK =~( (-1) << Mask_SMT_shift);
+//shift left to derive extraction bitmask for SMT_ID
+return 0;
+}
+
+Vol. 3A 8-37
+
+ MULTIPLE-PROCESSOR MANAGEMENT
+
+b.
+
+Derive the extraction bitmask for processor cores in a physical processor package and associated mask offset for
+different packages.
+
+int DeriveCore_Mask_Offsets (void)
+{
+IF (!HWMTSupported()) return -1;
+execute cpuid with EAX = 0BH or 1FH, ECX = 0;
+WHILE( ECX[15:8] ) {
+//level type encoding is valid
+Mask_last_known_shift = EAX[4:0]
+IF (returned level type encoding in ECX[15:8] matches CORE) {
+Mask_Core_shift = EAX[4:0];
+}
+ELSE IF (returned level type encoding in ECX[15:8] matches DIE {
+Mask_Die_shift = EAX[4:0];
+}
+//
+// Keep enumerating. Check if the next level is the desired level and if not, keep enumerating until you reach a known level
+// or the invalid level (“0” level type). If there are more levels between DIE and PACKAGE, the unknown levels will be ignored
+// and treated as an extension of the last known level (i.e., DIE in this case).
+//
+ECX++;
+execute cpuid with EAX = 0BH or 1FH;
+}
+COREPlusSMT_MASK = ~( (-1) << Mask_Core_shift);
+DIEPlusCORE_MASK = ~( (-1) << Mask_Die_shift);
+//
+// Treat levels between DIE and physical package as an extension of DIE for software choosing not to implement or recognize
+// these unknown levels.
+//
+CORE_MASK = COREPlusSMT_MASK ^ SMT_MASK;
+DIE_MASK = DIEPlusCORE_MASK ^ COREPlusSMT_MASK;
+PACKAGE_MASK = (-1) << Mask_last_known_shift;
+return -1;
+}
8.9.3
@@ -276807,18 +276837,19 @@ For Intel 64 and IA-32 processors, system hardware establishes an 8-bit initial
processor supports CPUID leaf 0BH) that is unique for each logical processor following power-up or RESET (see
Section 8.6.1). Each logical processor on the system is allocated an initial APIC ID. BIOS may implement features
that tell the OS to support less than the total number of logical processors on the system bus. Those logical processors that are not available to applications at runtime are halted during the OS boot process. As a result, the number
-valid local APIC_IDs that can be queried by affinitizing-current-thread-context (See Example 8-22) is limited to the
+valid local APIC_IDs that can be queried by affinitizing-current-thread-context (See Example 8-23) is limited to the
number of logical processors enabled at runtime by the OS boot process.
Table 8-2 shows an example of the 8-bit APIC IDs that are initially reported for logical processors in a system with
four Intel Xeon MP processors that support Intel Hyper-Threading Technology (a total of 8 logical processors, each
physical package has two processor cores and supports Intel Hyper-Threading Technology). Of the two logical
-processors within a Intel Xeon processor MP, logical processor 0 is designated the primary logical processor and
-logical processor 1 as the secondary logical processor.
-8-36 Vol. 3A
+8-38 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
+processors within a Intel Xeon processor MP, logical processor 0 is designated the primary logical processor and
+logical processor 1 as the secondary logical processor.
+
T0
T1
@@ -277006,7 +277037,7 @@ SMT ID
1H
-Vol. 3A 8-37
+Vol. 3A 8-39
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277175,12 +277206,12 @@ total number logical processors available in the platform hardware.
valid ECX index, The ECX index start at 0.
11. Maximum number addressable ID for processor cores sharing the target cache level is obtained by executing CPUID with EAX = 4
and the ECX index corresponding to the target cache level.
-8-38 Vol. 3A
+8-40 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
The extraction algorithm (for three-level mappings from an APIC ID) uses the general procedure depicted in
-Example 8-18, and is supplemented by more detailed descriptions on the derivation of topology enumeration
+Example 8-19, and is supplemented by more detailed descriptions on the derivation of topology enumeration
parameters for extraction bit masks:
1. Detect hardware multi-threading support in the processor.
2. Derive a set of bit masks that can extract the sub ID of each hierarchical level of the topology. The algorithm to
@@ -277230,11 +277261,11 @@ d. Derive the extraction bit masks using respective address sizes corresponding
PACKAGE_ID, starting from SMT_ID.
e. Apply each extraction bit mask to the 8-bit initial APIC ID to extract sub-field IDs.
-Vol. 3A 8-39
+Vol. 3A 8-41
MULTIPLE-PROCESSOR MANAGEMENT
-Example 8-18. Support Routines for Detecting Hardware Multi-Threading and Identifying the Relationships Between Package,
+Example 8-19. Support Routines for Detecting Hardware Multi-Threading and Identifying the Relationships Between Package,
Core and Logical Processors
1.
//
@@ -277263,7 +277294,7 @@ return (feature_flag_edx & HWMT_BIT); // bit 28
}
return 0;
}
-Example 8-19. Support Routines for Identifying Package, Core and Logical Processors from 32-bit x2APIC ID
+Example 8-20. Support Routines for Identifying Package, Core and Logical Processors from 32-bit x2APIC ID
a.
Derive the extraction bitmask for logical processors in a processor core and associated mask offset for different
@@ -277279,7 +277310,7 @@ SMT_MASK = ~( (-1) << Mask_SMT_shift); // shift left to derive extraction bitmas
return 0;
}
-8-40 Vol. 3A
+8-42 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277319,7 +277350,7 @@ store returned value of edx
return (unsigned) (reg_edx) ;
}
-Example 8-20. Support Routines for Identifying Package, Core and Logical Processors from 8-bit Initial APIC ID
+Example 8-21. Support Routines for Identifying Package, Core and Logical Processors from 8-bit Initial APIC ID
a.
Find the size of address space for logical processors in a physical processor package.
@@ -277337,7 +277368,7 @@ store returned value of ebx
return (unsigned char) ((reg_ebx & NUM_LOGICAL_BITS) >> 16);
}
-Vol. 3A 8-41
+Vol. 3A 8-43
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277375,7 +277406,7 @@ store returned value of ebx
return (unsigned) ((reg_ebx & INITIAL_APIC_ID_BITS) >> 24;
}
-8-42 Vol. 3A
+8-44 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277422,7 +277453,7 @@ Return SubID;
}
Software must not assume local APIC_ID values in an MP system are consecutive. Non-consecutive local APIC_IDs
may be the result of hardware configurations or debug features implemented in the BIOS or OS.
-An identifier for each hierarchical level can be extracted from an 8-bit APIC_ID using the support routines illustrated in Example 8-20. The appropriate bit mask and shift value to construct the appropriate bit mask for each
+An identifier for each hierarchical level can be extracted from an 8-bit APIC_ID using the support routines illustrated in Example 8-21. The appropriate bit mask and shift value to construct the appropriate bit mask for each
level must be determined dynamically at runtime.
8.9.5
@@ -277435,10 +277466,10 @@ following procedures are recommended:
Extract the three-level identifiers from the APIC ID of each logical processor enabled by system software. The
-sequence is as follows (See the pseudo code shown in Example 8-21 and support routines shown in Example
-8-18):
+sequence is as follows (See the pseudo code shown in Example 8-22 and support routines shown in Example
+8-19):
-Vol. 3A 8-43
+Vol. 3A 8-45
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277467,48 +277498,48 @@ each node of a clustered system is symmetric.
Assemble the three-level identifiers of SMT_ID, CORE_ID, PACKAGE_IDs into arrays for each enabled logical
-processor. This is shown in Example 8-22a.
+processor. This is shown in Example 8-23a.
To detect the number of physical packages: use PACKAGE_ID to identify those logical processors that reside in
-the same physical package. This is shown in Example 8-22b. This example also depicts a technique to construct
+the same physical package. This is shown in Example 8-23b. This example also depicts a technique to construct
a mask to represent the logical processors that reside in the same package.
To detect the number of processor cores: use CORE_ID to identify those logical processors that reside in the
-same core. This is shown in Example 8-22. This example also depicts a technique to construct a mask to
+same core. This is shown in Example 8-23. This example also depicts a technique to construct a mask to
represent the logical processors that reside in the same core.
-In Example 8-21, the numerical ID value can be obtained from the value extracted with the mask by shifting it right
+In Example 8-22, the numerical ID value can be obtained from the value extracted with the mask by shifting it right
by shift count. Algorithms below do not shift the value. The assumption is that the SubID values can be compared
for equivalence without the need to shift.
-Example 8-21. Pseudo Code Depicting Three-level Extraction Algorithm
+Example 8-22. Pseudo Code Depicting Three-level Extraction Algorithm
For Each local_APIC_ID{
// Calculate SMT_MASK, the bit mask pattern to extract SMT_ID,
// SMT_MASK is determined using topology enumertaion parameters
-// from CPUID leaf 0BH (Example 8-19);
-// otherwise, SMT_MASK is determined using CPUID leaf 01H and leaf 04H (Example 8-20).
+// from CPUID leaf 0BH (Example 8-20);
+// otherwise, SMT_MASK is determined using CPUID leaf 01H and leaf 04H (Example 8-21).
// This algorithm assumes there is symmetry across core boundary, i.e. each core within a
// package has the same number of logical processors
// SMT_ID always starts from bit 0, corresponding to the right-most bit-field
SMT_ID = APIC_ID & SMT_MASK;
// Extract CORE_ID:
-// CORE_MASK is determined in Example 8-19 or Example 8-20
+// CORE_MASK is determined in Example 8-20 or Example 8-21
CORE_ID = (APIC_ID & CORE_MASK) ;
// Extract PACKAGE_ID:
// Assume single cluster.
// Shift out the mask width for maximum logical processors per package
-// PACKAGE_MASK is determined in Example 8-19 or Example 8-20
+// PACKAGE_MASK is determined in Example 8-20 or Example 8-21
PACKAGE_ID = (APIC_ID & PACKAGE_MASK) ;
}
-8-44 Vol. 3A
+8-46 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
-Example 8-22. Compute the Number of Packages, Cores, and Processor Relationships in a MP System
+Example 8-23. Compute the Number of Packages, Cores, and Processor Relationships in a MP System
a) Assemble lists of PACKAGE_ID, CORE_ID, and SMT_ID of each enabled logical processors
//The BIOS and/or OS may limit the number of logical processors available to applications
// after system boot. The below algorithm will compute topology for the processors visible
@@ -277527,9 +277558,9 @@ while (ThreadAffinityMask ≠ 0 && ThreadAffinityMask <= SystemAffinity) {
if (ThreadAffinityMask & SystemAffinity){
Set thread to run on the processor specified in ThreadAffinityMask
Wait if necessary and ensure thread is running on specified processor
-APIC_ID = GetAPIC_ID(); // 32 bit ID in Example 8-19 or 8-bit ID in Example 8-20
+APIC_ID = GetAPIC_ID(); // 32 bit ID in Example 8-20 or 8-bit ID in Example 8-21
Extract the Package_ID, Core_ID and SMT_ID as explained in three level extraction
-algorithm of Example 8-21
+algorithm of Example 8-22
PackageID[ProcessorNUM] = PACKAGE_ID;
CoreID[ProcessorNum] = CORE_ID;
SmtID[ProcessorNum] = SMT_ID;
@@ -277555,7 +277586,7 @@ PackageIDBucket[0] = PackageID[0];
ProcessorMask = 1;
PackageProcessorMask[0] = ProcessorMask;
-Vol. 3A 8-45
+Vol. 3A 8-47
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277609,7 +277640,7 @@ CoreNum++;
// CoreProcessorMask[] array has the processor set of each core
Other processor relationships such as processor mask of sibling cores can be computed from set operations of the
PackageProcessorMask[] and CoreProcessorMask[].
-8-46 Vol. 3A
+8-48 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277665,13 +277696,13 @@ availability of MONITOR and MWAIT:
Use CPUID to query the MONITOR bit (CPUID.1.ECX[3] = 1).
If CPUID indicates support, execute MONITOR inside a TRY/EXCEPT exception handler and trap for an
exception. If an exception occurs, MONITOR and MWAIT are not supported at a privilege level greater than 0.
-See Example 8-23.
+See Example 8-24.
-Vol. 3A 8-47
+Vol. 3A 8-49
MULTIPLE-PROCESSOR MANAGEMENT
-Example 8-23. Verifying MONITOR/MWAIT Support
+Example 8-24. Verifying MONITOR/MWAIT Support
boolean MONITOR_MWAIT_works = TRUE;
try {
_asm {
@@ -277733,7 +277764,7 @@ Architectural TLB invalidations including writes to CR0, CR3, CR4 and certain MS
Power management related events (such as Thermal Monitor 2 or chipset driven STPCLK# assertion) will not cause
the monitor event pending flag to be cleared. Faults will not cause the monitor event pending flag to be cleared.
-8-48 Vol. 3A
+8-50 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277807,7 +277838,7 @@ Hyper-Threading Technology. It also describes optimizations that can help an ope
are representative of the types of modifications that appear in Windows* XP and Linux* kernel 2.4.0 operating
systems for Intel processors supporting Intel Hyper-Threading Technology. Additional optimizations for processors
-Vol. 3A 8-49
+Vol. 3A 8-51
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277849,8 +277880,8 @@ instruction.
Potential Usage of MONITOR/MWAIT in C0 Idle Loops
-An operating system may implement different handlers for different idle states. A typical OS idle loop on an ACPIcompatible OS is shown in Example 8-24:
-Example 8-24. A Typical OS Idle Loop
+An operating system may implement different handlers for different idle states. A typical OS idle loop on an ACPIcompatible OS is shown in Example 8-25:
+Example 8-25. A Typical OS Idle Loop
// WorkQueue is a memory location indicating there is a thread
// ready to run. A non-zero value for WorkQueue is assumed to
// indicate the presence of work to be scheduled on the processor.
@@ -277864,7 +277895,7 @@ ELSE {
// on Idle time accumulated
IF (IdleTime >= IdleTimeThreshhold) THEN {
// Call appropriate C1, C2, C3 state handler, C1 handler
-8-50 Vol. 3A
+8-52 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277878,7 +277909,7 @@ VOID C1Handler()
HLT
}
The MONITOR and MWAIT instructions may be considered for use in the C0 idle state loops, if MONITOR and MWAIT are supported.
-Example 8-25. An OS Idle Loop with MONITOR/MWAIT in the C0 Idle Loop
+Example 8-26. An OS Idle Loop with MONITOR/MWAIT in the C0 Idle Loop
// WorkQueue is a memory location indicating there is a thread
// ready to run. A non-zero value for WorkQueue is assumed to
// indicate the presence of work to be scheduled on the processor.
@@ -277921,7 +277952,7 @@ for runnable software tasks. Logical processors that execute idle loops consume
execution resources that might otherwise be used by the other logical processors in the physical package. For this
reason, halting idle logical processors optimizes the performance.12 If all logical processors within a physical
package are halted, the processor will enter a power-saving state.
-Vol. 3A 8-51
+Vol. 3A 8-53
MULTIPLE-PROCESSOR MANAGEMENT
@@ -277930,8 +277961,8 @@ Vol. 3A 8-51
Potential Usage of MONITOR/MWAIT in C1 Idle Loops
An operating system may also consider replacing HLT with MONITOR/MWAIT in its C1 idle loop. An example is
-shown in Example 8-26:
-Example 8-26. An OS Idle Loop with MONITOR/MWAIT in the C1 Idle Loop
+shown in Example 8-27:
+Example 8-27. An OS Idle Loop with MONITOR/MWAIT in the C1 Idle Loop
// WorkQueue is a memory location indicating there is a thread
// ready to run. A non-zero value for WorkQueue is assumed to
// indicate the presence of work to be scheduled on the processor.
@@ -277989,7 +278020,7 @@ thread’s code and data when it is dispatched for execution after being suspend
12. Excessive transitions into and out of the HALT state could also incur performance penalties. Operating systems should evaluate the
performance trade-offs for their operating system.
-8-52 Vol. 3A
+8-54 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
@@ -278069,7 +278100,7 @@ whether it should execute the BIOS boot-strap code (if it is the BSP) or enter a
AP).
The following special-purpose interprocessor interrupts (IPIs) are used during the boot phase of the MP initialization protocol. These IPIs are broadcast on the APIC bus.
-Vol. 3A 8-53
+Vol. 3A 8-55
MULTIPLE-PROCESSOR MANAGEMENT
@@ -278194,7 +278225,7 @@ BIPI to be handled, so processor 1 becomes the BSP.)
5. The newly established BSP broadcasts an FIPI message to “all including self.” The FIPI is guaranteed to be
handled only after the completion of the BIPIs that were issued by the non-BSP processors.
-8-54 Vol. 3A
+8-56 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
@@ -278263,6 +278294,10 @@ of the STPCLK# pin.
See Section 8.4.4, “MP Initialization Example,” for an annotated example the use of the MP protocol to boot IA-32
processors in an MP. This code should run on any IA-32 processor that used the MP protocol.
+Vol. 3A 8-57
+
+ MULTIPLE-PROCESSOR MANAGEMENT
+
8.11.2.1
Error Detection and Handling During the MP Initialization Protocol
@@ -278277,15 +278312,11 @@ The MP initialization protocol makes the following assumptions regarding errors
If errors are detected on the APIC bus during execution of the MP initialization protocol, the processors that
detect the errors are shut down.
-Vol. 3A 8-55
-
- MULTIPLE-PROCESSOR MANAGEMENT
-
The MP initialization protocol will be executed by processors even if they fail their BIST sequences.
-8-56 Vol. 3A
+8-58 Vol. 3A
CHAPTER 9
PROCESSOR MANAGEMENT AND INITIALIZATION
@@ -291906,8 +291937,8 @@ support for multiple processor extended states along with FP/SSE states that may
requiring the system executive to be modified each time a new processor state extension is introduced. XSAVE
feature set provide mechanisms to enumerate the supported extended states, enable some or all of them for software use, instructions to save/restore the states and enumerate the layout of the states when saved to memory.
XSAVE/XRSTOR instructions are part of the XSAVE feature set. These instructions are introduced after the introduction of FP/SSE states but can be used to manage legacy FP/SSE state along with processor extended states. See
-CHAPTER 13 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 for information
-about XSAVE feature set.
+Chapter 13 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 for information about
+XSAVE feature set.
System programming for managing processor extended states is described in sections 13.5 through 13.6. XSAVE
feature set is designed to be compatible with FXSAVE/FXRSTOR and hence much of the material through sections
13.1 to 13.4 related to SSE state also applies to XSAVE feature set with the exception of enumeration and
@@ -292462,7 +292493,7 @@ FPU and SSE state, it should execute the CLTS instruction to clear the TS flag.
THE XSAVE FEATURE SET AND PROCESSOR EXTENDED STATE
MANAGEMENT
-The architecture of XSAVE feature set is described in CHAPTER 13 of Intel® 64 and IA-32 Architectures Software
+The architecture of XSAVE feature set is described in Chapter 13 of Intel® 64 and IA-32 Architectures Software
Developer’s Manual, Volume 1. The XSAVE feature set includes the following:
@@ -304687,10 +304718,14 @@ Handler re-enables IA32_PERF_GLOBAL_CTRL
None
-17.4.8
-
Reduced software overhead
+Vol. 3B 17-15
+
+ DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+
+17.4.8
+
LBR Stack
The last branch record stack and top-of-stack (TOS) pointer MSRs are supported across Intel 64 and IA-32
@@ -304714,27 +304749,17 @@ Range of TOS Pointer
FROM_IP, TO_IP
0 to 31
-
-Vol. 3B 17-15
-
- DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-
-Table 17-4. LBR Stack Size and TOS Pointer Range (Contd.)
-DisplayFamily_DisplayModel
-
-Size of LBR Stack
-
-Component of an LBR Entry
-LBR_INFO1
-
-Range of TOS Pointer
+1
06_4EH, 06_5EH, 06_8EH, 06_9EH, 06_55H,
-06_66H, 06_7AH
+06_66H, 06_7AH, 06_67H, 06_6AH, 06_6CH,
+06_7DH, 06_7EH
32
-FROM_IP, TO_IP,
+FROM_IP, TO_IP, LBR_INFO
+
+0 to 31
06_3DH, 06_47H, 06_4FH, 06_56H, 06_3CH,
06_45H, 06_46H, 06_3FH, 06_2AH, 06_2DH,
@@ -304767,13 +304792,6 @@ FROM_IP, TO_IP
NOTES:
1. See Section 17.12.
-
-17-16 Vol. 3B
-
-0 to 31
-
- DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-
The last branch recording mechanism tracks not only branch instructions (like JMP, Jcc, LOOP and CALL instructions), but also other operations that cause a change in the instruction pointer (like external interrupts, traps and
faults). The branch recording mechanisms generally employs a set of MSRs, referred to as last branch record (LBR)
stack. The size and exact locations of the LBR stack are generally model-specific (see Chapter 2, “Model-Specific
@@ -304802,6 +304820,10 @@ LBR Stack and Intel® 64 Processors
LBR MSRs are 64-bits. In 64-bit mode, last branch records store the full address. Outside of 64-bit mode, the upper
32-bits of branch addresses will be stored as 0.
+17-16 Vol. 3B
+
+ DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+
MSR_LASTBRANCH_0_FROM_IP through MSR_LASTBRANCH_(N-1)_FROM_IP
0
@@ -304833,11 +304855,6 @@ address) of respective source/destination. Misprediction, TSX, and elapsed cycle
are reported in the LBR_INFO MSR stack.
— 000110B (64-bit LIP record format), Flags, Cycles — Stores 64-bit linear address (CS.Base +
effective address) of respective source/destination. Misprediction info is reported in the upper bits of
-
-Vol. 3B 17-17
-
- DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-
'FROM' registers in the LBR stack. Elapsed cycles since the last LBR update are reported in the upper 16 bits
of the 'TO' registers in the LBR stack (see Section 17.6).
— 000111B (64-bit LIP record format), Flags, LBR_INFO — Stores 64-bit linear address (CS.Base +
@@ -304862,6 +304879,10 @@ exception or an interrupt. The location of the last exception record (LER) MSRs
store last exception records are 64-bits. If IA-32e mode is disabled, only the lower 32-bits of the address is
recorded. If IA-32e mode is enabled, the processor writes 64-bit values into the MSR. In 64-bit mode, last exception records store 64-bit addresses; in compatibility mode, the upper 32-bits of last exception records are cleared.
+Vol. 3B 17-17
+
+ DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+
17.4.9
BTS and DS Save Area
@@ -304907,10 +304928,6 @@ counter is automatically reset to a specified value, and event counting begins a
PEBS record varies across different implementations that support PEBS. See Section 18.6.2.4.2 for details of
enumerating PEBS record format.
-17-18 Vol. 3B
-
- DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-
NOTES
Prior to processors based on the Goldmont microarchitecture, PEBS facility only supports a subset
of implementation-specific precise events. See Section 18.5.3.1 for a PEBS enhancement that can
@@ -304942,8 +304959,12 @@ should be the same as the address in the BTS buffer base field.
-BTS absolute maximum — Linear address of the next byte past the end of the BTS buffer. This address
-should be a multiple of the BTS record size (12 bytes) plus 1.
+BTS absolute maximum — Linear address of the next byte past the end of the BTS buffer. This address should
+be a multiple of the BTS record size (12 bytes) plus 1.
+
+17-18 Vol. 3B
+
+ DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -304979,11 +305000,6 @@ handled prior to processor writing the PEBS absolute maximum record.
PEBS counter reset value — A 64-bit value that the counter is to be set to when a PEBS record is written. Bits
beyond the size of the counter are ignored. This value allows state information to be collected regularly every
time the specified number of events occur.
-
-Vol. 3B 17-19
-
- DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-
IA32_DS_AREA MSR
DS Buffer Management Area
BTS Buffer Base
@@ -305038,6 +305054,11 @@ PEBS Record n
Figure 17-5. DS Save Area Example1
NOTES:
1. This example represents the format for a system that supports PEBS on only one counter.
+
+Vol. 3B 17-19
+
+ DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+
Figure 17-6 shows the structure of a 12-byte branch record in the BTS buffer. The fields in each record are as
follows:
@@ -305056,10 +305077,6 @@ service routine.
Branch predicted — Bit 4 of field indicates whether the branch that was taken was predicted (set) or not
predicted (clear).
-17-20 Vol. 3B
-
- DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-
31
4
@@ -305134,7 +305151,7 @@ When DTES64 = 1 (CPUID.1.ECX[2] = 1), the structure of the DS save area is shown
When DTES64 = 0 (CPUID.1.ECX[2] = 0) and IA-32e mode is active, the structure of the DS save area is shown in
Figure 17-8. If IA-32e mode is not active the structure of the DS save area is as shown in Figure 17-5.
-Vol. 3B 17-21
+17-20 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -305216,7 +305233,7 @@ Branch Predicted
Figure 17-9. 64-bit Branch Trace Record Format
-17-22 Vol. 3B
+Vol. 3B 17-21
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -305308,7 +305325,7 @@ and Section 17.4.9.1, “64 Bit Format of the DS Save Area”). Also see the add
10.5.1, “Local Vector Table.”
4. Establish an interrupt handler in the IDT for the vector associated with the performance counter entry in the
xAPIC LVT.
-Vol. 3B 17-23
+17-22 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -305319,9 +305336,9 @@ The following restrictions should be applied to the DS save area.
The three DS save area sections should be allocated from a non-paged pool, and marked accessed and dirty. It
-is the responsibility of the operating system to keep the pages that contain the buffer present and to mark them
-accessed and dirty. The implication is that the operating system cannot do “lazy” page-table entry propagation
-for these pages.
+is the responsibility of the operating system to keep the pages that contain the buffer present and to mark
+them accessed and dirty. The implication is that the operating system cannot do “lazy” page-table entry
+propagation for these pages.
@@ -305422,7 +305439,7 @@ MSR_DEBUGCTLB for Pentium M processors).
3. Clear the BTINT flag in the corresponding IA32_DEBUGCTL (or MSR_DEBUGCTLA MSR; or MSR_DEBUGCTLB)
if a circular BTS buffer is desired.
-17-24 Vol. 3B
+Vol. 3B 17-23
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -305602,7 +305619,7 @@ access to the DS save area. This is done by clearing TR flag in the IA32_DEBUGCT
and by clearing the precise event enable flag in the MSR_PEBS_ENABLE MSR. These settings should be
restored to their original values when exiting the ISR.
-Vol. 3B 17-25
+17-24 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -305690,15 +305707,15 @@ store source addresses
— MSR_LASTBRANCH_0_TO_IP (address 60H) through MSR_LASTBRANCH_3_TO_IP (address 63H) store
destination addresses
-17-26 Vol. 3B
+Vol. 3B 17-25
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
Last Branch Record Top-of-Stack (TOS) Pointer — The lowest significant 2 bits of the TOS Pointer MSR
-(MSR_LASTBRANCH_TOS, address 1C9H) contains a pointer to the MSR in the LBR stack that contains the
-most recent branch, interrupt, or exception recorded.
+(MSR_LASTBRANCH_TOS, address 1C9H) contains a pointer to the MSR in the LBR stack that contains the most
+recent branch, interrupt, or exception recorded.
Eight pairs of MSRs are supported in the LBR stack for 45 nm and 32 nm Intel Atom processors:
@@ -305713,8 +305730,8 @@ destination addresses
Last Branch Record Top-of-Stack (TOS) Pointer — The lowest significant 3 bits of the TOS Pointer MSR
-(MSR_LASTBRANCH_TOS, address 1C9H) contains a pointer to the MSR in the LBR stack that contains the
-most recent branch, interrupt, or exception recorded.
+(MSR_LASTBRANCH_TOS, address 1C9H) contains a pointer to the MSR in the LBR stack that contains the most
+recent branch, interrupt, or exception recorded.
The address format written in the FROM_IP/TO_IP MSRS may differ between processors. Software should query
IA32_PERF_CAPABILITIES[5:0] and consult Section 17.4.8.1. The behavior of the MSR_LER_TO_LIP and the
@@ -305793,7 +305810,7 @@ R/W
Elapsed core clocks since last update to the LBR stack.
-Vol. 3B 17-27
+17-26 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -305834,8 +305851,8 @@ destination addresses.
Last Branch Record Top-of-Stack (TOS) Pointer — The lowest significant 3 bits of the TOS Pointer MSR
-(MSR_LASTBRANCH_TOS, address 1C9H) contains a pointer to the MSR in the LBR stack that contains the most
-recent branch, interrupt, or exception recorded.
+(MSR_LASTBRANCH_TOS, address 1C9H) contains a pointer to the MSR in the LBR stack that contains the
+most recent branch, interrupt, or exception recorded.
LBR filtering is supported. Filtering of LBRs based on a combination of CPL and branch type conditions is supported.
When LBR filtering is enabled, the LBR stack only captures the subset of branches that are specified by
@@ -305866,7 +305883,7 @@ operations. See Section 17.4.1 for a description of the flags. See Figure 17-11
Last branch record (LBR) stack — There are 16 MSR pairs that store the source and destination addresses
related to recently executed branches. See Section 17.9.1.
-17-28 Vol. 3B
+Vol. 3B 17-27
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -305937,7 +305954,7 @@ LBR — Last branch/interrupt/exception
Figure 17-11. IA32_DEBUGCTL MSR for Processors based
on Intel microarchitecture code name Nehalem
-Vol. 3B 17-29
+17-28 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306116,10 +306133,10 @@ Reserved
63:9
-17-30 Vol. 3B
-
Must be zero
+Vol. 3B 17-29
+
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
17.10
@@ -306309,7 +306326,7 @@ R/W
When set, do not capture near relative jumps except near relative calls.
-Vol. 3B 17-31
+17-30 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306432,7 +306449,7 @@ R/W
When set, indicates the entry occurred in a TSX region
-17-32 Vol. 3B
+Vol. 3B 17-31
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306556,7 +306573,7 @@ When set, indicates either the target of the branch was mispredicted and/or the
direction (taken/non-taken) was mispredicted; otherwise, the target branch was
predicted.
-Vol. 3B 17-33
+17-32 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306567,8 +306584,7 @@ Streamlined Freeze_LBRs_On_PMI Operation
The FREEZE_LBRS_ON_PMI feature causes the LBRs to be frozen on a hardware request for a PMI. This prevents
the LBRs from being overwritten by new branches, allowing the PMI handler to examine the control flow that
preceded the PMI generation. Architectural performance monitoring version 4 and above supports a streamlined
-FREEZE_LBRs_ON_PMI operation for PMI service routine that replaces the legacy FREEZE_LBRs_ON_PMI operation
-(see Section 17.4.7).
+FREEZE_LBRs_ON_PMI operation for PMI service routine that replaces the legacy FREEZE_LBRs_ON_PMI operation (see Section 17.4.7).
While the legacy FREEZE_LBRS_ON_PMI clear the LBR bit in the IA32_DEBUGCTL MSR on a PMI request, the
streamlined FREEZE_LBRS_ON_PMI will set the LBR_FRZ bit in IA32_PERF_GLOBAL_STATUS. Branches will not
cause the LBRs to be updated when LBR_FRZ is set. Software can clear LBR_FRZ at the same time as it clears overflow bits by setting the LBR_FRZ bit as well as the needed overflow bit when writing to
@@ -306633,7 +306649,7 @@ IA32_MISC_ENABLE MSR — Indicates that the processor provides the BTS facilitie
Last branch record top-of-stack (TOS) pointer — The TOS Pointer MSR contains a 2-bit pointer (0-3) to
-the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded for the Pentium
+the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded for the
Last branch record (LBR) stack — The LBR stack is a circular stack that consists of four MSRs
(MSR_LASTBRANCH_0 through MSR_LASTBRANCH_3) for the Pentium 4 and Intel Xeon processor family
@@ -306642,13 +306658,13 @@ Last branch record (LBR) stack — The LBR stack is a circular stack that consis
MSR_LASTBRANCH_0_TO_IP through MSR_LASTBRANCH_15_TO_IP) for the Pentium 4 and Intel Xeon
processor family [CPUID family 0FH, model 03H].
-17-34 Vol. 3B
+Vol. 3B 17-33
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-4 and Intel Xeon processor family [CPUID family 0FH, models 0H-02H]. This pointer becomes a 4-bit pointer
-(0-15) for the Pentium 4 and Intel Xeon processor family [CPUID family 0FH, model 03H]. See also: Table
-17-17, Figure 17-12, and Section 17.13.2, “LBR Stack for Processors Based on Intel NetBurst® Microarchitecture.”
+Pentium 4 and Intel Xeon processor family [CPUID family 0FH, models 0H-02H]. This pointer becomes a 4-bit
+pointer (0-15) for the Pentium 4 and Intel Xeon processor family [CPUID family 0FH, model 03H]. See also:
+Table 17-17, Figure 17-12, and Section 17.13.2, “LBR Stack for Processors Based on Intel NetBurst® Microarchitecture.”
@@ -306722,7 +306738,7 @@ BTS_OFF_USR (disable ring 0 branch trace store) flag (bit 6) — When set, enabl
skip sending/logging non-CPL_0 BTMs to the memory-resident BTS buffer. See Section 17.13.2, “LBR Stack for
Processors Based on Intel NetBurst® Microarchitecture.”
-Vol. 3B 17-35
+17-34 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306786,10 +306802,10 @@ is the linear address of the next instruction to be executed upon returning from
Exception — If the record is for an exception, the “from” address is the linear address of the instruction that
-caused the exception to be generated and the “to” address is the address of the first instruction in the exception
-handler routine.
+caused the exception to be generated and the “to” address is the address of the first instruction in the
+exception handler routine.
-17-36 Vol. 3B
+Vol. 3B 17-35
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306831,8 +306847,8 @@ To Linear Address
Figure 17-13. LBR MSR Branch Record Layout for the Pentium 4
and Intel Xeon Processor Family
Additional information is saved if an exception or interrupt occurs in conjunction with a branch instruction. If a
-branch instruction generates a trap type exception, two branch records are stored in the LBR stack: a branch
-record for the branch instruction followed by a branch record for the exception.
+branch instruction generates a trap type exception, two branch records are stored in the LBR stack: a branch record
+for the branch instruction followed by a branch record for the exception.
If a branch instruction is immediately followed by an interrupt, a branch record is stored in the LBR stack for the
branch instruction followed by a record for the interrupt.
@@ -306868,7 +306884,8 @@ exception being generated) in the last branch record (LBR) stack. For more infor
Branch Record (LBR) Stack” below.
— BTF (single-step on branches) flag (bit 1) — When set, the processor treats the TF flag in the EFLAGS
register as a “single-step on branches” flag rather than a “single-step on instructions” flag. This mechanism
-Vol. 3B 17-37
+
+17-36 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306913,8 +306930,7 @@ at 40H). See Figure 17-15.
-Last Branch Record Top-of-Stack (TOS) Pointer — The TOS Pointer MSR contains a 3-bit pointer (bits 2-0)
-to the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded. For Intel
+Last Branch Record Top-of-Stack (TOS) Pointer — The TOS Pointer MSR contains a 3-bit pointer (bits 20) to the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded. For Intel
Core Solo and Intel Core Duo processors, this MSR is located at register address 01C9H.
For compatibility, the Intel Core Solo and Intel Core Duo processors provide two 32-bit MSRs (the
@@ -306937,7 +306953,7 @@ From Linear Address
Figure 17-15. LBR Branch Record Layout for the Intel Core Solo
and Intel Core Duo Processor
-17-38 Vol. 3B
+Vol. 3B 17-37
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -306998,7 +307014,7 @@ Debug store (DS) feature flag (bit 21), returned by the CPUID instruction — In
processor provides the debug store (DS) mechanism, which allows BTMs to be stored in a memory-resident
BTS buffer. See Section 17.4.5, “Branch Trace Store (BTS).”
-Vol. 3B 17-39
+17-38 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307010,9 +307026,8 @@ Processors, these pairs are located at register addresses 040H-047H. See Figure
-Last Branch Record Top-of-Stack (TOS) Pointer — The TOS Pointer MSR contains a 3-bit pointer (bits 2-0)
-to the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded. For Pentium
-M Processors, this MSR is located at register address 01C9H.
+Last Branch Record Top-of-Stack (TOS) Pointer — The TOS Pointer MSR contains a 3-bit pointer (bits 20) to the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded. For
+Pentium M Processors, this MSR is located at register address 01C9H.
MSR_LASTBRANCH_0
@@ -307061,7 +307076,7 @@ MSRs) for the last branch and the last exception or interrupt taken by the proce
being generated. The processor clears this flag whenever a debug exception, such as an instruction or data
breakpoint or single-step trap occurs.
-17-40 Vol. 3B
+Vol. 3B 17-39
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307112,7 +307127,8 @@ last branch, interrupt, or exception that the processor took prior to a debug ex
branch occurs, the processor loads the address of the branch instruction into the LastBranchFromIP MSR and loads
the target address for the branch into the LastBranchToIP MSR.
When an interrupt or exception occurs (other than a debug exception), the address of the instruction that was
-interrupted by the exception or interrupt is loaded into the LastBranchFromIP MSR and the address of the exception or interrupt handler that is called is loaded into the LastBranchToIP MSR.
+interrupted by the exception or interrupt is loaded into the LastBranchFromIP MSR and the address of the exception
+or interrupt handler that is called is loaded into the LastBranchToIP MSR.
The LastExceptionToIP and LastExceptionFromIP MSRs (also 32-bit registers) record the instruction pointers for
the last branch that the processor took prior to an exception or interrupt being generated. When an exception or
interrupt occurs, the contents of the LastBranchToIP and LastBranchFromIP MSRs are copied into these registers
@@ -307126,13 +307142,12 @@ records for the Pentium 4 and Intel Xeon processors.
Monitoring Branches, Exceptions, and Interrupts
-When the LBR flag in the DEBUGCTLMSR register is set, the processor automatically begins recording branches
-that it takes, exceptions that are generated (except for debug exceptions), and interrupts that are serviced. Each
-time a branch, exception, or interrupt occurs, the processor records the to and from instruction pointers in the
-LastBranchToIP and LastBranchFromIP MSRs. In addition, for interrupts and exceptions, the processor copies the
+When the LBR flag in the DEBUGCTLMSR register is set, the processor automatically begins recording branches that
+it takes, exceptions that are generated (except for debug exceptions), and interrupts that are serviced. Each time
+a branch, exception, or interrupt occurs, the processor records the to and from instruction pointers in the LastBranchToIP and LastBranchFromIP MSRs. In addition, for interrupts and exceptions, the processor copies the
contents of the LastBranchToIP and LastBranchFromIP MSRs into the LastExceptionToIP and LastExceptionFromIP
MSRs prior to recording the to and from addresses of the interrupt or exception.
-Vol. 3B 17-41
+17-40 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307143,9 +307158,8 @@ addresses of the last branch prior to an interrupt or exception are retained in
The debugger can use the last branch, interrupt, and/or exception addresses in combination with code-segment
selectors retrieved from the stack to reset breakpoints in the breakpoint-address registers (DR0 through DR3),
allowing a backward trace from the manifestation of a particular bug toward its source. Because the instruction
-pointers recorded in the LastBranchToIP, LastBranchFromIP, LastExceptionToIP, and LastExceptionFromIP MSRs are
-offsets into a code segment, software must determine the segment base address of the code segment associated
-with the control transfer to calculate the linear address to be placed in the breakpoint-address registers. The
+pointers recorded in the LastBranchToIP, LastBranchFromIP, LastExceptionToIP, and LastExceptionFromIP MSRs
+are offsets into a code segment, software must determine the segment base address of the code segment associated with the control transfer to calculate the linear address to be placed in the breakpoint-address registers. The
segment base address can be determined by reading the segment selector for the code segment from the stack
and using it to locate the segment descriptor for the segment in the GDT or LDT. The segment base address can
then be read from the segment descriptor.
@@ -307204,7 +307218,7 @@ The specific processor configuration determines the behavior. Constant TSC behav
of each clock tick is uniform and supports the use of the TSC as a wall clock timer even if the processor core
changes frequency. This is the architectural behavior moving forward.
-17-42 Vol. 3B
+Vol. 3B 17-41
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307225,10 +307239,11 @@ emulate the instruction through a user-accessible programming interface.
The RDTSC instruction is not serializing or ordered with other instructions. It does not necessarily wait until all
previous instructions have been executed before reading the counter. Similarly, subsequent instructions may begin
execution before the RDTSC instruction operation is performed.
-The RDMSR and WRMSR instructions read and write the time-stamp counter, treating the time-stamp counter as
-an ordinary MSR (address 10H). In the Pentium 4, Intel Xeon, and P6 family processors, all 64-bits of the timestamp counter are read using RDMSR (just as with RDTSC). When WRMSR is used to write the time-stamp counter
-on processors before family [0FH], models [03H, 04H]: only the low-order 32-bits of the time-stamp counter can
-be written (the high-order 32 bits are cleared to 0). For family [0FH], models [03H, 04H, 06H]; for family [06H]],
+The RDMSR and WRMSR instructions read and write the time-stamp counter, treating the time-stamp counter as an
+ordinary MSR (address 10H). In the Pentium 4, Intel Xeon, and P6 family processors, all 64-bits of the time-stamp
+counter are read using RDMSR (just as with RDTSC). When WRMSR is used to write the time-stamp counter on
+processors before family [0FH], models [03H, 04H]: only the low-order 32-bits of the time-stamp counter can be
+written (the high-order 32 bits are cleared to 0). For family [0FH], models [03H, 04H, 06H]; for family [06H]],
model [0EH, 0FH]; for family [06H]], DisplayModel [17H, 1AH, 1CH, 1DH]: all 64 bits are writable.
17.17.1
@@ -307258,7 +307273,7 @@ controlled by CR4.TSD (Time Stamp Disable flag).
User mode software can use RDTSCP to detect if CPU migration has occurred between successive reads of the TSC.
It can also be used to adjust for per-CPU differences in TSC values in a NUMA system.
-Vol. 3B 17-43
+17-42 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307324,7 +307339,7 @@ L3 cache monitoring (currently the last level cache in most server platforms).
Memory Bandwidth Monitoring (MBM), introduced in the Intel® Xeon® processor E5 v4 family, builds on the CMT
infrastructure to allow monitoring of bandwidth from one level of the cache hierarchy to the next - in this case
2. IA32_TSC_ADJUST MSR and the TSC-offset field in the VM execution controls of VMCS are some of the common interfaces that privileged software can use to manage the time stamp counter for keeping time
-17-44 Vol. 3B
+Vol. 3B 17-43
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307401,10 +307416,10 @@ controllers on the same package).
Enabling Monitoring: Usage Flow
-Figure 17-19 illustrates the key steps for OS/VMM to detect support of shared resource monitoring features such
-as CMT and enable resource monitoring for available resource types and monitoring events.
+Figure 17-19 illustrates the key steps for OS/VMM to detect support of shared resource monitoring features such as
+CMT and enable resource monitoring for available resource types and monitoring events.
-Vol. 3B 17-45
+17-44 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307450,8 +307465,9 @@ processor provides the following programming interfaces for shared resource moni
CPUID leaf function 0FH (Shared Resource Monitoring Enumeration leaf) provides information on available
-resource types (see Section 17.18.4), and monitoring capabilities for each resource type (see Section 17.18.5).
-Note CMT and MBM capabilities are enumerated as separate event vectors using shared enumeration infrastructure under a given resource type.
+resource types (see Section 17.18.4), and monitoring capabilities for each resource type (see Section
+17.18.5). Note CMT and MBM capabilities are enumerated as separate event vectors using shared enumeration
+infrastructure under a given resource type.
@@ -307492,8 +307508,7 @@ enumeration data:
Monitoring leaf sub-function 0 enumerates available resources that support monitoring, i.e. executing CPUID
with EAX=0FH and ECX=0H. In the initial implementation, L3 cache is the only resource type available. Each
-
-17-46 Vol. 3B
+Vol. 3B 17-45
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307501,8 +307516,8 @@ supported resource type is represented by a bit in CPUID.(EAX=0FH, ECX=0):EDX[31
corresponds to the sub-leaf index (ResID) that software must use to query details of the monitoring capability
of that resource type (see Figure 17-21 and Figure 17-22). Reserved bits of CPUID.(EAX=0FH,
ECX=0):EDX[31:2] correspond to unsupported sub-leaves of the CPUID.0FH leaf. Additionally,
-CPUID.(EAX=0FH, ECX=0H):EBX reports the highest RMID value of any resource type that supports
-monitoring in the processor.
+CPUID.(EAX=0FH, ECX=0H):EBX reports the highest RMID value of any resource type that supports monitoring
+in the processor.
CPUID.(EAX=0FH, ECX=0H) Output: (EAX: Reserved; ECX: Reserved)
31
@@ -307564,7 +307579,7 @@ The raw numerical value reported from IA32_QM_CTR can be converted to the final
bandwidth in bytes per sampled time period) by multiplying the counter value by the value from CPUID.(EAX=0FH,
ECX=1H).EBX, see Figure 17-21.
-Vol. 3B 17-47
+17-46 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307608,10 +307623,9 @@ platforms, this represents memory bandwidth.
CPUID.(EAX=0FH, ECX=1H).EDX[bit 2]: indicates L3 local memory bandwidth monitoring event is supported if
set. This event monitors the L3 external bandwidth satisfied by the local memory. In most platforms that
-support this event, L3 requests are likely serviced by a memory system with non-uniform memory architecture.
-This allows bandwidth to off-package memory resources to be tracked by subtracting local from total bandwidth
-(for instance, bandwidth over QPI to a memory controller on another physical processor could be tracked by
-subtraction).
+support this event, L3 requests are likely serviced by a memory system with non-uniform memory architecture. This allows bandwidth to off-package memory resources to be tracked by subtracting local from total
+bandwidth (for instance, bandwidth over QPI to a memory controller on another physical processor could be
+tracked by subtraction).
The corresponding Event ID can be looked up from Table 17-18. The L3 bandwidth data accumulated in
IA32_QM_CTR can be converted to total bandwidth (in bytes) using CPUID.(EAX=0FH, ECX=1H).EBX.
@@ -307656,7 +307670,7 @@ first step is to associate a given software thread (or multiple threads as part
Note that the process of associating an RMID with a given software thread is the same for all shared resource monitoring features (CMT, MBM), and a given RMID number has the same meaning from the viewpoint of any logical
processors in a package. Stated another way, a thread may be associated in a 1:1 mapping with an RMID, and that
-17-48 Vol. 3B
+Vol. 3B 17-47
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307725,7 +307739,7 @@ Software can program an RMID / Event ID pair into the IA32_QM_EVTSEL MSR bit fie
read a particular counter for a given resource. The currently supported list of Monitoring Event IDs is discussed
in Section 17.18.5, which covers feature-specific details.
-Vol. 3B 17-49
+17-48 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307824,7 +307838,7 @@ logarithm of the total cache size divided by the Upscaling Factor from CPUID.
In Memory Bandwidth Monitoring the initial counter size is 24 bits, and retrieving the value at 1Hz or faster is sufficient to ensure at most one rollover per sampling period. Any future changes to counter width will be enumerated
to software.
-17-50 Vol. 3B
+Vol. 3B 17-49
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307846,7 +307860,8 @@ out of L3.
17.18.8.3 Monitoring Operation with Other Operating Modes
The states in IA32_PQR_ASSOC and monitoring counter are unmodified across an SMI delivery. Thus, the execution of SMM handler code and SMM handler’s data can manifest as spurious contribution in the monitored data.
-It is possible for an SMM handler to minimize the impact on of spurious contribution in the QOS monitoring counters by reserving a dedicated RMID for monitoring the SMM handler. Such an SMM handler can save the previously
+It is possible for an SMM handler to minimize the impact on of spurious contribution in the QOS monitoring counters
+by reserving a dedicated RMID for monitoring the SMM handler. Such an SMM handler can save the previously
configured QOS Monitoring state immediately upon entering SMM, and restoring the QOS monitoring state back to
the prev-SMM RMID upon exit.
@@ -307880,7 +307895,7 @@ below for OS/Hypervisor with respect to ring 3 software and virtual guests). Dep
or L3 cache allocation capability may be provided, and the technology is designed to scale across multiple cache
levels and technology generations.
-Vol. 3B 17-51
+17-50 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307959,7 +307974,7 @@ of COS tags per resource for instance, the COS management overhead is constant.
bitmask associated with that class. Bitmasks are configured via the IA32_resourceType_MASK_n MSRs, where
resourceType indicates a resource type (e.g. “L3” for the L3 cache) and “n” indicates a COS number.
The basic ingredients of Cache Allocation Technology are as follows:
-17-52 Vol. 3B
+Vol. 3B 17-51
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -307990,7 +308005,8 @@ Implementation-dependent mechanisms to indicate which COS is associated with a m
enforce the cache allocation on a per COS basis.
A capacity bitmask (CBM) provides a hint to the hardware indicating the cache space an application should be
-limited to as well as providing an indication of overlap and isolation in the CAT-capable cache from other applications contending for the cache. The bit length of the capacity mask available generally depends on the configuration of the cache and is specified in the enumeration process for CAT in CPUID (this may vary between models in a
+limited to as well as providing an indication of overlap and isolation in the CAT-capable cache from other applications contending for the cache. The bit length of the capacity mask available generally depends on the configuration
+of the cache and is specified in the enumeration process for CAT in CPUID (this may vary between models in a
processor family as well). Similarly, other parameters such as the number of supported COS may vary for each
resource type, and these details can be enumerated via CPUID.
@@ -308177,7 +308193,7 @@ contiguous '1's (including zero) will result in a general protection fault (#GP(
way-based implementations, one capacity mask bit corresponds to some number of ways in cache, but the specific
mapping is implementation-dependent. In all cases, a mask bit set to '1' specifies that a particular Class of Service
can allocate into the cache subset represented by that bit. A value of '0' in a mask bit specifies that a Class of
-Vol. 3B 17-53
+17-52 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -308188,9 +308204,9 @@ as 8-bit vectors, though this may vary depending on the implementation and how t
COS are implementation-dependent) have full access to the cache. The second case shows an overlapped case,
which would allow some lower-priority threads share cache space with the highest priority threads. The third case
shows various non-overlapped partitioning schemes. As a matter of software policy for extensibility COS0 should
-typically be considered and configured as the highest priority COS, followed by COS1, and so on, though there is no
-hardware restriction enforcing this mapping. When the system boots all threads are initialized to COS0, which has
-full access to the cache by default.
+typically be considered and configured as the highest priority COS, followed by COS1, and so on, though there is
+no hardware restriction enforcing this mapping. When the system boots all threads are initialized to COS0, which
+has full access to the cache by default.
Though the representation of the CBMs looks similar to a way-based mapping they are independent of any specific
enforcement implementation (e.g. way partitioning.) Rather, this is a convenient manner to represent capacity,
overlap and isolation of cache space. For example, executing a POPCNT instruction (population count of set bits) on
@@ -308271,7 +308287,7 @@ can be programmed via the global set of CAT configuration registers (in the case
IA32_L3_MASK_n MSRs, where “n” is the Class of Service, starting from zero). In all architectural implementations
supporting CPUID it is possible to change the CBMs dynamically, during program execution, unless stated otherwise by Intel.
The currently running application's Class of Service is communicated to the hardware through the per-logicalprocessor PQR MSR (IA32_PQR_ASSOC MSR). When the OS schedules an application thread on a logical processor,
-17-54 Vol. 3B
+Vol. 3B 17-53
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -308306,7 +308322,6 @@ By default, CDP is disabled on the processor. If the CAT MSRs are used without e
the CAT mask MSRs are re-mapped into interleaved pairs of mask MSRs for data or code fetches (see
-2
Figure 17-29),
@@ -308325,7 +308340,7 @@ where CDP is enabled, and each COS number maps 1:2 to two masks, one for code an
code and data to be either overlapped or isolated to varying degrees either globally or on a per-COS basis,
depending on application and system needs.
-Vol. 3B 17-55
+17-54 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -308573,8 +308588,6 @@ COS1.Data
0
COS1.Code
-Other COS.Data
-Other COS.Code
0
@@ -308672,13 +308685,16 @@ Other COS.Code
1
+Other COS.Data
+Other COS.Code
+
CAT with
CDP
Figure 17-29. Code and Data Capacity Bitmasks of CDP
When CDP is enabled, the existing mask space for CAT-only operation is split. As an example if the system supports
16 CAT-only COS, when CDP is enabled the same MSR interfaces are used, however half of the masks correspond
-to code, half correspond to data, and the effective number of COS is reduced by half. Code/Data masks are defined
+to code, half correspond to data, and the effective number2of COS is reduced by half. Code/Data masks are defined
per-COS and interleaved in the MSR space as described in subsequent sections.
In cases where CPUID exposes a non-even number of supported Classes of Service for the CAT or CDP features,
software using CDP should use the lower matched pairs of code/data masks, and any upper unpaired masks should
@@ -308693,7 +308709,7 @@ Enabling Cache Allocation Technology Usage Flow
Figure 17-30 illustrates the key steps for OS/VMM to detect support of Cache Allocation Technology and enable
priority-based resource allocation for a CAT-capable resource.
-17-56 Vol. 3B
+Vol. 3B 17-55
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -308762,8 +308778,8 @@ same level as the L3 cache). Software may determine which logical processors sha
to a core, or shared across multiple cores) by performing a write to one of these MSRs and noting which logical
threads observe the change. Example flows for a similar method to determine register scope are described in
Section 15.5.2, “System Software Recommendation for Managing CMCI and Machine Check Resources”.
-Software may also use CPUID leaf 4 to determine the maximum number of logical processor IDs that may
-share a given level of the cache.
+Software may also use CPUID leaf 4 to determine the maximum number of logical processor IDs that may share
+a given level of the cache.
@@ -308779,7 +308795,8 @@ CPUID leaf function 10H (Cache Allocation Technology Enumeration leaf) provides
CAT Enumeration leaf sub-function 0 enumerates available resource types that support allocation control, i.e.
by executing CPUID with EAX=10H and ECX=0H. Each supported resource type is represented by a bit field in
-Vol. 3B 17-57
+
+17-56 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -308884,7 +308901,7 @@ minus-one notation, i.e. a value of 15 corresponds to the capability bitmask hav
— CPUID.(EAX=10H, ECX=1):EBX[31:0] reports a bit mask. Each set bit within the length of the CBM
indicates the corresponding unit of the L3 allocation may be used by other entities in the platform (e.g. an
-17-58 Vol. 3B
+Vol. 3B 17-57
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -308964,7 +308981,7 @@ may result if COS are migrated frequently. This is aligned with the industry-sta
caches after a migration. In general, for best performance, minimize thread migration and COS migration across
processor logical threads and processor cores.
-Vol. 3B 17-59
+17-58 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -308975,7 +308992,8 @@ corresponding IA32_resourceType_MASK_n register, where 'n' corresponds to a numb
range of COS, i.e. the range between 0 and CPUID.(EAX=10H, ECX=ResID):EDX[15:0], inclusive, and
'resourceType' corresponds to a specific resource as enumerated by the set bits of CPUID.(EAX=10H,
ECX=0):EAX[31:1], for instance, ‘L2’ or ‘L3’ cache.
-A hierarchy of MSRs is reserved for Cache Allocation Technology registers of the form IA32_resourceType_MASK_n:
+A hierarchy of MSRs is reserved for Cache Allocation Technology registers of the form
+IA32_resourceType_MASK_n:
@@ -309024,9 +309042,9 @@ Figure 17-34. IA32_PQR_ASSOC, IA32_L3_MASK_n MSRs
-Within the same CAT range hierarchy, another set of registers is defined for resourceType 'L2', corresponding to
-the L2 cache in a platform, and MSRs IA32_L2_MASK_n are defined for n=[0,63] at addresses 0D10H through
-0D4FH (inclusive).
+Within the same CAT range hierarchy, another set of registers is defined for resourceType 'L2', corresponding
+to the L2 cache in a platform, and MSRs IA32_L2_MASK_n are defined for n=[0,63] at addresses 0D10H
+through 0D4FH (inclusive).
Figure 17-34 and Figure 17-35 provide an overview of the relevant registers.
@@ -309060,11 +309078,11 @@ Allocation Features”.
17.19.4.4 Class of Service to Cache Mask Association: Common Across Allocation Features
After configuring the available classes of service with the preferred set of capacity bitmasks, the OS/VMM can set
the IA32_PQR_ASSOC.COS of a logical processor to the class of service with the desired CBM when a thread
-context switch occurs. This allows the OS/VMM to indicate which class of service an executing thread/VM belongs
-17-60 Vol. 3B
+Vol. 3B 17-59
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+context switch occurs. This allows the OS/VMM to indicate which class of service an executing thread/VM belongs
within. Each logical processor contains an instance of the IA32_PQR_ASSOC register at MSR location 0C8FH, and
Figure 17-34 shows the bit field layout for this register. Bits[63:32] contain the COS field for each logical processor.
Note that placing the RMID field within the same PQR register enables both RMID and CLOS to be swapped at
@@ -309121,7 +309139,7 @@ socket). Refer to Section 17.19.7 for software considerations while enabling or
When CDP is enabled, the existing CAT mask MSR space is re-mapped to provide a code mask and a data mask per
COS. The re-mapping is shown in Table 17-19.
-Vol. 3B 17-61
+17-60 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -309239,7 +309257,7 @@ L2 CDP Enable
Figure 17-37. Layout of IA32_L2_QOS_CFG
-17-62 Vol. 3B
+Vol. 3B 17-61
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -309301,7 +309319,7 @@ When reading an IA32_resourceType_MASK_n register the current capacity bit mask
As noted previously, software should minimize migrations of COS across logical processors (across threads or
cores), as a reduction in the accuracy of the Cache Allocation feature may result if COS are migrated frequently.
-Vol. 3B 17-63
+17-62 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -309323,8 +309341,9 @@ cache of a portion thereof may be powered off. Upon resuming an active state any
will be filled subject to the cache capacity bitmasks. Any data in the cache prior to the cache shrink or power off
may have been flushed to memory during the process of entering the idle state, however, and is not guaranteed to
remain in the cache. If differentiation between threads is the goal of system software then this model allows
-substantial power savings while continuing to deliver performance differentiation. If system software needs optimal
-determinism then power saving modes which flush portions of the caches and power them off should be disabled.
+substantial power savings while continuing to deliver performance differentiation. If system software needs
+optimal determinism then power saving modes which flush portions of the caches and power them off should be
+disabled.
NOTE
IA32_PQR_ASSOC is saved and restored across C6 entry/exit. Similarly, the mask register contents
@@ -309355,17 +309374,17 @@ When CDP is enabled,
— Two mask sets exist for each COS number, one for code, one for data.
— Masks for code/data are interleaved in the MSR address space (see Table 17-19).
+Vol. 3B 17-63
+
+ DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+
17.19.7
Introduction to Memory Bandwidth Allocation
The Memory Bandwidth Allocation (MBA) feature provides indirect and approximate control over memory bandwidth available per-core, and was introduced on the Intel Xeon Processor Scalable Family. This feature provides a
-17-64 Vol. 3B
-
- DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
-
-method to control applications which may be over-utilizing bandwidth relative to their priority in environments
-such as the data-center.
+method to control applications which may be over-utilizing bandwidth relative to their priority in environments such
+as the data-center.
The MBA feature uses existing constructs from the Resource Director Technology (RDT) feature set including
Classes of Service (CLOS). A given CLOS used for L3 CAT for instance means the same thing as a CLOS used for
MBA. Infrastructure such as the MSR used to associate a thread with a CLOS (the IA32_PQR_ASSOC_MSR) and
@@ -309419,12 +309438,12 @@ control is needed, the Memory Bandwidth Monitoring (MBM) feature can be used as
which makes decisions about the MBA throttling level to apply.
Enumeration and configuration details are discussed below followed by usage model considerations.
-17.19.7.1 Memory Bandwidth Allocation Enumeration
-Similar to other RDT features, enumeration of the presence and details of the MBA feature is provided via a subleaf of the CPUID instruction.
-Vol. 3B 17-65
+17-64 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+17.19.7.1 Memory Bandwidth Allocation Enumeration
+Similar to other RDT features, enumeration of the presence and details of the MBA feature is provided via a subleaf of the CPUID instruction.
Key components of the enumeration are as follows.
@@ -309473,7 +309492,7 @@ common subset or enable more flexibility by selectively applying resource contro
CLOS and thread mapping. In all cases, CLOS[0] supports all RDT resource control features present on the platform.
Discussion on the interpretation and usage of the MBA delay values is provided in Section 17.19.7.2 on MBA configuration.
-17-66 Vol. 3B
+Vol. 3B 17-65
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -309547,7 +309566,7 @@ IA32_L2_QoS_Ext_BW_Thrtl_n MSRs to update the delay values applied for a specifi
CPUID.(EAX=10H, ECX=ResID=1):EDX[15:0] as described in Section 17.19.7.1. For instance, if 16 CLOS are
supported then the valid MSR range will extend from D50H through D5F inclusive.
-Vol. 3B 17-67
+17-66 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
@@ -309577,8 +309596,8 @@ IA32_L2_QoS_Ext_BW_Thrtl_'COS_MAX'
D50H + COS_MAX from CPUID.10H.3
The definition for the MBA delay value MSRs is provided in Figure 17.39. The lower 16 bits are used for MBA delay
-values, and values from zero to the maximum from the CPUID MBA_MAX-1 value are supported. Values outside this
-range will generate #GP(0).
+values, and values from zero to the maximum from the CPUID MBA_MAX-1 value are supported. Values outside
+this range will generate #GP(0).
If linear input throttling values are indicated by CPUID.(EAX=10H, ECX=ResID=3):ECX[bit 2] then values from
zero through the MBA_MAX field from CPUID.(EAX=10H, ECX=ResID=3):EAX[11:0] are supported as inputs. In
the linear mode the input precision is defined as 100-(MBA_MAX). For instance, if the MBA_MAX value is 90, the
@@ -309617,6 +309636,10 @@ As control is provided per processor core (the max of the delay values of the pe
care should be taking in scheduling threads so as to not inadvertently place a high-priority thread (with zero
intended MBA throttling) next to a low-priority thread (with MBA throttling intended), which would lead to inadvertent throttling of the high-priority thread.
+Vol. 3B 17-67
+
+ DEBUG, BRANCH PROFILE, TSC, AND INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) FEATURES
+
17-68 Vol. 3B
CHAPTER 18
@@ -309792,14 +309815,14 @@ Architectural performance monitoring provides a CPUID mechanism for enumerating
-Number of performance monitoring counters available in a logical processor (each IA32_PERFEVTSELx MSR is
-paired to the corresponding IA32_PMCx MSR)
+Number of performance monitoring counters available to software in a logical processor (each
+IA32_PERFEVTSELx MSR is paired to the corresponding IA32_PMCx MSR).
-Number of bits supported in each IA32_PMCx
-Number of architectural performance monitoring events supported in a logical processor
+Number of bits supported in each IA32_PMCx.
+Number of architectural performance monitoring events supported in a logical processor.
Software can use CPUID to discover architectural performance monitoring availability (CPUID.0AH). The architectural performance monitoring leaf provides an identifier corresponding to the version number of architectural
performance monitoring available in the processor.
@@ -309819,7 +309842,9 @@ event select registers. These MSRs have the following properties:
IA32_PMCx MSRs start at address 0C1H and occupy a contiguous block of MSR address space; the number of
-MSRs per logical processor is reported using CPUID.0AH:EAX[15:8].
+MSRs per logical processor is reported using CPUID.0AH:EAX[15:8]. Note that this may vary from the number
+of physical counters present on the hardware, because an agent running at a higher privilege level (e.g., a
+VMM) may not expose all counters.
@@ -310495,12 +310520,15 @@ Vol. 3B 18-11
PERFORMANCE MONITORING
-Note: The number of general-purpose performance monitoring counters (i.e. N in Figure 18-9) can vary across
-processor generations within a processor family, across processor families, or could be different depending on
-the configuration chosen at boot time in the BIOS regarding Intel Hyper Threading Technology, (e.g. N=2 for 45
-nm Intel Atom processors; N =4 for processors based on the Nehalem microarchitecture; for processors based
-on the Sandy Bridge microarchitecture, N = 4 if Intel Hyper Threading Technology is active and N=8 if not
-active).
+NOTE
+The number of general-purpose performance monitoring counters (i.e., N in Figure 18-9) can vary
+across processor generations within a processor family, across processor families, or could be
+different depending on the configuration chosen at boot time in the BIOS regarding Intel Hyper
+Threading Technology, (e.g. N=2 for 45 nm Intel Atom processors; N =4 for processors based on
+the Nehalem microarchitecture; for processors based on the Sandy Bridge microarchitecture, N =
+4 if Intel Hyper Threading Technology is active and N=8 if not active). In addition, the number of
+counters may vary from the number of physical counters present on the hardware, because an
+agent running at a higher privilege level (e.g., a VMM) may not expose all counters.
Global Enable Controls IA32_PERF_GLOBAL_CTRL
35 34 33 32 31
@@ -310997,20 +311025,26 @@ Counting off-core response requires additional event qualification configuration
IA32_PERFEVTSELx. Two off-core response MSRs are provided to use in conjunction with specific event codes
that must be specified with IA32_PERFEVTSELx.
+NOTE
+The number of counters available to software may vary from the number of physical counters
+present on the hardware, because an agent running at a higher privilege level (e.g., a VMM) may
+not expose all counters. CPUID.0AH:EAX[15:8] reports the MSRs available to software; see Section
+18.2.1.
+
18.3.1.1.1 Processor Event Based Sampling (PEBS)
All four general-purpose performance counters, IA32_PMCx, can be used for PEBS if the performance event
supports PEBS. Software uses IA32_MISC_ENABLE[7] and IA32_MISC_ENABLE[12] to detect whether the performance monitoring facility and PEBS functionality are supported in the processor. The MSR IA32_PEBS_ENABLE
provides 4 bits that software must use to enable which IA32_PMCx overflow condition will cause the PEBS record to
be captured.
-Additionally, the PEBS record is expanded to allow latency information to be captured. The MSR
-IA32_PEBS_ENABLE provides 4 additional bits that software must use to enable latency data recording in the PEBS
-record upon the respective IA32_PMCx overflow condition. The layout of IA32_PEBS_ENABLE for processors based
-on Intel microarchitecture code name Nehalem is shown in Figure 18-15.
18-18 Vol. 3B
PERFORMANCE MONITORING
+Additionally, the PEBS record is expanded to allow latency information to be captured. The MSR
+IA32_PEBS_ENABLE provides 4 additional bits that software must use to enable latency data recording in the PEBS
+record upon the respective IA32_PMCx overflow condition. The layout of IA32_PEBS_ENABLE for processors based
+on Intel microarchitecture code name Nehalem is shown in Figure 18-15.
When a counter is enabled to capture machine state (PEBS_EN_PMCx = 1), the processor will write machine state
information to a memory buffer specified by software as detailed below. When the counter IA32_PMCx overflows
from maximum count to zero, the PEBS hardware is armed.
@@ -311115,6 +311149,19 @@ R/ESI
R15
+Vol. 3B 18-19
+
+ PERFORMANCE MONITORING
+
+Table 18-3. PEBS Record Format for Intel Core i7 Processor Family
+Byte Offset
+
+Field
+
+Byte Offset
+
+Field
+
38H
R/EDI
@@ -311139,19 +311186,6 @@ A0H
Data Source Encoding
-Vol. 3B 18-19
-
- PERFORMANCE MONITORING
-
-Table 18-3. PEBS Record Format for Intel Core i7 Processor Family
-Byte Offset
-
-Field
-
-Byte Offset
-
-Field
-
50H
R8
@@ -311550,8 +311584,8 @@ DMND_IFETCH
2
-(R/W). Counts the number of demand and DCU prefetch instruction cacheline reads. Does not count L2
-code read prefetches.
+(R/W). Counts the number of demand instruction cacheline reads and L1 instruction cacheline
+prefetches.
WB
@@ -312500,6 +312534,9 @@ thread
3
+Use CPUID to enumerate # of
+counters. See Section 18.2.1.
+
# of general-purpose
counters per core
@@ -312507,6 +312544,9 @@ counters per core
8
+Use CPUID to enumerate # of
+counters. See Section 18.2.1.
+
Counter width (R,W)
R:48, W: 32/48
@@ -312524,7 +312564,7 @@ threads)
4
Use CPUID to enumerate # of
-counters.
+counters. See Section 18.2.1.
PMI Overhead Mitigation
@@ -312556,8 +312596,6 @@ not support PEBS.
Box
Comment
-Use CPUID to enumerate # of
-counters.
Vol. 3B 18-33
@@ -312733,13 +312771,12 @@ Vol. 3B 18-35
Counter Coalescence
In processors based on Intel microarchitecture code name Sandy Bridge, each processor core implements eight
-general-purpose counters. CPUID.0AH:EAX[15:8] will report either 4 or 8 depending specific processor’s product
-features.
-If a processor core is shared by two logical processors, each logical processors can access 4 counters (IA32_PMC0IA32_PMC3). This is the same as in the prior generation for processors based on Intel microarchitecture code name
-Nehalem.
-If a processor core is not shared by two logical processors, all eight general-purpose counters are visible, and
-CPUID.0AH:EAX[15:8] reports 8. IA32_PMC4-IA32_PMC7 occupy MSR addresses 0C5H through 0C8H. Each
-counter is accompanied by an event select MSR (IA32_PERFEVTSEL4-IA32_PERFEVTSEL7).
+general-purpose counters. CPUID.0AH:EAX[15:8] will report the number of counters visible to software.
+If a processor core is shared by two logical processors, each logical processors can access up to four counters
+(IA32_PMC0-IA32_PMC3). This is the same as in the prior generation for processors based on Intel microarchitecture code name Nehalem.
+If a processor core is not shared by two logical processors, up to eight general-purpose counters are visible. If
+CPUID.0AH:EAX[15:8] reports 8 counters, then IA32_PMC4-IA32_PMC7 would occupy MSR addresses 0C5H
+through 0C8H. Each counter is accompanied by an event select MSR (IA32_PERFEVTSEL4-IA32_PERFEVTSEL7).
If CPUID.0AH:EAX[15:8] report 4, access to IA32_PMC4-IA32_PMC7, IA32_PMC4-IA32_PMC7 will cause #GP.
Writing 1’s to bit position 7:4 of IA32_PERF_GLOBAL_CTRL, IA32_PERF_GLOBAL_STATUS, or
IA32_PERF_GLOBAL_OVF_CTL will also cause #GP.
@@ -312849,12 +312886,12 @@ SAMPLING Restriction
Small SAV(CountDown) value incur higher overhead than prior
generation.
+Only IA32_PM
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment