III Floating-Point Exceptions and Specification of Elementary Arithmetic Instructions

The IEEE 754 standard is a partial specification of behavior for a variety of floating-point operations, intended to ensure consistent results across all conforming platforms. In its treatment of exceptional conditions, however, the standard allows some latitude, resulting in significant differences between compliant architectures. For example, the condition of underflow, which occurs when a non-zero result lies below the normal range of the target format, is treated by an x86 processor as a property of the rounded result, but is detected by an ARM processor before rounding. One of the six industry-standard floating-point exceptional conditions, the denormal operand, is not addressed by the IEEE standard at all.

Even within the x86 architecture, there are inconsistencies in the treatment of exceptions by different instruction sets, especially in the unmasked cases. We shall focus on the two main classes of x86 instructions:

- The
*Streaming SIMD (single instruction, multiple data) Extensions*, or*SSE*instructions, were introduced by Intel in 1999 in the Pentium III processor series and have continually expanded ever since. These instructions, which are commonly used in multimedia and graphics applications, operate on single- and double-precision data stored in 128-bit and 256-bit registers and provide a significant boost in performance by executing an operation on several sets of data elements in parallel. - The older
*x87*instructions were introduced as an extension of the 8086 architecture and were originally implemented in a floating-point coprocessor to provide hardware and microcode support for mathematical and scientific computations that had previously been performed by software. Since the introduction of SSE, the x87 instructions have not been as essential as they once were, but they remain important for numerical calculations that are sensitive to rounding error and require the higher precision and extended range provided by the 80-bit double extended precision format.

We shall also address the floating-point instructions of the ARM AArch32 architecture, which are similar to the SSE instructions but exhibit some significant differences in the handling of exceptions. For each of these three instruction sets, we shall limit our analysis to the most common elementary arithmetic operations: addition, subtraction, multiplication, division, square root, and multiply-accumulate (SSE and ARM only). For each of these operations, we present a comprehensive specification that may serve as a basis for formal verification of an implementation. Furthermore, the executable ACL2 formalizations of these specifications may be used in the testing of compliant implementations.

In the absence of exceptional conditions, the IEEE specification of the data result of an arithmetic operation, as presented at the beginning of Part II, is unambiguous and consistent with each of the architectures addressed here. As discussed in the preface to this manual, however, the formalization of this specification in the ACL2 logic, which is limited to rational arithmetic, presents a challenge in the case of the square root operation. In the next chapter, before presenting the details of these architectures, we discuss our solution to this technical problem. This chapter may be omitted by the reader who is not interested in the ACL2 formalization.

- 7 IEEE-Compliant Square Root

- 8 SSE Floating-Point Instructions
- 8.1 The SSE Control and Status Register
- 8.2 Overview of SSE Floating-Point Exceptions
- 8.3 SSE Pre-Computation Exceptions
- 8.4 SSE Post-Computation Exceptions

- 9 x87 Instructions
- 9.1 x87 Control Word
- 9.2 x87 Status Word
- 9.3 Overview of x87 Exceptions
- 9.4 x87 Pre-Computation Exceptions
- 9.5 x87 Post-Computation Exceptions

- 10 ARM AArch32 Floating-Point Instructions