5.1  Classification of Formats

A floating-point format is characterized by the precision $ p$ with which representable numbers are differentiated, and the the number $ q$ of bits allocated to the exponent, determining the range of representable numbers. Some formats represent all $ p$ bits of a number's significand explicitly, but a common optimization is to omit the integer bit. Thus, a format is defined as a triple:

Definition 5.1.1   (formatp, prec, expw, sigw) A floating-point format is a triple $ F = \langle e, p, q \rangle$, where

(a) $ e$ is a boolean value indicating whether the integer bit is explicitly represented; the format is said to be explicit or implicit accordingly;

(b) $ p = \mathit{prec}(F)$ is an integer, $ p \geq 2$, the precision of $ F$;

(c) $ q = \mathit{expw}(F)$ is an integer, $ q \geq 2$, the exponent width of $ F$.


The significand width of $ F$ is $ \mathit{sigw}(F) = \left\{\begin{array}{ll}
p & \mbox{if $F$ is explicit}\\
p-1 & \mbox{if $F$ is implicit.}\end{array}\right.$

In this chapter, every “format” will be understood to be a floating-point format.

Definition 5.1.2   (encodingp) An encoding for a format $ F$ is a bit vector of width $ \mathit{expw}(F)+\mathit{sigf}(F)+1$.

The most common implicit formats are the IEEE basic single ($ p=24$, $ q=8$) and double precision ($ p=53$, $ q=11$) formats, at least one of which must be implemented by any IEEE-compliant floating-point unit. Explicit formats include most implementations of the single extended ($ p = 32$, $ q=11$) and double extended ($ p = 64$, $ q = 15$) formats, as well as the higher-precision formats that are typically used for internal computations in floating-point units.

We establish the following notation for the three formats that are used by the x86 elementary arithmetic operations discussed in Part III:

Definition 5.1.3   (hp, sp, dp, ep) The half, single, double, and (double) extended formats are as follows:

$\displaystyle \mathit{HP} = \langle NIL, 11, 5 \rangle;
$

$\displaystyle \mathit{SP} = \langle NIL, 24, 8 \rangle;
$

$\displaystyle \mathit{DP} = \langle NIL, 53, 11 \rangle;
$

$\displaystyle \mathit{EP} = \langle T, 64, 15 \rangle.
$

The sign, exponent, and significand fields of an encoding are defined as illustrated in Figures 5.1 and 5.2. We also define the mantissa field as the significand field without the integer bit, if present:

Definition 5.1.4   (sgnf, expf, sigf, manf) If $ x$ is an encoding for a format $ F$, then

(a) $ \mathit{sgnf}(x, F) = x[\mathit{expw}(F)+\mathit{sigw}(F)]$;

(b) $ \mathit{expf}(x, F) = x[\mathit{expw}(F)+\mathit{sigw}(F)-1:\mathit{sigw}(F)]$;

(c) $ \mathit{sigf}(x, F) = x[\mathit{sigw}(F)-1]$;

(d) $ \mathit{manf}(x, F) = x[\mathit{prec}(F)-2]$.

Figure 5.1: A Floating-Point Format with Implicit Integer Bit
\begin{figure}\par\setlength{\unitlength}{2mm}
\begin{picture}(64,10)(-1,-1)
...
...x(4,3)[l]{$p$-$2$}}
\put(55,6){\makebox(4,3)[r]{$0$}}
\end{picture}\end{figure}

Figure 5.2: A Floating-Point Format with Explicit Integer Bit
\begin{figure}\par\setlength{\unitlength}{2mm}
\begin{picture}(64,10)(-1,-1)
...
...x(4,3)[l]{$p$-$1$}}
\put(55,6){\makebox(4,3)[r]{$0$}}
\end{picture}\end{figure}

The encodings for a given format are partitioned into several classes determined primarily by the exponent field, as described in the next three sections.

David Russinoff 2017-08-01