We turn now to the problem of adding two rational numbers that are encoded in a floating-point format. The following procedure represents a naive approach to the design of a floating point adder:
While a large left shift might be required (in the case of subtraction, if massive cancellation occurs), and a large right shift might be required (if the exponents are vastly different), only one of these possibilities will be realized for any given pair of inputs. Thus, an efficient adder typically includes two data paths, called the close and far paths. On the close path, the sum is computed under the assumption that subtraction is to be performed and the exponents differ by at most 1. Thus, the summands are aligned quickly, but two cycles are dedicated to Steps (4) and (5). On the far path, which handles the remaining case, Steps (1) and (2) require two cycles, but the sum is easily normalized. A concurrent analysis of the exponents determines which of these paths is actually used to produce the final result. Since each path is traversed in three cycles, the total latency of the operation is reduced from six to four.
However, in order for the operation to be effectively pipelined without duplicating the adder hardware, the addition must be performed on the same cycle along both paths; moreover, the paths must merge before the addition is performed. This is made possible by a technique known as leading one prediction, which allows the left shift (of the close path) to be determined, and in fact performed, in advance of the subtraction. Consequently, steps (4) and (5) of the close path may be executed concurrently with steps (1) and (2) of the far path. Meanwhile, the exponent analysis is performed in time to select the inputs to the adder.
The objective of leading one prediction is the location of the highest index at which a one occurs in the difference of two bit vectors. Although the precise computation of this index is in general as complex as the subtraction itself, a useful approximate solution may be obtained more quickly.
Let
and
be integers of width
, with
. We shall
compute, in constant time (independent of
and
), a positive
integer
such that
is either
or
. We begin by defining a function
that returns the
desired exponent
. We are not concerned with the efficiency
of this computation, since it is not actually implemented in our final
solution, but is used only in the proof of its correctness.
The computation of
may be thought of as a traversal of the
vectors
and
from left to right. At each stage, we examine the
bits of
and
at some index
under the assumption that
. Accordingly, one of the following four steps
is executed, resulting either in termination or a decrement of
:
The correctness of this computation is established by the following lemma.
PROOF: More generally, we shall show, by induction on
, that if
and
If
Suppose
. Then by Lemma 2.3.14,
If
, then
and by inductive hypothesis,
Suppose that
. Then
and induction yields
Thus,
and
Finally, we have
Thus, we require an efficient method for computing a number
such that
. First, we handle
the relatively simple case
. Note that the time required
for the computation is two gate delays.
PROOF: Since
it will suffice to show, according to Lemma 6.2.1, that
Using induction, we shall prove the more general result that for all
First note that by Lemmas 3.2.9, 3.1.7, and 2.2.20,
and
Suppose
. By Lemma 2.3.10,
.
If
and
, then
,
, and
by Definition 6.2.1 and inductive hypothesis,
On the other hand, if either
The next lemma covers the more complicated case
, which
requires three gate delays. Figure 6.2 contains an illustration
of this case.
PROOF: Let
. Since
,
and therefore it will suffice to show that
For the case
We proceed by induction. Let
. Note that for
,
It follows that for
| if |
|||
| if |
| if |
|||
| if |
If
, then
; hence,
and
Next, suppose
and
. If
, then
,
hence
and
Finally, suppose
and
. If
, then
,
, and