14  SRT Division and Square Root

Sweeney-Robertson-Tocher (SRT) algorithms for division and square root extraction[ROB58,TOC58] are ubiquitous in contemporary microprocessor design and notoriously prone to implementation error. They are, therefore, an important application of formal verification.

This chapter is a comprehensive analysis of one such algorithm that that performs both division and square root extraction with arbitrary radix $ 2^\rho$. Since our main concern is the reliability of our results, we have ignored various well known opportunities for optimization in order to simplify the proof. In particular, our analysis is limited to the case of “maximal redundancy,” which allows all quotient digit values in the set $ \{1-2^\rho,\ldots,2^\rho-1\}$.

The algorithm generates a sequence of approximations of the desired quotient or square root. Our primary objective is to prove convergence by establishing a suitable error bound for these approximations. We do not treat the problem of rounding, which may be solved by the general methods of Chapter 6, e.g., an an application of Lemma 6.5.20.

In Section 14.1, we develop a criterion for quotient digit selection tables of arbitrarily high radix, which is proved necessary and sufficient to produce correct quotients and remainders. We also present a simple procedure that determines whether there exists a table of size $ 2^M
\times 2^N$ that meets this criterion, for given $ \rho$, $ M$, and $ N$, and another that generates such a table if possible.

One difference between the SRT algorithms for division and square root extraction is that the latter requires an initial approximation of the root to be used as input to the table. That is, the first several iterations must be performed by some other means before the table may be invoked on subsequent iterations. In Section 14.2, we establish a criterion for a table that may be used for square root extraction after $ K$ iterations and show that any table that satisfies this criterion for some $ K$ also satisfies the criterion for division. We also define computable procedures that determine the existence of a table of given dimensions that meets this criterion, and another that generates it if possible. In Section 14.3, we derive a formula for computing the entries of a “seed table” through which an approximation of the square root, accurate to $ K\rho$ bits, is derived from an approximation of the radicand.

Applying our results to the case $ \rho =3$, we find that the smallest admissible table for division is given by $ M=7$ and $ N=3$, and that the optimal parameters for square root are $ M=7$, $ N=4$, and $ K=2$.

Another complication of the square root algorithm is that the digit selection index is derived from the partial root, which changes on every iteration, instead of a fixed divisor. In Section 14.3, we show that an appropriate adjustment of the initial approximation ensures that the leading bits of the partial quotient actually remain constant. This requires an initial full-width comparison to determine whether the approximation derived from the seed table is an underestimate of the square root, which adds a cycle of latency but simplifies the table access logic without increasing the table size.

David Russinoff 2017-08-01