4.2  Exactness

In order for a significand, which has the form

to be represented by an -bit field, we must have for all , or equivalently, a right shift of the radix point by places must result in an integer. This motivates our definition of exactness.

Definition 4.2.1   (exactp) Let and . Then

If , then we shall say that is -exact.

The definition may be restated as follows.

Lemma 4.2.1   (exactp2) For all and , is -exact if and only if .

The next three lemmas are consequences of the observation that -exactness of is a property of .

Lemma 4.2.2   (exactp-sig) For all and , is -exact if and only if is -exact.

(minus-exactp, exactp-neg) For all and , is -exact if and only if is -exact.

PROOF: By Lemma 4.1.13,

(exactp-shift) For all , , and , is -exact if and only if is -exact.

PROOF: By Lemma 4.1.14,

It is clear that is is representable in a given field of bits, then it is also representable in any wider field.

(exactp-<=) For all , , and , if and is -exact, then is -exact.

PROOF: Since and ,

Any power of 2 has a 1-bit significand.

(exactp-2**n) Let and . If , then is -exact.

PROOF: By Lemma 4.1.4,

A bit vector is exact with respect to its width.

(bvecp-exactp) Let and . If is a bit vector of width , then is -exact.

PROOF: Since 0 is -exact for all , we may assume , and therefore . Now since , Lemma 4.1.3 implies , and hence

We have the following formula for the exactness of a product.

(exactp-prod) Let , , , and . If is -exact and is -exact, then is -exact.

PROOF: If and are integers, then by Lemma 4.1.15, so is

In particular, if is -exact, then is -exact. The converse also holds.

(exactp-x2) Let and . If is -exact, then is -exact.

PROOF: We shall require an elementary fact of arithmetic:

If and , then .
In order to establish this result, let , where and are integers with no common prime factor, and assume that . If were even, say , then , implying that is even, which is impossible. Thus, cannot have a prime factor that does not also divide , and hence .

Now to show that is -exact, i.e., , it will suffice to show that

But since is -exact, we have , and since Lemma 4.1.15 implies ,

(exactp-factors) Let , , and . Assume that and are non-zero and -exact for some . If is -exact, then so are and .

PROOF: Let and be the smallest integers such that and are integers. Thus, and are odd. By Lemma 4.2.4, , , or , respectively, is -exact iff , , or is -exact. Consequently,, we may replace and with and . That is, we may assume without loss of generality that and are odd integers.

By Lemma 4.2.1, an odd integer is -exact iff , or equivalently, acording to Lemma 4.1.3, . Thus, since is -exact, , which implies and , and hence and are -exact.

Exactness of a bit vector may be formulated in various ways.

(exact-bits-1, exact-bits-2, exact-bits-3) Let , , and . If and , then the following are equivalent:

 (a) is -exact (b) (c) (d) .

PROOF: The equivalence between (a) and (b) follows from Lemma 4.2.1, since

Now by Lemmas 2.2.5 and 2.2.22,

and hence

But by Lemma 2.2.1, , which implies that (b), (c), and (d) are all equivalent.

Corollary 4.2.12   (exact-k+1) Let , , and such that and , Assume that is -exact. Then is -exact if and only if .

The next lemma gives a formula for exactness of a difference.

(exactp-diff) Let , , , and . Assume that and . If x and y are both n-exact and , then is -exact.

PROOF: Since is -exact and ,

by Lemma 4.2.1.. Similarly, . Thus,

Lemma 4.2.13 is often applied with , in the following weaker form.

Corollary 4.2.14   (exactp-diff-cor) Let , , and with . If x and y are both n-exact and , then is -exact.

If is positive and -exact, then the least -exact number exceeding is computed as follows.

Definition 4.2.2   (fp+) Let , , and , . Then

(fp+1) Let , , and , . If is -exact, then is -exact.

PROOF: Let and . Since is -exact, , and by Lemma 4.1.3, . Thus,

which implies . Therefore, and . If , then is -exact by Lemma 4.2.6. Otherwise, , , and

(fp+2) Let , , , and , . If and are -exact, then .

PROOF: Let and . Since is -exact, . Similar, since is -exact, . But by Corollary 4.1.5, and hence

as well. Now since , , which implies

Thus,

(fp+expo) Let , , and , . If is -exact and , then .

PROOF: Since , Lemma 4.1.5 implies . Therefore, we have , which, according to Lemma 4.1.3, implies . On the other hand, since is -exact by Lemma 4.2.6, Lemma 4.2.16 implies

(expo-diff-min) Let , , and with . If x and y are n-exact and , then

PROOF: Since the case is trivial, we shall assume that and, without loss of generality, that . But then by Lemma 4.2.16,

Similarly, the following definition computes the greatest -exact number that is less than a given -exact number.

Definition 4.2.3   (fp-) Let , , and , . Then

.

(exactp-fp-) Let , , and , . If is -exact, then is -exact.

PROOF: Let and .

Suppose first that . Then , and since

Definition 4.1.1 implies . Therefore,

and is -exact by Lemma 4.2.1.

Now suppose . Then by Lemma 4.2.15,

Now , and hence . Thus,

which implies is -exact.

(fp+-, fp-+) Let , , and , . If is -exact, then

PROOF: Let , and . We shall prove first that .

As noted in the proof of Lemma 4.2.15,

If , then

But if , then

Next, we prove that . Note that . If , then , and since ,

We may assume, therefore, that . By Lemma 4.2.15, since is -exact, , and hence, . Thus,

(fp-2) Let , , , and , . If and are -exact, then .

PROOF: Suppose . Then since and are both -exact, Lemma 4.2.15 implies , a contradiction.

David Russinoff 2017-08-01