# [Oberon] Wrong results of selected REAL multiplications

Michael Schierl schierlm at gmx.de
Fri Oct 13 23:54:54 CEST 2023

```Hello,

Am 13.10.2023 um 18:39 schrieb Hans Klaver:

>> The problem does not occur on a April 2016 verison of Peter's
>> emulator. This leads me to suspect that the problem may have been
>> introduced during the changes to the Floating Point-related Verilog
>> files that were made between 8 Aug 2016 and 3 Oct 2016 as
>> summarised here:
>>
>> https://people.inf.ethz.ch/wirth/news.txt

For the record, the changes talked about are in this diff:
<https://github.com/Spirit-of-Oberon/wirth-personal/commit/359240b9d1c63d23aff6c90aa90fdd46917c743b>.

> Thanks for checking this.
>
> As I don't know much about Verilog programming I am afraid that I
> can't be of much help to pinpoint this bug and propose a solution.

I don't think having experience about Verilog programming really helps
me much in understanding this very dense code of parallel bit-twiddling
(and a significant part of the twiddling and its state machine is for
doing the "integer" multiplication of the mantissae).

> Hoping others in the community are able and willing to look into
> this.

Just starting from the facts you stated (sometimes, when multiplying a
value with mantissa # 1.0 with a value with mantissa # 1.0 and the
result has a mantissa = 1.0, the exponent is off by one), and when
looking at the changes to FPMultiplier in the commit above, I have a
suspicion ...

Just picking a few lines of the assignments from the new version (I
slightly reordered them for the argument, but that does not matter as
all those assignments happen in parallel anyway):

reg [47:0] P; // product
wire [24:0] z0;

assign e1 = e0 - 127 + P[47];

assign z0 = P[47] ? P[47:23]+1 : P[46:22]+1;  // round and normalize

assign z = (xe == 0) | (ye == 0) ? 0 :
(~e1[8]) ? {sign, e1[7:0], z0[23:1]} :
(~e1[7]) ? {sign, 8'b11111111, z0[23:1]} : 0;

Or condensed to the part I'd like to point out

assign e1 = <something> + P[47];

assign z0 = P[47] ? P[47:23]+1 : P[46:22]+1;  // round and normalize

assign z = <some corner cases omitted, but the main case is>
{sign, e1[7:0], z0[23:1]};

z is our final output, and in the final step of the state machine it
gets set by using e1 as exponent and the "middle" of z0 as mantissa
(while z0 is a 25-bit value, we omit the first and last bit of it. First
bit has to be 1 and is omitted in IEEE float representation, and last
bit was there for better rounding).

Both exponent and mantissa have special logic for handing the most
significant bit of the intermediate product, P[47], to be 1. In that
case we increase the exponent by 1, while at the same time shifting the
mantissa one more bit. That way, even if P[47] is not 1, P[46] is 1, so
we end up with a normalized mantissa.

Or at least, that is what you might think at first point.

Now consider the case, where (the relevant part of) P has the value of

0111111111111111111111111

The most significant bit is 0. Yet for the actual computation of the
mantissa, we add 1 after shifting (this bit is cut off afterwards and as
the comment suggests, this is to improve rounding of results), resulting
in a mantissa of all zeros (as the top bit is cut off as well), instead
of a mantissa of 1000000000000000000000000. The leading bit does not
matter, though, as we will discard it anyway, assuming it to be a 1.

But what does matter: For the computation of the exponent we don't take
care of the overflow of the addition and do not add an extra 1.

An easy fix would be to remove the rounding (the two +1), but then your
result would be 437FFFFFH instead of 43800000H (still a lot better than
43000000H, though). But a correct fix that keeps the rounding does not
seem to be trivial.

In the emulator source, I could easily fix this by adding an additional
if statement and just do the addition again before comparing the most
significant bit for incrementing the exponent, but I don't feel able to
do this in Verilog (while still keeping the whole circuit being fast),
especially since I don't have a way to test it on real hardware. Maybe
somebody else wants to give it a stab?

So either, change the computation of P in the previous states to add 1
there (while for the best rounding you would have to add 2 in case the
value is shifted one bit later). Or account for the extra 1 exponent in
case of overflow.

As a compromise, one could also just omit the rounding in case the value
before rounding but after normalization (or at least a significant part
of its digits) happens to be all ones; in that case, the rounding would
still happen in most cases but not in the one that currently triggers
the bug.

Or did I miss anything obvious? :)

Regards,

Michael
```