[Oberon] REAL I/O in module Texts

Sat Jan 14 19:37:51 CET 2023

Hi all,

Two years ago I reported about unexpected wrong output of various borderline REAL values by the present versions of the PO procedures Texts.WriteReal and Texts.WriteRealFix. 
See: https://lists.inf.ethz.ch/pipermail/oberon/2020/015007.html 

Most of those who replied to my post agreed with me that this behaviour is not as it should be and deserves improvement.

Recently I tried and made improvements to these procedures.

A few examples may illustrate the output of the present and of my new versions. Find the source code of a simple TestRealI0.Mod below. An extended TestRealIO.Mod, Texts1.Mod and related modules can be found in a GitHub repository: https://github.com/hansklav/Oberon-REAL-IO

When Texts.Scan (via TestRealIO.Do) inputs one of the following values x in the first column the outputs will be as in the second and third columns (wrong values marked by an asterisk); for the last two columns my improved procedures of Texts1 were used.

TestRealIO.Do^      Texts.WriteReal Texts.WriteRealFix Texts1.WriteReal Texts1.WriteRealFix
                    Texts.Scan      Texts.Scan         Texts1.Scan      Texts1.Scan     
x            n  k

1554.70E5    6  2   1.E+08 *        151826.17 *        1.6E+08          155470000.  
1554.70E5    7  2   6.E+08 *        151826.17 *        1.6E+08          155470000.  
1554.70E5    8  2   2.E+08          151826.17 *        1.6E+08          155470000.  
1554.70E5    9  2   1.6E+08         151826.17 *        1.6E+08          155470000.  
12345.678    14 3   1.234567E+04    12345.679          1.234567E+04     12345.679
22345.678    14 3   2.234567E+04    11172.835 *        2.234567E+04     22345.670
22345.6789   14 2   2.234567E+04    22345.67           2.234567E+04     22345.67
12345678.0   14 0   1.234567E+04    12345679.          1.234567E+04     12345680.
22345678.0   14 0   5.568462E+06 *  5568462.  *	       TRAP 7 Overflow  TRAP 7 Overflow
123456789.0  14 0   6.016277E+06 *  6016277.  *        TRAP 7 Overflow  TRAP 7 Overflow
1.0E6        14 1   1.000000E+06    1000000.1          1.000000E+06     1000000.1
1.0E6        14 2   1.000000E+06    125000.00 *        1.000000E+06     1000000.00
1.0E7        14 0   1.000000E+07    10000001.          1.000000E+07     10000001.
1.0E8        14 0   1.000000E+08    12500000. *        1.000000E+08     100000000.
1.232596E-32 14 1   1.232596E-32    .0                 1.232596E-32     0.0
1.232595E-32 14 1   6.162975E-33 *  .0                 Underflow        0.0

It became clear to me that in cases where both Texts.WriteReal and Texts.WriteRealFix give wrong output this is caused by wrong output of Texts.Scan, due to overflow of FLT(n) if n > 16777215 (= 2^24-1). When Texts.Scan is not used (e.g. by giving the same value as a literal REAL parameter to both output procedures, or by inputting the INTEGER encoding of a REAL value directly by using TestRealIO.DoInt) in nearly all cases only Texts.WriteRealFix produces wrong output (this time due to overflow of FLOOR(x) which (to my surprise) has the same upper limit as FLT, i.e. FLOOR(x) overflows for x > 16777215.00); nearly always output of Texts.WriteReal is OK. Actually I could find only a few special cases where Texts.WriteReal gives wrong output:
  n < 8 
  x =  16777215.0 
  x = -16777215.0
  x = Reals.Real(07F800000H) (= +infinity) 
  x = Reals.Real(0FF800000H) (= -infinity)
  ABS(x) < 1.232596E-32  (< Reals.Real(0A7FFFFFH), when the significand = 8388607 = 2^23-1)
This is shown by the test procedures TestRealIO.WriteNumbers, TestRealIO.WriteNumbers1 and TestRealIO.DivByZero.

-WriteReal-
An implementation for Texts.WriteReal that handles the above values correctly is given in Texts1.WriteReal below. It handles the four constants as special cases and values ABS(x) < 1.232596E-32 are signalled as underflow. It also fixes the minor bug for n < 8 I wrote about earlier (https://lists.inf.ethz.ch/pipermail/oberon/2020/015003.html); run TestRealIO.WriteReal to see this latter bug and the effect of my fix.

-Scan-
To prevent Texts.Scan from returning wrong REAL values for corner cases the simplest solution is to catch problematic input with a trap. This is done in my Texts1.Scan below (use TestRealIO.SwapScan to let the procedures of TestRealIO alternate between Texts.Scan and the improved Texts1.Scan).
I expect that with this implementation of the Scan procedure all problematic input is trapped, and other cases are handled correctly: in both fixed point notation and E-notation at most 7 digits before the decimal point are allowed (even 8 digits if x < 16777216.0). Excess digits after the decimal point are ignored, and if ABS(x) > 3.402823E+38 a NaN is returned.

-WriteRealFix-
To develop a better alternative for Texts.WriteRealFix that stays close to the original first I looked at code by Bernd Moesli from the 1990s. Because he used 64-bit LONGREALs I could not use it. Then I stumbled upon code by Joseph Templ for his Ofront, which uses an idea that I had thought of myself, but didn't know how to implement. It can be found here: https://github.com/vishaps/voc/blob/master/src/runtime/Reals.Mod (in Reals.ConvertL). 
His approach uses the trick of a two-step attack of a LONGREAL (64-bit) to extract the decimal mantissa using FLOOR (ENTIER). I adapted this code to make a 32-bit Oberon-07 version of Reals.Convert which can be used by the original implementation of Texts.WriteRealFix by Jörg Gutknecht (see pp. 117 and 118 of https://people.inf.ethz.ch/wirth/ProjectOberon1992.pdf). A slightly extended version of this latter procedure can be found in Texts1.WriteRealFix on my GitHub pages. 

This new version of WriteRealFix can output more than 7 significant digits (at most 9), but often only 7 will be correct; this is a general limitation of 32-bit floating point type. The least significant correct (rightmost 7th, 8th or 9th) digit is not guaranteed to be rounded correctly, but mostly it is off target by only 1 unit. Rounding of the more significant digits is guaranteed correct (rounded to the nearest integer, rounding half away from zero), with one exception (see below). Text1.WriteRealFix often outputs 9 digits correctly in case of REAL representations of integer numbers; for larger numbers it falls back to Text1.WriteReal (E-notation). 

In cases 0.0 < x < 1.0 it outputs quite small values (at most 11 zeros after the decimal point without loosing significant digits) without fallback to E-notation. Unfortunately in these cases Texts1.WriteRealFix has one bug which I was unable to fix: the rightmost zero is not rounded correctly, e.g. Texts1.WriteRealFix(0.000654321, 0, 3) gives 0.000 in stead of 0.001 (this bug was also present in the 1992 version of DOS Oberon). So remember: if you find only zero's in the output make parameter k larger to check the real magnitude. 

I did tests with a large number of inputs and never found that Texts1.WriteRealFix outputs nonsense results as Texts.WriteRealFix sometimes does. In extreme cases it will seek refuge in Texts1.WriteReal.

I did my best to let the Texts1.WriteRealFix tabulate numbers properly, as is shown by TestRealIO.WriteTable1 (compare with TestIO.WriteTable, which uses the present TextsWriteRealFix procedure).

While doing the tests (using Peter de Wachter's emulator) I found that the FPGA RISC does a nice job in implementing floating point operations. For instance I noticed that: 
   9.0 / 0.0             -> 07F800000H  (+INF)  
  -9.0 / 0.0             -> 0FF800000H  (-INF)
   3.402824E+38          -> 07F800001H  (NaN)
   1000.0 * 3.402824E+38 -> 07F800002H  (NaN)
   1.0    / 3.402824E+38 -> 0
   0.3 * 1.232595E-32    -> 09999998H   (3.697785E-33)
On the other hand, at least one operation is not handled correctly, e.g.:
   0.0 / 0.0             -> 0           (but should be: NaN)

Operations involving reals with absolute values between 1.232596E-32 and 0.20E-37 are handled well, but the results cannot be output correctly as decimal reals by the present Texts or my Texts1 procedures; they can only be output as integer encodings by using Texts1.WriteRealHex(ORD(x)) or Texts1WriteRealHex(Reals.Int(x)). 

I suspect that even subnormal REAL values are supported, because it can be show that the following 
can be calculated:  0.2E-37 / 2.0 = 0059C7DCH (which is 8.2450551E-39).
Note that the compiler can't cope with smaller negative exponents than -37, so 2.0E-38 will be seen as  0.0 
but 0.20E-37 not!  See TestRealIO.Limits3.

To verify the integer encoding of REAL values the following web page is helpful: https://float.exposed
Just paste the hexadecimal value (output by Text1.WriteRealHex, but without the trailing H) into the field 'Raw Hexadecimal Integer Value' and press Enter. The accompanying explanatory web page is very informative (click on the name of its author, Bartosz Ciechanowski, in the bottom line).

For the future it might be interesting to make Oberon-07 implementations of more 'ideal' WriteReal and/or WriteRealFix along the lines of Steele and White's Dragon4 or Florian Loitsch' Grisu3 algorithms. See: 
  http://kurtstephens.com/files/p372-steele.pdf
  https://en.wikipedia.org/wiki/Floating-point_arithmetic#Binary-to-decimal_conversion_with_minimal_number_of_digits.
Also an improvement to the REAL part of Texts.Scan that accommodates all possible input values correctly would be welcome. 

I have not attempted this yet.

Tell me if you like my implementations, and please inform me of any deficiencies you encounter.

Kind regards,

Hans Klaver

------------

MODULE TestRealI0;
(* The full version of this test module, TestRealIO.Mod, can be found on https://github.com/hansklav *)
  IMPORT Texts, Oberon;

  VAR W: Texts.Writer; 

  PROCEDURE Do*;
    VAR x: REAL;  n, k, beg, end, time: INTEGER; 
      T: Texts.Text;  S: Texts.Scanner;
  BEGIN Texts.OpenScanner(S, Oberon.Par.text, Oberon.Par.pos);  Texts.Scan(S);
    IF S.class = Texts.Char THEN
      IF S.c= "^" THEN
        Oberon.GetSelection(T, beg, end, time);
        IF time >= 0 THEN Texts.OpenScanner(S, T, beg);  Texts.Scan(S);
          IF S.class = Texts.Real THEN x := S.x;  Texts.Scan(S);
            IF S.class = Texts.Int THEN n := S.i;  Texts.Scan(S);
              IF S.class = Texts.Int THEN k := S.i; 
                Texts.WriteReal(W, x, n); Texts.WriteLn(W);
                Texts.WriteRealFix(W, x, 0, k); Texts.WriteLn(W)
              ELSE Texts.WriteString(W, "parameter error"); Texts.WriteLn(W)
              END
            ELSE Texts.WriteString(W, "parameter error"); Texts.WriteLn(W)
            END
          ELSE Texts.WriteString(W, "parameter error"); Texts.WriteLn(W)
          END
        ELSE Texts.WriteString(W, "no selection"); Texts.WriteLn(W);
        END
      END 
    END; Texts.WriteLn(W);  Texts.Append(Oberon.Log, W.buf)
  END Do;

BEGIN 
  Texts.OpenWriter(W)
END TestRealI0.

TestRealI0.Do^

  1554.70E5     7 2
  12345.678    14 3
  22345.678    14 3
  22345.6789   14 2
  12345678.0   14 0
  22345678.0   14 0       
  123456789.0  14 0
  1.0E6        14 1
  1.0E6        14 2
  1.0E7        14 1
  1.0E8        14 1
  1.0E9        14 1        

--------------------

The changes I made to Texts.WriteReal are in the ELSIF clauses, and in one line further down.
(For the context see Text1.WriteReal in Text1.Mod on my GitHub pages)

PROCEDURE WriteReal* (VAR W: Texts.Writer; x: REAL; n: INTEGER);
(* NW 10.1.2019 / hk 27.12.2023 (ELSIF clauses) *)
  VAR e, i, k, m: INTEGER;
    d: ARRAY 16 OF CHAR;
BEGIN
  e := ASR(ORD(x), 23) MOD 100H;  (* binary exponent = shifted exponent of 2 (0 <= e < 256) *)
  IF e = 0 THEN seq(W, " ", n-14); Texts.WriteString(W, "  0     ");  seq(W, " ", n-8)
  ELSIF e = 255 THEN 
    IF    ORD(x) = 07F800000H THEN seq(W, " ", n-14); Texts.WriteString(W, " +INF   "); seq(W, " ", n-8)
    ELSIF ORD(x) = 0FF800000H THEN seq(W, " ", n-14); Texts.WriteString(W, " -INF   "); seq(W, " ", n-8)
    ELSE seq(W, " ", n-14); Texts.WriteString(W, " NaN    "); seq(W, " ", n-8)
    END
  ELSIF ORD(x) = 04B7FFFFFH THEN seq(W, " ", n-14); Texts.WriteString(W, "  1.677721E+07"); 
  ELSIF ORD(x) = 0CB7FFFFFH THEN seq(W, " ", n-14); Texts.WriteString(W, " -1.677721E+07");
  ELSIF ORD(ABS(x)) < 0A800000H THEN  (* ABS(x) < 1.232596E-32 *)
      Texts.WriteString(W, " Underflow in Texts1.WriteReal");  Texts.Append(Oberon.Log, W.buf);
      (*ASSERT(FALSE)*)
  ELSE Texts.Write(W, " ");
    WHILE n >= 15 DO DEC(n); Texts.Write(W, " ") END ;
    IF n < 9 THEN n := 9 END;                                          (* hk 03.10.2020 *)
    (* 2 < n < 9 digits to be written*)
    IF x < 0.0 THEN Texts.Write(W, "-"); x := -x ELSE Texts.Write(W, " ") END ;
    e := (e - 127) * 77 DIV 256 - 6;  (* decimal exponent = exponent of 10 *)
    IF e >= 0 THEN x := x / Ten(e) ELSE x := Ten(-e) * x END;
    m := FLOOR(x + 0.5);  (* significand or mantissa; 7 or 8 digits; last digit rounded *)
    IF m >= 10000000 THEN INC(e); m := m DIV 10 END ;  (* 7 digits *)
    i := 0; k := 13-n;
    REPEAT
      IF i = k THEN INC(m, 5) END;  (* rounding of k-th decimal *)
      d[i] := CHR(m MOD 10 + 30H); m := m DIV 10; INC(i)
    UNTIL m = 0;
    DEC(i); Texts.Write(W, d[i]); Texts.Write(W, ".");
    IF i < n-7 THEN n := 0 ELSE n := 14 - n END ;
    WHILE i > n DO DEC(i); Texts.Write(W, d[i]) END ;
    Texts.Write(W, "E"); INC(e, 6);
    IF e < 0 THEN Texts.Write(W, "-"); e := -e ELSE Texts.Write(W, "+") END ;
    Texts.Write(W, CHR(e DIV 10 + 30H)); Texts.Write(W, CHR(e MOD 10 + 30H))
  END
END WriteReal;

--------------------

The relevant section of Texts.Scan that I changed (also see Texts1.Scan in Texts1.Mod on my GitHub pages):

PROCEDURE Scan* (VAR S: Scanner);
  CONST maxExp = 38;  maxM = 16777216 - 1; (* 2^24 - 1 *)                         (* hk 26.12.2022 *)
  (...)
    IF ("0" <= ch) & (ch <= "9") THEN (*number*)
      n := ORD(ch) - 30H; h := n; Read(S, ch);
      WHILE ("0" <= ch) & (ch <= "9") OR ("A" <= ch) & (ch <= "F") DO
        IF ch <= "9" THEN d := ORD(ch) - 30H ELSE d := ORD(ch) - 37H; hex := TRUE END ;
        n := 10*n + d; h := 10H*h + d; Read(S, ch)
      END ;
      IF ch = "H" THEN (*hex integer*) Texts.Read(S, ch); S.i := h; S.class := Texts.Int  (*neg?*)
      ELSIF ch = "." THEN (*real number*)
        (* Allowed are numbers with max. 7 (or 8, if x < 16777216.0) digits before the decimal     *)
        (* point. Most values are trapped by the first Boolean, values with 8 digits > 16777215.0  *)
        (* by the second, -2147483648.0 (the only value in which the sign is not flipped) is       *)
        (* trapped by the last Boolean:                                                            *)
        (* ************************************************************************* hk 26.12.2022 *)
        IF (ABS(n) # n) OR (ABS(n) > maxM) OR (n < -2147483647) THEN 
          Texts.WriteString(W, "Overflow of FLT(x) in Texts1.Scan"); Texts.Append(Oberon.Log, W.buf);
          ASSERT(FALSE)
        END;
        (* ************************************************************************* hk 26.12.2022 *)
        Texts.Read(S, ch); x := 0.0; e := 0; j := 0;
        WHILE ("0" <= ch) & (ch <= "9") DO (*fraction*)
      (...)

---------------------

Reals.Convert, after Reals.ConvertL by Joseph Templ:
(See my GitHub pages for Reals.Mod and for Texts1.WriteRealFix, in which Reals.Convert is called)

PROCEDURE Convert* (x: REAL; n: INTEGER; VAR d: ARRAY OF CHAR);
(** Convert REAL: Write positive integer value of x into array d.
  The value is stored backwards, i.e. the least significant digit
  first; n digits are written, with trailing zeros fill.
  On entry x has been scaled to the number of digits required.
*)
  VAR i, j, k: INTEGER;
BEGIN
  IF x < 0.0 THEN x := -x END;
  k := 0;

  IF n > 7 THEN
    (* There are more decimal digits than can be handled by FLOOR(x) and FLT(i):
       - FLT(i) for PO RISC V5 can't handle more than 24 bits: maxFLT = 16777215 (2^24 - 1)
       - FLOOR(x) overflows with x > 16777215.0 (Reals.Int(x) (* or ORD(x) *) > 4B7FFFFFH)
    *)
    i := FLOOR(x /           10000000.0);   (* The 8th and higher digits *)
    j := FLOOR(x - (FLT(i) * 10000000.0));  (* The lower 7 digits *)
    (* First generate the lower 7 digits *)
    IF j < 0 THEN j := 0 END;
    WHILE k < 7 DO
      d[k] := CHR(j MOD 10 + 48); j := j DIV 10; INC(k)
    END;
    (* Fall through to generate the upper digits *)
  ELSE
    (* We can generate all the digits in one go *)
    i := FLOOR(x);
  END;

  WHILE k < n DO
    d[k] := CHR(i MOD 10 + 48); i := i DIV 10; INC(k)
  END
END Convert;