[Oberon] A CASE quiz
dsar at eml.cc
Sat Feb 17 15:29:23 CET 2018
On Fri, Feb 16, 2018, at 2:04 PM, Chris Burrows wrote:
> Compiles are typically about 10% faster after implementing the PO2013
> Oberon compiler scanner function ORS.Get as a CASE statement. The number
> of lines of code in ORS reduced by about 3% as a result.
It's worthy to say that Wirth's scanner with IFs is optimised to reduce the number of tests per character. They are also ordered on the frequency in an Oberon source code. Of course, a flatter IF sequence would be a lot slower.
However, I'm very surprised about that 10%. I expect a lot more since the scanner is the most sensible part in a single-pass compiler.
In the past I wrote a very efficient scanner that used the CASE statement as the main selector and various lookup tables to reduce the time of scanning, something like this:
(* all elements must be initialised to true *)
VAR nonAlphaNumerics : ARRAY 256 OF BOOLEAN;
VAR nonWhiteSpaces : ARRAY 256 OF BOOLEAN;
nonAlphaNumerics[ORD("A")] := FALSE;
nonAlphaNumerics[ORD("B")] := FALSE;
nonWhiteSpaces[ORD(20X)] := FALSE; (* TAB *)
nonWhiteSpaces[ORD(09X)] := FALSE; (* Blank space *)
In the identifier scanner all the tests were simplified in:
REPEAT [...] UNTIL nonAlphaNumerics[ORD(ch)];
REPEAT [...] UNTIL (ch < "0") OR (ch > "9") & (ch < "A") OR (ch > "Z") & (ch < "a") OR (ch > "z");
With CASE statement as main selector and lookup tables for scanning I achieved a truly constant time (the number of characters in the text), while the equivalent with a lot of tests was slower in a range of 70% to 300%, based on the text scanned. These were outstanding results.
I reduced memory requirement by using SETs instead of ARRAYs, with a small impact on speed (I don't remember how much in detail in this moment).
Thinking more about it, the speed benefit of jump and lookup tables are very sensible to the computer architecture, a very small and slow cache/memory could be an issue for these techniques.
So depending on the architecture the benefits of jump and lookup tables are not cost effective. It would be nice to test these techniques in a more limited architecture.
In my opinion, however, the CASE statement is fundamental, also if its performance depends on various factor.
More information about the Oberon