[Oberon] Numerical CASE Statements in Project Oberon
Jan de Kruyf
jan.de.kruyf at gmail.com
Sat Nov 21 12:42:17 CET 2015
JohnR good morning to you.
Thank you for your long and thoughtful contribution.
It sums up most of the criteria in the discussion.
> Prof. Wirth intended that IF-ELSIF-ELSIF-ELSE statement chains be used
when the subset is sparse. . .
I think that is how most people understand and use it today. but sometimes
a case is more readable.
Here are some quotes from my Scott 'Programming Language Pragmatics' book,
besides the speed facts you already quoted:
---
The case statements of Algol W and its descendants provide alternative
syntax for
a special case of nested if . . . then . . . else . When each condition
compares the
same integer expression to a different compile-time constant . . .
Rather than test its expression sequentially against a series of
possible values,
the case statement is meant to compute an address to which it jumps in a
single
instruction. . . .
. . . . .. it can decide which form of target code to generate. For the
sake
of simplicity, most compilers employ only some of the possible
implementations.
Many use binary search in lieu of hashing. Some generate only indexed jump
tables;
others only that plus sequential testing. Users of less sophisticated
compilers may
need to restructure their case statements if the generated code turns out
to be
unexpectedly large or slow.
. . . . . More significantly, languages differ in whether they
permit label ranges, whether they permit (or require) a default ( else )
clause, and
in how they handle a value that fails to match any label at run time.
Standard Pascal does not permit a default clause: all values on which to
take
action must appear explicitly in label lists. It is a dynamic semantic
error for
the expression to evaluate to a value that does not appear. Most Pascal
compilers
permit the programmer to add a default clause, labeled either else or
otherwise ,
as a language extension. Modula allows an optional else clause. If one does
not
appear in a given case statement, then it is a dynamic semantic error for
the
tested expression to evaluate to a missing value. Ada requires arm labels
to cover
all possible values in the domain of the tested expression's type . . .
--------
So
1. it does appear that an ELSE is a valued addition to a case statement.
Compiler implementations
of any laguage in general gravitate towards it. Also your statement :
"The programmer will presumably also have provided a friendlier error
mechanism than a raw error trap."
points that way.
2. The Ada implementation is there because Ada encourages subranges like
Pascal.
But in practice the 'Others' arm functions as a catch all.
3. I did a short comparison and although a jumptable is theoretically the
fastest,
it needs some associated code that makes most smallish case statements,
at least on RISC,
as fast with a binary search algorithm. Only in biggish case statements
(like in ORS.Mod for instance)
will you notice a difference.
And then there is the space implication on small machines for a table,
as you observed.
> the hybrid case of dense subsubsets. . . .
This is basically search tree with page lookup. There is certainly merit in
that for big data handling, but
we are talking Oberon on micro controllers here.
One thing has been missing from the discussion so far, that is:
"How does the average user use this feature."
I think the biggest I have seen is in industrial ethernet comms packages.
(PC to microcontroller board etc.)
And there, if speed is really an issue, one could certainly do page lookup
first, say with a case of ranges,
which is fast in the construction I have cooked up.
Ok thats enough talk for today.
Cheers,
j.
On Sat, Nov 21, 2015 at 9:26 AM, John R. Strohm <strohm at airmail.net> wrote:
> After watching this discussion, here's my read:
>
> Consider first the problem of selecting an action according to the value
> of a variable whose type consists of a set of possible selector values, as
> a (small) subset of the integers.
>
> Prof. Wirth intended that IF-ELSIF-ELSIF-ELSE statement chains be used
> when the subset is sparse, that is, there are many values not handled, and
> CASE statements be used when the subset is dense (no holes). The CASE
> statement in Oberon-07 does not admit a DEFAULT (ELSE) value, indicating
> that giving it an invalid (missing) selector value is an error.
>
> If one were seriously contemplating implementing the CASE statement as an
> IF-ELSIF-ELSIF-ELSE chain, one would of course be advised to employ a
> binary search technique, rather than a linear search technique. However,
> COMMA, this is the sort of "optimization" ("pessimization") decision that
> should be made by the programmer, not the compiler. It is trivial to show
> that a linear IF-ELSIF-ELSIF-ELSE chain takes O(N) time, a binary chain
> takes O(log2 N) time, and a branch table takes O(1) time. This goes back
> to the "start by generating good code" philosophy.
>
> Now, since Oberon does not provide subrange types (as PASCAL did), a CASE
> statement implementation must necessarily start by verifying that the
> selector variable's value is in the range specified by the individual
> branch selector values. If it is out of range, an error trap is taken.
> (If the compiler does flow analysis, and the programmer is intelligent, the
> compiler won't need to generate this check, as the programmer will already
> have done it, and the compiler now knows that the value must be in range.
> The programmer will presumably also have provided a friendlier error
> mechanism than a raw error trap.) This is of course trivial, no harder
> than verifying that a subscript is in range.
>
> Yes, the programmer certainly could use large selector value ranges, with
> most cases "missing". Using a branch table approach would generate large
> branch tables, with most values taking error traps. "Doctor, it hurts when
> I do this." "Well, don't do that!"
>
> One would, in the hybrid case of dense subsubsets forming a sparse subset,
> expect the programmer to "hybridize" his code, allowing sparse blocks of
> dense subsets, using IF-ELSIF-ELSIF-ELSE chains to select the block, and
> individual CASE statements to handle the dense subset in each block.
>
> --John R. Strohm
>
> -----Original Message----- From: Chris Burrows
> Sent: Friday, November 20, 2015 10:33 PM
> To: 'ETH Oberon and related systems'
>
> Subject: Re: [Oberon] Numerical CASE Statements in Project Oberon
>
> -----Original Message-----
>> From: Oberon [mailto:oberon-bounces at lists.inf.ethz.ch] On Behalf Of
>> John Stout
>> Sent: Friday, 20 November 2015 12:40 AM
>> To: oberon at lists.inf.ethz.ch
>> Subject: Re: [Oberon] Numerical CASE Statements in Project Oberon
>>
>> Jan
>>
>> I wasn't suggesting that we NOT implement it, just that compiling it
>> (implementing it) to a series of compiled IF THEN statements may be
>> more in keeping with NW's view.
>>
>>
> I have not been able to find any NW reference that advocates implementing
> CASE by emulating IF THEN ELSE. Every such reference I am aware of
> advocates the use of jump tables. e.g.
>
> Project Oberon (2005), 12.2 Code Patterns:
>
> "Case statements serve to select a statement sequence from a set of cases
> according to an index value. Selection is represented by a direct branch to
> the selected case; the CASE instruction takes the branch distance from a
> table using the indexed addressing mode. We conclude from the following
> code that missing cases yield a table entry leading to a trap instruction
> (BPT 16). The table of offsets is located in the module's area for
> constants."
>
> Compiler Construction (2005), 11.3:
>
> "Consider the implementation of the case statement of Oberon (see Appendix
> A.2). Its essential property is that it uses a table of jump addresses for
> the various cases, and an indexed jump instruction."
>
> An Oberon Compiler for the ARM Processor (2007), 15. The Case Statement
>
> "The case statement serves to select a statement according to an integer,
> just as an element is selected from an array. This helps to avoid long
> cascades of if-elsif statements and increases efficiency. However, case
> statements are recommended only if the set of selectable statements is
> reasonably large. The compiler generates an internal array of the labels of
> the component statements. The specific jump address (label) is chosen by
> the index specified in the case clause."
>
> Regards,
> Chris Burrows
>
> CFB Software
> http://www.astrobe.com
>
>
> --
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> --
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.inf.ethz.ch/pipermail/oberon/attachments/20151121/26e623d6/attachment.html>
More information about the Oberon
mailing list