[Oberon] CHAR to/from BYTE conversion

Chris Burrows chris at cfbsoftware.com
Sat Jul 9 04:03:33 CEST 2016


> -----Original Message-----
> From: Oberon [mailto:oberon-bounces at lists.inf.ethz.ch] On Behalf Of
> Jörg Straube
> Sent: Saturday, 9 July 2016 5:06 AM
> To: ETH Oberon and related systems
> Cc: paulreed at paddedcell.com
> Subject: Re: [Oberon] CHAR to/from BYTE conversion
> 
> Don't open that can of worms :-)
> - Should LEN(s) return the number of characters in s (logical length)
> or the number of bytes of s (physical length)?

Neither.

LEN is the number of elements of an array. An array declared as 

  VAR s: ARRAY 12 OF CHAR 

will always have 12 elements no matter how long the string is that is currently stored in the array. 

LEN("abc") = 4  (* including the terminating NULL character *)

s := "abc";
LEN(s) = 12

Similarly for an array declared as

  VAR a: ARRAY 12 OF INTEGER

LEN(a) = 12.

SYSTEM.SIZE will give you the number of bytes allocated to a variable. SYSTEM indicates that the value that SIZE returns is implementation-dependent.

The length of a string is the number of characters up to but not including the null terminating character. The longest string that can be stored in the array s where LEN(s) = 12 is 11 characters.

> - shoudn't the type CHAR better be called "ASCII"?

No. Since 2013 the Oberon Report has not required it be Latin-1 / ASCII. CHAR is now defined as 'the characters of a standard character set'. 

> - As Unicode characters need up to 32 bit, should CHAR be 32 bit?
> 

There is nothing stopping an implementer of Oberon from defining characters as 32-bit (or 16-bit for that matter) items if they wanted it to be used to develop applications that required a character set other than ASCII.

> Strings are quite tricky, especially a compact internal
> representation of them.
> 

True.

Regards,

Chris Burrows
CFB Software
http://www.astrobe.com


> > Am 08.07.2016 um 20:50 schrieb Skulski, Wojciech
> <skulski at pas.rochester.edu>:
> >
> >
> >> That means both CHAR and BYTE holds values 0 to 255.
> >
> >> No, CHAR holds characters, 0X to 0FFX, including "A", "B" etc.;
> and
> >> BYTE holds 0 to 255, a sub-range of the INTEGER range.
> >
> >> Your statement is indicative of how much damage C has done to the
> >> world :)
> >
> > Characters should not be just English characters.
> >
> > Your statement is indicative of how much damage ASCII has done to
> the
> > world :)
> >
> > W.
> > --
> > Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related
> > systems https://lists.inf.ethz.ch/mailman/listinfo/oberon
> 
> --
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related
> systems https://lists.inf.ethz.ch/mailman/listinfo/oberon



More information about the Oberon mailing list