[Oberon] CHAR to/from BYTE conversion
Jörg Straube
joerg.straube at iaeth.ch
Sat Jul 9 08:58:17 CEST 2016
Chris
Just realized the using the second rule in this definition
string = """ {character} """ | digit {hexdigit} "X" .
makes your code non-portable.
The expression ("a" = 61X) is TRUE in one implementation and FALSE in another.
Jörg
> Am 09.07.2016 um 08:34 schrieb Jörg Straube <joerg.straube at iaeth.ch>:
>
> Chris
>
> Indeed, you're right the Oberon-07 report allows for any charcter coding as the formulation "ordinal value" (of type INTEGER) is quite open. Even EBCDIC would do.
> One small remark though, the Oberon EBNF is not complete as the rule "character" (used in the definition of "string") is not defined.
>
> The ProjectOberon implementation supports CHAR in the range 0X to 7FX.
>
> If the implementation decides for Unicode you should probably take UTF-32 as internal representation of CHAR, and INTEGER has to have at least 32 bit. If you would take UTF-16 or UTF-8 the rather simple ch := s[5] get's quite complicated to implement.
> If you restrict yourself to Unicode BMP (no emoji), UTF-16 will do and INTEGERs can have 16 bit.
>
> Jörg
>
> Am 09.07.2016 um 04:03 schrieb Chris Burrows <chris at cfbsoftware.com>:
>
>>> -----Original Message-----
>>> From: Oberon [mailto:oberon-bounces at lists.inf.ethz.ch] On Behalf Of
>>> Jörg Straube
>>> Sent: Saturday, 9 July 2016 5:06 AM
>>> To: ETH Oberon and related systems
>>> Cc: paulreed at paddedcell.com
>>> Subject: Re: [Oberon] CHAR to/from BYTE conversion
>>>
>>> Don't open that can of worms :-)
>>> - Should LEN(s) return the number of characters in s (logical length)
>>> or the number of bytes of s (physical length)?
>>
>> Neither.
>>
>> LEN is the number of elements of an array. An array declared as
>>
>> VAR s: ARRAY 12 OF CHAR
>>
>> will always have 12 elements no matter how long the string is that is currently stored in the array.
>>
>> LEN("abc") = 4 (* including the terminating NULL character *)
>>
>> s := "abc";
>> LEN(s) = 12
>>
>> Similarly for an array declared as
>>
>> VAR a: ARRAY 12 OF INTEGER
>>
>> LEN(a) = 12.
>>
>> SYSTEM.SIZE will give you the number of bytes allocated to a variable. SYSTEM indicates that the value that SIZE returns is implementation-dependent.
>>
>> The length of a string is the number of characters up to but not including the null terminating character. The longest string that can be stored in the array s where LEN(s) = 12 is 11 characters.
>>
>>> - shoudn't the type CHAR better be called "ASCII"?
>>
>> No. Since 2013 the Oberon Report has not required it be Latin-1 / ASCII. CHAR is now defined as 'the characters of a standard character set'.
>>
>>> - As Unicode characters need up to 32 bit, should CHAR be 32 bit?
>>>
>>
>> There is nothing stopping an implementer of Oberon from defining characters as 32-bit (or 16-bit for that matter) items if they wanted it to be used to develop applications that required a character set other than ASCII.
>>
>>> Strings are quite tricky, especially a compact internal
>>> representation of them.
>>>
>>
>> True.
>>
>> Regards,
>>
>> Chris Burrows
>> CFB Software
>> http://www.astrobe.com
>>
>>
>>>> Am 08.07.2016 um 20:50 schrieb Skulski, Wojciech
>>> <skulski at pas.rochester.edu>:
>>>>
>>>>
>>>>> That means both CHAR and BYTE holds values 0 to 255.
>>>>
>>>>> No, CHAR holds characters, 0X to 0FFX, including "A", "B" etc.;
>>> and
>>>>> BYTE holds 0 to 255, a sub-range of the INTEGER range.
>>>>
>>>>> Your statement is indicative of how much damage C has done to the
>>>>> world :)
>>>>
>>>> Characters should not be just English characters.
>>>>
>>>> Your statement is indicative of how much damage ASCII has done to
>>> the
>>>> world :)
>>>>
>>>> W.
>>>> --
>>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related
>>>> systems https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>>
>>> --
>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related
>>> systems https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>
>> --
>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.inf.ethz.ch/pipermail/oberon/attachments/20160709/3d9d6a6f/attachment.html>
More information about the Oberon
mailing list