[Oberon] RISC emulator

Tue Mar 18 10:11:39 CET 2014

Hi
After some thought I'm not so sure if the "international Project Oberon" should really be able to hold ALL Unicode characters, so ALL 17 planes. I think we can limit to plane 0, namely BMP.
I'm now rather in favor of the following scheme:
- Redefine CHAR as a type occupying TWO bytes.
- Use internally the UCS-2 coding (almost identical to UTF-16 but restricted to plane 0)
- If you write text to files use UTF-8 coding.

On the other hand, knowing that the Oberon fonts do only support a very limited set of additional latin letters, the whole UCS-2 coding is a little overkill. We could easily go with Unicode but restrict the code points to one byte. We could call that coding  "reduced ISO-8859-1" :-)

br, Jörg

> Am 17.03.2014 um 17:51 schrieb Claudio Nieder <private at claudio.ch>:
> 
> Hi,
> 
>> Disadvantage: Implementation is more complex. You have to convert strings into UTF8 and back. Umlauts do not fit into one CHAR anymore.
> 
> Question is, must CHAR be one byte or should you rather have a distinction between a type BYTE which is 8 bits and a type CHAR which can hold all Unicode character values from 0 to 10FFFFH and therefore is represented internally in some "different" way.
> 
> claudio
> -- 
> Claudio Nieder, Talweg 6, CH-8610 Uster, Tel +4179 357 6743, www.claudio.ch
> 
> 
> 
> 
> --
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2334 bytes
Desc: not available
Url : https://lists.inf.ethz.ch/pipermail/oberon/attachments/20140318/dd1ca286/attachment.bin