<div dir="ltr">In some Oberon versions, type CHAR has a size of 2 or 4 bytes.<div>(i. e. BlackBox has a 2-byte CHAR, Active Oberon has 4-byte CHAR.)<br><div><br></div><div>In the latest Oberon language report (2016), the size of type CHAR</div><div>is not defined, instead CHAR is said to hold "the characters of a</div><div>standard character set". A new type BYTE is added that is said to</div><div>hold "the integers between 0 and 255". Now BYTE is used instead of</div><div>CHAR where it is necessary to work with binary data (or files).</div><div><br></div><div>Thus, in Project Oberon, module Files now has the following procedures:</div><div><br></div><div>PROCEDURE ReadByte*(VAR r: Rider; VAR x: BYTE);<br>PROCEDURE ReadBytes*(VAR r: Rider; VAR x: ARRAY OF BYTE; n: INTEGER);<br>PROCEDURE Read*(VAR r: Rider; VAR ch: CHAR);<br>PROCEDURE ReadString*(VAR R: Rider; VAR x: ARRAY OF CHAR);<br><br>PROCEDURE WriteByte*(VAR r: Rider; x: BYTE);<br>PROCEDURE WriteBytes*(VAR r: Rider; x: ARRAY OF BYTE; n: INTEGER);<br>PROCEDURE Write*(VAR r: Rider; ch: CHAR);<br>PROCEDURE WriteString*(VAR R: Rider; x: ARRAY OF CHAR);<br></div><div><br></div><div>The procedure Write is internally the same as WriteByte, and likewise</div><div>procedure Read is the same as ReadByte, but with different signatures.</div><div>The only difference is i.e. that</div><div>    r.buf.data[r.bpos] := ORD(ch)<br></div><div>is used in Write instead of</div><div>    r.buf.data[r.bpos] := x</div><div>(as in WriteByte).<br></div><div><br></div><div>In Project Oberon, the size of CHAR is 1 byte.</div><div><br></div><div>But, if CHAR were 2 bytes, module Files should provide a way to read and</div><div>write the characters in the way that is convenient for the further usage of</div><div>the file. If CHARs are to be written in a file raw, as a 2-byte integer, then</div><div>the file would have an encoding of UTF-16, UCS-2 or similar (without BOM),</div><div>and thus probably it will not display properly in any modern text viewer.</div><div><br></div><div>My proposal (in case of 2-byte or 4-byte CHARs) is to make Files.Read</div><div>and Files.Write work with CHARs in the following manner:</div><div><br></div><div>1. The file is assumed to be UTF-8 encoded.</div><div>2. Files.Read gets one or more bytes from a file and constructs</div><div>    a value of CHAR.</div><div>3. Files.Write converts the given CHAR in UTF-8 and puts one</div><div>    or more bytes in a file.</div><div><br></div><div>The number of bytes read or written for a 2-byte CHAR can be 1, 2 or 3,</div><div>as UTF-8 takes some bits for itself.</div><div><br></div><div>For your information:</div><div>2-byte version of Unicode covers all modern languages of the world, including</div><div>Chinese, Japanese, Korean and Thai. The rest 2 bytes of Unicode are used to</div><div>encode ancient writings, emoji, and some strange things like playing card icons,</div><div>tiles of the game Mah Jongg, and even dominoes.</div><div><br></div><div>Additionally, two procedures WriteChar and ReadChar can be added, that</div><div>write the values of CHARs directly (for fast local non-portable data storage).</div><div><br></div><div>Kind regards,</div><div>Arthur Yefimov</div><div><a href="https://free.oberon.org/en">https://free.oberon.org/en</a></div><div><br></div></div></div>