<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Segoe UI Emoji";
panose-1:2 11 5 2 4 2 4 2 2 3;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:463885164;
mso-list-template-ids:1177944772;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:72.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:108.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:144.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:180.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:216.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:252.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:288.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:324.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1
{mso-list-id:1392803774;
mso-list-template-ids:-1010654898;}
@list l1:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level2
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:72.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:108.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:144.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:180.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:216.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:252.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:288.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l1:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:324.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l2
{mso-list-id:1475872616;
mso-list-type:hybrid;
mso-list-template-ids:1190719198 783950112 134676483 134676485 134676481 134676483 134676485 134676481 134676483 134676485;}
@list l2:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-font-family:Calibri;}
@list l2:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l2:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l2:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l2:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l2:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l2:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l2:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l2:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l3
{mso-list-id:1629237978;
mso-list-type:hybrid;
mso-list-template-ids:-847848414 697058262 134676483 134676485 134676481 134676483 134676485 134676481 134676483 134676485;}
@list l3:level1
{mso-level-start-at:0;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-font-family:Calibri;}
@list l3:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l3:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l3:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l3:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l3:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l3:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Symbol;}
@list l3:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l3:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l4
{mso-list-id:2097557918;
mso-list-template-ids:1984583994;}
@list l4:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level2
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:72.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:108.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:144.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:180.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:216.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:252.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:288.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l4:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:324.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=DE-CH link=blue vlink=purple style='word-wrap:break-word'><div class=WordSection1><p class=MsoNormal><span style='mso-fareast-language:EN-US'>Arthur<o:p></o:p></span></p><p class=MsoNormal><span style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>On additional remark: I recommended to represent the CHAR internally as 4 bytes; to be able to hold all possible Unicode chars.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>If you want to cut memory usage in half, you could represent a CHAR internally as 2-byte and restrict yourself to Unicode BMP. BMP covers quite some languages, unfortunately no emojis </span><span lang=EN-US style='font-family:"Segoe UI Emoji",sans-serif;mso-fareast-language:EN-US'>😊</span><span lang=EN-US style='mso-fareast-language:EN-US'><o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>In FPGA Oberon (with only 1 MB of RAM) you have to carefully design your font data structure, as Unicode fonts easily fill up your whole 1MB memory.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>My recommendation: break the font up in 128 CHAR blocks and load them dynamically when needed. Most programmers only write programs (and use text) in ONE language only.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>Latin OR Cyrillic OR Greek. It’s very seldom to have to render text in Latin AND Cyrillic AND Greek AND …<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>In Unicode those languages are often grouped in 128 characters (Greek: 03xx, Cyrillic: 04xx, Hebrew: 05xx, Arabic: 06xx and so on)<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>Emojis did not fit in the BMP anymore; they can be found in the SMP at 1F9xx <o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>br<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>Jörg<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span lang=EN-US>From:</span></b><span lang=EN-US> Jörg <joerg.straube@iaeth.ch> <br><b>Sent:</b> Wednesday, July 21, 2021 6:46 PM<br><b>To:</b> 'ETH Oberon and related systems' <oberon@lists.inf.ethz.ch><br><b>Subject:</b> RE: [Oberon] Files.Write and 2-byte CHARs<o:p></o:p></span></p></div></div><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><p class=MsoNormal><span style='mso-fareast-language:EN-US'>Arthur<o:p></o:p></span></p><p class=MsoNormal><span style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>The implementation of CHAR is not defined in the Oberon report; it’s length is not defined, the charset is not defined and the coding is not defined neither.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>It’s all up to the implementation.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>The charset can be ASCII or EBCDIC or Unicode or any other. If it’s Unicode it can be coded as UTF-8, UCS-2, UTF-16 or others.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>As implementor, you can decide to represent CHAR differently internally and externally.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>In ProjectOberon the implementation is as follows:<o:p></o:p></span></p><ul style='margin-top:0cm' type=disc><li class=MsoListParagraph style='margin-left:0cm;mso-list:l3 level1 lfo3'><span lang=EN-US style='mso-fareast-language:EN-US'>CHAR takes one byte<o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l3 level1 lfo3'><span lang=EN-US style='mso-fareast-language:EN-US'>CHAR charset is 7-bit ASCII<o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l3 level1 lfo3'><span lang=EN-US style='mso-fareast-language:EN-US'>internal = external<o:p></o:p></span></li></ul><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>If you want to extend CHAR to more than 7bits, I recommend doing it as follows:<o:p></o:p></span></p><ul style='margin-top:0cm' type=disc><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo6'><span lang=EN-US style='mso-fareast-language:EN-US'>Adapt the compiler to store CHAR internally as INTEGER (4 Byte). Don’t use UTF-8 internally as ARRAY OF CHAR will be no ARRAY anymore..<o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo6'><span lang=EN-US style='mso-fareast-language:EN-US'>Reading/Writing 4-byte CHAR: encode/decode it on the external medium as UTF-8<o:p></o:p></span></li></ul><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>You have to do quite some adaptions to the system:<o:p></o:p></span></p><ul style='margin-top:0cm' type=disc><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo6'><span lang=EN-US style='mso-fareast-language:EN-US'>Rewrite Fonts.Mod<o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo6'><span lang=EN-US style='mso-fareast-language:EN-US'>Decide how to store Filenames (32 CHAR or 32 Bytes) </span><span lang=EN-US style='font-family:Wingdings;mso-fareast-language:EN-US'>à</span><span lang=EN-US style='mso-fareast-language:EN-US'> adapt FileDir.Mod<br>(recommendation: stick to 32 byte and treat them as UTF-8, consequence: filenames can get shorter than 32 CHARs if Unicode characters are used)<o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo6'><span lang=EN-US style='mso-fareast-language:EN-US'>You have to decide how to store module names </span><span lang=EN-US style='font-family:Wingdings;mso-fareast-language:EN-US'>à</span><span lang=EN-US style='mso-fareast-language:EN-US'> adapt Modules.Mod<br>(recommendation: stick to 32 byte and treat them as UTF-8. Consequence: module names can get shorter than 32 CHARs if Unicode characters are used) <o:p></o:p></span></li><li class=MsoListParagraph style='margin-left:0cm;mso-list:l2 level1 lfo6'><span lang=EN-US style='mso-fareast-language:EN-US'>You have to decide how to store string constants in the module </span><span lang=EN-US style='font-family:Wingdings;mso-fareast-language:EN-US'>à</span><span lang=EN-US style='mso-fareast-language:EN-US'> adapt compiler<br>(recommendation: store them as 4byte CHAR to make string assignments s := “Hi </span><span lang=EN-US style='font-family:"Segoe UI Emoji",sans-serif;mso-fareast-language:EN-US'>😎</span><span lang=EN-US style='mso-fareast-language:EN-US'>”; not too complex)<o:p></o:p></span></li></ul><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>br<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'>Jörg<o:p></o:p></span></p><p class=MsoNormal><span lang=EN-US style='mso-fareast-language:EN-US'><o:p> </o:p></span></p><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span lang=EN-US>From:</span></b><span lang=EN-US> Oberon <<a href="mailto:oberon-bounces@lists.inf.ethz.ch">oberon-bounces@lists.inf.ethz.ch</a>> <b>On Behalf Of </b>Arthur Yefimov<br><b>Sent:</b> Wednesday, July 21, 2021 5:31 PM<br><b>To:</b> <a href="mailto:oberon@lists.inf.ethz.ch">oberon@lists.inf.ethz.ch</a><br><b>Subject:</b> [Oberon] Files.Write and 2-byte CHARs<o:p></o:p></span></p></div><p class=MsoNormal><span lang=EN-US><o:p> </o:p></span></p><div><p class=MsoNormal>In some Oberon versions, type CHAR has a size of 2 or 4 bytes.<o:p></o:p></p><div><p class=MsoNormal>(i. e. BlackBox has a 2-byte CHAR, Active Oberon has 4-byte CHAR.)<o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>In the latest Oberon language report (2016), the size of type CHAR<o:p></o:p></p></div><div><p class=MsoNormal>is not defined, instead CHAR is said to hold "the characters of a<o:p></o:p></p></div><div><p class=MsoNormal>standard character set". A new type BYTE is added that is said to<o:p></o:p></p></div><div><p class=MsoNormal>hold "the integers between 0 and 255". Now BYTE is used instead of<o:p></o:p></p></div><div><p class=MsoNormal>CHAR where it is necessary to work with binary data (or files).<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Thus, in Project Oberon, module Files now has the following procedures:<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>PROCEDURE ReadByte*(VAR r: Rider; VAR x: BYTE);<br>PROCEDURE ReadBytes*(VAR r: Rider; VAR x: ARRAY OF BYTE; n: INTEGER);<br>PROCEDURE Read*(VAR r: Rider; VAR ch: CHAR);<br>PROCEDURE ReadString*(VAR R: Rider; VAR x: ARRAY OF CHAR);<br><br>PROCEDURE WriteByte*(VAR r: Rider; x: BYTE);<br>PROCEDURE WriteBytes*(VAR r: Rider; x: ARRAY OF BYTE; n: INTEGER);<br>PROCEDURE Write*(VAR r: Rider; ch: CHAR);<br>PROCEDURE WriteString*(VAR R: Rider; x: ARRAY OF CHAR);<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>The procedure Write is internally the same as WriteByte, and likewise<o:p></o:p></p></div><div><p class=MsoNormal>procedure Read is the same as ReadByte, but with different signatures.<o:p></o:p></p></div><div><p class=MsoNormal>The only difference is i.e. that<o:p></o:p></p></div><div><p class=MsoNormal> r.buf.data[r.bpos] := ORD(ch)<o:p></o:p></p></div><div><p class=MsoNormal>is used in Write instead of<o:p></o:p></p></div><div><p class=MsoNormal> r.buf.data[r.bpos] := x<o:p></o:p></p></div><div><p class=MsoNormal>(as in WriteByte).<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>In Project Oberon, the size of CHAR is 1 byte.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>But, if CHAR were 2 bytes, module Files should provide a way to read and<o:p></o:p></p></div><div><p class=MsoNormal>write the characters in the way that is convenient for the further usage of<o:p></o:p></p></div><div><p class=MsoNormal>the file. If CHARs are to be written in a file raw, as a 2-byte integer, then<o:p></o:p></p></div><div><p class=MsoNormal>the file would have an encoding of UTF-16, UCS-2 or similar (without BOM),<o:p></o:p></p></div><div><p class=MsoNormal>and thus probably it will not display properly in any modern text viewer.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>My proposal (in case of 2-byte or 4-byte CHARs) is to make Files.Read<o:p></o:p></p></div><div><p class=MsoNormal>and Files.Write work with CHARs in the following manner:<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>1. The file is assumed to be UTF-8 encoded.<o:p></o:p></p></div><div><p class=MsoNormal>2. Files.Read gets one or more bytes from a file and constructs<o:p></o:p></p></div><div><p class=MsoNormal> a value of CHAR.<o:p></o:p></p></div><div><p class=MsoNormal>3. Files.Write converts the given CHAR in UTF-8 and puts one<o:p></o:p></p></div><div><p class=MsoNormal> or more bytes in a file.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>The number of bytes read or written for a 2-byte CHAR can be 1, 2 or 3,<o:p></o:p></p></div><div><p class=MsoNormal>as UTF-8 takes some bits for itself.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>For your information:<o:p></o:p></p></div><div><p class=MsoNormal>2-byte version of Unicode covers all modern languages of the world, including<o:p></o:p></p></div><div><p class=MsoNormal>Chinese, Japanese, Korean and Thai. The rest 2 bytes of Unicode are used to<o:p></o:p></p></div><div><p class=MsoNormal>encode ancient writings, emoji, and some strange things like playing card icons,<o:p></o:p></p></div><div><p class=MsoNormal>tiles of the game Mah Jongg, and even dominoes.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Additionally, two procedures WriteChar and ReadChar can be added, that<o:p></o:p></p></div><div><p class=MsoNormal>write the values of CHARs directly (for fast local non-portable data storage).<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Kind regards,<o:p></o:p></p></div><div><p class=MsoNormal>Arthur Yefimov<o:p></o:p></p></div><div><p class=MsoNormal><a href="https://free.oberon.org/en">https://free.oberon.org/en</a><o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div></div></div></div></body></html>