[Oberon] Coping with Word documents
John Drake
jmdrake_98 at yahoo.com
Thu Feb 10 00:02:55 CET 2005
--- easlab at absamail.co.za wrote:
>
> Thu Jan 13 21:49:09 MET 2005, Soren Renner said,
> > > ... we have what is clearly the best OS and
> > > language project in the world*1, yet almost
> > > noone will pay any attention to it.
>
> Peter E. wrote:
> > In many environments, editing of Word documents
> > is necessary. Ref. http://www.abisource.com/
>
> ] a native OS X frontend
>
> So It could run under linux ?
>
> Yes, coping with Word documents is as essential as
> coping with
> flu-virus or taxes. This what I regularly [have to]
> do:-
>
> * apparently all the formatting info is at the
> beginning and end
> of the file - although once I received a Word
> document containing
> pictures.
>
> * cut off the head and the tail
> = find where the head stops and the real text starts
> == eliminate the (line-len > display-width) in order
> to be able to READ
> the text.
> I use: Mail.CutLines 76 *
> = this can be problematic, since it cuts the line(s)
> at word breaks, so
> there can be problems when there are no 'word
> breaks' for a string
> of say 500 chars.
> === I use various heuristics to find the beginning
> of the 'real text,
> which is possible given the speed of n-o's
> cording.
> ----------------
> Since the header-formatting-info is mostly
> non-ascii, perhaps someone
> who is comfortable with RX.Tool can write a few
> lines to:
> delete all the non-ascii chars ?
> Although perhaps:
> 1. this is too slow,
> 2. RX.Tool also has problems with lines being too
> long.
>
>
> == Chris Glur.
This will strip out all non-ASCII charecters
in a file. It still leaves a lot of junk
charecters in Word documents though.
MODULE StripNonASCII;
IMPORT
Files, In;
PROCEDURE Do*;
VAR
infile, outfile : Files.File;
infname, outfname : ARRAY 50 OF CHAR;
in, out : Files.Rider;
ch : CHAR;
BEGIN
In.Open;
In.String(infname);
In.String(outfname);
infile := Files.Old(infname);
outfile := Files.New(outfname);
Files.Set(in, infile, 0);
Files.Set(out, outfile, 0);
WHILE ~in.eof DO;
Files.Read(in, ch);
IF (ORD(ch) <= 127) & (ORD(ch) >= 10) THEN
Files.Write(out, ch);
END;
END;
Files.Register(outfile);
Files.Close(infile);
END Do;
END StripNonASCII.
StripNonASCII.Do "Document.doc" "Document.txt" ~
Regards,
John M. Drake
__________________________________
Do you Yahoo!?
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250
More information about the Oberon
mailing list