[Oberon] Coping with Word documents

John Drake jmdrake_98 at yahoo.com
Thu Feb 10 00:02:55 CET 2005


--- easlab at absamail.co.za wrote:

> 
> Thu Jan 13 21:49:09 MET 2005, Soren Renner said,
> > > ... we have what is clearly the best OS and
> > > language project in the world*1, yet almost 
> > > noone will pay any attention to it.
> 
>  Peter E.  wrote:
> > In many environments, editing of Word documents 
> > is necessary.  Ref. http://www.abisource.com/
> 
> ] a native OS X frontend
> 
> So It could run under linux ?
> 
> Yes, coping with Word documents is as essential as
> coping with
> flu-virus or taxes.  This what I regularly [have to]
> do:-
> 
> * apparently all the formatting info is at the
> beginning and end
> of the file -  although once I received a Word
> document containing
> pictures.
> 
> * cut off the head and the tail
> = find where the head stops and the real text starts
> == eliminate the (line-len > display-width) in order
> to be able to READ
>    the text.   
> I use:  Mail.CutLines 76 *
> = this can be problematic, since it cuts the line(s)
> at word breaks, so 
>  there can be problems when there are no 'word
> breaks' for a string
> of say 500 chars.
> === I use various heuristics to find the beginning
> of the 'real text,
>     which is possible given the speed of n-o's
> cording.
> ----------------
> Since the header-formatting-info is mostly
> non-ascii, perhaps someone
> who is comfortable with RX.Tool can write a few
> lines to:
>   delete all the non-ascii chars  ?
> Although perhaps:
> 1.  this is too slow,
> 2. RX.Tool also has problems with lines being too
> long.
> 
> 
> == Chris Glur.

This will strip out all non-ASCII charecters
in a file.  It still leaves a lot of junk
charecters in Word documents though.  

MODULE StripNonASCII;

  IMPORT 
    Files, In;
		
  PROCEDURE Do*;
  VAR
    infile, outfile : Files.File;
    infname, outfname : ARRAY 50 OF CHAR;
    in, out : Files.Rider;
    ch : CHAR;
		
  BEGIN
    In.Open;
    In.String(infname);
    In.String(outfname);
    infile := Files.Old(infname);
    outfile := Files.New(outfname);
    Files.Set(in, infile, 0);
    Files.Set(out, outfile, 0);
    WHILE ~in.eof DO;
      Files.Read(in, ch);
      IF (ORD(ch) <= 127) & (ORD(ch) >= 10) THEN
        Files.Write(out, ch);
      END;
    END;
    Files.Register(outfile);
    Files.Close(infile);
  END Do;
END StripNonASCII.

StripNonASCII.Do "Document.doc" "Document.txt" ~

Regards,

John M. Drake


		
__________________________________ 
Do you Yahoo!? 
Yahoo! Mail - now with 250MB free storage. Learn more.
http://info.mail.yahoo.com/mail_250



More information about the Oberon mailing list