[Oberon] Re. Wirth compiler book text (was: report-back on linux-oberon)

easlab at absamail.co.za easlab at absamail.co.za
Tue Jun 6 06:23:53 CEST 2006

> -----Original Message-----
> > From: oberon-bounces at lists.inf.ethz.ch 
> > [mailto:oberon-bounces at lists.inf.ethz.ch] On Behalf Of 
> > easlab at absamail.co.za
> > Sent: Monday, 5 June 2006 3:18 PM
> > To: oberon at lists.inf.ethz.ch
> > Subject: [Oberon] Re: report-back on linux-oberon
> > 
> > 
> > But Wirth's compiler-book is an 'analog' image, like a photograph.
Chris Burrows wrote:
> Which book exactly are you referring to? 
I don't find the URL right now, but I remember commenting in this 
mailingList that I wasn't happy to download over 6MB.

> If it is Wirth's book titled 'Compiler Construction', the PDF copy that you 
> can download is NOT an 'analog' image, like a photograph. 
The size is 6'291'456.
It has 69 pdf-pages & the last is blank and pdf-pg68 ends with:

> All of the text is contained there. 
Yes, as a graphics image.

> It looks as though it is in a compressed or encrypted format which might 
> have misled you.
Well yes, I was misled by linux> mc's [too smart !] transform files
[without notifying the user] into different view-formats.
See below.
> > So it needs human intelligence or OCR [which will give some 
> > errors] to convert it to digital - eg. ascii text.
> > 
> No it does not. It just needs a copy of Adobe Acrobat to interpret it. 

Yes the pdf-viewer is an interpreter, which 'interprets' the sequence of 
bytes of the image of a rose into a picture for you, but not into the 4
chars: "r", "o","s","e".

> Using Acrobat you can save it to text, but if you want to retain the 
> formatting, RTF or Word Document format is better.
I've just realised that [linux] ETH-oberon can look at the file without
the deceptive filters, to show what's really there.

So I searched for [a common string] "The" and amazingly found:
"Chapter 6 The ..".
So it DOES contain 'text'.
But NO, it only contains the 'Chapter numbers & titles'.

What shows that it's a 'image' is eg. the 3 "s" in the word "consists"
all look different. OTOH the 'Chapter numbers & titles', all look 
consistent & 'computer generated'.

Because the box with unix-ETH-oberon & *pdf-viewer isn't set up
to email to this mail-list, I'll post the vital contents of the *pdf to

BTW linux ETH-oberon doesn't want to scroll down to more than 
70% of the 6MB text/frame. But I doubt that the un-encoded-text
is all hidden down there.


== Chris Glur.

PS. From: AntiSpam UOL <f.passold.sspam at uol.com.br>
   who posts in inappropriate html, should be taken off this mail-list ?

More information about the Oberon mailing list