[Oberon] Re: report-back on linux-oberon

Thomas Frey thomas.frey at alumni.ethz.ch
Tue Jun 6 11:00:15 CEST 2006


> But Wirth's compiler-book is an 'analog' image, like a photograph.
> So it needs human intelligence or OCR [which will give some errors]
> to convert it to digital - eg. ascii text.
>
> The next question is: since ps/pdf are designed for text and our
> book is only a 'analog-image', was pdf better than other image
> formats ?
PDF offers quite good compression methods for black and white images
that are not supported by standard image processing programs. E.g. PNG
in the same resolution would take about 15MB. Another advantage of PDF
is that the pages are kept in one file and that the page switching is
easier as for example opening the next PNG page from a ZIP/TAR
archive.

I have now OCRed the images in the PDF file and tried to format the
result as a word document. The scan quality of the pages in the PDF is
not perfect so the OCRed text still needs more editing. The page
structure is the same as in the real book for easier reference.
The doc can be found in zipped form at
http://www.bbos.org/downloads/compilerbook.zip (795kb)
If somebody improves the document, it could maybe replace (possibly
again in pdf form) the image-pdf on the ETH server.

--Thomas

>
> PS. Dijkstra [one of the other high-priests who died recently] had
> his personel notes also published on the net.  From pre-computing
> [what were those 'wax-papers'/rolling-machines called ?] typewriter
> days.  They were calling for volunteers to transcribe [Texas Uni
> AFAI-remember]. I think they initially used OCR and needed editors.
>
> > You get the complete book in Oberon Text format, pretty
> > unreadable though imo because all formatting
> > (including indentation) is gone.
>
> I don't know that the compiler book which I'm talking about is
> on the net in text, although I'd expect the original 'computer
> file' to be still in archive ?
>
> Which is the other point: if you/I get text which is not perfect, then
> the process of edit/cleaning it while also colouring sections, really
> helps me to absorb it.   After all, the material has only reached it's
> peak value once it's added to YOUR knowledge.  Which is the top
> of the heirarchy ?
>
> Brantley Coile wrote:
> > If you want to play with the source code from a Oberon text file
> > on other systems, like Linux, you can easily convert the files.
> > Remove all the bytes at the beginning of the file upto the word MODULE.
> > Next, convert all the CRs to NLs.
> > Last, either set indention in your editor to two spaces or just
> > expand the tabs to 2 spaces.  (I do the first).
>
> Yes but you don't have to do that 'manually', because others have
> already made the utilities, eg. ET.OpenAscii  ^, ET.StoreAscii * ....etc.
>
>
> == Chris Glur.
> --
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>


More information about the Oberon mailing list