[Oberon] Text search is THE main task.

easlab at absamail.co.za easlab at absamail.co.za
Sat Jun 12 16:17:58 CEST 2004


Who needs pre-emptive multitasking except for real-time control
systems ?

Newsgroups: comp.lang.oberon
Subject: Re: Questions from Unix world...

> WLad  wrote: 
> > Dear colleague!
> > 
> > I show Aos to one Plan9 programmer.
> > He asked some questions:
> > 
> >  > in Oberon/bluebottle, do we have
> >  > 1. something like 'grep' to search regular expressions (or the like) 
> >  within textfiles;
> 
(jmdrake) wrote: 
> Yes.  There is a package called "RX" that comes with Oberon System 3, NO
> and BlueBottle.  See:
> 
> http://www.oberon.ethz.ch/software/RX.html
> 
> Also there is another regular expressions package called "Regul".
> 
> http://www.oberon.ethz.ch/software/Regul.html
> 
I was persuaded to fetch this, because I didn't realise that it 
couldn't do [as described below] what is needed.
It's a monster, because it is part of the meta compiler: Bable;
and needs several other packages.  You must buy the complete
automobile if you want to test the cigarette lighter !

> >  > 2. something like 'awk' to work with flatfile databases, or some 
> >  other database system;
> 
> Nothing like "awk" exists, but you can make calls to the RX module
> from inside another program, do pattern matching and accomplish much
> of what someone might do with awk inside a simple Oberon program.  

Efficient searching for text is undoubtely becoming a key for 
productivety, beside minimising possible great frustration.
As a 'heavy' user of NO, this is what helps me [PLUS what I still need]:-
* file spaces are organised by topic into disk-partitions: say 5  to 20.
*  new incoming 'articles' get appended to their appropriate file,
   under their 'chapter number & title' and the 'chapter number & title'
   appears at the file header as an index thus:
file = AImisc:ExprtSys4Legal
1. Overview of AI Research Groups in The Netherlands
2. Legal advisory system - EDI within EC
3. A Pragmatic Legal Expert System
4. Some URLs
5. About Legal informatics
6. expert systems
7. lexit.at/resource
8. A Pragmatic Legal Expert System
9. Book: Representation and Reasoning in Law

Guided by colours-info [qed with NO], its a 'wipe & a click'
[750 ms.] to get to the beginning of the "6. expert systems" chapter.

* For searching a <text string> in a partition:-
  in order to avoid half the redundant *.Bak files, have a pre-build
  file list in eg.   ACTnRule:SearchTemplate which looks like this:---
  ACTnRule:SearchTemplate ==
  Updated 2004 Jan 25  Find.All ^

Find.Domain
ACTnRule:AI.ES.4law
ACTnRule:AbortionAct
ACTnRule:Aces2InfoAct
ACTnRule:AmdCompaniesAct
...
ACTnRule:UsuryAct
ACTnRule:WillsAct.html
~  ---- end of file:   ACTnRule:SearchTemplate

    With nicely coloured for hi-lighting  of the 2 commands:
    'Find.Domain' selects the set of files [which was created by 
    System.Directory  <partitonID.>:* and Store the Directory and
    RX.Grep Directory \i   ".Bak"; to remove the redundany *.Bak].
   And select <keyString> , click 'Find.All ^' rapidly list all files of the
    partition containing <keyString>.
    
What is needed !!
  List all files [in the list of files {which BTW can be further partitioned 
  by date - using SmartDir.Tool ; it's very common to want limit access
  to files that you've been working on recently} ] containg <string1>
  and <string2> ...<stringN> in any order in files of the file set.
  
 I mistakenly though that I could hack this function by running the 
 output of 'Find.All ^' 
 [a text of lines of: <FileID> <Tab> <subString containing Key> ]
  through RX to remove the <Tab> <subString containing Key>,
  and then use the reduced set of 'satisfying files' to test for the
  next key.
  
Of course this is no good since multiple finds of a single key
produce multiple FileIDs for the next stage.
What I need is another quick hack to remove duplicate lines,
in the text if one-FileID-per-line. Any solutions welcomed !

Alternatively, it would be valuable if someone [ELSE ;-] could add 
a command to Find.Mod [perhaps by extending existing code] to 
search for multiple strings, in a short-circuited-AND way.  
And just output the fileIDs which pass: contain ALL of the strings
- in any oder.

Related to this: it looks as if "western civilisation" is getting overly
dependent on google ?  The dissapearance of the facilities would
be disasterous for me.

== Chris Glur.




More information about the Oberon mailing list