[Oberon] Text search is THE main task.
easlab at absamail.co.za
easlab at absamail.co.za
Sat Jun 12 16:17:58 CEST 2004
Who needs pre-emptive multitasking except for real-time control
systems ?
Newsgroups: comp.lang.oberon
Subject: Re: Questions from Unix world...
> WLad wrote:
> > Dear colleague!
> >
> > I show Aos to one Plan9 programmer.
> > He asked some questions:
> >
> > > in Oberon/bluebottle, do we have
> > > 1. something like 'grep' to search regular expressions (or the like)
> > within textfiles;
>
(jmdrake) wrote:
> Yes. There is a package called "RX" that comes with Oberon System 3, NO
> and BlueBottle. See:
>
> http://www.oberon.ethz.ch/software/RX.html
>
> Also there is another regular expressions package called "Regul".
>
> http://www.oberon.ethz.ch/software/Regul.html
>
I was persuaded to fetch this, because I didn't realise that it
couldn't do [as described below] what is needed.
It's a monster, because it is part of the meta compiler: Bable;
and needs several other packages. You must buy the complete
automobile if you want to test the cigarette lighter !
> > > 2. something like 'awk' to work with flatfile databases, or some
> > other database system;
>
> Nothing like "awk" exists, but you can make calls to the RX module
> from inside another program, do pattern matching and accomplish much
> of what someone might do with awk inside a simple Oberon program.
Efficient searching for text is undoubtely becoming a key for
productivety, beside minimising possible great frustration.
As a 'heavy' user of NO, this is what helps me [PLUS what I still need]:-
* file spaces are organised by topic into disk-partitions: say 5 to 20.
* new incoming 'articles' get appended to their appropriate file,
under their 'chapter number & title' and the 'chapter number & title'
appears at the file header as an index thus:
file = AImisc:ExprtSys4Legal
1. Overview of AI Research Groups in The Netherlands
2. Legal advisory system - EDI within EC
3. A Pragmatic Legal Expert System
4. Some URLs
5. About Legal informatics
6. expert systems
7. lexit.at/resource
8. A Pragmatic Legal Expert System
9. Book: Representation and Reasoning in Law
Guided by colours-info [qed with NO], its a 'wipe & a click'
[750 ms.] to get to the beginning of the "6. expert systems" chapter.
* For searching a <text string> in a partition:-
in order to avoid half the redundant *.Bak files, have a pre-build
file list in eg. ACTnRule:SearchTemplate which looks like this:---
ACTnRule:SearchTemplate ==
Updated 2004 Jan 25 Find.All ^
Find.Domain
ACTnRule:AI.ES.4law
ACTnRule:AbortionAct
ACTnRule:Aces2InfoAct
ACTnRule:AmdCompaniesAct
...
ACTnRule:UsuryAct
ACTnRule:WillsAct.html
~ ---- end of file: ACTnRule:SearchTemplate
With nicely coloured for hi-lighting of the 2 commands:
'Find.Domain' selects the set of files [which was created by
System.Directory <partitonID.>:* and Store the Directory and
RX.Grep Directory \i ".Bak"; to remove the redundany *.Bak].
And select <keyString> , click 'Find.All ^' rapidly list all files of the
partition containing <keyString>.
What is needed !!
List all files [in the list of files {which BTW can be further partitioned
by date - using SmartDir.Tool ; it's very common to want limit access
to files that you've been working on recently} ] containg <string1>
and <string2> ...<stringN> in any order in files of the file set.
I mistakenly though that I could hack this function by running the
output of 'Find.All ^'
[a text of lines of: <FileID> <Tab> <subString containing Key> ]
through RX to remove the <Tab> <subString containing Key>,
and then use the reduced set of 'satisfying files' to test for the
next key.
Of course this is no good since multiple finds of a single key
produce multiple FileIDs for the next stage.
What I need is another quick hack to remove duplicate lines,
in the text if one-FileID-per-line. Any solutions welcomed !
Alternatively, it would be valuable if someone [ELSE ;-] could add
a command to Find.Mod [perhaps by extending existing code] to
search for multiple strings, in a short-circuited-AND way.
And just output the fileIDs which pass: contain ALL of the strings
- in any oder.
Related to this: it looks as if "western civilisation" is getting overly
dependent on google ? The dissapearance of the facilities would
be disasterous for me.
== Chris Glur.
More information about the Oberon
mailing list