[Oberon] Oberon performance (long)

Vasile Rotaru vrotaru at seznam.cz
Thu Apr 17 23:37:56 CEST 2003


> This is correct. A few instructions could be removed.
> There are basically two approaches: finding a small local optimization
> which achieves this (e.g. the register allocator could remember the
> last value loaded in a register and propose the same register instead
> of loading it again), or implementing CSE (common subexpression
> elimination) in the compiler.
> The compiler first emits the code in an intermediate representation (LIR)
> (defined
> in PCLIR). Then the installed backend (PCG386 + PCO for i386) will
> translate the LIR to i386 after a small optimization step to reconstruct the
> complex addressing modes instead (and at the same time remove a few
> instructions).

Well, it was the first thing I was thinking about, and may be I'll go this way (or both woys). But just now
a small modification in PCC.Field has compiled my toy example rather well, and I'm interested to see if
it could be done at the level of intermediate code generation. Here are the relevant bits.

PROCEDURE Field*(code: Code;  VAR x: Item;  fld: PCT.Field);
	(* new code *)
	IF (lastLoad.level = x.level) & (lastLoad.offs = x.offs) THEN
		x := lastLoad;
	 	LoadAdr(code, x);
	 	lastLoad := x;
	END; (* if *)
	(* old code
	LoadAdr(code, x);
	x.mode := RegRel;  x.offs := fld.adr(PCBT.Variable).offset;  x.type := fld.type;
	x.deref := FALSE
END Field;

  Of course, ``lastLoad'' is new variable of type ``Item'', and it is initialazed like this
  lastLoad.level := -1; 
  As for the results:

MODULE Simple;

	Simple	= OBJECT
		x, y, z: LONGINT;
			x := 1; y := 2; z := 3;
		END Init;
			RETURN x + y + z
		END Sum;
	END Simple;
END Simple.

Decoder.Decode Simple.Obx ~

0007H: 55                                  PUSH    EBP
0008H: 8B EC                               MOV     EBP,ESP
000AH: 8B 5D 08                            MOV     EBX,8[EBP]
000DH: C7 03 01 00 00 00                   MOV     0[EBX],1
0013H: C7 43 04 02 00 00 00                MOV     4[EBX],2
001AH: C7 43 08 03 00 00 00                MOV     8[EBX],3
0021H: 8B E5                               MOV     ESP,EBP
0023H: 5D                                  POP     EBP
0024H: C2 04 00                            RET     4

0027H: 55                                  PUSH    EBP
0028H: 8B EC                               MOV     EBP,ESP
002AH: 8B 5D 08                            MOV     EBX,8[EBP]
002DH: 8B 03                               MOV     EAX,0[EBX]
002FH: 03 43 04                            ADD     EAX,4[EBX]
0032H: 03 43 08                            ADD     EAX,8[EBX]
0035H: 8B E5                               MOV     ESP,EBP
0037H: 5D                                  POP     EBP
0038H: C2 04 00                            RET     4
003BH: 6A 03                               PUSH    3
003DH: CC                                  INT     3

  (Of course it fails, on any real-world programs.)
  I have tried to generalize this scheme, but I didn't succeed very much. In fact at all.

  There is one more question. Is the code generation done concurrently?
And how can I know which data belongs to which process? Using AosActive.ActiveObject() ?
May this be a problem if I choose to modify the register allocator in x86 backend?

  Any way, I will have to look deeply into the code. But this is a nice intelectual challenge and I like it.


How apt the poor are to be proud.
		-- William Shakespeare, "Twelfth-Night"


More information about the Oberon mailing list