[Oberon] New Oberon to Lua Transpiler
rochus.keller at bluewin.ch
rochus.keller at bluewin.ch
Mon Oct 14 11:55:56 CEST 2019
@ Luca Boasso:
Thanks for the data.
It's long time ago I had to deal with JVM bytecode, but I thought to remember that there is a way to get the address of local variables, but maybe I mix it up with CIL/CLR. Allocating a new array for each local variable looks like a rather expensive operation and I'm not sure the LuaJIT optimizer would get rid of it. I already have this concept with structured thunks (i.e. call-by-reference to structure/array elements), but currently it looks like this was one of the bottlenecks. A much cheaper operation would be to use multiple return values for the changed values, but as far as I remember JVM doesn't support it (in contrast to LuaJIT).
But anyway I already had a short-term success: I was able to move the allocation of thunk functions away from the call and use local variables to reference them instead. I found a quite decent way to do that in my current generator without a full re-design. The JIT is now able to find the relevant traces without hitting FNEW and aborting. The speedup is significant. Here are some numbers for comparison (bear with my higher figures than yours since I do this on a ten years old 32 bit Linux laptop):
Test OBNC
Perm 11
Towers 10
Queens 9
Intmm 8
Mm 14
Quick 10
Bubble 14
Tree 11
FFT 24
NFP 152.19
Test OBNLC 2019-10-08
Perm 237
Towers 11
Queens 278
Intmm 51
Mm 53
Quick 1030
Bubble 15
Tree 39
FFT 22
NFP 3123
FP 3376
FP 302.55
Test OBNCL 2019-10-13
Perm 239
Towers 11
Queens 25
Intmm 55
Mm 58
Quick 32
Bubble 16
Tree 39
FFT 22
NFP 705.93
FP 975.11
As you can see my most recent implementation is now pretty close to the expected performance of LuaJIT compared to a native implementation. I will still try other optimizations, especially for Perm.
Best
R.
More information about the Oberon
mailing list