[Oberon] RISC-5 and memory

Thu Oct 5 04:12:06 CEST 2017

Joerg:

  I am not a HW guru. I just happen to design and build some of the most advanced FPGA-based boards in my corner of the research community, to the extent that some national labs are buying these. But it does not mean that my words should be taken as reference.

My position is simple. I need color. So I assume that others (or some others) also need color. I need lots of RAM for buffering or histogramming my data. So I assume that others also need some more RAM, sometimes. I heard that more advanced Oberon Systems need more RAM than 1 meg to be implemented. The reason that we do not have System-3 or V4 or Component Pascal on RISC5 is not only strong opinions by some (this too), but simply lack of RAM. So I believe that, while your points are well taken, we need more RAM. Note that some newer commercial boards, like Arty from Digilent, provide lots of RAM, albeit the Arty design leaves a bit to be desired. (I am studying it right now.) Nevertheless, this board exists and is reasonably priced, especially for educational customers. So designing a board is not a necessary precondition for moving forward. This much said, a dedicated Oberon FPGA board would be of some advantage for the community. 

Now concerning the memories. I would consider the following technologies: (1) SRAM, (2) ZBT SRAM, (3) SDRAM, (4) DDR and LPDDR, and (5) DDRx SDRAM. These are listed in the order of increasing performance and also difficulty. 

It is sort of easy to interface SRAM. It is discussed in textbooks. (Either Pedroni or Chu, I forgot which one, devoted an entire chapter to SRAM.) ZBT SRAM is similar. It offers 2x performance. The  ZBT interfaces are available from both Xilinx and Altera, as well as Open Cores. Then we enter SDRAM and here things get hairy because of the refresh cycles. I would not venture into this territory on my own. It is wiser to read about SDRAM interfacing. For example: 

opencores.org/project,sdr_ctrl
hamsterworks.co.nz/mediawiki/index.php/SDRAM_Memory_Controller

If you wish I can also send you a Master Thesis by Mohammad Talal Bonny (131 pages) on SDRAM interfacing (Technische Universität Braunschweig 2002). The author is now an EE professor at the Sharjah University. I am sure he would be happy to offer some advice.

Finally, we enter the DDR3 territory. An interesting story is here: opencores.org/project,wbddr3. A simple message seems to be "don't". Don't do it yourself. Use Xilinx Core Generator which will interface the hard silicon logic built into Xilinx FPGA banks. This is what Arty is doing. The performance will depend on the fine details of clocking inside the FPGA, as well as board layout. (Arty performance is sort of low because they did not provide the recommended reference voltage despite the Xilinx recommendation. I wonder why.) 

The bottom line: if you want to use high speed DDRx memories at close to their ratings, you cannot achieve it with LOLA-generated Verilog. You need to follow the FPGA manufacturer hard silicon solution, which is provided with their design tools. Do I need to say it is total mess? Yes it is. But there is no other route.

Note that a well designed DDRx controller may offer a sort of a cache which will make it look a bit like SRAM from the CPU perspective. I also think that the RISC5 "stall" can remedy the short hiccups due to refresh cycles, but I have not studied this topic.

Finally, the video controller which can suck the bitmaps from the SDRAM of any kind. I would start from here: opencores.org/project,vga_lcd. This project by Richard Herveille provides a 46-page manual which seems very well written. I have not studied the actual HDL yet.

My conclusion is that LOLA-generated Verilog is a great proof of principle and very educational. But if we want to see V4, S3, or CP running on the FPGA (which are worthy goals!), then we need to tackle the subject matter with full repository of available tools and IP cores. Otherwise we will stay where we are now. We can consider building a board with lots of SRAM or ZBT RAM instead of SDRAM or DDRx, but this will get both expensive and unwieldy at more than a few megs.

Just my two groszes. (A "grosz" is a Polish penny.)

W.
________________________________________
From: Oberon [oberon-bounces at lists.inf.ethz.ch] on behalf of Jörg [joerg.straube at iaeth.ch]
Sent: Wednesday, October 4, 2017 5:25 PM
To: ETH Oberon and related systems
Subject: [Oberon] RISC-5 and memory

Wojtek

some remarks on memory.

> On the other hand, the one megabyte FPGA card just happened to be available. It is not the only possible solution even in the FPGA world. Let me compare memory prices to make it clear. Two 0.5 MB chips type IS61LV25616AL-10TL comprising the 1 megabyte cost $9.26 at DigiKey. A single 512 megabyte chip type AS4C256M16D3LB-12BCN costs $10.59. So we are looking at the cost effectiveness differing by orders of magnitude.
>
> These are different technologies. One is simple to use, while the other is much harder. On the other hand, boards using the latter have been built. (For example, Arty from Digilent.) It is not that clear to me that the language definition and compiler technology should stay at the level of asynchronous SRAM rather than advance into the era of DDR3L.

First let me start with the statement: I’m not the HW guru. So, I can be completely wrong.
I took your arguments and started to investigate on the reasons why NW decided for SRAM instead of the much larger and cheaper SDRAM.

The key thing in ProjectOberon is not the language or the compiler, the key topic is his own CPU, the "RISC-5“.
As with the Oberon language, the Oberon OS and the Oberon compiler, NW seems to follow the same principle for the Oberon CPU: make it simple but not simpler.

I studied the RISC-5 Verilog code and googled a bit because I was wondering how today’s CPU tackle the fact that SDRAM is MUCH slower than SRAM. I found that today's CPUs implement several optimization techniques, e.g. pipelining. But I think the fundamental point to overcome the low speed of SDRAM is that today’s architecture use caches. Either only L1 or a combo of L1 and L2 cache before they access slow SDRAM.

To keep the CPU design simple, the RISC-5 does not implement neither a pipeline nor does it use a two stage cache approach to access RAM.
I come to the conclusion that the whole SRAM in ProjectOberon can be seen as one big cache in today’s CPUs wording.

Or in other words: We would have to add a cache strategy to RISC-5 environment to use SDRAM.
When we did that, I have no clue whether we would then be forced to introduce special video RAM as well.

I’d like to get feedback.

br
Jörg

--
Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.inf.ethz.ch_mailman_listinfo_oberon&d=DwIGaQ&c=kbmfwr1Yojg42sGEpaQh5ofMHBeTl9EI2eaqQZhHbOU&r=uUiA_zLpwaGJIlq-_BM9w1wVOuyqPwHi3XzJRa-ybV0&m=bJwV9vO98NzT1QivXzRQkLQTwuW7dqMF-pGv7eohTgw&s=a4kAR3dOntAH-HqwQUlgkNHYOOhTSzdgilYe0WOX_Nw&e=