[Oberon] RISC-5 and memory

Fri Oct 6 03:48:00 CEST 2017

Magnus:

  thank you for a detailed response. Before I respectfully disagree with some of your points, let me say that today we connected your Pepino to a VGA monitor, a keyboard, and a mouse. Pepino is now blinking a red LED waiting for two students who tomorrow will start learning how to setup the Oberon System and perhaps they will also learn a few things about the FPGA.

Now let me explain my goal and position. First, I want to use the  Oberon System embedded in the FPGA, rather than connected outside the FPGA. We have a gigahertz ARM to do the the latter. The photo of my MicroBone is available at skutek.com --> ARM System on module (left column). This board has 512 MB DDR3, a video chip, and even a GPU. We will use it for deeply embedded data processing. The video chip is a leftover from the BeagleBone design. I decided to keep it as a future option, though we are not using the display now. Anyone can take this board and develop the Oberon System running on the Sitara ARM, with keyboard, display, etc. The same is true about BeagleBone, which is every inch a desktop computer despite recent opinions in this forum. These opinions were misinformed. I can tell you a lot about both the BeagleBone and the MicroBone, because I developed the latter based on the former. The Beagle/Micro combo can run a native Oberon the same way as Raspberry Pi can, or many other similar boards. But it would need to get developed, which is rather unlikely. 

So why am I looking at the 25 MHz RISC5, if I have my own board running a gigahertz ARM? A few short answers: (1) If a CPU is embedded, then you do not need to write the data out. You can process the data in situ. (2) I am sick and tired of Linux, though I am using it and I will keep using it in future. (3) I want to experiment. (4) This CPU exists and it is running Oberon, though not exactly the way I want. 

The last point is incredibly important. There were many "could" in the recent discussions, while the FPGA - based Oberon is a "can", though it leaves a few things to be desired. I may list misfeatures of this system, but there is one undeniable feature: it runs. The students will start experiencing it tomorrow. It is a big asset.

Now concerning the possible future. It turns out that the Series-7 can interface a single 256 MB DDR3 chip using one IO bank, using a 16-bit chip such as AS4C128M16D3L. It is a very reasonable RAM size for a small system. From among Series-7, Artix-7 seems the most reasonable (to me) for a small system. The max data rate is then limited to 800 Mb/s per pin, which is half of what the memory chip could do (Artix-7 data sheet, Table 16). It means that a single memory chip, costing about $7, can provide max burst data rate of 16*800 = 12,800 megabits/s =  1,600 MB/s. I do understand it is a peak rate. I do understand there are the refresh cycles. I do understand we will need some buffering to provide a smooth video stream. But I would maintain that the disparity between the RISC5 CPU (or any other soft CPU) and the memory bandwidth is sufficiently huge that we (probably) do not need to worry about the memory bandwidth too much. This memory can (probably) swamp any soft core with data faster than the core can process, even if the core is running at close to 100 MHz. (I hope that RISC5 can get beefed up to about 100 MHz.)

So I am quite optimistic that a design with one Artix-7, speed grade 2, and one DDR3L chip with 16-bit interface (e.g., AS4C128M16D3L) can provide enough bandwidth for an 8-bit color or even 16-bit. 

For my needs the 8-bit color would be entirely OK, if it is mapped to 24-bit via the usual color lookup table. The IP core described on OpenCores is doing just that: opencores.org/project,vga_lcd.

Finally, concerning the HDL development. I am a big fan of the Oberon software. I am a big fan of RISC5 because it is running this software, though honestly I would welcome any other soft core doing this. (If it is not taking the entire FPGA!) I am *not* a big fan of the LOLA-based coding style. One of the first things which must happen is a rework of the present HDL code. The CPU must get interfaced to the Wishbone. The present peripherals must either be replaced with their Wishbone counterparts, or get a face lift to get interfaced. 

The present memory controller and the vid.v module must be replaced with the Wishbone-based replacements using the DDR3 PHY interface which Artix is providing.

Now please feel free to poke holes in my estimates.

Thank you,
W.

PS: I removed most of the previous posts, except yours.

________________________________________
From: Oberon [oberon-bounces at lists.inf.ethz.ch] on behalf of Magnus Karlsson [magnus at saanlima.com]
Sent: Thursday, October 5, 2017 1:04 PM
To: oberon at lists.inf.ethz.ch
Subject: Re: [Oberon] RISC-5 and memory

There are two different needs discussed here - more memory and color
display.

A limited amount of memory can be added without any change to the RISC-5
architecture.  I used a sell 2MB Pepino board well suited for Project
Oberon for $25 more than the 1MB version, and in principal an 8MB
SRAM-based RISC-5 board should be straight forward to design.  However,
in my mind SRAM makes no sense from cost standpoint beyond 8MB.  Note
that using non-SRAM memory will have huge impact on the RISC-5
architecture on many levels - there needs to be a fairly complex memory
controller implemented and both instruction and data caches needs to be
added to avoid performance loss, and the current deterministic behavior
of the RISC-5 system will be lost once you have cache-based system which
might have big impact on realtime systems.

Adding color display is a different story since you need both more
memory to store the pixels, and more memory bandwidth to pump out the
pixels to the screen. Today the RISC-5 system runs at 25 MHz so the
single cycle 32-bit SRAM memory can provide up to 100 MB/s of
bandwidth.  The display pixel clock is 75 MHz and with 1 bit/pixel the
memory bandwidth requirement is 75 MHz/8 = 9.375 MB/s.  This memory
bandwidth is "stolen" from the CPU by stalling it for one clock about 10
% of the time.  Going to color with say 16 bits/pixel would increase the
memory bandwidth requirement to 150 MB/s (75 MHz * 2 bytes/pixel), which
is way more than the RISC-5 memory architecture can provide and will
force a radical architecture change to a memory system that can provide
much higher bandwidth.  And then you will have the performance side
effect of having to render 16 bits/pixels vs rendering 1 bit/pixel so
there might be a need to increase the CPU clock rate from 25 MHz to
compensate for the performance loss.  All this points to a completely
different architecture instead of RISC-5 if you truly want color.  If
you want to stay in a Xilinx FPGA-based system that would mean a switch
to either a Microblaze or a MIPS-based soft-CPU architecture with a
high-performance memory system, which is readily available to use at
fairly low effort level, but comes at the expense of porting the
Oberon-07 system code to a different CPU architecture.  See this forum
thread about running DOOM on Pipistrello, which shows the Pipistrello
board with 64 MB of LPDDR memory that implements a Microblaze 100 MHz
CPU with instruction and data caches and 16 bit/pixel video system that
could be a starting point for an Oberon-07 system implemented on an FPGA
board  (sorry about the gun violence in the videos):
saanlima.com/forum/viewtopic.php?f=9&t=1232

Then of course you could decide to ditch the FPGA requirement all
together and do it on one of the many ARM-based boards out there for
very little money.

Magnus