[Oberon] RISC5 implementation issues.

Walter Gallegos walter at waltergallegos.com
Wed Feb 17 23:16:48 CET 2016

Hi Magnus,

You are welcome to continue with FPGA specific topics by private e-mail 
if you want.


El 2016-02-17 a las 18:30, Magnus Karlsson escribió:
> Hi Walter,
> Since this is really Paul's design, I guess it would be more 
> appropriate to discuss it with him, I was just trying to explain why 
> it looks like it does.
> Cheers,
> Magnus
> On 2/17/2016 1:15 PM, Walter Gallegos wrote:
>> Magnus,
>> Some of messages was delayed; so, I continue from here to not 
>> overload the list.
>> If I understand you correctly, you justify a uncontrolled delay 
>> because they simplify the SRAM handling.
>> Sorry, is as using the old circuit with an and/inverted to generate a 
>> pulse. If you need a delayed signal you should use the DCM 90°, 180° 
>> or 270° clock outputs and keep all under control, I think don't need 
>> a state machine in this case.
>> About ISE warnings, be careful, non warning do not means good 
>> methodology.
>> About XILINX docs; really, I don't remember. Doing training, first as 
>> Xilinx ATP and now as independent consultant, I touch this problem in 
>> my trainings. Have an uncontrolled delay in clock is a big door to 
>> random problems. FPGA design must be synchronous all times; no 
>> exceptions.
>> Regards,
>> Walter
>> El 2016-02-17 a las 14:41, Magnus Karlsson escribió:
>>> Walter,
>>> I agree with you that the "pure" way of doing this is as you stated, 
>>> with a DCM to directly generate both clk and pclk.  So how come Paul 
>>> didn't do that?  It's not like he doesn't know how to use the DCM, 
>>> after all the current code generates pclk from clk using a DCM, and 
>>> there would probably be less code to do it like you suggest.  No, 
>>> the reason for this is very subtle and is easy to miss if you just 
>>> take a quick look at the code, and it has to the asynchronous SRAM 
>>> interface.
>>> One of the most critical aspects of using SRAM is to control the 
>>> write signal - ideally the write signal should be asserted after all 
>>> other control signals (like address, data, byte-enable, read, oe) 
>>> are valid, and should be de-asserted well before any of the other 
>>> control signals go invalid, to avoid spurious writes. However, this 
>>> is not that easy to do in a synchronous system where all signals 
>>> change at the clock edge.  The most common way to do this is to have 
>>> a state machine that is clocked at say 4x the CPU clock so that you 
>>> can divided the SRAM access cycle into several phases and assert the 
>>> write signal on some of those phases.
>>> However, this is not the way Paul choose to do it, instead he choose 
>>> to do a less "pure" clock generation by generating clk from a 
>>> flip-flop rather than from a DCM.  By doing so, he actually 
>>> generates an early version of the clock signal called clk that is 
>>> leading the global clock signal clk_BUFG by the delay of the BUFG 
>>> buffer.  Since this early version of the clock signal is generated 
>>> like any other logic signal, he could use this signal to gate the 
>>> write signal to the SRAM such that write signal will be de-asserted 
>>> well before the other control signals (clocked by clk_BUFG) will 
>>> change, and thus avoiding the need to have a state machine 
>>> controlling the write signal.  The price for this is that the clock 
>>> signal is now generated in a less "pure" way, but still a valid way 
>>> as long as you know what you are doing.  The BUFG clock driver can 
>>> be driven from a PLL, a DCM or from the logic fabric.  The first two 
>>> are speed optimized paths going directly from the PLL or DCM to the 
>>> BUFG and can be clock at much higher clock rate, while the logic 
>>> fabric path is limited by the maximum clock rate of the logic 
>>> fabric.  However, at the clock rate we use (25 MHz) this is not an 
>>> issue.  When you do this there are no warnings generated by ISE that 
>>> this is not a good idea, and I have not read anywhere in the Xilinx 
>>> clocking resource guide that you should avoid doing this. Basically, 
>>> the BUFG clock driver is designed to do this, the tool will allow 
>>> you to do it and at the clock rate we use it has no performance 
>>> implications.  As I see it, this is another place where the goal of 
>>> simplification has driven the implementation of the system at the 
>>> expense of a slightly less "pure" clock generation.
>>> Just my 2c
>>> Magnus
>>> On 2/17/2016 4:18 AM, Walter Gallegos wrote:
>>>> Hi Paul,
>>>> My apologies for this off topic, ProjectOberon is basically a 
>>>> learning tool some hardware comments should not be bad.
>>>> Yes the tool add the clock buffers because FF clock edge detectors 
>>>> must be connected to clock distribution tree, this do no correct 
>>>> the issue.
>>>> The problem is, to connect a FF output, as clk signal is, to the 
>>>> clock buffer input the signal need be routed by general propose 
>>>> lines and interconnection matrix. This generate an uncontrolled delay.
>>>> This issue has minor effect in RISC5 because is very special case 
>>>> where all project is self contained.
>>>> A correct technique could be, RISC5 use a DCM to generate 75MHZ, 
>>>> use the same DCM to generate both 25MHZ (CLKDV) and 75MHZ (CLKFX) 
>>>> from 50MHZ (CLKIN).
>>>> Regards,
>>>> Walter
>>>> El 2016-02-17 a las 06:39, Paul Reed escribió:
>>>>> Hi Walter,
>>>>>> So, RISC5 use general propose resources to routing a clock signal.
>>>>> I agree with Magnus, the tools add the relevant clock buffer as 
>>>>> part of
>>>>> their job, and the source code is kept simple and clear.
>>>>> FPGAs are a little off-topic for many Oberoners, but hopefully the 
>>>>> below
>>>>> simple hardware LED counter for the Spartan 3 board (easily 
>>>>> adapted to
>>>>> almost any other board!) might be indulged, and interesting for 
>>>>> enough
>>>>> people :)
>>>>> If you create a project in Xilinx ISE for the xc3s200-4ft256 and 
>>>>> add these
>>>>> source files, then "Generate Programming File", then as far as I 
>>>>> can see
>>>>> from the reports, the tools add the appropriate clock buffers and 
>>>>> global
>>>>> resources - correct me if I'm wrong!
>>>>> Cheers,
>>>>> Paul
>>>>> (test.v)
>>>>> `timescale 1ns / 1ps
>>>>> module TestTop(
>>>>>      input CLK50M,   //50MHz
>>>>>      output [7:0] leds);
>>>>> reg clk;
>>>>> reg [31:0] cnt;
>>>>> assign leds = cnt[31:24];
>>>>> always @(posedge clk) //25MHz
>>>>>    cnt <= cnt + 1;
>>>>> always @(posedge CLK50M) clk <= ~clk;
>>>>> endmodule
>>>>> (test.ucf)
>>>>> NET "CLK50M" LOC = "T9" ;
>>>>> NET "leds[0]" LOC = "K12";
>>>>> NET "leds[1]" LOC = "P14";
>>>>> NET "leds[2]" LOC = "L12";
>>>>> NET "leds[3]" LOC = "N14";
>>>>> NET "leds[4]" LOC = "P13";
>>>>> NET "leds[5]" LOC = "N12";
>>>>> NET "leds[6]" LOC = "P12";
>>>>> NET "leds[7]" LOC = "P11";
>>>>> -- 
>>>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related 
>>>>> systems
>>>>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>> -- 
>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related 
>>> systems
>>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
> -- 
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon


Walter Daniel Gallegos
Programmable Logic & Software
Consultoría, Diseño, Entrenamiento.
Montevideo, Uruguay
EMAIL walter at waltergallegos.com
Tel +598 26 23 44 60 | Cel +598 99 18 58 88

More information about the Oberon mailing list