[Oberon] RISC5 implementation issues.

Wed Feb 17 22:15:26 CET 2016

Magnus,

Some of messages was delayed; so, I continue from here to not overload 
the list.

If I understand you correctly, you justify a uncontrolled delay because 
they simplify the SRAM handling.
Sorry, is as using the old circuit with an and/inverted to generate a 
pulse. If you need a delayed signal you should use the DCM 90°, 180° or 
270° clock outputs and keep all under control, I think don't need a 
state machine in this case.

About ISE warnings, be careful, non warning do not means good methodology.

About XILINX docs; really, I don't remember. Doing training, first as 
Xilinx ATP and now as independent consultant, I touch this problem in my 
trainings. Have an uncontrolled delay in clock is a big door to random 
problems. FPGA design must be synchronous all times; no exceptions.

Regards,
Walter

El 2016-02-17 a las 14:41, Magnus Karlsson escribió:
> Walter,
>
> I agree with you that the "pure" way of doing this is as you stated, 
> with a DCM to directly generate both clk and pclk.  So how come Paul 
> didn't do that?  It's not like he doesn't know how to use the DCM, 
> after all the current code generates pclk from clk using a DCM, and 
> there would probably be less code to do it like you suggest.  No, the 
> reason for this is very subtle and is easy to miss if you just take a 
> quick look at the code, and it has to the asynchronous SRAM interface.
>
> One of the most critical aspects of using SRAM is to control the write 
> signal - ideally the write signal should be asserted after all other 
> control signals (like address, data, byte-enable, read, oe) are valid, 
> and should be de-asserted well before any of the other control signals 
> go invalid, to avoid spurious writes. However, this is not that easy 
> to do in a synchronous system where all signals change at the clock 
> edge.  The most common way to do this is to have a state machine that 
> is clocked at say 4x the CPU clock so that you can divided the SRAM 
> access cycle into several phases and assert the write signal on some 
> of those phases.
>
> However, this is not the way Paul choose to do it, instead he choose 
> to do a less "pure" clock generation by generating clk from a 
> flip-flop rather than from a DCM.  By doing so, he actually generates 
> an early version of the clock signal called clk that is leading the 
> global clock signal clk_BUFG by the delay of the BUFG buffer.  Since 
> this early version of the clock signal is generated like any other 
> logic signal, he could use this signal to gate the write signal to the 
> SRAM such that write signal will be de-asserted well before the other 
> control signals (clocked by clk_BUFG) will change, and thus avoiding 
> the need to have a state machine controlling the write signal.  The 
> price for this is that the clock signal is now generated in a less 
> "pure" way, but still a valid way as long as you know what you are 
> doing.  The BUFG clock driver can be driven from a PLL, a DCM or from 
> the logic fabric.  The first two are speed optimized paths going 
> directly from the PLL or DCM to the BUFG and can be clock at much 
> higher clock rate, while the logic fabric path is limited by the 
> maximum clock rate of the logic fabric.  However, at the clock rate we 
> use (25 MHz) this is not an issue.  When you do this there are no 
> warnings generated by ISE that this is not a good idea, and I have not 
> read anywhere in the Xilinx clocking resource guide that you should 
> avoid doing this.  Basically, the BUFG clock driver is designed to do 
> this, the tool will allow you to do it and at the clock rate we use it 
> has no performance implications.  As I see it, this is another place 
> where the goal of simplification has driven the implementation of the 
> system at the expense of a slightly less "pure" clock generation.
>
> Just my 2c
>
> Magnus
>
>
> On 2/17/2016 4:18 AM, Walter Gallegos wrote:
>> Hi Paul,
>>
>> My apologies for this off topic, ProjectOberon is basically a 
>> learning tool some hardware comments should not be bad.
>>
>> Yes the tool add the clock buffers because FF clock edge detectors 
>> must be connected to clock distribution tree, this do no correct the 
>> issue.
>> The problem is, to connect a FF output, as clk signal is, to the 
>> clock buffer input the signal need be routed by general propose lines 
>> and interconnection matrix. This generate an uncontrolled delay.
>> This issue has minor effect in RISC5 because is very special case 
>> where all project is self contained.
>>
>> A correct technique could be, RISC5 use a DCM to generate 75MHZ, use 
>> the same DCM to generate both 25MHZ (CLKDV) and 75MHZ (CLKFX) from 
>> 50MHZ (CLKIN).
>>
>> Regards,
>> Walter
>>
>>
>> El 2016-02-17 a las 06:39, Paul Reed escribió:
>>> Hi Walter,
>>>
>>>> So, RISC5 use general propose resources to routing a clock signal.
>>> I agree with Magnus, the tools add the relevant clock buffer as part of
>>> their job, and the source code is kept simple and clear.
>>>
>>> FPGAs are a little off-topic for many Oberoners, but hopefully the 
>>> below
>>> simple hardware LED counter for the Spartan 3 board (easily adapted to
>>> almost any other board!) might be indulged, and interesting for enough
>>> people :)
>>>
>>> If you create a project in Xilinx ISE for the xc3s200-4ft256 and add 
>>> these
>>> source files, then "Generate Programming File", then as far as I can 
>>> see
>>> from the reports, the tools add the appropriate clock buffers and 
>>> global
>>> resources - correct me if I'm wrong!
>>>
>>> Cheers,
>>> Paul
>>>
>>>
>>> (test.v)
>>>
>>> `timescale 1ns / 1ps
>>>
>>> module TestTop(
>>>      input CLK50M,   //50MHz
>>>      output [7:0] leds);
>>>
>>> reg clk;
>>> reg [31:0] cnt;
>>>
>>> assign leds = cnt[31:24];
>>>
>>> always @(posedge clk) //25MHz
>>>    cnt <= cnt + 1;
>>>
>>> always @(posedge CLK50M) clk <= ~clk;
>>>
>>> endmodule
>>>
>>> (test.ucf)
>>>
>>> NET "CLK50M" LOC = "T9" ;
>>> NET "leds[0]" LOC = "K12";
>>> NET "leds[1]" LOC = "P14";
>>> NET "leds[2]" LOC = "L12";
>>> NET "leds[3]" LOC = "N14";
>>> NET "leds[4]" LOC = "P13";
>>> NET "leds[5]" LOC = "N12";
>>> NET "leds[6]" LOC = "P12";
>>> NET "leds[7]" LOC = "P11";
>>>
>>>
>>> -- 
>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related 
>>> systems
>>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>>
>>
>
> -- 
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>

-- 

Walter Daniel Gallegos
Programmable Logic & Software
Consultoría, Diseño, Entrenamiento.
Montevideo, Uruguay
EMAIL walter at waltergallegos.com
Tel +598 26 23 44 60 | Cel +598 99 18 58 88