[Oberon] RISC5 implementation issues.

Walter Gallegos walter at waltergallegos.com
Wed Feb 24 18:46:28 CET 2016


Apologies, Spanish typo in my post.  "de" => "the".

El 2016-02-24 a las 11:07, Walter Gallegos escribió:
>
> DCM #(.CLKFX_MULTIPLY(3), .CLKIN_DIVIDE_BY_2("TRUE"), 
> .CLKIN_PERIOD(20.000))
>   dcm(.*CLKIN(CLK50M)*, .CLKFB(clk), .RST(1'b0), .PSEN(1'b0),
>       .PSINCDEC(1'b0), .PSCLK(1'b0), .DSSEN(1'b0), .CLKFX(pclk), 
> .*CLK0(clk)*,
>       .CLK90(clk90));
>
> clk = CLK0 = CLKIN = CLK50M.  Can this version of RISC5 run at 50MHZ 
> as is ?
>
> Here my simulation using de Cypress SRAM VHDL simulation model with 
> ALDEC simulator.
>
> Label : "Potential Issue", If you read the SRAM and - in the next clk 
> cycle - try to write the same SRAM data bus we could wait some ns of 
> bus contention at this point.
> If RISC5 can read then write in consecutive clk cycles the issue is 
> easy worked capturing de SRAM data with 90° clock rising_edge and 
> releasing oen soon.
> Important note : This functional simple test bench do not take in 
> count internal RISC5 timing issues.
>
> Question to RISC5 fathers and adopters : Someone build an "official" 
> test bench set for RISC5 ? That means module by module plus one total.
> Is the test bench available ?
>
>
>
> I'm more interested in practical uses; as an -industrial grade- 
> embedded RISC5; how many in this list, in addition to those we already 
> know, walk in the same direction ?
>
> Best regards
> Walter
>
>
> El 2016-02-23 a las 13:49, Magnus Karlsson escribió:
>> Walter,
>>
>> I tried your proposed solution and couldn't make it work.  It 
>> "almost" works in that it starts the boot process and reads from the 
>> SD-card, but then hangs with the LEDs showing 0x84.
>> Maybe I didn't do it correct, I instantiated the ODDR2 buffer 
>> directly at the top module and modified the DCM instantiation to 
>> create the clk90 clock.
>>
>> Here is the code, can you look it over and see if this is what you 
>> had in mind or if something is wrong:
>>
>> DCM #(.CLKFX_MULTIPLY(3), .CLKIN_DIVIDE_BY_2("TRUE"), 
>> .CLKIN_PERIOD(20.000))
>>   dcm(.CLKIN(CLK50M), .CLKFB(clk), .RST(1'b0), .PSEN(1'b0),
>>       .PSINCDEC(1'b0), .PSCLK(1'b0), .DSSEN(1'b0), .CLKFX(pclk), 
>> .CLK0(clk),
>>       .CLK90(clk90));
>>
>> ODDR2 #(.INIT(1'b1))
>>   oddr2(.Q(wr_enable), .C0(clk90), .C1(~clk90), .CE(1'b1), .D0(1'b1),
>>       .D1(~wr), .R(1'b0), .S(1'b0));
>>
>> assign SRwe = wr_enable;
>>
>>
>> wr is the active-high write signal from the RISC5 CPU, and SRwe is 
>> the short active-low write-enable signal to the SRAM.
>>
>> Just to recap, this is the code I used to generate the short write 
>> pulse, using a 2x clk instead.  This version seems to run fine:
>>
>> reg wr_enable;
>>
>> DCM #(.CLKFX_MULTIPLY(3), .CLKFX_DIVIDE(2), .CLKDV_DIVIDE(2), 
>> .CLKIN_PERIOD(20.000))
>>   dcm(.CLKIN(CLK50M), .CLKFB(clk2x), .RST(1'b0), .PSEN(1'b0),
>>       .PSINCDEC(1'b0), .PSCLK(1'b0), .DSSEN(1'b0), .CLKFX(pclk), 
>> .CLKDV(clk), .CLK0(clk2x));
>>
>> always @(negedge clk2x)
>>   wr_enable <= wr & ~wr_enable;
>>
>> assign SRwe = ~wr_enable;
>>
>>
>> Any idea what's wrong?
>> Magnus
>>
>>
>> On 2/19/2016 9:58 AM, Walter Gallegos wrote:
>>> Let us concentrate in the problem...
>>>
>>> How to generate an output pulse shorter than clock period?
>>>
>>> You can make a pulse shorter with help of DCM and DDR ( dual data 
>>> rate output registers ) see the simulation capture
>>>
>>>
>>>
>>> clk25f90 is the DCM clock output with 90° phase shift.
>>>
>>> The VHDL code is : ( without DCM instantiation )
>>>
>>> ENTITY PulseShaper IS
>>>     PORT  (CLK25F90 :  IN STD_LOGIC;
>>>              RESET :  IN STD_LOGIC;
>>>                 WE : IN STD_LOGIC;
>>>             WESRAM : OUT STD_LOGIC
>>>          );
>>> END PulseShaper;
>>>
>>> ARCHITECTURE RTL OF PulseShaper IS
>>>
>>>    CONSTANT zero : STD_LOGIC := '0';
>>>    CONSTANT one : STD_LOGIC := '1';
>>>
>>>    SIGNAL clk90n: STD_LOGIC;
>>>
>>> BEGIN
>>>
>>>    clk90n <= NOT(CLK25F90);  -- This is valid because the tool use 
>>> the clock inverter available in IOB
>>>
>>>    Dly : ODDR2
>>>    PORT MAP (
>>>       Q => WESRAM,
>>>       C0 => CLK25F90,
>>>       C1 => clk90n,
>>>       CE => one,
>>>       D0 => one,
>>>       D1 => WE,
>>>       R => zero,
>>>       S => zero
>>>    );
>>>
>>> END RTL;
>>>
>>> Someone know if exist a FPGA testbench for RISC5 ?
>>>
>>> Regards,
>>> Walter
>>>
>>> El 2016-02-19 a las 11:33, Magnus Karlsson escribió:
>>>> In all fairness, since I have generated and tested code for one way 
>>>> to solve the problem, can you then give us your code proposal for 
>>>> generating the SRAM write signal, with 25MHz and 75MHz generated by 
>>>> a DCM?  The write signal must be asserted after all other SRAM 
>>>> control signals are valid, and be de-asserted before any of the 
>>>> control signals go invalid, and last for at least 5 nS.  The SRAM 
>>>> control signals are generated by the 25MHz clock and last for one 
>>>> clock cycle.
>>>>
>>>> I will be more than happy to try it out on a board and report the 
>>>> result.
>>>>
>>>> Magnus
>>>>
>>>>
>>>> On 2/19/2016 4:57 AM, Walter Gallegos wrote:
>>>>> Have a solid and coherent clock distribution is basic for FPGA 
>>>>> design, my proposition was keep both 25MHZ and 75MHZ, generated by 
>>>>> DCM.
>>>>>
>>>>> Run all in 75 MHZ is unnecessary; also, clock enable approach add 
>>>>> unnecessary complexity to the design. Make a design synchronous 
>>>>> don't necessary means use the same clock for all the design.
>>>>>
>>>>> Continue using 25MHZ for the core and peripherals, the only 
>>>>> concern is the CDC (clock domain crossing). As both clock was 
>>>>> generated by the same DCM is minor problem; correspondent rising 
>>>>> edges are aligned by design in DCMs; metastability is not an 
>>>>> issue. Using DCMs and constraining input clock (50MHZ) the 
>>>>> constraints propagation rules constraint both clocks, 25MHZ and 
>>>>> 75MHZ. If no methodology errors all design are constrained from 
>>>>> first to last register element in the chain. Beware of 
>>>>> combinational logic in outside this elements.
>>>>>
>>>>> In my opinion, taking care of CDC keep both clock is the 
>>>>> appropriate solution.
>>>>>
>>>>> Best regards,
>>>>> Walter
>>>>>
>>>>> El 2016-02-18 a las 19:58, Magnus Karlsson escribió:
>>>>>> So I have been thinking about this some more and decided to 
>>>>>> modify/update the design to remove all the concerns raised by 
>>>>>> Walter and Wojtek.
>>>>>>
>>>>>> Just to recap, Walter's concern is that the clocks are generated 
>>>>>> using flip-flops and use logic fabric interconnect instead of 
>>>>>> dedicated clocking elements and pathways, and that all clocks 
>>>>>> should be generated by a DCM module instead (DCM = Digital Clock 
>>>>>> Manager). Wojtek's concern is that there are unspecified timing 
>>>>>> relations between the 25MHz and the 75MHz clock domains.
>>>>>>
>>>>>> Both concerns are valid and in my opinion the correct way to fix 
>>>>>> both issues is to make the design completely synchronous.  This 
>>>>>> means that all clocked elements in the design (like flip-flops, 
>>>>>> memories etc.) should be clocked with a single clock signal, 
>>>>>> which in this case is the 75MHz clock.  The CPU and I/O 
>>>>>> subsystem, which before was clocked by a separate 25MHz clock, 
>>>>>> are now also clocked by the 75MHz clock but are only enabled to 
>>>>>> be clocked on every third clock cycle.  This means that all 
>>>>>> "always @ (posedge clk)" statements have been changed to include 
>>>>>> "If (enable) ...", where "enable" is a signal that is true on 
>>>>>> every third clock cycle.  The asynchronous SRAM interface is also 
>>>>>> changed so that the write signal is asserted on the middle-third 
>>>>>> clock phase of the three clock CPU cycle.
>>>>>>
>>>>>> While the Verilog changes to do this are very straight forward, 
>>>>>> one complication here is that the Xilinx ISE tool used to create 
>>>>>> the bit file for the FPGA do not understand that the CPU and I/O 
>>>>>> subsystem are only clocked on every third clock and will 
>>>>>> basically try to make the CPU run at 75MHz, and will fail since 
>>>>>> this is too fast the FPGA.  The solution to this problem is to 
>>>>>> tell the tool that all clock paths in the CPU and I/O subsystem 
>>>>>> can actually take three clocks to complete (this is called 
>>>>>> multi-cycle paths). With the multi-cycle paths added to the .ucf 
>>>>>> file the design compiles with no timing violations
>>>>>>
>>>>>> With those changes the 75MHz clock is now generated by a DCM and 
>>>>>> the unspecified timing relations that Wojtek brought up are now 
>>>>>> gone since everything is clocked with a single clock.   The 
>>>>>> modified design have been tested on Pepino and seems to run fine.
>>>>>>
>>>>>> The complete ISE project with those changes are available at the 
>>>>>> Pepino GitHub repository: 
>>>>>> https://github.com/Saanlima/Pepino/tree/master/Projects/RISC5Verilog_Pepino
>>>>>>
>>>>>> Any comments or critique are welcome.
>>>>>>
>>>>>> Cheers,
>>>>>> Magnus
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 2/17/2016 2:16 PM, Walter Gallegos wrote:
>>>>>>> Hi Magnus,
>>>>>>>
>>>>>>> You are welcome to continue with FPGA specific topics by private 
>>>>>>> e-mail if you want.
>>>>>>>
>>>>>>> Regards
>>>>>>> Walter
>>>>>>>
>>>>>>> El 2016-02-17 a las 18:30, Magnus Karlsson escribió:
>>>>>>>> Hi Walter,
>>>>>>>>
>>>>>>>> Since this is really Paul's design, I guess it would be more 
>>>>>>>> appropriate to discuss it with him, I was just trying to 
>>>>>>>> explain why it looks like it does.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Magnus
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2/17/2016 1:15 PM, Walter Gallegos wrote:
>>>>>>>>> Magnus,
>>>>>>>>>
>>>>>>>>> Some of messages was delayed; so, I continue from here to not 
>>>>>>>>> overload the list.
>>>>>>>>>
>>>>>>>>> If I understand you correctly, you justify a uncontrolled 
>>>>>>>>> delay because they simplify the SRAM handling.
>>>>>>>>> Sorry, is as using the old circuit with an and/inverted to 
>>>>>>>>> generate a pulse. If you need a delayed signal you should use 
>>>>>>>>> the DCM 90°, 180° or 270° clock outputs and keep all under 
>>>>>>>>> control, I think don't need a state machine in this case.
>>>>>>>>>
>>>>>>>>> About ISE warnings, be careful, non warning do not means good 
>>>>>>>>> methodology.
>>>>>>>>>
>>>>>>>>> About XILINX docs; really, I don't remember. Doing training, 
>>>>>>>>> first as Xilinx ATP and now as independent consultant, I touch 
>>>>>>>>> this problem in my trainings. Have an uncontrolled delay in 
>>>>>>>>> clock is a big door to random problems. FPGA design must be 
>>>>>>>>> synchronous all times; no exceptions.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Walter
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> El 2016-02-17 a las 14:41, Magnus Karlsson escribió:
>>>>>>>>>> Walter,
>>>>>>>>>>
>>>>>>>>>> I agree with you that the "pure" way of doing this is as you 
>>>>>>>>>> stated, with a DCM to directly generate both clk and pclk. So 
>>>>>>>>>> how come Paul didn't do that?  It's not like he doesn't know 
>>>>>>>>>> how to use the DCM, after all the current code generates pclk 
>>>>>>>>>> from clk using a DCM, and there would probably be less code 
>>>>>>>>>> to do it like you suggest. No, the reason for this is very 
>>>>>>>>>> subtle and is easy to miss if you just take a quick look at 
>>>>>>>>>> the code, and it has to the asynchronous SRAM interface.
>>>>>>>>>>
>>>>>>>>>> One of the most critical aspects of using SRAM is to control 
>>>>>>>>>> the write signal - ideally the write signal should be 
>>>>>>>>>> asserted after all other control signals (like address, data, 
>>>>>>>>>> byte-enable, read, oe) are valid, and should be de-asserted 
>>>>>>>>>> well before any of the other control signals go invalid, to 
>>>>>>>>>> avoid spurious writes. However, this is not that easy to do 
>>>>>>>>>> in a synchronous system where all signals change at the clock 
>>>>>>>>>> edge.  The most common way to do this is to have a state 
>>>>>>>>>> machine that is clocked at say 4x the CPU clock so that you 
>>>>>>>>>> can divided the SRAM access cycle into several phases and 
>>>>>>>>>> assert the write signal on some of those phases.
>>>>>>>>>>
>>>>>>>>>> However, this is not the way Paul choose to do it, instead he 
>>>>>>>>>> choose to do a less "pure" clock generation by generating clk 
>>>>>>>>>> from a flip-flop rather than from a DCM. By doing so, he 
>>>>>>>>>> actually generates an early version of the clock signal 
>>>>>>>>>> called clk that is leading the global clock signal clk_BUFG 
>>>>>>>>>> by the delay of the BUFG buffer.  Since this early version of 
>>>>>>>>>> the clock signal is generated like any other logic signal, he 
>>>>>>>>>> could use this signal to gate the write signal to the SRAM 
>>>>>>>>>> such that write signal will be de-asserted well before the 
>>>>>>>>>> other control signals (clocked by clk_BUFG) will change, and 
>>>>>>>>>> thus avoiding the need to have a state machine controlling 
>>>>>>>>>> the write signal.  The price for this is that the clock 
>>>>>>>>>> signal is now generated in a less "pure" way, but still a 
>>>>>>>>>> valid way as long as you know what you are doing. The BUFG 
>>>>>>>>>> clock driver can be driven from a PLL, a DCM or from the 
>>>>>>>>>> logic fabric.  The first two are speed optimized paths going 
>>>>>>>>>> directly from the PLL or DCM to the BUFG and can be clock at 
>>>>>>>>>> much higher clock rate, while the logic fabric path is 
>>>>>>>>>> limited by the maximum clock rate of the logic fabric. 
>>>>>>>>>> However, at the clock rate we use (25 MHz) this is not an 
>>>>>>>>>> issue.  When you do this there are no warnings generated by 
>>>>>>>>>> ISE that this is not a good idea, and I have not read 
>>>>>>>>>> anywhere in the Xilinx clocking resource guide that you 
>>>>>>>>>> should avoid doing this. Basically, the BUFG clock driver is 
>>>>>>>>>> designed to do this, the tool will allow you to do it and at 
>>>>>>>>>> the clock rate we use it has no performance implications. As 
>>>>>>>>>> I see it, this is another place where the goal of 
>>>>>>>>>> simplification has driven the implementation of the system at 
>>>>>>>>>> the expense of a slightly less "pure" clock generation.
>>>>>>>>>>
>>>>>>>>>> Just my 2c
>>>>>>>>>>
>>>>>>>>>> Magnus
>>>>>>
>>>>>> -- 
>>>>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related 
>>>>>> systems
>>>>>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>>>>>
>>>>>
>>>>
>>>> -- 
>>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related 
>>>> systems
>>>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>>>
>>>
>>> -- 
>>>
>>> Walter Daniel Gallegos
>>> Programmable Logic & Software
>>> Consultoría, Diseño, Entrenamiento.
>>> Montevideo, Uruguay
>>> EMAILwalter at waltergallegos.com   
>>> Tel +598 26 23 44 60 | Cel +598 99 18 58 88
>>>
>>>
>>> El presente correo y cualquier posible archivo adjunto está dirigido únicamente
>>> al destinatario del mensaje y contiene información que puede ser confidencial.
>>> Si Ud. no es el destinatario correcto por favor notifique al remitente
>>> respondiendo anexando este mensaje y elimine inmediatamente el e-mail y los
>>> posibles archivos adjuntos al mismo de su sistema. Está prohibida cualquier
>>> utilización, difusión o copia de este e-mail por cualquier persona o entidad
>>> que no sean las específicas destinatarias del mensaje.
>>>
>>> This e-mail and any attachment is confidential and is intended solely for the
>>> addressee(s). If you are not intended recipient please inform the sender
>>> immediately, answering this e-mail and delete it as well as the attached files.
>>> Any use, circulation or copy of this e-mail by any person or entity that is not
>>> the specific addressee(s) is prohibited.
>>>
>>>
>>>
>>> --
>>> Oberon at lists.inf.ethz.ch  mailing list for ETH Oberon and related systems
>>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>
>>
>>
>> --
>> Oberon at lists.inf.ethz.ch  mailing list for ETH Oberon and related systems
>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>
> -- 
>
> Walter Daniel Gallegos
> Programmable Logic & Software
> Consultoría, Diseño, Entrenamiento.
> Montevideo, Uruguay
> EMAILwalter at waltergallegos.com   
> Tel +598 26 23 44 60 | Cel +598 99 18 58 88
>
>
> El presente correo y cualquier posible archivo adjunto está dirigido únicamente
> al destinatario del mensaje y contiene información que puede ser confidencial.
> Si Ud. no es el destinatario correcto por favor notifique al remitente
> respondiendo anexando este mensaje y elimine inmediatamente el e-mail y los
> posibles archivos adjuntos al mismo de su sistema. Está prohibida cualquier
> utilización, difusión o copia de este e-mail por cualquier persona o entidad
> que no sean las específicas destinatarias del mensaje.
>
> This e-mail and any attachment is confidential and is intended solely for the
> addressee(s). If you are not intended recipient please inform the sender
> immediately, answering this e-mail and delete it as well as the attached files.
> Any use, circulation or copy of this e-mail by any person or entity that is not
> the specific addressee(s) is prohibited.
>

-- 

Walter Daniel Gallegos
Programmable Logic & Software
Consultoría, Diseño, Entrenamiento.
Montevideo, Uruguay
EMAIL walter at waltergallegos.com
Tel +598 26 23 44 60 | Cel +598 99 18 58 88


El presente correo y cualquier posible archivo adjunto está dirigido únicamente
al destinatario del mensaje y contiene información que puede ser confidencial.
Si Ud. no es el destinatario correcto por favor notifique al remitente
respondiendo anexando este mensaje y elimine inmediatamente el e-mail y los
posibles archivos adjuntos al mismo de su sistema. Está prohibida cualquier
utilización, difusión o copia de este e-mail por cualquier persona o entidad
que no sean las específicas destinatarias del mensaje.

This e-mail and any attachment is confidential and is intended solely for the
addressee(s). If you are not intended recipient please inform the sender
immediately, answering this e-mail and delete it as well as the attached files.
Any use, circulation or copy of this e-mail by any person or entity that is not
the specific addressee(s) is prohibited.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.inf.ethz.ch/pipermail/oberon/attachments/20160224/cec22307/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 12112 bytes
Desc: not available
URL: <http://lists.inf.ethz.ch/pipermail/oberon/attachments/20160224/cec22307/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 6951 bytes
Desc: not available
URL: <http://lists.inf.ethz.ch/pipermail/oberon/attachments/20160224/cec22307/attachment-0003.png>


More information about the Oberon mailing list