[Oberon] RISC5 implementation issues.
Walter Gallegos
walter at waltergallegos.com
Fri Feb 19 18:58:06 CET 2016
Let us concentrate in the problem...
How to generate an output pulse shorter than clock period?
You can make a pulse shorter with help of DCM and DDR ( dual data rate
output registers ) see the simulation capture
clk25f90 is the DCM clock output with 90° phase shift.
The VHDL code is : ( without DCM instantiation )
ENTITY PulseShaper IS
PORT (CLK25F90 : IN STD_LOGIC;
RESET : IN STD_LOGIC;
WE : IN STD_LOGIC;
WESRAM : OUT STD_LOGIC
);
END PulseShaper;
ARCHITECTURE RTL OF PulseShaper IS
CONSTANT zero : STD_LOGIC := '0';
CONSTANT one : STD_LOGIC := '1';
SIGNAL clk90n: STD_LOGIC;
BEGIN
clk90n <= NOT(CLK25F90); -- This is valid because the tool use the
clock inverter available in IOB
Dly : ODDR2
PORT MAP (
Q => WESRAM,
C0 => CLK25F90,
C1 => clk90n,
CE => one,
D0 => one,
D1 => WE,
R => zero,
S => zero
);
END RTL;
Someone know if exist a FPGA testbench for RISC5 ?
Regards,
Walter
El 2016-02-19 a las 11:33, Magnus Karlsson escribió:
> In all fairness, since I have generated and tested code for one way to
> solve the problem, can you then give us your code proposal for
> generating the SRAM write signal, with 25MHz and 75MHz generated by a
> DCM? The write signal must be asserted after all other SRAM control
> signals are valid, and be de-asserted before any of the control
> signals go invalid, and last for at least 5 nS. The SRAM control
> signals are generated by the 25MHz clock and last for one clock cycle.
>
> I will be more than happy to try it out on a board and report the result.
>
> Magnus
>
>
> On 2/19/2016 4:57 AM, Walter Gallegos wrote:
>> Have a solid and coherent clock distribution is basic for FPGA
>> design, my proposition was keep both 25MHZ and 75MHZ, generated by DCM.
>>
>> Run all in 75 MHZ is unnecessary; also, clock enable approach add
>> unnecessary complexity to the design. Make a design synchronous don't
>> necessary means use the same clock for all the design.
>>
>> Continue using 25MHZ for the core and peripherals, the only concern
>> is the CDC (clock domain crossing). As both clock was generated by
>> the same DCM is minor problem; correspondent rising edges are aligned
>> by design in DCMs; metastability is not an issue. Using DCMs and
>> constraining input clock (50MHZ) the constraints propagation rules
>> constraint both clocks, 25MHZ and 75MHZ. If no methodology errors all
>> design are constrained from first to last register element in the
>> chain. Beware of combinational logic in outside this elements.
>>
>> In my opinion, taking care of CDC keep both clock is the appropriate
>> solution.
>>
>> Best regards,
>> Walter
>>
>> El 2016-02-18 a las 19:58, Magnus Karlsson escribió:
>>> So I have been thinking about this some more and decided to
>>> modify/update the design to remove all the concerns raised by Walter
>>> and Wojtek.
>>>
>>> Just to recap, Walter's concern is that the clocks are generated
>>> using flip-flops and use logic fabric interconnect instead of
>>> dedicated clocking elements and pathways, and that all clocks should
>>> be generated by a DCM module instead (DCM = Digital Clock Manager).
>>> Wojtek's concern is that there are unspecified timing relations
>>> between the 25MHz and the 75MHz clock domains.
>>>
>>> Both concerns are valid and in my opinion the correct way to fix
>>> both issues is to make the design completely synchronous. This means
>>> that all clocked elements in the design (like flip-flops, memories
>>> etc.) should be clocked with a single clock signal, which in this
>>> case is the 75MHz clock. The CPU and I/O subsystem, which before
>>> was clocked by a separate 25MHz clock, are now also clocked by the
>>> 75MHz clock but are only enabled to be clocked on every third clock
>>> cycle. This means that all "always @ (posedge clk)" statements have
>>> been changed to include "If (enable) ...", where "enable" is a
>>> signal that is true on every third clock cycle. The asynchronous
>>> SRAM interface is also changed so that the write signal is asserted
>>> on the middle-third clock phase of the three clock CPU cycle.
>>>
>>> While the Verilog changes to do this are very straight forward, one
>>> complication here is that the Xilinx ISE tool used to create the bit
>>> file for the FPGA do not understand that the CPU and I/O subsystem
>>> are only clocked on every third clock and will basically try to make
>>> the CPU run at 75MHz, and will fail since this is too fast the
>>> FPGA. The solution to this problem is to tell the tool that all
>>> clock paths in the CPU and I/O subsystem can actually take three
>>> clocks to complete (this is called multi-cycle paths). With the
>>> multi-cycle paths added to the .ucf file the design compiles with no
>>> timing violations
>>>
>>> With those changes the 75MHz clock is now generated by a DCM and the
>>> unspecified timing relations that Wojtek brought up are now gone
>>> since everything is clocked with a single clock. The modified
>>> design have been tested on Pepino and seems to run fine.
>>>
>>> The complete ISE project with those changes are available at the
>>> Pepino GitHub repository:
>>> https://github.com/Saanlima/Pepino/tree/master/Projects/RISC5Verilog_Pepino
>>>
>>> Any comments or critique are welcome.
>>>
>>> Cheers,
>>> Magnus
>>>
>>>
>>>
>>> On 2/17/2016 2:16 PM, Walter Gallegos wrote:
>>>> Hi Magnus,
>>>>
>>>> You are welcome to continue with FPGA specific topics by private
>>>> e-mail if you want.
>>>>
>>>> Regards
>>>> Walter
>>>>
>>>> El 2016-02-17 a las 18:30, Magnus Karlsson escribió:
>>>>> Hi Walter,
>>>>>
>>>>> Since this is really Paul's design, I guess it would be more
>>>>> appropriate to discuss it with him, I was just trying to explain
>>>>> why it looks like it does.
>>>>>
>>>>> Cheers,
>>>>> Magnus
>>>>>
>>>>>
>>>>> On 2/17/2016 1:15 PM, Walter Gallegos wrote:
>>>>>> Magnus,
>>>>>>
>>>>>> Some of messages was delayed; so, I continue from here to not
>>>>>> overload the list.
>>>>>>
>>>>>> If I understand you correctly, you justify a uncontrolled delay
>>>>>> because they simplify the SRAM handling.
>>>>>> Sorry, is as using the old circuit with an and/inverted to
>>>>>> generate a pulse. If you need a delayed signal you should use the
>>>>>> DCM 90°, 180° or 270° clock outputs and keep all under control, I
>>>>>> think don't need a state machine in this case.
>>>>>>
>>>>>> About ISE warnings, be careful, non warning do not means good
>>>>>> methodology.
>>>>>>
>>>>>> About XILINX docs; really, I don't remember. Doing training,
>>>>>> first as Xilinx ATP and now as independent consultant, I touch
>>>>>> this problem in my trainings. Have an uncontrolled delay in clock
>>>>>> is a big door to random problems. FPGA design must be synchronous
>>>>>> all times; no exceptions.
>>>>>>
>>>>>> Regards,
>>>>>> Walter
>>>>>>
>>>>>>
>>>>>> El 2016-02-17 a las 14:41, Magnus Karlsson escribió:
>>>>>>> Walter,
>>>>>>>
>>>>>>> I agree with you that the "pure" way of doing this is as you
>>>>>>> stated, with a DCM to directly generate both clk and pclk. So
>>>>>>> how come Paul didn't do that? It's not like he doesn't know how
>>>>>>> to use the DCM, after all the current code generates pclk from
>>>>>>> clk using a DCM, and there would probably be less code to do it
>>>>>>> like you suggest. No, the reason for this is very subtle and is
>>>>>>> easy to miss if you just take a quick look at the code, and it
>>>>>>> has to the asynchronous SRAM interface.
>>>>>>>
>>>>>>> One of the most critical aspects of using SRAM is to control the
>>>>>>> write signal - ideally the write signal should be asserted after
>>>>>>> all other control signals (like address, data, byte-enable,
>>>>>>> read, oe) are valid, and should be de-asserted well before any
>>>>>>> of the other control signals go invalid, to avoid spurious
>>>>>>> writes. However, this is not that easy to do in a synchronous
>>>>>>> system where all signals change at the clock edge. The most
>>>>>>> common way to do this is to have a state machine that is clocked
>>>>>>> at say 4x the CPU clock so that you can divided the SRAM access
>>>>>>> cycle into several phases and assert the write signal on some of
>>>>>>> those phases.
>>>>>>>
>>>>>>> However, this is not the way Paul choose to do it, instead he
>>>>>>> choose to do a less "pure" clock generation by generating clk
>>>>>>> from a flip-flop rather than from a DCM. By doing so, he
>>>>>>> actually generates an early version of the clock signal called
>>>>>>> clk that is leading the global clock signal clk_BUFG by the
>>>>>>> delay of the BUFG buffer. Since this early version of the clock
>>>>>>> signal is generated like any other logic signal, he could use
>>>>>>> this signal to gate the write signal to the SRAM such that write
>>>>>>> signal will be de-asserted well before the other control signals
>>>>>>> (clocked by clk_BUFG) will change, and thus avoiding the need to
>>>>>>> have a state machine controlling the write signal. The price
>>>>>>> for this is that the clock signal is now generated in a less
>>>>>>> "pure" way, but still a valid way as long as you know what you
>>>>>>> are doing. The BUFG clock driver can be driven from a PLL, a DCM
>>>>>>> or from the logic fabric. The first two are speed optimized
>>>>>>> paths going directly from the PLL or DCM to the BUFG and can be
>>>>>>> clock at much higher clock rate, while the logic fabric path is
>>>>>>> limited by the maximum clock rate of the logic fabric. However,
>>>>>>> at the clock rate we use (25 MHz) this is not an issue. When
>>>>>>> you do this there are no warnings generated by ISE that this is
>>>>>>> not a good idea, and I have not read anywhere in the Xilinx
>>>>>>> clocking resource guide that you should avoid doing this.
>>>>>>> Basically, the BUFG clock driver is designed to do this, the
>>>>>>> tool will allow you to do it and at the clock rate we use it has
>>>>>>> no performance implications. As I see it, this is another place
>>>>>>> where the goal of simplification has driven the implementation
>>>>>>> of the system at the expense of a slightly less "pure" clock
>>>>>>> generation.
>>>>>>>
>>>>>>> Just my 2c
>>>>>>>
>>>>>>> Magnus
>>>
>>> --
>>> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related
>>> systems
>>> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>>>
>>
>
> --
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon
>
--
Walter Daniel Gallegos
Programmable Logic & Software
Consultoría, Diseño, Entrenamiento.
Montevideo, Uruguay
EMAIL walter at waltergallegos.com
Tel +598 26 23 44 60 | Cel +598 99 18 58 88
El presente correo y cualquier posible archivo adjunto está dirigido únicamente
al destinatario del mensaje y contiene información que puede ser confidencial.
Si Ud. no es el destinatario correcto por favor notifique al remitente
respondiendo anexando este mensaje y elimine inmediatamente el e-mail y los
posibles archivos adjuntos al mismo de su sistema. Está prohibida cualquier
utilización, difusión o copia de este e-mail por cualquier persona o entidad
que no sean las específicas destinatarias del mensaje.
This e-mail and any attachment is confidential and is intended solely for the
addressee(s). If you are not intended recipient please inform the sender
immediately, answering this e-mail and delete it as well as the attached files.
Any use, circulation or copy of this e-mail by any person or entity that is not
the specific addressee(s) is prohibited.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.inf.ethz.ch/pipermail/oberon/attachments/20160219/7a146b1b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ffgabeje.png
Type: image/png
Size: 6951 bytes
Desc: not available
URL: <http://lists.inf.ethz.ch/pipermail/oberon/attachments/20160219/7a146b1b/attachment.png>
More information about the Oberon
mailing list