[Oberon] oberonnet of things

Wed May 27 07:08:30 CEST 2015

I agree, this zero-copy design is nice.
Your approach would be doable if I decided for another Ethernet chip.
If I see it correctly, the W5500 in the wiz550io comes with its own memory and irs own state machine.
So, unfortunately I cannot go the zero-copy approach, as the RISC cannot access this memory directly :-( 

Br, Jörg

Am 27.05.2015 um 06:09 schrieb <skulski at pas.rochester.edu> <skulski at pas.rochester.edu>:

>> As I don't want to modify the existing behavior of the RISC5, my
>> intention is to extend the IO addresses by one additional address,
>> namely -24. Simply said: By writing to this address with
>> "SYSTEM.PUT(-24, x)" I will put one byte on the Ethernet interface.
>> When I read from this address with "SYSTEM.GET(-24, x)",
>> I will get one byte from the Ethernet interface.
>> The actual Oberon code to drive the wiz550io is a little more complex
>> than that but to give you an idea.
>> 
>> Now to make this happen I need some "HW glue logic" to map the Oberon
>> address -24 to the underlying HW mechanism of the Pipistrello board
>> (FPGA) and the wiz550io. This HW behavior is defined in Verilog; I hence
>> adopt "RISC5Top.v" accordingly.
> 
> It is interesting to recall how Cypress achieved near full USB-2 speed in
> their EZ-USB-FX devices, which combined an 8-bit 8051 and some
> programmable logic. (They did not open up the programmable logic to the
> users, and perhaps they implemented it with ASIC silicon, but it obviously
> looks like a small FPGA-on-chip.) The 8051 runs at 48 MHz with 4 clocks
> per instruction, which translates into 12 MIPS. The transfer rate was
> about 30 MB/s despite this low CPU power.
> 
> The architecture employed zero-copy "end point buffers", which were filled
> by the hardware. The buffer, after it was filled with USB data, was
> switched to the "endpoint domain" and it became unaccessible to the CPU.
> The buffer content was pumped via the USB channel, and then the buffer was
> returned to the CPU domain. The CPU could either poll the status bit (not
> recommended) or it could receive an interrupt when the buffer was return
> for reuse by the CPU. There were multiple such buffers, four if I remember
> correctly. The CPU could work on filling some buffers, while the others
> were being transmitted by the programmable HW domain.
> 
> It is important that these buffers were NOT allocated from the general
> purpose RAM. They were rather preassigned somewhere in silicon. Their
> addresses were fixed in the memory space. It was obvious from the
> documentation that the USB transfer was handled by a state machine in the
> USB domain.
> 
> In the current picture such buffers would be crafted from BRAM, filled by
> the RISC5, and then passed to the custody of a state machine that would
> pump the data to the Wiznet chip. BTW, the Wiznet W5100 has both a SPI and
> a byte-wide interface. The latter is obviously faster than SPI.
> 
> Cypress distributed a very neat embedded operating system for their EZ-USB
> chips. They named it Frameworks" rather than OS, but in fact it was an OS,
> with a task scheduler, interrupt handlers, and all the usual stuff. It was
> small, very well written in C, and well documented. In a certain way it
> was a gem of software design.
> 
> The lesson from Cypress design was that in order to achieve good
> performance in hardware one needs to think in hardware. Transfering data
> in and out of a buffer to a communication channel is a better task for a
> state machine rather than a CPU. The software can never achieve the speed
> that a state machine can run at. The keys to performance are: (1) working
> with buffers rather than individual bytes; (2) delegating the transfers to
> state machines; (3) having multiple parallel buffers in order to execute
> various tasks in parallel with both the CPU and the state machines.
> 
> You can also say that Cypress implemented a multicore system with one
> slower 8051 core, and one (or several, who knows how they did it) fast
> cores dedicated to just sending and receiving the buffers. Inter-core
> synchronization was done either with status bits polled by the slow CPU,
> or with the interrupts executed by the slow CPU when buffers became
> available.
> 
> It would not hurt if Cypress design was studied a bit more because it was
> an example of good engineering and excellent documentation. The basic
> principles can be employed in the RISC5 systems. Working along these
> lines, one can also implement multicore RISC5 systems-on-chip.
> 
> I think that the above discussion can point towards the areas where RISC5
> can offer unique advantages, namely joint hardware-software codesign. In
> order to reach those areas one has to divide the tasks between both
> hardware and CPU domains, just like Cypress did. A skilled engineer can
> design embedded systems running in low-capacity FPGAs, somewhat similar to
> the Cypress SoC which achieved sufficient performance despite using a
> low-end CPU.
> 
> In order to reach the efficient codesign one has to part with an illusion
> that the HW part can be implemented with just a few lines of portable LOLA
> or Verilog. Efficient hardware cannot be casually implemented as a "glue
> register logic". In modern FPGAs the efficiency can only be achieved when
> one knows the underlying hardware resources which one is using.
> 
> Proof-of-principle HW can be implemented with generic HDL, but efficient
> hardware cannot be built this way.
> 
> W.
> 
> 
> --
> Oberon at lists.inf.ethz.ch mailing list for ETH Oberon and related systems
> https://lists.inf.ethz.ch/mailman/listinfo/oberon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2376 bytes
Desc: not available
Url : https://lists.inf.ethz.ch/pipermail/oberon/attachments/20150527/945932c5/attachment.bin