[Barrelfish-users] [Barrelfish] Intel SCC latency measurements for MPB operations
troscoe at inf.ethz.ch
Tue Mar 1 16:59:52 CET 2011
I'm not sure what communication stack your graphs are using (whether
this is bare-metal RCCE, RCCE over Linux TCP, etc.). I'm also assuming
this isn't using Barrelfish (as we've only released SCC support an hour
or two ago).
On Barrelfish we require a trap to kernel mode for inter-core messages,
which in practice dominates the time taken to access the MPB (once
you're in the kernel, we can transfer a cache line to another core's
on-time MPB in a hundred clocks or so as Intel advertise). Also, due to
queuing issues, Barrelfish's interconnect driver for the SCC passes
message payloads in DDR3, and uses the MPB for the metadata.
On 03/01/2011 04:53 PM, Haas, Werner wrote:
> I work at Intel Labs so let me try answering the RCCE-related part: The
> latency table reflects the numbers from looking at the actual hardware,
> i.e. without taking software operation into account. The RCCE round-trip
> times, however, were measured by running an actual application, i.e.
> they rather reflect the efficiency of one particular communication
> algorithm than hardware properties. I do not know the precise number but
> there are actually several MPB accesses involved in passing data via RCCE.
> Please also note that the bypass mode should _/not/_ be used as we have
> a hardware bug which can lead to reading incorrect data. Unfortunately
> this greatly reduces the benefit of using the on-die SRAM vs. off-die DDR3.
> Best regards,
> *From:*Konstantin Zertsekel [mailto:zertsekel at gmail.com]
> *Sent:* Tuesday, March 01, 2011 4:32 PM
> *To:* barrelfish-users at lists.inf.ethz.ch
> *Cc:* Dan Tsafrir; Konstantin Zertsekel; Roei; Ido Shamay; Avi
> Mendelson; Prof. Assaf Schuster
> *Subject:* [Barrelfish-users] Intel SCC latency measurements for MPB
> Hi all,
> I am engaged in the project with Intel SCC chip where communication
> latency is important factor.
> Now, according to RCCE inter-core Ping-Pong test, the minimum latency is
> 5 microseconds (see this graph
> But according to latency table for various memory accesses by Intel (see
> this table
> the minimum latency (when accessing the local MPB with bypass) is
> measured in ~100 clocks, not microseconds.
> What is the reason for such a wide gap in RCCE implementation and
> hardware latency table?
> I guess all kinds of memory access latency measurements is a must-know
> stuff for porting Barrelfish to SCC...
> Thanks, KostaZ.
> Intel GmbH
> Dornacher Strasse 1
> 85622 Feldkirchen/Muenchen, Deutschland
> Sitz der Gesellschaft: Feldkirchen bei Muenchen
> Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
> Registergericht: Muenchen HRB 47456
> Ust.-IdNr./VAT Registration No.: DE129385895
> Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
More information about the Barrelfish-users