[Barrelfish-users] Threads using sockets may block

Kornilios Kourtis kornilios.kourtis at inf.ethz.ch
Wed Feb 13 11:05:27 CET 2013


Hi Zaheer,

On Mon, Feb 11, 2013 at 03:44:59PM +0000, Chothia  Zaheer wrote:
> Hello,
> 
> When multiple threads use the sockets API some calls may block indefinitely.
> It seems this is because they use the default waitset -> lib/posixcompat/sockets.c:
> 
>   ssize_t recv(int sockfd, void *buf, size_t len, int flags)
>                     // XXX: Assume it was on the default waitset
>                     err = us->u.active.binding->change_waitset
>                         (us->u.active.binding, get_default_waitset());
> 
> A simple server-client example is attached.  Output looks like this:
>  
[snip]
>   client: owner has the IP address 10.110.4.21
>   [server] listening on port 5000.
>   [client] created socket: fd = 4
>   [client] connecting to server at 10.110.4.21:5000 ...
>   [client] connected to server at 10.110.4.21:5000
>   [client] calling read on socket
>   netconn_recv called on [0x805d93d8]
[snip]

Just adding to the comments: AFAICT you are running the client and the server
on the same machine, which typically requires some kind of loopback mechanism
on the network stack for the routing. Doing some naive grepping in lwip, I got
the following:

lib/lwip/src/core/ipv4/ip.c:
655-#if (LWIP_NETIF_LOOPBACK || LWIP_HAVE_LOOPIF)
656-    if (ip_addr_cmp(dest, &netif->ip_addr)) {
657:        /* Packet to self, enqueue it for loopback */
658-        LWIP_DEBUGF(IP_DEBUG, ("netif_loop_output()"));
659-        return netif_loop_output(netif, p, dest);
660-    } else
661-#endif                          /* (LWIP_NETIF_LOOPBACK || LWIP_HAVE_LOOPIF) */

I'm not sure what our configuration options are, but it might worth making sure
that the problem is not in loopback routing (e.g., by using two different
machines, or two different domains).

cheers,
Kornilios.

-- 
Kornilios Kourtis



More information about the Barrelfish-users mailing list