WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] Unmatched decrementing of net device reference c

To: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] Unmatched decrementing of net device reference count
From: Glauber de Oliveira Costa <gcosta@xxxxxxxxxx>
Date: Wed, 20 Dec 2006 11:21:51 -0200
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 20 Dec 2006 05:21:12 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <E1Gwoqk-00051M-00@xxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20061219153755.GD19551@xxxxxxxxxx> <E1Gwoqk-00051M-00@xxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.11
On Wed, Dec 20, 2006 at 10:58:22AM +1100, Herbert Xu wrote:
> Glauber de Oliveira Costa <gcosta@xxxxxxxxxx> wrote:
> > 
> > This bug was found when heavy stressing the netfront 
> > attach/detach mechanism with the following script:
> > 
> >   for i in $(seq 200); 
> >   do 
> >     xm network-attach <domid>;  
> >     xm network-detach <domid> $i;
> >   done
> > 
> > Guest kernel shows the following messages:
> > 
> > unregister_netdevice: waiting for eth1 to become free. Usage count = -1
> > 
> > After this patch, it ran okay in multiple iterations
> 
> Could you please use in-line patches? It's much easier to comment on.
It is. I could swear I inlined it, but maybe I forgot.
 
> Your patch description doesn't make sense.  unregister_netdev()
> cannot possibly cause the device to be freed.  Otherwise the
> subsequent free_netdev() call which you kept would be wrong.

In fact. I read it again, and it was confusing (I myself was confused).

I'll try to rephrase: ( I digged more, cleared things up, and it'll be
more precise now)

unregister_netdev() works as a barrier in this case. The call to
netif_disconnect_backend() introduces a new carrier watch, which hold()s a
reference to be put()'d in a future time. If we call free right after that, 
it might be the case that put() is called after free. Nothing in this
case prevents this memory region to have been allocated again to another
device. 

unregister_netdev() holds the rntl lock. It means that when the lock is
released, netdev_run_todo() (which is setup by unregister_netdev()
itself, with net_set_todo() ), will call netdev_wait_allrefs(), which 
takes care of the linkwatch_runqueue. Calling unregister_netdev()
between the carrier watch and free_netdev() guarantees that the device
will be only free'd when the watches were already handled.

There would most probably be other ways to guarantee that, such as,
calling linkwatch_runqueue() directly. But I think that we lose nothing
by calling unregister_netdev() in the middle, and gain serialization for
free. 


> So most likely what's happening is that free_netdev() is occuring
> without a preceding unregister_netdev(), which implies that there
> is a bug in the frontend state transition.

It is not the case, see above.
 
-- 
Glauber de Oliveira Costa
Red Hat Inc.
"Free as in Freedom"

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel