[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] xen-netfront: Improve error handling during initialization



On 02/07/2017 09:55 AM, Ross Lagerwall wrote:
> This fixes a crash when running out of grant refs when creating many
> queues across many netdevs.
>
> * If creating queues fails (i.e. there are no grant refs available),
> call xenbus_dev_fatal() to ensure that the xenbus device is set to the
> closed state.
> * If no queues are created, don't call xennet_disconnect_backend as
> netdev->real_num_tx_queues will not have been set correctly.
> * If setup_netfront() fails, ensure that all the queues created are
> cleaned up, not just those that have been set up.
> * If any queues were set up and an error occurs, call
> xennet_destroy_queues() to clean up the napi context.
> * If any fatal error occurs, unregister and destroy the netdev to avoid
> leaving around a half setup network device.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
> ---
>
> Changed in V2:
> * Retested on top of v4.10-rc7 + "xen-netfront: Delete rx_refill_timer
>   in xennet_disconnect_backend()".
> * Don't move setup_timer as it is not necessary.
>
>  drivers/net/xen-netfront.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 722fe9f..5399a86 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -1823,27 +1823,23 @@ static int talk_to_netback(struct xenbus_device *dev,
>               xennet_destroy_queues(info);
>  
>       err = xennet_create_queues(info, &num_queues);
> -     if (err < 0)
> -             goto destroy_ring;
> +     if (err < 0) {
> +             xenbus_dev_fatal(dev, err, "creating queues");
> +             if (num_queues > 0) {
> +                     goto destroy_ring;

The only way for us to have (err<0) && (num_queues>0) is when we get a
-ENOMEM right at the top, isn't it? So there is nothing to disconnect or
destroy, it seems to me. And if that's true you can directly 'goto out'.

-boris

> +             } else {
> +                     kfree(info->queues);
> +                     info->queues = NULL;
> +                     goto out;
> +             }
> +     }
>  
>       /* Create shared ring, alloc event channel -- for each queue */
>       for (i = 0; i < num_queues; ++i) {
>               queue = &info->queues[i];
>               err = setup_netfront(dev, queue, feature_split_evtchn);
> -             if (err) {
> -                     /* setup_netfront() will tidy up the current
> -                      * queue on error, but we need to clean up
> -                      * those already allocated.
> -                      */
> -                     if (i > 0) {
> -                             rtnl_lock();
> -                             netif_set_real_num_tx_queues(info->netdev, i);
> -                             rtnl_unlock();
> -                             goto destroy_ring;
> -                     } else {
> -                             goto out;
> -                     }
> -             }
> +             if (err)
> +                     goto destroy_ring;
>       }
>  
>  again:
> @@ -1933,9 +1929,10 @@ static int talk_to_netback(struct xenbus_device *dev,
>       xenbus_transaction_end(xbt, 1);
>   destroy_ring:
>       xennet_disconnect_backend(info);
> -     kfree(info->queues);
> -     info->queues = NULL;
> +     xennet_destroy_queues(info);
>   out:
> +     unregister_netdev(info->netdev);
> +     xennet_free_netdev(info->netdev);
>       return err;
>  }
>  



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.