[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure




> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf Of 
> Paul Durrant
> Sent: 28 February 2019 11:22
> To: Wei Liu <wei.liu2@xxxxxxxxxx>
> Cc: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>; Wei Liu 
> <wei.liu2@xxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxxx; 
> davem@xxxxxxxxxxxxx
> Subject: Re: [Xen-devel] [PATCH] xen-netback: fix occasional leak of grant 
> ref mappings under memory
> pressure
> 
> > -----Original Message-----
> > From: Wei Liu [mailto:wei.liu2@xxxxxxxxxx]
> > Sent: 28 February 2019 11:02
> > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
> > Cc: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>; 
> > xen-devel@xxxxxxxxxxxxxxxxxxxx;
> > netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Wei Liu 
> > <wei.liu2@xxxxxxxxxx>;
> > davem@xxxxxxxxxxxxx
> > Subject: Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> > under memory pressure
> >
> > On Thu, Feb 28, 2019 at 09:46:57AM +0000, Paul Durrant wrote:
> > > > -----Original Message-----
> > > > From: Igor Druzhinin [mailto:igor.druzhinin@xxxxxxxxxx]
> > > > Sent: 28 February 2019 02:03
> > > > To: xen-devel@xxxxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; 
> > > > linux-kernel@xxxxxxxxxxxxxxx
> > > > Cc: Wei Liu <wei.liu2@xxxxxxxxxx>; Paul Durrant 
> > > > <Paul.Durrant@xxxxxxxxxx>; davem@xxxxxxxxxxxxx;
> > Igor
> > > > Druzhinin <igor.druzhinin@xxxxxxxxxx>
> > > > Subject: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> > > > under memory pressure
> > > >
> > > > Zero-copy callback flag is not yet set on frag list skb at the moment
> > > > xenvif_handle_frag_list() returns -ENOMEM. This eventually results in
> > > > leaking grant ref mappings since xenvif_zerocopy_callback() is never
> > > > called for these fragments. Those eventually build up and cause Xen
> > > > to kill Dom0 as the slots get reused for new mappings.
> > > >
> > > > That behavior is observed under certain workloads where sudden spikes
> > > > of page cache usage for writes coexist with active atomic skb 
> > > > allocations.
> > > >
> > > > Signed-off-by: Igor Druzhinin <igor.druzhinin@xxxxxxxxxx>
> > > > ---
> > > >  drivers/net/xen-netback/netback.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/net/xen-netback/netback.c 
> > > > b/drivers/net/xen-netback/netback.c
> > > > index 80aae3a..2023317 100644
> > > > --- a/drivers/net/xen-netback/netback.c
> > > > +++ b/drivers/net/xen-netback/netback.c
> > > > @@ -1146,9 +1146,12 @@ static int xenvif_tx_submit(struct xenvif_queue 
> > > > *queue)
> > > >
> > > >                 if (unlikely(skb_has_frag_list(skb))) {
> > > >                         if (xenvif_handle_frag_list(queue, skb)) {
> > > > +                               struct sk_buff *nskb =
> > > > +                                               
> > > > skb_shinfo(skb)->frag_list;
> > > >                                 if (net_ratelimit())
> > > >                                         netdev_err(queue->vif->dev,
> > > >                                                    "Not enough memory 
> > > > to consolidate frag_list!\n");
> > > > +                               xenvif_skb_zerocopy_prepare(queue, 
> > > > nskb);
> > > >                                 xenvif_skb_zerocopy_prepare(queue, skb);
> > > >                                 kfree_skb(skb);
> > > >                                 continue;
> > >
> > > Whilst this fix will do the job, I think it would be better to get rid of 
> > > the kfree_skb() from
> > inside xenvif_handle_frag_list() and always deal with it here rather than 
> > having it happen in two
> > different places. Something like the following...
> >
> > +1 for having only one place.
> >
> > >
> > > ---8<---
> > > diff --git a/drivers/net/xen-netback/netback.c 
> > > b/drivers/net/xen-netback/netback.c
> > > index 80aae3a32c2a..093c7b860772 100644
> > > --- a/drivers/net/xen-netback/netback.c
> > > +++ b/drivers/net/xen-netback/netback.c
> > > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct 
> > > xenvif_queue *queue,
> > >  /* Consolidate skb with a frag_list into a brand new one with local 
> > > pages on
> > >   * frags. Returns 0 or -ENOMEM if can't allocate new pages.
> > >   */
> > > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > > sk_buff *skb)
> > > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > > sk_buff *diff --git
> > a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > > index 80aae3a32c2a..093c7b860772 100644
> > > --- a/drivers/net/xen-netback/netback.c
> > > +++ b/drivers/net/xen-netback/netback.c
> > > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct 
> > > xenvif_queue *qu
> > > eue,
> > >  /* Consolidate skb with a frag_list into a brand new one with local 
> > > pages on
> > >   * frags. Returns 0 or -ENOMEM if can't allocate new pages.
> > >   */
> > > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > > sk_buff *
> > > skb)
> > > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > > sk_buff *
> > > skb,
> > > +                                  struct sk_buff *nskb)
> > >  {
> > >         unsigned int offset = skb_headlen(skb);
> > >         skb_frag_t frags[MAX_SKB_FRAGS];
> > >         int i, f;
> > >         struct ubuf_info *uarg;
> > > -       struct sk_buff *nskb = skb_shinfo(skb)->frag_list;
> > >
> > >         queue->stats.tx_zerocopy_sent += 2;
> > >         queue->stats.tx_frag_overflow++;
> > > @@ -1072,11 +1072,6 @@ static int xenvif_handle_frag_list(struct 
> > > xenvif_queue *q
> > > ueue, struct sk_buff *s
> > >                 skb_frag_size_set(&frags[i], len);
> > >         }
> > >
> > > -       /* Copied all the bits from the frag list -- free it. */
> > > -       skb_frag_list_init(skb);
> > > -       xenvif_skb_zerocopy_prepare(queue, nskb);
> > > -       kfree_skb(nskb);
> > > -
> > >         /* Release all the original (foreign) frags. */
> > >         for (f = 0; f < skb_shinfo(skb)->nr_frags; f++)
> > >                 skb_frag_unref(skb, f);
> > > @@ -1145,7 +1140,11 @@ static int xenvif_tx_submit(struct xenvif_queue 
> > > *queue)
> > >                 xenvif_fill_frags(queue, skb);
> > >
> > >                 if (unlikely(skb_has_frag_list(skb))) {
> > > -                       if (xenvif_handle_frag_list(queue, skb)) {
> > > +                       struct sk_buff *nskb = skb_shinfo(skb)->frag_list;
> > > +
> > > +                       xenvif_skb_zerocopy_prepare(queue, nskb);
> > > +
> > > +                       if (xenvif_handle_frag_list(queue, skb, nskb)) {
> > >                                 if (net_ratelimit())
> > >                                         netdev_err(queue->vif->dev,
> > >                                                    "Not enough memory to 
> > > consolidate
> frag_list!\n");
> > > @@ -1153,6 +1152,10 @@ static int xenvif_tx_submit(struct xenvif_queue 
> > > *queue)
> > >                                 kfree_skb(skb);
> > >                                 continue;
> > >                         }
> > > +
> > > +                       /* Copied all the bits from the frag list. */
> > > +                       skb_frag_list_init(skb);
> > > +                       kfree(nskb);
> >
> > I think you want kfree_skb here?
> 
> No. nskb is the frag list... it is unlinked from skb by the call to 
> skb_frag_list_init() and then it
> can be freed on its own. The skb is what we need to retain, because that now 
> contains all the data.

Sorry I misread/understood what you were getting at. Yes, I meant 
kfree_skb(nskb).

  Paul

> 
>   Cheers,
> 
>     Paul
> 
> >
> > Wei.
> >
> > >                 }
> > >
> > >                 skb->dev      = queue->vif->dev;
> > > ---8<---
> > >
> > > What do you think?
> > >
> > >   Paul
> > >
> > > > --
> > > > 2.7.4
> > >
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.