[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness



On Thu, 2014-03-06 at 21:39 +0000, Zoltan Kiss wrote:
> On 06/03/14 17:30, Tim Deegan wrote:
> > At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
> >> On 06/03/14 15:53, Ian Campbell wrote:
> >>> On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> >>>> By my understanding, there is no way rsp could be smaller than req, so
> >>>> there is no point having this. Am I missing something?
> >>>
> >>> It happens during wraparound, i.e. after req has wrapped but rsp hasn't
> >>> yet.
> >>
> >> The name of the macro suggest we are interested whether the ring has
> >> unconsumed requests, and netback uses it that way. The answer to that
> >> question is req_prod - req_cons. And it works if prod wrapped but cons
> >> didn't.
> >
> > Yes.
> >
> >> rsp calculates the number of "consumed but not responded" requests (it
> >> also works well if req_cons wrapped but rsp_prod_pvt didn't), then
> >> subtract it from the ring size.
> >
> > That is indeed an odd thing to check, since it seems like it could only
> > be relevant if the request producer overran the response producer.
> > It's been there in one form or another since the original ring.h,
> > and RING_REQUEST_CONS_OVERFLOW does something similar.
> >
> > I can't remember the original reasoning, and so I'm reluctant to
> > suggest removing it without some more eyes on the code...
> 
> I've added the following printk before the "req < rsp" part:
> 
>       if (rsp < req)                                                  \
>               pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt 
> %u\n", req, 
> rsp, (_r)->sring->req_prod, (_r)->req_cons, (_r)->rsp_prod_pvt); \
> 
> And it gave me such results:
> 
> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod 
> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
> 
> So it can happen that req_prod is behind req_cons, sometimes even with 
> 17! But it always happen in this callback of my new grant mapping 
> series, which runs outside the NAPI instance. My theory why this can happen:
> - callback reads req_prod
> - frontend writes it
> - backend picks it up, and consumes those slots
> - callback reads req_cons

I'm a bit confused by what you mean by "it" in your theory, so perhaps I
misunderstand. Can you use the actual variable names for clarity.

Are you sure this is a problem with the actual code and not just with
your debug print? I would expect the real code to be snapshotting things
as appropriate etc, and also for the public/private state to not
necessarily be totally in sync when RING_HAS_UNCONSUMED_REQUESTS is
being called.

> So req can be near UINT_MAX if you call this macro outside the backend. 
> The only place where the actual return value of this macro matters is 
> xenvif_tx_build_gops, and it should be correct there. At other places we 
> are only looking for the fact whether the ring has unconsumed requests 
> or not. If prod is smaller than cons, we clearly read a wrong value. I 
> think what we can do:
> 1. try again until its correct
> 2. just return a non-zero value, it shouldn't cause too much trouble if 
> we say yes here
> 3. we can't see rsp_cons, so try to figure out if the ring is full of 
> consumed but not responded requests, and return zero then, otherwise a 
> positive value. That's what we do know.

s/know/now/? Is #3 here the status quo? 

> Does this make sense? Should we rather go option 1? Should I post a 
> comment patch to document this, and spare a few hours for future 
> generations? :)

Docs are always good IMHO. 

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.