[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RING_HAS_UNCONSUMED_REQUESTS oddness



On Thu, Mar 06, 2014 at 09:39:40PM +0000, Zoltan Kiss wrote:
> On 06/03/14 17:30, Tim Deegan wrote:
> >At 16:31 +0000 on 06 Mar (1394119880), Zoltan Kiss wrote:
> >>On 06/03/14 15:53, Ian Campbell wrote:
> >>>On Thu, 2014-03-06 at 15:47 +0000, Zoltan Kiss wrote:
> >>>>By my understanding, there is no way rsp could be smaller than req, so
> >>>>there is no point having this. Am I missing something?
> >>>
> >>>It happens during wraparound, i.e. after req has wrapped but rsp hasn't
> >>>yet.
> >>
> >>The name of the macro suggest we are interested whether the ring has
> >>unconsumed requests, and netback uses it that way. The answer to that
> >>question is req_prod - req_cons. And it works if prod wrapped but cons
> >>didn't.
> >
> >Yes.
> >
> >>rsp calculates the number of "consumed but not responded" requests (it
> >>also works well if req_cons wrapped but rsp_prod_pvt didn't), then
> >>subtract it from the ring size.
> >
> >That is indeed an odd thing to check, since it seems like it could only
> >be relevant if the request producer overran the response producer.
> >It's been there in one form or another since the original ring.h,
> >and RING_REQUEST_CONS_OVERFLOW does something similar.
> >
> >I can't remember the original reasoning, and so I'm reluctant to
> >suggest removing it without some more eyes on the code...
> 
> I've added the following printk before the "req < rsp" part:
> 
>       if (rsp < req)                                                  \
>               pr_err("req %u rsp %u req_prod %u req_cons %u rsp_prod_pvt 
> %u\n",
> req, rsp, (_r)->sring->req_prod, (_r)->req_cons,
> (_r)->rsp_prod_pvt); \
> 
> And it gave me such results:
> 
> xen_netback:xenvif_zerocopy_callback: req 4294967279 rsp 52 req_prod
> 1770663942 req_cons 1770663959 rsp_prod_pvt 1770663755
> 
> So it can happen that req_prod is behind req_cons, sometimes even
> with 17! But it always happen in this callback of my new grant
> mapping series, which runs outside the NAPI instance. My theory why
> this can happen:
> - callback reads req_prod
> - frontend writes it
> - backend picks it up, and consumes those slots
> - callback reads req_cons
> 

So there's three parties accessing the ring? I *think* the ring is only
designed to be lockfree for two parties so I don't know whether this
model is legit.

Wei.

> So req can be near UINT_MAX if you call this macro outside the
> backend. The only place where the actual return value of this macro
> matters is xenvif_tx_build_gops, and it should be correct there. At
> other places we are only looking for the fact whether the ring has
> unconsumed requests or not. If prod is smaller than cons, we clearly
> read a wrong value. I think what we can do:
> 1. try again until its correct
> 2. just return a non-zero value, it shouldn't cause too much trouble
> if we say yes here
> 3. we can't see rsp_cons, so try to figure out if the ring is full
> of consumed but not responded requests, and return zero then,
> otherwise a positive value. That's what we do know.
> 
> Does this make sense? Should we rather go option 1? Should I post a
> comment patch to document this, and spare a few hours for future
> generations? :)
> 
> Zoli

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.