[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Is: Telling QEMU to re-use PIRQ value Was: Re: [PATCH 1/2] xen, libxc: init msix addr/data with value from qemu via hypercall

On Fri, May 10, 2013 at 08:55:46AM +0100, Jan Beulich wrote:
> >>> On 10.05.13 at 09:39, Zhenzhong Duan <zhenzhong.duan@xxxxxxxxxx> wrote:
> > On 2013-05-10 14:37, Jan Beulich wrote:
> >>>>> On 10.05.13 at 04:49, Zhenzhong Duan <zhenzhong.duan@xxxxxxxxxx> wrote:
> >>> On 2013-05-10 03:05, Jan Beulich wrote:
> >>>>>>> Zhenzhong Duan <zhenzhong.duan@xxxxxxxxxx> 05/09/13 5:02 AM >>>
> >>>>> On 2013/5/8 20:03, Jan Beulich wrote:
> >>>>>> But of course I still don't really understand why all of the sudden
> >>>>>> this needs to be passed in rather than being under the full control
> >>>>>> of the hypervisor at all times. Perhaps this is related to me not
> >>>>>> understanding why the kernel would read these values at all:
> >>>>>> There's no other place in the kernel where the message would
> >>>>>> be read before first getting written (in fact, apart from the
> >>>>>> use of __read_msi_msg() by the Xen code, there's only one
> >>>>>> other user under arch/powerpc/, and there - according to the
> >>>>>> accompanying comment - this is just to save away the data for
> >>>>>> later use during resume).
> >>>>> There is a bug if msi_ad is not passed in.
> >>>>>
> >>>>> when driver first load,
> >>>>>
> >>>>> kernel.__read_msi_msg()
> >>>>> (got all zero)
> >>>> But you don't even comment on the apparently bogus use of the function 
> >>>> here.
> >>> This pattern is used only when hvm_pirq is enabled. kernel need to check
> >>> It's not a issue if data is 0 at first driver load, kernel will call
> >>> __write_msi_msg with pirq and  XEN_PIRQ_MSI_DATA set.
> >> But this doesn't make the use of __read_msi_msg() less bogus. It's
> >> not clear on what basis this mechanism got invented in the first
> >> place.
> > It's there since hvm_irq introduced. But it works indeed.
> But that doesn't in any way mean the concept is sound.
> >>>>> kernel.__write_msi_msg(pirq)
> >>>>> (ioreq passed to qemu as no msixtbl_entry established yet)
> >>>>> qemu.pt_msi_update_one()
> >>>>> xc_domain_update_msi_irq()
> >>>>> (msixtbl_entry dynamicly allocated with msi_ad all zero)
> >>>>>
> >>>>> then driver unload,
> >>>>> ...
> >>>>> driver load again,
> >>>>>
> >>>>> kernel.__read_msi_msg()
> >>>>> (got all zero from xen as accelerated entry just established with all 
> >>>>> zero)
> >>>> If all zeroes get returned, why would the flow here be different then 
> >>>> above?
> >>> Because pirq and related mapping and binding are not freed between
> >>> driver load-unload-load. They are freed when device detach.
> >>> We should try to use the last pirq.
> >> But then you need to solve the problem generically, i.e. not just
> >> for the driver reload case, but also for e.g. the kexec one (where
> >> __read_msi_msg() returning other than all zeros wouldn't help you
> >> as xen_irq_from_pirq() would then return -1, and you'd be back to
> >> the same problem.
> > No, not only kexec ones, it's driver unload that makes xen_irq_from_pirq 
> > return -1. So there is also a bug in kernel side.
> > I have sent a patch about kernel. I think you miss it.
> > http://www.gossamer-threads.com/lists/xen/devel/281498 
> >> IOW I think the prior IRQ needs to be freed
> >> anyway rather than an attempt be made to reuse it.
> > I have ever thought about this idea, but when to free the pirq is a problem.
> > When driver unload? qemu has no idea of if driver unloaded.
> But the kernel does, and hence could deal with this. As much as
> the setup is being done when the driver gets loaded, cleanup
> should be done when the driver gets unloaded. _If_ there
> already is such an odd protocol between kernel and qemu, then
> if that can't be dropped, it surely can be leveraged to also deal
> with the cleanup side of things? No need to fiddle with the

I don't know if such thing exists. Stefano, is there a way
to tell QEMU to re-use the PIRQ? Writting zero to the MSI?

> hypervisor interfaces for something that it's not supposed to
> know about anyway.
> > When msix entry masked? kernel mask and unmask msix entry 
> > intermittently, especially when irqbalance enabled.
> > 
> > So based on above, I think it's better to reuse same pirq, only free it 
> > when device detached.
> I continue to disagree. Also from a theoretical perspective - if you
> have a lot of devices that no driver is loaded for, you'd keep a lot
> of IRQs allocated without any need.

The guest has to use PHYSDEVOP_get_free_pirq to allocate it. And 
in this case we don't have a 'free_pirq' hypercall to release it. 

The Linux Xen<->IRQ drivers drops all of the information it has on
the PIRQ once the driver is unloaded (rightly so - the driver after
does not need the IRQ anymore and the PIRQ<->events connection has
been broken).

I wrote a tiny patch that needs improvements that would cache the
last seen BDF and PIRQ (that part is missing). That would allow us
to re-use the PIRQ and not call PHYSDEVOP_get_free_pirq until we
exhaust the allocation we have.

In other words - this can be fixed in the kernel.

But if there is a 'magic' value that can be written to QEMU to tell
it to re-use the PIRQ.. that would good too.
> Jan

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.