[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Re: [Xen-devel] Re: HVM DomU, msi_translate=0, MSI/MSI-X PCI passthrough fails.



Wednesday, December 8, 2010, 4:48:57 PM, you wrote:

> On Wed, Dec 08, 2010 at 03:05:50PM +0100, Sander Eikelenboom wrote:
>> 
>> Wednesday, December 8, 2010, 2:48:48 PM, you wrote:
>> 
>> > On Wed, Dec 08, 2010 at 02:37:15PM +0100, Sander Eikelenboom wrote:
>> >> Hello Mark,
>> 
>> > Hi
>> 
>> >> 
>> >> Just a recap:
>> >>      you pass through:
>> >>      - 3 physical nics/IGB
>> >>      - 1 ISDN pci ISDN box
>> 
>> > The redfone box runs on 1 of the nics - its not seperate. It converts
>> > ISDN to TDMoE see here.. http://www.red-fone.com/
>> 
>> So the problem is probably with the igb's.
>> Searching showed http://forums.virtualbox.org/viewtopic.php?f=7&t=32171 , 
>> perhaps worth a try ?

> Tried this - doesn't help.

>> 
>> Have you tried with just 1 IGB, and/or another simple 1gb NIC (non intel) to 
>> see if it's due to any of the special offload features ?

> Haven't got any other NIC's to try unfortunately. Even if it did work
> with 1, it would be no use to me as I need 3.

I understand, but simplifying the setup and trying to isolate the problem, 
could clarify things.

I also read you previous thread, and i saw you hide the 02:00.0 and 03:00.0 
with xen-pciback (e1000e driver) there, but now you seem to be passing through 
08:00.0 and 08:00.1 (igb) ?
So i assume you have already tried 2 different NIC's

http://download.intel.com/design/network/specupdt/82574.pdf though shows some 
errata regarding msi-x interrupts and timing issues and workarounds on the 
82574 (02:00.0 and 03:00.0) nics.

--
Sander


>> 
>> 
>> >>      - all using msi/msi-x interrupts ?
>> 
>> > I tried using msi/msi-x interrupts, but it caused the raid card to drop
>> > off (after some use) and provided seemingly even worse performance than
>> > pegging everything back to legacy.
>> 
>> >> 
>> >> Have you tried using a PV domU instead of a HVM domU ?
>> 
>> > I initially tried PV but had issues with the igb NIC's. There was
>> > another thread somewhere about my issues with that.
>> 
>> 
>> >> Have you tried passing through only the ISDN box, and let the network run 
>> >> with the xen backend/frontend to rule out the IGB/network stuff ?
>> >> 
>> >> 
>> >> --
>> >> Sander
>> >> 
>> >> 
>> >> 
>> >> Wednesday, December 8, 2010, 1:58:55 PM, you wrote:
>> >> 
>> >> > Hi - Apologies to top post this, but after alot of testing, I believe
>> >> > there must be an issue with IRQ's going missing between domU and dom0.
>> >> > Unfortunately I have no data to prove this!
>> >> 
>> >> > With msitranslate=0 as detailed below, and pci=nomsi in the guest kernel
>> >> > grub config, all 3 NIC's appear OK in the domU however I still had
>> >> > issues with the red-fone ISDN box. The interrupts were showing correctly
>> >> > (2000/s) in the domU but communication to the device via the NIC was
>> >> > still being interrupted (as shown in the asterisk console)Note that to
>> >> > get the igb driver to allow this many interrupts, the
>> >> > InterruptThrottleRate was set to 0. The same config (red-fone box,
>> >> > asterisk etc) works fine with a physical server.
>> >> 
>> >> > There is also the additional issue that I could not get the passthrough
>> >> > NIC's to show correctly when I also had a bridge setup.
>> >> 
>> >> > Throughout my testing however, I could not get the machine to crash.
>> >> 
>> >> > Not sure where to go with this one. For now we are keeping our VoIP
>> >> > servers physical when ISDN connections are required.
>> >> 
>> >> > Regards,
>> >> > Mark
>> >> 
>> >> > On Mon, Nov 29, 2010 at 11:36:35AM -0500, Konrad Rzeszutek Wilk wrote:
>> >> >> > 
>> >> >> > In my new test setup, I have seen some strange behaviour. 1 of the 
>> >> >> > HVM's
>> >> >> > (with identical config in dom0 and domU) suddenly would not allow the
>> >> >> > igb driver to be loaded in domU, even though the device was visible 
>> >> >> > in
>> >> >> 
>> >> >> Let's create a new thread for this other issue.
>> >> >> 
>> >> >> > lspci. Shutting the machine down, removing the power cord, waiting 5
>> >> >> > seconds then plugging it in again corrected that issue - Is this
>> >> >> > possibly a motherboard bug? I have also disabled the SR-IOV
>> >> >> > functionality in the BIOS incase this is causing any issues.
>> >> >> > 
>> >> >> > In addition, to try to correct the MSI issue noted above, I have 
>> >> >> > changed
>> >> >> > my pci= line to the following:
>> >> >> > 
>> >> >> > pci=[ '08:00.0,msitranslate=0', '08:00.1,msitranslate=0' ]
>> >> >> 
>> >> >> With the msi_translate=1 turned on the DomU HVM guests did work, right?
>> >> >> 
>> >> >> > 
>> >> >> > This has stopped the "already in use on device" log, and the devices
>> >> >> > appear to show correctly in the domU. Is it safe to disable
>> >> >> > msitranslate? as I understand it, its for allowing multifunction 
>> >> >> > devices
>> >> >> > to be seen as such in domU. Is that correct?
>> >> >> > 
>> >> >> > I haven't been able to reproduce the dropped raid issue yet, but I am
>> >> >> > awaiting delivery of the Red-Fone boxes (ISDN VoIP) which seem to 
>> >> >> > cause
>> >> >> > this due to their very high interrupt usage (2000 per second).
>> >> >> 
>> >> >> OK.
>> >> >> > 
>> >> >> > In the mean time, I can see the following in the qemu-dm logs now 
>> >> >> > with
>> >> >> > the msitranslate=0 enabled. Is it anything to worry about?
>> >> >> 
>> >> >> Well, the  "Error" ones are pretty bad, thought I am having a hard time
>> >> >> understanding what it means. Lets copy some of the QEMU folks on this.
>> >> >> 
>> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused 
>> >> >> > Base Address Register. [00:05.0][Offset:14h][Length:4]
>> >> >> > pt_ioport_map: e_phys=ffff pio_base=e880 len=32 index=2 first_map=0
>> >> >> > pt_ioport_map: e_phys=c220 pio_base=e880 len=32 index=2 first_map=0
>> >> >> > pt_pci_write_config: Warning: Guest attempt to set address to unused 
>> >> >> > Base Address Register. [00:06.0][Offset:14h][Length:4]
>> >> >> > pt_ioport_map: e_phys=ffff pio_base=ec00 len=32 index=2 first_map=0
>> >> >> > pt_ioport_map: e_phys=c240 pio_base=ec00 len=32 index=2 first_map=0
>> >> >> > pt_msix_update_one: Update msix entry 0 with pirq 4f gvec 59
>> >> >> > pt_msix_update_one: Update msix entry 1 with pirq 4e gvec 61
>> >> >> > pt_msix_update_one: Update msix entry 2 with pirq 4d gvec 69
>> >> >> > pt_msix_update_one: Update msix entry 3 with pirq 4c gvec 71
>> >> >> > pt_msix_update_one: Update msix entry 4 with pirq 4b gvec 79
>> >> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 0 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 1 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 2 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 3 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
>> >> >> > already function.
>> >> >> > pci_msix_writel: Error: Can't update msix entry 4 since MSI-X is 
>> >> >> > already function.
>> >> >> > 
>> >> >> > > 
>> >> >> > > Not yet. Need to serial log of the Linux kernel and the Xen 
>> >> >> > > hypervisor when your
>> >> >> > > machine is toast. I mentioned in the previous email the key 
>> >> >> > > sequences - look on Google
>> >> >> > > on how to pass in SysRQ if you are using a serial concentrator.
>> >> >> > 
>> >> >> > I will do this when I can get the machine to crash.
>> >> >> > 
>> >> >> > Best Regards,
>> >> >> > Mark
>> >> >> > 
>> >> >> > _______________________________________________
>> >> >> > Xen-devel mailing list
>> >> >> > Xen-devel@xxxxxxxxxxxxxxxxxxx
>> >> >> > http://lists.xensource.com/xen-devel
>> >> 
>> >> 
>> >> 
>> >> 
>> >> 
>> >> -- 
>> >> Best regards,
>> >>  Sander                            mailto:linux@xxxxxxxxxxxxxx
>> >> 
>> 
>> 
>> 
>> -- 
>> Best regards,
>>  Sander                            mailto:linux@xxxxxxxxxxxxxx
>> 
>> 
>> _______________________________________________
>> Xen-users mailing list
>> Xen-users@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-users




-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.