[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: Comments on Xen bug 1732



On Mon, 2011-01-31 at 13:18 +0000, Jan Beulich wrote:
> >>> On 31.01.11 at 05:54, Haitao Shan <maillists.shan@xxxxxxxxx> wrote:
> After taking a closer look:
> 
> > As you may already notice the bug 1732, (
> > http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1732), the culprit is
> > c/s 22182.
> 
> The warnings are a result of the c/s, but if there are functionality
> problems, they shouldn't be caused by this: The MSI-X table's base
> address was always determined from the value passed from Dom0
> (the raw address found in the BAR) plus the table offset as found
> in the MSI-X capability structure.

Actually I have some functionality problems which coincide with these
WARN()'s

> > I see the following attached code in your patch. It is pointless to check
> > msi->table_base against the value read from physical device if this function
> > is a virtual function of SR-IOV device. VFs are required to have BARs zeroed
> > by specifications. And for VFs, unless you can read these values from
> > corresponding PF, you will have to trust the "table_base" passed from dom0
> > via hypercall. Actually, this parameter is specifically introduced for
> > enabling SR-IOV.
> 
> One important question then is whether there's a way for Xen to
> determine the PF for the VF and the correct BAR to use without
> additional help from Dom0. If that's not possible, passing down the
> BAR contents needed for the PBA base address calculation on a
> VF would be necessary, which would require a new sub-hypercall.

In my case (HVM) it looks like qemu has figured out the correct base
address for the PBA.

> The only exception to this would be if both use the same BAR (and
> really if that's a common case, a simple initial fix could be to use
> the passed down table_base value also for pba_paddr if the two
> BIRs match).
> 
> In any case I am of the opinion that all of the warnings make
> sense currently, with the sole exception of the VF case of the
> msi->table_base != read_pci_mem_bar() one (avoiding this
> would require Xen to at least have a way to recognize a given
> <bus>:<dev>.<func> is a VF).

I see

> > BTW: I vaguely recall that MSI-X table base might not be the first page of
> > the corresponding BAR register.
> 
> Indeed - that's what is being accounted for using table_offset (read
> from MSI-X capability structure + msix_table_offset_reg()).

In my case the device is ixgbe and yes, it seems to follow the 8KB
aligning recommendation.

The actual symptom I am having is a lot of stuff like this in the guest
with VF passed-through:

ixgbevf: eth: ixgbevf_reset: PF still resetting
ixgbevf: eth: ixgbevf_open: Unable to start - perhaps the PFDriver isn't up yet
ixgbevf: eth: ixgbevf_check_tx_hang: Detected Tx Unit Hang
  Tx Queue             <0>
  TDH, TDT             <0>, <1>
  next_to_use          <1>
  next_to_clean        <0>
tx_buffer_info[next_to_clean]
  time_stamp           <fffd2f6b>
  jiffies              <fffd3db4>
ixgbevf: eth: ixgbevf_clean_tx_irq: tx hang 3 detected, resetting adapter
ixgbevf: eth: ixgbevf_watchdog_task: NIC Link is Up 10 Gbps

And correspondingly no Tx or Rx traffic at all. It all seems very much
like a lack of interrupts, but /proc/interrupts shows good numbers:

201:        146       PCI-MSI-X  eth-rx-0
209:        140       PCI-MSI-X  eth-tx-0
217:          8       PCI-MSI-X  eth:mbx

Furthermore this used to work on xen 3.4 but fails on 4.1 so it seems to
be a regression. One other notable change is the assignments of the
MSI-X vectors that I see when hitting the Q debug key:

On 3.4:
(XEN) 04:10.0 - dom 1   - MSIs < 66 74 82 >

On 4.1:
(XEN) 04:10.1 - dom 0   - MSIs < 117 118 119 >

However qemu seems happy with it all in either case:

Mar 15 18:00:30 localhost qemu.1[10344]: pt_register_regions: IO region 
registered (size=0x00004000 base_addr=0xdd700004) 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_register_regions: IO region 
registered (size=0x00004000 base_addr=0xdd800004) 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: get MSI-X table bar base 
dd800000 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: table_off = 0, 
total_entries = 3 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: errno = 2 
Mar 15 18:00:30 localhost qemu.1[10344]: pt_msix_init: mapping physical MSI-X 
table to b5d91000 
Mar 15 18:00:30 localhost qemu.1[10344]: register_real_device: Real physical 
device 04:10.1 registered successfuly! IRQ type = INTx 
...
Mar 15 18:01:13 localhost qemu.1[10344]: pt_msix_update_one: Update msix entry 
0 with pirq 56 gvec b9 
Mar 15 18:01:13 localhost qemu.1[10344]: pt_msix_update_one: Update msix entry 
1 with pirq 55 gvec c1 
Mar 15 18:01:13 localhost qemu.1[10344]: pt_msix_update_one: Update msix entry 
2 with pirq 54 gvec c9 

Any ideas?

Gianni


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.