[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] segfault in xl create for HVM with PCI passthrough



Ian,
apologies for pinging this, but I am not sure whether there's anything else over and above the answers in my last message (copied below) that you are expecting me to provide before being able to judge where and what the issue might be?

Many thanks in advance, Atom2

P.S. In case you again require the attachments to my last message, please let me know.

Am 29.10.14 um 01:26 schrieb Atom2:
To keep the thread together I am again submitting the relevant parts of
my last answer (which due to an error on my part originally went out to
Ian only and I only forward it to the list afterwards which resulted in
an out-of-thread appeareance) together with the (new) results of my gdb
excercise. Sorry for any confusion this may(might have) cause(d).

Am 28.10.14 um 17:04 schrieb Ian Campbell:
[...]
With regards to gdb: I can certainly run the command under gdb after
including debug support to the executables - that's no big deal.
I would, however, ask for your advice as to what I need to recompile
with debugger support? Is xen-tools (which includes xl) sufficient

I think just the Xen bits would be sufficient, at least to start with.

  or
would you think that I also need to include debug support for gcc as the
library that is mentioned in /var/log/messages (libgcc_s.so.1) seems to
belong to the gcc package? Or is this library a red herring that just
works as the catch-all code getting and finally handling the segfault?

I'd recommend ignoring it for now, in the event that the backtrace from
just the xen bits suggests a gcc issue that might change. My money right
now is on it being a xen issue though.

After recompiling xen-tools with gdb debug support I started the
following command:
# gdb --args /usr/sbin/xl create pfsense -c

Please find the command's screen output after its start up to the
segfault including the output of the bt command after the segfault in
the attached document named "create".

Furthermore I did the same for the destroy command:
# gdb --args /usr/sbin/xl destroy pfsense

The output of this command is in the attached document named "destroy".

I haven't got much experience with gdb yet so I am unable to interpret
the outcome of either. Also if there's more/different stuff required,
please advise me what to do next. Tx.

[...]
pci          = [ '04:00.0', '0a:08.0', '0a:0b.0' ]

You say in $subject that the failure is with PCI, is that because
you've
tried an HVM domain without and it is ok, or is it just that all your
HVM domains happen to have passthrough enabled?
I haven't tried HVM domains without PCI passthrough (but PV domains w/o
PCI passthrough and they did not segfault) so far as all my HVM domains
require PCI devices (either at least a network card for pfsense - in
actual facts it's more than one that's being passed through - or a SATA
controller for my second HVM which is used as a storage VM).

The VM doesn't need to be fully functional, it just needs to boot
without crashing the toolstack. Just running your existing VM with the
pci line commented out would be useful.
Before re-compiling the xen-tools I made a quick test as you suggested
and commented out the pci line from my config file ... and the boot menu
showed up (which it did not before when the segfault happened).
I did not boot the pfsense vm any further as this might lead to a change
in my configuration due to missing devices, but to me this at first
sight seemed to indicate that is has to do with the PCI passthrough
functionality.
Although as I did not want to boot the machine (and "xl shutdown" did
not work, not even with -F) I then decided to
     xl destroy pfsense
and that printed a segmentation fault message (in both the shell window
where I started the command from and the console window where the
boot-menu was shown) despite no PCI devices being passed through.

To also check PCI passthrough with a PV domain: I added a pci device to
a config file for a PV domain and started that with
     xl create voip -c
The boot menu appeared without issues. I then also tried
     xl destroy voip
from another window and that issued the following error messages in the
shell window (without using any -vvv option):

# xl destroy voip
libxl: error: libxl_pci.c:1247:do_pci_remove: xc_domain_irq_permission
irq=17
libxl: error: libxl_device.c:1127:libxl__wait_for_backend: Backend
/local/domain/0/backend/pci/4/0 not ready
libxl: error: libxl_pci.c:1247:do_pci_remove: xc_domain_irq_permission
irq=16
libxl: error: libxl_device.c:1127:libxl__wait_for_backend: Backend
/local/domain/0/backend/pci/4/0 not ready
libxl: error: libxl_pci.c:1247:do_pci_remove: xc_domain_irq_permission
irq=23
libxl: error: libxl_device.c:1127:libxl__wait_for_backend: Backend
/local/domain/0/backend/pci/4/0 not ready
Segmentation fault

The "Segmentation fault" message also appeared in both the console
window for the domU and the shell window.

This all seems a bit strange to me at the moement, but I am sure with
your help we will arrive at the grounds of this.

Thanks and regards Atom


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.