[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Fwd: Re: Xen 4.3 / 4.4 - concurrent APIs, VGA Passthru


  • To: Gordan Bobic <gordan@xxxxxxxxxx>
  • From: Georg Bege <therion@xxxxxxxxxxxx>
  • Date: Sun, 03 Aug 2014 03:09:31 +0200
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Sun, 03 Aug 2014 01:10:57 +0000
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=ninth-art.de; h=message-id :date:from:reply-to:mime-version:to:cc:subject:references :in-reply-to:content-type:content-transfer-encoding; q=dns; s= mail2013; b=PmosPC8hzzVbBBI3/XVGNC1+EPFE0WyECZwQP6HWmzQum+K7/aK1 1yuaZCE7/SwY4RBZJNZba2Gti7kn14rGf/M/9cJteAReRyqK5SWbze9+1EcBr/U2 4OQRMNPboyTSQOGTFRjxUmEUPPPMykF3j7tAPF1dZuqEbTV5fOsyQxs=
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hi

I tried your suggestions now with Xen 4.3.2-r4 (Gentoo).
On Win7 64bit its not working at all, I cannot passthru the GFX -
I rolled back to a snapshot where I installed everything but not the
nvidia package.

So I tried it and it works out, reboots and then I get a message like:
"This device reported an error and had to be stopped, code 43."
within the device manager.
This happens whether Im trying the DomU with 8GB or 1024MB - same results.
On Xen 4.4 I can passthru the GFX, no problem but its very slow (as I've
written already).

On WinXP 64bit its not working either, there I get the ominious error
you think is an memory corruption regarding IOMMU.
Either the machine freezes immediadly or it breaks some memory tables (I
think in nouveau, because I got error messages regarding this in Dom0
dmesg) and everything is extreme slow-motion - I can barly do anything.
This happens no matter if I do 8GB in the DomU or just 1024MB - so I
couldnt really try what you suggested.
Because 1GB memory didnt fix it for me and this time it wasnt only due
xl destroy but immediadly once the system came up (the WinXP DomU).

However on Xen 4.4 at least the WinXP DomU is working fine, you warned
me already that this doesnt mean there wont be any hidden corruption at
all - but so far it has worked for me with numerous games/applications
already including copying a lot of data via samba.

regards,
Georg

Am 29.07.2014 12:38, schrieb Gordan Bobic:
> BTW, why off-list?
>
> Replies inline below.
>
> On 2014-07-29 08:39, Georg Bege wrote:
>> Hi
>>
>> Am 29.07.2014 09:10, schrieb Gordan Bobic:
>>> On 07/29/2014 02:12 AM, Georg Bege wrote:
>>>> Hi Again,
>>>>
>>>> no with Xen 4.3 its not working to me, in Win XP64 I get kinda
>>>> abstract
>>>> distortions in the screen,
>>>
>>> That sounds suspiciously like the IOMEM memory stomp I see with my
>>> motherboard.
>>>
>>>> not much later then I get a kernel panic on Dom0 regarding IRQ16 (like
>>>> sth IRC16: no body cared.. oops) the kernel gets damaged by it and
>>>> then
>>>> the whole machine works like in slow-motion (everything X11, moving of
>>>> the mouse etc.) all I can do then is hard reboot...
>>>
>>> I have seen the interrupt issue before but _only_ when I xl destroy
>>> the domUs. It never happened on a clean shutdown/reboot of domUs. The
>>> interrupt you mention - does it happen to be the IRQ used by your USB
>>> controller?
>> Im not sure but the IRQ 16 is used by:
>>  16:    2065542          0          0          0          0
>> 0          0          0  xen-pirq-ioapic-level  ehci_hcd:usb1,
>> snd_emu10k1, nouveau
>
> ehci_hcd:usb1 sounds like USB.
>
> If you look at lspci -vvv, look for "IRQ 16". It should match.
>
> Unfortunately, I never got to the bottom of what the root cause of
> the fault is. However, I only ever see it when:
>
> 1) I force destroy the domU (xl destroy).
>
> 2) The domU crashes due to the memory stomp when it tries to
> write to a memory address range that is in the PCI IOMEM range
> which due to a bug isn't getting remapped by the IOMMU.
>
> I thus assumed this was all related to the same IOMMU bug,
> but I am not 100% certain.
>
> Try a fresh, clean reboot and start your domU with only 1GB of RAM
> and see if the problem goes away.
>
> If that works OK, look at the output of the following:
> # lspci -vvv | grep "Region .: Memory at" | sed -e 's/.* Memory at //'
> | sort
>
> Look at the first value it returns. Convert that from hex to
> decimal, and set your domU memory to 1MB below that. Assuming
> that is more than 1GB you tried above (if it's less than 1GB,
> try it anyway) and it still works without a crash there is a
> good chance you are up against the same bug as me.
>
> For Windows 7 domU you could potentially work around the
> problem by using bcdedit to mark the IOMEM aperture memory
> blocks as faulty so the OS will never try to use it.
>
> My motherboard only knows how to assign IOMEM below 4GB,
> so I found it easier to just mark everything between 1GB
> and 4GB as "reserved" in the domU e820 map. If your BIOS
> knows how to map things above 4GB, you may need to use the
> mentioned trick of marking the memory areas as faulty.
>
> Caveat - I don't actually know what happens if the domU
> IOMEM mapping for the GPU overlaps an IOMEM mapping in dom0.
> Thankfully, QEMU doesn't map my GPUs to such IOMEM ranges
> so I never had to find out. There has been a proposal to
> introduce an e820_host option to HVM guests (it already
> exists in PV guests), which might help, but with the
> work recently going into PVH domains, this option may become
> available to work around the hardware bug I'm talking about.
> Either way, if your domU GPU apertures aren't overlapping
> any dom0 apertures, it should be fine.
>
>>> I always assumed that this was to do with other IOMEM issues to do
>>> with my motherboard, but since it only happens when I have to
>>> forcefully destroy domUs it doesn't bother me too much.
>>>
>>> What motherboard do you use?
>> Intel DX79SI
>>>
>>> How much RAM are you giving to your domUs? Can you check if it happens
>>> when you only give the domU 1GB of RAM?
>> I could try, maybe later on - right now Im on Xen 4.4 again, just got my
>> xp x64 running fine -
>> maybe I'll later also test Xen 4.5 on that box.
>
> The "running fine" can be deceptive. If this is the bug I thing it
> is, there is a good chance the problem will not manifest until you
> overrun the PCI IOMEM region. Try loading a program that fills up
> all or most of memory and see if that triggers weird artifacting
> and crashing. When I was doing my testing I found that firing up
> Borderlands 2 and waiting for the intro was a very easy way to cause
> the crash to occur. I'm sure there are many other games that are
> similarly resource intensive that will cause it to happen.
>
> One thing to be aware of, though - if you are unlucky and the
> IOMEM stomp clobbers the aperture used by your disk controller
> you could end up with data corrupted on the disk.
>
>> Btw. I read this in advance, just realized you are the same guy
>> http://xen.1045712.n5.nabble.com/GPU-passthrough-on-Xen-4-4-0-FLReset-td5722964.html
>>
>
> Yes.
>
> Gordan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.