[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] AMD GPU passthrough in Xen



Wednesday, September 24, 2014, 3:58:57 PM, you wrote:

> Sander,

> You have given me some interesting information.  First off, I would like to 
> ask you about one of your comments;

>> Yes, from what i see the problem seems to be that the Radeon card is not 
>> reset
>> to a state were it will regard itself as "unposted". (on the first boot of 
>> the guest i
>> see the driver report it hasn't posted (which is correct), it starts to 
>> past, and it
>> works. On subsequent boots of the domU i don't see the unposted message

> I am curious how you can tell if the driver is posting or not?  Which message 
> are you looking at? 

With domU kernel 3.17-rc5 (with older kernels (i forgot the version number 
where it got fixed) there was a problem with the secondary card getting the 
shadow rom 
from the primary(emulated cirrus) ( see https://lkml.org/lkml/2014/2/13/109)), 
so a recent kernel domU kernel is best :)

In the domU on first boot it says it's not posted:

[    4.346649] [drm] Initialized drm 1.1.0 20060810
[    4.359722] [drm] radeon kernel modesetting enabled.
[    4.380435] xen: --> pirq=32 -> irq=36 (gsi=36)
[    4.383891] [drm] initializing kernel modesetting (TURKS 0x1002:0x6759 
0x174B:0xE193).
[    4.405597] [drm] register mmio base: 0xF3060000
[    4.418262] [drm] register mmio size: 131072
[    4.559911] ATOM BIOS: ELIXIR
[    4.569063] [drm] GPU not posted. posting now...
[    4.585542] radeon 0000:00:05.0: VRAM: 1024M 0x0000000000000000 - 
0x000000003FFFFFFF (1024M used)
[    4.609664] radeon 0000:00:05.0: GTT: 1024M 0x0000000040000000 - 
0x000000007FFFFFFF
....

In the domU on subsequent boots that message is not there, but i end up with 
errors like (and garbage on the screen (looks somewhat like white noise)):

[    4.321952] [drm] Initialized drm 1.1.0 20060810
[    4.335246] [drm] radeon kernel modesetting enabled.
[    4.355349] xen: --> pirq=32 -> irq=36 (gsi=36)
[    4.359541] [drm] initializing kernel modesetting (TURKS 0x1002:0x6759 
0x174B:0xE193).
[    4.385655] [drm] register mmio base: 0xF3060000
[    4.401814] [drm] register mmio size: 131072
[    4.599903] ATOM BIOS: ELIXIR
[    4.609170] radeon 0000:00:05.0: VRAM: 1024M 0x0000000000000000 - 
0x000000003FFFFFFF (1024M used)
[    4.632895] radeon 0000:00:05.0: GTT: 1024M 0x0000000040000000 - 
0x000000007FFFFFFF
....
[    5.126844] [drm:radeon_pm_init_dpm] *ERROR* radeon: dpm initialization 
failed
....
[    5.555322] [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed 
(scratch(0x8504)=0xCAFEDEAD)
....
 
(complete dmesg from both boots attached)

--
Sander

> Thanks,
> Kelly

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@xxxxxxxxxxxxxx]
>> Sent: Wednesday, September 24, 2014 9:22 AM
>> To: Zytaruk, Kelly
>> Cc: Peter Kay; xen-devel@xxxxxxxxxxxxx; Gordan Bobic; Konrad Rzeszutek Wilk
>> Subject: Re: [Xen-devel] AMD GPU passthrough in Xen
>> 
>> 
>> Wednesday, September 24, 2014, 2:12:38 PM, you wrote:
>> 
>> >> Good to know there is interest from AMD into this area :-)
>> 
>> > I am taking a personal interest in this and would like to improve AMD 
>> > support
>> and presence within the Xen community.
>> 
>> Thanks for that !
>> 
>> > Gordan has also reported problems restarting a guest.
>> 
>> Added Konrad to the CC as pciback maintainer.
>> 
>> Yes, from what i see the problem seems to be that the Radeon card is not 
>> reset
>> to a state were it will regard itself as "unposted". (on the first boot of 
>> the guest i
>> see the driver report it hasn't posted (which is correct), it starts to 
>> past, and it
>> works. On subsequent boots of the domU i don't see the unposted message but
>> some failures that are probably due to the driver regarding the card as 
>> already
>> been posted and skipping some init logic and reusing wrong values.
>> 
>> So current xen-pciback logic doesn't seem to be able to reset the card in a 
>> "non-
>> posted" state.
>> 
>> I don't know if you know what kind of reset is required for radeon cards to
>> regard itself as "not posted" ?
>> 
>> Current xen-pciback logic uses the "__pci_reset_function_locked(dev);"
>> functions (see pcistub.c pcistub_device_release()) to try to reset the 
>> device ..
>> that function in turn tries some possible reset functions in a specific 
>> order and
>> bails out at the first one reporting "succes". However it could be that this 
>> level
>> of reset is not enough for this specific case. (in my case it always uses and
>> succeeds at the first (pci_dev_specific_reset(dev, probe);).
>> 
>> When looking at the code for vfio/kvm in "drivers/vfio/pci/vfio_pci.c
>> vfio_pci_disable()", it seems to use:
>> - Another order for disabling resetting and config save/restore
>> - Always try a slot/bus reset on the way out ..
>> 
>> I'm trying to experiment with that in 2 ways:
>> 1) overrulling the logic in the domU radeon driver to do the reposting no 
>> matter
>>    in what state it thinks it is.
>> 2) Trying to change the xen-pciback reset logic, but at present that delivers
>>    bad irq's to dom0 for completly different irq's as the passedthrough 
>> device
>>    has (which i have seen before .. and is a bit worrying)
>> 
>> > I have been trying to reproduce the problem but have not had any luck.  As 
>> > a
>> secondary it restarts for me every time.  I don't know if I inadvertently 
>> made a
>> change that indirectly fixed it in my code base or what the difference might 
>> be.
>> 
>> Perhaps in the older qemu-traditional there is already some sort of reset 
>> done ?
>> 
>> > What Xen version are you testing with?
>> I'm almost always living on the edge with Xen-unstable ;) (the latest and
>> greatest, which is actually pretty stable)
>> 
>> --
>> Sander
>> 
>> > Thanks,
>> > Kelly
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@xxxxxxxxxxxxxx]
>> >> Sent: Tuesday, September 23, 2014 9:45 AM
>> >> To: Zytaruk, Kelly
>> >> Cc: Peter Kay; xen-devel@xxxxxxxxxxxxx
>> >> Subject: Re: [Xen-devel] AMD GPU passthrough in Xen
>> >>
>> >> Good to know there is interest from AMD into this area :-)
>> >>
>> >> I'm experimenting for a while with:
>> >>
>> >> - xen-unstable (and thus xl)
>> >> - latest kernels (both dom0 and domU)
>> >> - qemu-xen
>> >> - Radeon HD 6570
>> >> - secondary passthrough
>> >> - Debian linux (sid) with the opensource (in kernel) radeon driver
>> >>   (i also tried fglrx with succes, but it's a real PITA to build with 
>> >> every
>> >>   new kernel, so i ditched that)
>> >>
>> >> It used to work, but something broke at the moment, but that could
>> >> also be the changes to the systemd cruft that Debian jessie/sid is
>> >> currently undergoing (or something else since i regularly update all
>> components).
>> >>
>> >> The problems are mostly with restarting the domU, it differs a bit:
>> >> - sometimes screen goes ok, sometimes it's garbage.
>> >> - the radeon powercontrols only seem to work on the first boot and
>> >> give errors on any subsequent one.
>> >>
>> >> But when it works it does:
>> >> - the powercontrol.
>> >> - opengl and opencl benchmarks with (near) native results.
>> >> - hardware video acceleration in xbmc for instance.
>> >>
>> >> So one of the main problems at present seems to be proper resetting
>> >> of the whole device on domain shutdown/start. I did do some
>> >> experiments with the opensource radeon driver, but didn't get conclusive
>> results out of that yet.
>> >>
>> >> --
>> >> Sander
>> >>
>> >> Tuesday, September 23, 2014, 3:19:41 PM, you wrote:
>> >>
>> >> > Hi Peter / Sander,
>> >>
>> >> > Yes, I have AMD GPU passthru working as both primary and secondary
>> >> passthru.  Secondary was easy but primary is a bit tricky.
>> >>
>> >> > Getting on to your questions;
>> >>
>> >> >> Is there any specific reason you're using Xen 4.2 rather than 4.4.1?
>> >>
>> >> > I am working on a project that is based on Xen 4.2 (I can't say any
>> >> > more than
>> >> that).  I have looked at some of the more recent versions just to
>> >> check if some of the bugs that I have seen have been fixed but I have
>> >> not studied the newer versions in detail.  At some point in time in
>> >> the future I would like will make the jump to a more recent version but I 
>> >> don't
>> know the scheduling of that.
>> >>
>> >> >> In 4.2, using xl or xm?
>> >>
>> >> > xl
>> >>
>> >> >> qemu-traditional (with rombios) or "upstream"
>> >>
>> >> > qemu-traditional
>> >>
>> >> >> Primary or secondary passthrough?
>> >>
>> >> > Both but I am focusing on secondary right now.
>> >>
>> >> >> Presumably 64 bit versions of Windows?
>> >>
>> >> > 32 bit and 64 bit Win7.  I have tested Win8.1 and it works but my
>> >> > focus is currently Win7
>> >>
>> >> >> I am quite willing to test various scenarios. I've a 6950, 6450 and 
>> >> >> 5450.
>> >>
>> >> > Awesome.  My goal right now is obtaining stability on Xen 4.2.
>> >> > Since 4.2 is
>> >> past its feature cutoff I won't be able to submit any open source changes 
>> >> for
>> it.
>> >> I would like to eventually work with the community to get passthru
>> >> working with a recent version of "upstream".
>> >>
>> >> > Thanks,
>> >> > Kelly
>> >>
>> >>
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@xxxxxxxxxxxxxx]
>> >> >> Sent: Monday, September 22, 2014 8:38 AM
>> >> >> To: Peter Kay
>> >> >> Cc: xen-devel@xxxxxxxxxxxxx; Zytaruk, Kelly
>> >> >> Subject: Re: [Xen-devel] AMD GPU passthrough in Xen
>> >> >>
>> >> >>
>> >> >> Monday, September 22, 2014, 2:16:58 PM, you wrote:
>> >> >>
>> >> >> > Hi Kelly, list
>> >> >>
>> >> >> > I see you're having AMD GPU success with Xen 4 2 and Linux 3.4.9.
>> >> >> > I've been
>> >> >> less than successful getting passthrough working at all in Xen
>> >> >> (although it's fine in KVM primary passthrough as long as the BIOS
>> >> >> is supplied as a file). Could I confirm the following :
>> >> >>
>> >> >> > Is there any specific reason you're using Xen 4.2 rather than
>> >> >> > 4.4.1? I know in
>> >> >> some ways 4.4 suffers as it's now xl only and some of the xm
>> >> >> functionality has not come across.
>> >> >>
>> >> >> > In 4.2, using xl or xm?
>> >> >>
>> >> >> Another interesting question/aspect would be qemu-traditional
>> >> >> (with
>> >> >> rombios) or "upstream" (with seabios) ?
>> >> >>
>> >> >> > Primary or secondary passthrough?
>> >> >>
>> >> >> > Presumably 64 bit versions of Windows?
>> >> >>
>> >> >> > My system is a bit old (Core2Quad) but as mentioned AMD
>> >> >> > passthrough works
>> >> >> in KVM but I've found it tricky in Xen.
>> >> >>
>> >> >> > I am quite willing to test various scenarios. I've a 6950, 6450 and 
>> >> >> > 5450.
>> >> >>
>> >> >> > Thanks
>> >> >>
>> >> >> > Peter
>> >> >>
>> >> >>
>> >>
>> >>
>> 
>> 

Attachment: dmesg-firstboot.txt
Description: Text document

Attachment: dmesg-secondboot.txt
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.