[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: ATI radeon fails with "iommu=soft swiotlb=force" (seen on RV730/RV740 and RS780/RS800)

On Wed, Sep 30, 2009 at 11:21:49AM -0400, Konrad Rzeszutek Wilk wrote:
> With those two options defined in the boot line, the radeon kernel driver 
> spits out:
> [  244.190885] [drm] Loading RV730/RV740 PFP Microcode
> [  244.190911] [drm] Loading RV730/RV740 CP Microcode
> [  244.205974] [drm] Resetting GPU
> [  244.310103] [drm] writeback test failed
> [  251.220092] [drm] Resetting GPU
> And the video is useless (red little characters at the top, nothing else).
> If I don't define those two arguments the video driver works fine.

.. snip ..
> Right now I am instrumenting the code and trying to narrow
> down the issue. But I was wondering if somebody has seen this in the past
> and can say "Oh yeah, I saw that back in 1980, try this."

I found the fault, which I am going to try to describe. 

The radeon_dri driver (user-land) via ioctl, calls 'drm_sg_alloc'.
'drm_sg_alloc' allocates a 32MB area using 'vmalloc_32'. On non-Xen
machines, the physical address is indeed under the 4GB number. On Xen
thought the physical address is somewhere in the stratosphere. The radeon_dri
saves this vmalloc-ed virtual address in its structures.

Next step is bit more complex. radeon_dri via ioctl calls the radeon_cp_init
which job is to initialize the command processing engine. The passed
in arguments are the virtual address obtained via 'drm_sg_alloc'. That 32MB
at that point is partitioned (the first 1MB is for the command processing ring,
the next 64KB for a ring pointer, and so on) and both the user-land driver
and the kernel save this address in their structures. Keep in mind that this
address is the virtual address pointing to the vmalloc-ed area. Not the
virtual address obtained from page_address(vmalloc_to_page(i))). 

Then the radeon_cp_init sets up a GPU page-table using the physical
addresses.  This is where it starts to fall apart, as during this
setup, the r600_page_table_init calls the pci_map_page to ensure that the
physical address is below 4GB. On Xen (and on non-Xen with swiotlb=force
iommu=soft), the physical addresses obtained end up being taken
from the IOMMU. This is important. 

After that the writeback is done on the MMIO region, the GPU
happily looks at the page-table, finds the physicall address (which
is _not_ pointing to the vmalloc area, but to the IOMMU) and does a
write to that memory region. While the writeback test reads data from
saved ring_rptr (which reads memory from the vmalloc area - for which
the GPU has no page-table). 

1). The current workaround is to boot Xen with mem=3GB which will
enforce the vmalloc area is under 4GB.

2). I experiemented a bit with drm_sg_alloc to use the dma_alloc_coherent
but 32MB is way too big for it:

       if (order >= MAX_ORDER) {
                WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));

(MAX_ORDER is defined to be 11, and 32MB is order 15).

3). Restructure r600_cp_init to refresh its structures when the page
   table uses different areas than the physical addresses. That would
   imply keeping a copy of the "old" vmalloc unused virtual addresses
   (the user-land driver uses the virtual address as handle) and updating
   the ring pointer, command pointer, etc to use the virtual addresses
   obtained from doing phys_to_virt(page_to_phys(entry->pagelist[X])).
   But the code assumes (and righly) that the area obtained from
   drm_sg_alloc is 32-bit addressable, so..

3). Ensuring the sg_drm_alloc will get an area with physical addresses
   under the 4G mark. I was thinking to utlilize the 
   but the PFNs returned by vmalloc_32 are actually MFNs, so
   xen_create_contigous_region is out of the picture.

4). Go a further step back. Ensure that vmalloc_32 will always get 
    addresses under the 4GB mark? Since the virt_to_pfn for these addresses
    are the real physical addresses, perhaps make this function enforce
    the MFNs to be under 4GB mark?

5). Other suggestions?

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.