[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Nouveau on dom0


  • To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
  • From: Arvind R <arvino55@xxxxxxxxx>
  • Date: Wed, 3 Mar 2010 03:04:19 +0530
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 02 Mar 2010 13:38:31 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=mgEoO0W4qAwysMj0/58R2X2+n1Sqh5gNsCklQVE8Egt1X4jwnecSt1ykfM79SwXXTt LHo+S4Hb9imTj1vhqNRXx6avkoKRfvgrlSlHbFe2t1dYUoEEa246l/v+Lp6L1snisdgP oGXydIMrQOJ+XREutJXieIr2Hn4PIevMxCDHQ=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On Mon, Mar 1, 2010 at 9:31 PM, Konrad Rzeszutek Wilk
<konrad.wilk@xxxxxxxxxx> wrote:
> On Fri, Feb 26, 2010 at 09:04:33PM +0530, Arvind R wrote:
>> On Thu, Feb 25, 2010 at 11:14 PM, Konrad Rzeszutek Wilk
>> <konrad.wilk@xxxxxxxxxx> wrote:
>> > On Thu, Feb 25, 2010 at 09:01:48AM -0800, Arvind R wrote:
>> >> On Thu, Feb 25, 2010 at 6:25 PM, Konrad Rzeszutek Wilk
>> >> <konrad.wilk@xxxxxxxxxx> wrote:
>> >> > On Thu, Feb 25, 2010 at 02:16:07PM +0530, Arvind R wrote:
>> >> >> Hi all,
>> >> >> I merged the drm-tree from 2.6.33-rc8 into jeremy's 2.6.31.6 master and
>> >> ======= snip =======
>> >> > is not. Would it be possible to trace down who allocates that *chan? You
>> >> > say it is 'PRAMIN' - is that allocated via pci_alloc_* call?
>> ======= snip =======
>> >> So, there must be a mmap call somewhere to map the area to user-space
>> >> for that problem write to work on non-Xen boots. Will try track down some 
>> >> more
>> >> and post. With mmaps and PCIGARTs - it will be some hunt!
>>  ======= snip =======
>> > to the drm_radeon driver which used it as a ring buffer. Took a bit of
>> > hoping around to find who allocated it in the first place.
>> >
>> After a lot of reboots and log viewing:
>> The pushbuf (FIFO/RING) is the only means of programming the card DMA
>> activity. It is exposed to user-space by mmap of the drm_device (PCI) handle
>> with different offsets for each channel. Parameters are associated to the DMA
>> command using ioctls to bind channels/sub-channels/contexts. This mmap is
>> in the libdrm2 library. Libdrm channel/accelerator  initialization and
>> setup chores
>>  and the DDX driver (xf86-video-nouveau) more-or-less acts thro' libdrm.
>
> Ok, that is the DRM_NOUVEAU_CHANNEL_ALLOC ioctl, which ends up calling
> the 'ttm_bo_init'. I remember Pasi having an issue with this on Radeon
> and I provided a hack to see if it would work. Take a look at this
> e-mail:
>
> http://lists.xensource.com/archives/cgi-bin/extract-mesg.cgi?a=xen-devel&m=2010-01&i=20100115071856.GD17978%40reaktio.net
>
>>
>> My suspicion is that Xen has some problems with mmap of PCI(E) device
>> memory. How is iomem handled in a mmap?
>
> It looks to be using 'ioremap' which is Xen safe. Unless your card has
> an AGP bridge on it, at which point it would end up using
> dma_alloc_coherent in all likehood.
>
>>
>> As of now, accelerator on Xen stops right at the initialisation stage - when
>> libdrm tries to set up the accelerator-engine in the course of ScreenInit. 
>> And
>> to do that, it cannot write the command to setup the basic 2D engine.
>
> I think that the ttm_bo calls set up pages in the 4KB size, but the
> initial channel requests a 64KB one. I think it also sets up

Got that far, tried some dirty patches of mine which broke the framebuffer
Your ttm patch using dma_alloc_coherent instead of alloc_page resulted in
the same problem as with the Radeon report - leaking pages, erroneous page count

> page-table directory so that when the GPU accesses the addresses, it
> gets the real bus address. I wonder if it fails at that thought -
> meaning that the addresses that are written to the page table are
> actually the guest page numbers (gpfn) instead of the machine page numbers 
> (mfn).

No, I don't think thats how it works. The user-space write triggers an
aio-write -
I got that in a trace that my patch caused - which page_faults and leads to
the ttm_bo_fault. I tried to alloc_pages in  ttm_bo_vm_fault but I think I got
the remap_pfn_range address parameter wrong. This patch crashed the same
way under bare boot as on xen with_or_without the patch! So it is
clearly the mmap
of pushbuf thats the block. ttm_bo_vm_fault is the pivot for the
pushbuf_bo allocation

My patch in ttm_bo_vm_fault:
if (io_mem) {
    /* retain the orig. speculative pre-fault code */
    ...
}
else {
    /* ttm_bo_get_pages is modified __ttm_tt_get_page using alloc_pages
        Irrespective of where fault occurs, fault-in the whole buffer */
    pages = ttm_bo_get_pages(ttm, get_order(bo->num_pages));
    pfn = page_to_pfn(page);
    remap_pfn_range(vma, bo->buffer_start, pfn, bo->num_pages << PAGE_SHIFT,
              vma->vm_page_prot); /* Triggers Kernel BUG invalid opcode */
}

BTW,  ttm_bo_vm_fault is the ONLY user of vm_insert_mixed in the kernel tree!

Tried to use split_page() - resulted in undefined symbol!

> The other issue might be that your back-port broke the AGP allocation.
>
Nope - untouched and same.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.