On 03/16/2010 06:19 PM, Konrad Rzeszutek Wilk wrote:
>> The serial output is attached.
>>
>> The patch I used to instrument the fb_mmap function and the output it
>> produced for a couple of runs are also attached.
>>
>> And I tossed in my kernel .config for good measure.
>>
>> What else is needed?
>>
> It looks that I confused your email with another person. You don't seem
> to run the nvidia fb, but rather the radeon one.
>
The current machine I am using has Intel integrated graphics but I can
also reproduce the problem on a laptop with nvidia graphics (it runs the
vesafb framebuffer). After I send this mail I'll recompile on that
machine and see what happens.
I recompiled Xen and pvops/next today. I included your instrumentation
patch below for i915_gem_fault, but it doesn't trigger. No
instrumentation messages appear. I even put a print statement at the
top of the function but it never prints.
I have attached the serial console output and dmesg output. The
initcall and drm debug stuff is present.
Also, I get something new when I run the test program. It prints out:
# ./silly
Mapped /dev/fb0 at 0x7f3237175000
Killed
Message from syslogd@moss-flapper at Mar 25 19:25:52 ...
kernel:Bad pagetable: 000f [#1] SMP
And I get the following on the serial console (the deadbeef stuff is the
buffer I just wrote into the mmap):
moss-flapper login: (XEN) d0:v1: reserved bit in page table (ec=000F)
(XEN) Pagetable walk from 00007f3237175000:
(XEN) L4[0x0fe] = 000000001154a067 00000000001deaec
(XEN) L3[0x0c8] = 000000001492f067 00000000001db6d1
(XEN) L2[0x1b8] = 0000000015bc7067 00000000001da569
(XEN) L1[0x175] = fffff7fffffff22f ffffffffffffffff
(XEN) ----[ Xen-4.0.0-rc8-pre x86_64 debug=n Not tainted ]----
(XEN) CPU: 1
(XEN) RIP: e033:[<0000003002e8305b>]
(XEN) RFLAGS: 0000000000010206 EM: 0 CONTEXT: pv guest
(XEN) rax: 00007f3237175000 rbx: 0000000000000000 rcx: 0000000000000200
(XEN) rdx: 0000000000001000 rsi: 00007fff42cc42e0 rdi: 00007f3237175000
(XEN) rbp: 00007fff42cc52f0 rsp: 00007fff42cc42c8 r8: 0000000000000001
(XEN) r9: 0000000000000001 r10: 00000000ffffffff r11: 0000000000001000
(XEN) r12: 00000000004005d0 r13: 00007fff42cc53d0 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000026f0
(XEN) cr3: 00000000116da000 cr2: 00007f3237175000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
(XEN) Guest stack trace from rsp=00007fff42cc42c8:
(XEN) 00000000004007e0 cafeababdeadbeef 0000000000000000 cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
(XEN) cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef cafeababdeadbeef
silly: Corrupted page table at address 7f3237175000
PGD 1deaec067 PUD 1db6d1067 PMD 1da569067 PTE fffffffffffff22f
Bad pagetable: 000f [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 1
Modules linked in: nfs fscache bridge stp llc ipt_MASQUERADE iptable_nat nf_nat
nfsd lockd nfs_acl auth_rpcgss export]
Pid: 1775, comm: silly Not tainted 2.6.32-pvops-dom0 #23 OptiPlex 960
RIP: e033:[<0000003002e8305b>] [<0000003002e8305b>] 0x3002e8305b
RSP: e02b:00007fff42cc42c8 EFLAGS: 00010206
RAX: 00007f3237175000 RBX: 0000000000000000 RCX: 0000000000000200
RDX: 0000000000001000 RSI: 00007fff42cc42e0 RDI: 00007f3237175000
RBP: 00007fff42cc52f0 R08: 0000000000000001 R09: 0000000000000001
R10: 00000000ffffffff R11: 0000000000001000 R12: 00000000004005d0
R13: 00007fff42cc53d0 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f3237162700(0000) GS:ffff880028054000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f3237175000 CR3: 00000001df03a000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process silly (pid: 1775, threadinfo ffff8801df02a000, task ffff8801db68ae60)
RIP [<0000003002e8305b>] 0x3002e8305b
RSP <00007fff42cc42c8>
---[ end trace e07c6ddec4199123 ]---
> .. snip ..
>
>> Non-volatile memory driver v1.3
>> Linux agpgart interface v0.103
>> agpgart-intel 0000:00:00.0: Intel Q45/Q43 Chipset
>> agpgart-intel 0000:00:00.0: detected 32764K stolen memory
>> agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
>> tpm_tis 00:08: 1.2 TPM (device-id 0x4A10, rev-id 78)
>> [drm] Initialized drm 1.1.0 20060810
>> [drm] radeon defaulting to kernel modesetting.
>> [drm] radeon kernel modesetting enabled.
>> xen_allocate_pirq: returning irq 16 for gsi 16
>> Already setup the GSI :16
>> i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
>> [drm] set up 31M of stolen space
>> [drm] TMDS-8: set mode 1280x1024 17
>> Console: switching to colour frame buffer device 160x64
>> fb0: inteldrmfb frame buffer device
>> registered panic notifier
>> [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
>>
> You look to have a i915 framebuffer on your box.
>
> I *think* that the i915 is not using KMS and the TTM stuff, so the
> patch that Arvind posted would probably not help you.
> http://www.mail-archive.com/dri-devel@xxxxxxxxxxxxxxxxxxxxx/msg48668.html
>
> So, lets boot your kernel with these command line parameters to get more
> data: debug initcall_debug drm.debug=255
>
> That should spew out some more details.
>
> Next thing I would suggest is to instrument i915_gem_fault. Attached is
> a patch that does it (thought it is not compile tested nor actually
> booted so it might need some hand crafting - sorry).
>
> And the other thing is to read through the steps that Arvind took in the
> e-mail thread titled: "Nouveau on dom0". It covers the gamma of things
> to troubleshoot this.
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fba37e9..cfcaafd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -33,6 +33,8 @@
> #include "intel_drv.h"
> #include <linux/swap.h>
> #include <linux/pci.h>
> +#include <xen/xen.h>
> +#include <asm/xen/page.h>
>
> #define I915_GEM_GPU_DOMAINS (~(I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT))
>
> @@ -1145,6 +1147,143 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void
> *data,
> return 0;
> }
>
> +void print_pte(struct vm_area_struct *vma, char *what, struct page *page,
> unsigned int pfn, unsigned long address)
> +{
> + static const char * const level_name[] =
> + { "NONE", "4K", "2M", "1G", "NUM" };
> + unsigned long addr = 0;
> + pte_t *pte = NULL;
> + pteval_t val = (pteval_t)0;
> + unsigned int level = 0;
> + unsigned offset;
> + unsigned long phys;
> + pgprotval_t prot;
> + char buf[90];
> + char *str;
> +
> + str = buf;
> + // Figure out if the address is pagetable.
> + if (address == 0 && !page && pfn>0) {
> + page = pfn_to_page(pfn);
> + }
> + if (address == 0 && page)
> + addr = (u64)page_address(page);
> +
> + if (address && !page)
> + addr = address;
> +
> + if (address && page) {
> + addr = (u64)page_address(page);
> + if (address != addr) {
> + if (addr == 0) {
> + str += sprintf(str, "addr(page)==0");
> + addr = address;
> + }
> + }
> + }
> +
> + if (pfn != 0 && page) {
> + if (pfn != page_to_pfn(page)) // Gosh!?
> + str += sprintf(str, "pfn!=pfn(page)");
> + }
> + if (pfn != 0 && addr != 0) {
> + if (pfn != virt_to_pfn(addr))
> + str += sprintf(str,"pfn(addr)!=pfn");
> + }
> + pte = lookup_address(addr, &level);
> + if (!pte) {
> + str += sprintf(str,"!pte(addr)");
> + goto print;
> + }
> + offset = addr & ~PAGE_MASK;
> +
> + if (xen_domain()) {
> + phys = (pte_mfn(*pte) << PAGE_SHIFT) + offset;
> + val = pte_val_ma(*pte);
> +
> + if (pfn > 0) {
> + if (pte_mfn(*pte) == pfn) {
> + if (vma->vm_flags && VM_IO)
> + str += sprintf(str,"PHYS");
> + else
> + str += sprintf(str,"BUG: VM_IO not
> set!");
> + }
> + /* It is a pseudo page ... and the VM_IO flag is set */
> + if (pte_mfn(*pte) != pfn) {
> + if (vma->vm_flags && VM_IO)
> + str += sprintf(str,"BUG: VM_IO flag
> set!");
> + else
> + str += sprintf(str, "PSEUDO");
> + }
> + } else {
> + str += sprintf(str,"pfn==0");
> + }
> +
> + } else {
> + phys = (pte_pfn(*pte) << PAGE_SHIFT) + offset;
> + val = pte_val(*pte);
> + }
> + prot = pgprot_val(pte_pgprot(*pte));
> +
> + if (!prot)
> + str += sprintf(str, "Not present.");
> + else {
> + if (prot & _PAGE_USER)
> + str += sprintf(str, "USR ");
> + else
> + str += sprintf(str, " ");
> + if (prot & _PAGE_RW)
> + str += sprintf(str, "RW ");
> + else
> + str += sprintf(str, "ro ");
> + if (prot & _PAGE_PWT)
> + str += sprintf(str, "PWT ");
> + else
> + str += sprintf(str, " ");
> + if (prot & _PAGE_PCD)
> + str += sprintf(str, "PCD ");
> + else
> + str += sprintf(str, " ");
> +
> + /* Bit 9 has a different meaning on level 3 vs 4 */
> + if (level <= 3) {
> + if (prot & _PAGE_PSE)
> + str += sprintf(str, "PSE ");
> + else
> + str += sprintf(str, " ");
> + } else {
> + if (prot & _PAGE_PAT)
> + str += sprintf(str, "pat ");
> + else
> + str += sprintf(str, " ");
> + }
> + if (prot & _PAGE_GLOBAL)
> + str += sprintf(str, "GLB ");
> + else
> + str += sprintf(str, " ");
> + if (prot & _PAGE_NX)
> + str += sprintf(str, "NX ");
> + else
> + str += sprintf(str, "x ");
> +#ifdef _PAGE_IOMEM
> + if (prot & _PAGE_IOMEM)
> + str += sprintf(str, "IO ");
> + else
> + str += sprintf(str, " ");
> +#endif
> +
> + }
> +
> +print:
> + printk(KERN_INFO "[%16s]PFN: 0x%lx PTE: 0x%lx (val:%lx): [%s] [%s]\n",
> + what,
> + (unsigned long)pfn,
> + (pte) ? (unsigned long)(pte->pte) : 0,
> + (unsigned long)val,
> + buf,
> + level_name[level]);
> +}
> +
> /**
> * i915_gem_fault - fault a page into the GTT
> * vma: VMA in question
> @@ -1200,8 +1339,10 @@ int i915_gem_fault(struct vm_area_struct *vma, struct
> vm_fault *vmf)
> pfn = ((dev->agp->base + obj_priv->gtt_offset) >> PAGE_SHIFT) +
> page_offset;
>
> + print_pte(vma,"before", NULL, pfn, 0);
> /* Finally, remap it using the new GTT offset */
> ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
> + print_pte(vma, "after", NULL, pfn, (unsigned long)
> vmf->virtual_address);
> unlock:
> mutex_unlock(&dev->struct_mutex);
>
>
--
Eamon Walsh
National Security Agency
dmesg.txt
Description: Text document
serialoutput.txt
Description: Text document
silly.c
Description: Text document
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|