[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: BUG: unable to handle kernel NULL pointer dereference at IP: [<ffffffff8105ae4c>] process_one_work+

On 06/14/2011 09:55 AM, Konrad Rzeszutek Wilk wrote:
But the curious thing is that you have two CPUs assigned to Dom0 and
while CPU0 looks to be bouncing back and forth, CPU1 is doing
something. The RIP is 0xffffffff8108820c. Can you try to run this
through System.map? Or the whole bunch of these:

ffffffff8108820c ffffffff81088100 ffffffff810881a7 ffffffff8108811a
ffffffff816101a8 ffffffff81006c32 ffffffff816114a4 ffffffff8108803a
ffffffff8105f5bd ffffffff81618564 ffffffff81617973 ffffffff816117a1

     I grabbed code snippets for each of these locations and put them here:


The other idea is to limit Dom0 to only run on one CPU. You can do
this by having 'dom0_max_vcpus=1 dom0_vcpus_pin' and see if it fails
somewhere else? It probably will die in the 0xffffffff810013aa :-(

     After setting dom0_max_vcpus=1 and dom0_vcpus_pin, the boot got to
"Trying to unpack rootfs image as initramfs..." and hung there.  The
serial console as well as the CTRL_A(x3) * outputs are here:


But irregardless of what I mentioned above we need to find out why
process_one_worker got a toxic parameter. Can you disassemble
0xffffffff8105ae4c and see what it does and how it corresponds to
'process_one_work' in kernel/workqueue.c?

     I put the disassembly of it in the hailstorm-debugnotes.txt file
that I mentioned above.  Let me know if you need more than that.

You can also instrument the code to find out what:

1804         work_func_t f = work->func;


     I think this request is starting to go a little beyond what I know
how to do.

Scott Garron

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.