[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] additional domain.c memory allocation causes "xm create" to fail



On 04/09/12 19:22, misiu godfrey wrote:
> Hello Xen Developers,
>
> I am currently attempting an experiment to modify Xen in such a way
> that it will flush the L2 cache every time it performs a context switch.

For what purpose?  There was once a bug which caused this to happen and
it caused Xen to slow to a glacial pace.  We got bored of debugging HVM
domains after several hours and the BIOS has still not loaded the
bootloader.

>  Thus far I have implemented my code inside the __context_switch()
> function, of domain.c (xen/arch/x86/domain.c), and while my
> modifications work fine for memory allocations of up to 512KB, it
> fails whenever I increase this to 1MB.
>
> Specifically, when the memory buffer is increased to 1MB, the machine
> will force a hard reset whenever xm tries to create a new domain.
>  When I try to create a new domain (a sparse squeeze install, Dom0 is
> running Ubuntu 12.04) it gets as far as completing the
> scripts/init-bottom call before it crashes, which makes me think it is
> going down during the following init call.
>
> I have narrowed down the section of code that is failing.  The curious
> thing is that the failure threshold seems to be dependent on the
> number of iterations in the loop, rather than the specific amount of
> memory (i.e. 1MB of memory will work when 'i' is incremented by 128
> rather than 64, whereas 512KB of memory will work when 'i' is 64):
>
>   cache_mem_size = 1048576;  // Size of L2 cache
>   cache_mem = xmalloc_array(char,cache_mem_size);

You don't check the return value, so what happens when the allocation
fails?  I would say that calling *alloc() is not a sensible thing to be
doing in __context_switch() anyway, as you are sitting doing a long
operation while in a critical section of Xen code.

>
>   for (i=0;i<cache_mem_size;i+=64)
>     cache_mem[i] += 1;
>
>   xfree(cache_mem);

Furthermore, this algorithm has no guarantee to clear the L2 cache.  In
fact is almost certainly will not.

>
> If anyone has a suggestion as to what may be causing this failure, or
> insight into the runtime limitations of this section of the
> architecture, kernel, or scheduler, I would appreciate the information.
>
> Thanks,
>
> -Misiu 

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.