[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Question about partitioning shared cache in Xen

On 14/01/15 00:41, Meng Xu wrote:
> Hi,
> [Goal]
> I want to investigate the impact of the shared cache on the
> performance of workload in guest domain.
> I also want to partition the shared cache via page coloring mechanism
> so that guest domains can use different cache colors of shared cache
> and will not have interference in the shared cache.
> [Motivation: Why do I want to partition the shared cache?]
> Because the shared cache is shared among all guest domains (I assume
> the machine has multicores sharing the same LLC. For example, Intel(R)
> Xeon(R) CPU E5-1650 v2 has 6 physical cores sharing a 12MB L3 cache.),
> the workload in one domU can interfere another domU's memory-intensive
> workload on the same machine via shared cache. This shared-cache
> interference makes the execution time of the workload in a domU
> non-deterministic and increase a lot. (If we assume the worst case,
> the worst-case execution time of the workload will be too
> pessimistic.) A stable execution time is very important in real-time
> computation when the real-time program, like the control program on
> automobile, have to generate the result within a deadline.
> I did some quick measurements to show how shared cache can be used by
> a holistic domain to interfere the execution time of another domain's
> workload. I pin the VCPUs of two domains to different physical cores
> and use one domain to pollute the shared cache. The result shows that
> the shared-cache interference can make the execution time of another
> domain's workload slow down by 4x. The whole experiment result can be
> found at 
> https://github.com/PennPanda/cis601/blob/master/project/data/boxplot_cache_v2.pdf
>  . (The workload of the figure is a program reading a large array. I
> run the program for 100 times and draw the latency of accessing the
> array in a box plot. The first column with name "aloneâd1v1" is the
> boxplot latency when the program in dom1 runs alone. The fourth column
> "d1v1d2v1âpindiffcore" is the boxplot latency when the program in dom1
> runs along with another program in dom2, and these two domains uses
> different cores. dom1 and dom2 have 1 vcpu with budget equal to
> period. The scheduler is credit scheduler.)
> [Idea of how to partition the shared cache]
> When a PV guest domain is created, it will call xc_dom_boot_mem_init()
> to allocate memory for the domain, which finally calls
> xc_domain_populate_physmap_exact() to allocate memory pages from
> domheap in Xen.
> The idea of partitioning the share cache is as follows:
> 1) xl tool change: Add an option in domain's configuration file which
> specifies which cache colors this domain should use. (I have done this
> and when I use xl create --dry-run, I can see the parameters are
> parsed to the build information.)
> 2) hypervisor change: Add another hypercall
> xc_domain_populate_physmap_exact_ca() which has one more parameter,
> i.e, the cache colors this domain should use. I also need to reserve a
> memory pool which sort the reserved memory pages based on its cache
> color.
> When a PV domain is created, I can specify the cache colors it uses.
> Then the xl tool will call the xc_domain_populate_physmap_exact_ca()
> to only allocate the memory pages with the specified cache colors to
> this domain.
> [Quick implementation]
> I attached my quick implementation patch at the end of this email.
> [Issues and Questions]
> After I applied the patch to  Xen's commit point
> 36174af3fbeb1b662c0eadbfa193e77f68cc955b and run it on my machine,
> dom0 cannot boot up.:-(
> The error message from dom0 is:
> [    0.000000] Kernel panic - not syncing: Failed to get contiguous
> memory for DMA from Xen!
> [    0.000000] You either: don't have the permissions, do not have
> enough free memory under 4GB, or the hypervisor memory is too
> fragmented! (rc:-12)
> I tried to print every message in the function I touched in order to
> figure out where goes wrong but failed. :-(
> The thing I cannot understand is that: My implementation haven't
> reserve any  memory pages in the cache-aware memory pool before the
> system boots up. Basically, every function I modified haven't been
> called before the system boots up. But the system crashes. :-( (The
> system can boot up and work perfectly before applying my patch.)
> I really appreciate it if any of you could point out the part I missed
> or misunderstood. :-)

The error message is quite clear.  I presume that your cache
partitioning algorithm has prevented dom0 from getting any
machine-contiguous pages for DMA.  This prevents dom0 from using any
hardware, such as its disks or the network.

What I don't see is how you plan to isolate different colours in a
shared cache.  I am guessing (seeing as the patch is full of debugging
and hard to follow) that you are using the low order bits in the
physical address to identify the colour, which will indeed prevent any
continuous allocations from happening.  Is this what you are attempting
to do?


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.