[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 01/16] docs: create Memory Bandwidth Allocation (MBA) feature document.



On 17-02-23 15:46:36, Meng Xu wrote:
> Hi Yi,
> 
> I have some quick comment about this document. Some minor points are
> not very clear, IMHO.
> 
Thanks for your mail!

> > +
> > +  2. `psr-mba-set [OPTIONS] domain-id throttling`:
> > +
> > +     Set memory bandwidth throttling for domain.
> > +
> > +     Options:
> > +     '-s': Specify the socket to process, otherwise all sockets are 
> > processed.
> > +
> > +     Throttling value set in register implies memory bandwidth blocked, 
> > i.e.
> > +     higher throttling value results in lower bandwidth. The max throttling
> > +     value can be got through CPUID.
> > +
> > +     The response of the throttling value could be linear mode or 
> > non-linear
> > +     mode.
> > +
> > +     Linear mode: the input precision is defined as 100-(MBA_MAX). For 
> > instance,
> > +     if the MBA_MAX value is 90, the input precision is 10%. Values not an 
> > even
> > +     multiple of the precision (e.g., 12%) will be rounded down (e.g., to 
> > 10%
> > +     delay applied) by HW automatically.
> 
> So MBA has a minimum allocation unit. What is the minimum bandwidth
> allocation unit?
> >From the above example, I had the impression that the allocation unit is 10%.
> As mentioned in the document later, the throttle value is set in the
> COS register's  Thrtl bit fields as shown in [Code_CBM, Data_CBM,
> Thrtl]. I had the impression that the maximum number of bandwidth
> units we can allocate is 2^number_of_bits_in_Thrtl.
> Only one of my impression could be true, right? ;-)
> 
MBA supports two modes by design. One is linear mode which likes 10%. The other
is non-linear mode which is power of 2. So, it depends on the HW info to see
which mode is supported. :)

> In addition, since hardware will round down the partial bandwidth
> value, why shouldn't we just allow system operators to configure the
> "valid" bandwidth supported by the hardware.
> For example, if the hardware only supports the bandwidth  throttle
> value in 10% units, then we should not allow users to input the
> bandwidth throttle value as 12% or 13%. Otherwise, as a system
> operator, I would be confused at why I increased the bandwidth
> throttle value from 11% to 19%, I still see the same bandwidth
> guarantee.
> 
That is an option to implement libxl or even upper layer. 

> > +
> > +     Non-linear mode: input delay values are powers-of-two from zero to the
> > +     MBA_MAX value from CPUID. In this case any values not a power of two 
> > will
> > +     be rounded down the next nearest power of two by HW automatically.\
> 
> First question: Why is it the delay value instead of bandwidth value
> in the non-linear mode? Does MBA really control memory access latency?
> 
MBA directly controls the latency to indirectly control the bandwidth.
You can see the description in SDM:
"The Memory Bandwidth Allocation (MBA) feature provides indirect and
approximate control over memory band width available per-core"

> Second question: Does the hardware provide any guaranteed bandwidth in
> the non-linear mode?
Nope, as above mentions, "approximate control" is provided no matter linear
mode or non-linear mode.

> I saw the document patch in Linux at
> http://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1307176.html:
> [Qutoe]
> In nonlinear scale currently SDM specifies
> +throttle values in 2^n values. However the h/w does not guarantee a
> +specific curve for the amount of memory b/w that is actually throttled.
> +But for any thrtl_by value x > y, its guaranteed that x would throttle
> +more b/w than y.  The info directory specifies the max thrtl_by value
> +and thrtl_by granularity.
> [/Qutoe]
> 
> It seems that the non-linear mode simply provide some throttling
> relations but don't guarantee the actual throttle value.
> Maybe it will be good to clearly state the capability and limitations
> of the hardware.
> 
Sorry, there is no such info in SDM. But you can use MBM (Memory Bandwidth
Monitoring) feature to learn the MBA real status.

> > +  System administrator can change PSR allocation policy at runtime by
> > +  tool stack. Since MBA shares COS with CAT/CDP, a COS corresponds to a
> > +  2-tuple, like [CBM, Thrtl] with only-CAT enalbed, when CDP is enable,
> > +  the COS corresponds to a 3-tuple, like [Code_CBM, Data_CBM, Thrtl]. If
> > +  neither CAT nor CDP is enabled, things would be easier, one COS
> > +  corresponds to one Thrtl.
> 
> How many bits in Thrtl field?
> Is it decided by the hardware type?
> 
This is defined in SDM.
"The definition for the MBA delay value MSRs is provided in Figure 17.39. The
lower 16 bits are used for MBA delay values, and values from zero to the maximum
from the CPUID MBA_MAX-1 value are supported."

Please note, MBA value is different with CBM. You do not need care the bits.

> > +# References
> > +
> > +"INTEL® RESOURCE DIRECTOR TECHNOLOGY (INTEL® RDT) ALLOCATION FEATURES" 
> > [Intel® 64 and IA-32 Architectures Software Developer Manuals, 
> > vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
> > +
> 
> I checked the document. The CAT is in Chapter 17.17. However, there is
> no description about the MBA? ;-)
Have you downloaded latest SDM? 17.18.7 is for MBA.

> 
> Thanks,
> 
> Meng
> 
> -----------
> Meng Xu
> PhD Student in Computer and Information Science
> University of Pennsylvania
> http://www.cis.upenn.edu/~mengxu/

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.