[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/2] Boot time cpupools



On Fri, 19 Nov 2021, Julien Grall wrote:
> Hi Stefano,
> 
> On 18/11/2021 02:19, Stefano Stabellini wrote:
> > On Wed, 17 Nov 2021, Julien Grall wrote:
> > > > > > > On 17 Nov 2021, at 10:26, Julien Grall <julien@xxxxxxx> wrote:
> > > > > > > 
> > > > > > > Hi Luca,
> > > > > > > 
> > > > > > > On 17/11/2021 09:57, Luca Fancellu wrote:
> > > > > > > > Currently Xen creates a default cpupool0 that contains all the
> > > > > > > > cpu
> > > > > > > > brought up
> > > > > > > > during boot and it assumes that the platform has only one kind
> > > > > > > > of
> > > > > > > > CPU.
> > > > > > > > This assumption does not hold on big.LITTLE platform, but
> > > > > > > > putting
> > > > > > > > different
> > > > > > > > type of CPU in the same cpupool can result in instability and
> > > > > > > > security issues
> > > > > > > > for the domains running on the pool.
> > > > > > > 
> > > > > > > I agree that you can't move a LITTLE vCPU to a big pCPU.
> > > > > > > However...
> > > > > > > 
> > > > > > > > For this reason this serie introduces an architecture specific
> > > > > > > > way
> > > > > > > > to create
> > > > > > > > different cpupool at boot time, this is particularly useful on
> > > > > > > > ARM
> > > > > > > > big.LITTLE
> > > > > > > > platform where there might be the need to have different
> > > > > > > > cpupools
> > > > > > > > for each type
> > > > > > > > of core, but also systems using NUMA can have different cpu pool
> > > > > > > > for
> > > > > > > > each node.
> > > > > > > 
> > > > > > > ... from my understanding, all the vCPUs of a domain have to be in
> > > > > > > the
> > > > > > > same cpupool. So with this approach it is not possible:
> > > > > > >     1) to have a mix of LITTLE and big vCPUs in the domain
> > > > > > >     2) to create a domain spanning across two NUMA nodes
> > > > > > > 
> > > > > > > So I think we need to make sure that any solutions we go through
> > > > > > > will
> > > > > > > not prevent us to implement those setups.
> > > > > > The point of this patch is to make all cores available without
> > > > > > breaking
> > > > > > the current behaviour of existing system.
> > > > > 
> > > > > I might be missing some context here. By breaking current behavior, do
> > > > > you
> > > > > mean user that may want to add "hmp-unsafe" on the command line?
> > > > 
> > > > Right, with hmp-unsafe the behaviour is now the same as without, only
> > > > extra
> > > > cores are parked in other cpupools.
> > > > 
> > > > So you have a point in fact that behaviour is changed for someone who
> > > > was
> > > > using hmp-unsafe before if this is activated.
> > > > The command line argument suggested by Jurgen definitely makes sense
> > > > here.
> > > > 
> > > > We could instead do the following:
> > > > - when this is activated in the configuration, boot all cores and park
> > > > them
> > > > in different pools (depending on command line argument). Current
> > > > behaviour
> > > > not change if other pools are not used (but more cores will be on in
> > > > xen)
> > > 
> > >  From my understanding, it is possible to move a pCPU in/out a pool
> > > afterwards.
> > > So the security concern with big.LITTLE is still present, even though it
> > > would
> > > be difficult to hit it.
> > 
> > As far as I know moving a pCPU in/out of a pool is something that cannot
> > happen automatically: it requires manual intervention to the user and it
> > is uncommon. We could print a warning or simply return error to prevent
> > the action from happening. Or something like:
> > 
> > "This action might result in memory corruptions and invalid behavior. Do
> > you want to continue? [Y/N]
> > 
> > 
> > > I am also concerned that it would be more difficult to detect any
> > > misconfiguration. So I think this option would still need to be turned on
> > > only
> > > if hmp-unsafe are the new command line one are both on.
> > > 
> > > If we want to enable it without hmp-unsafe on, we would need to at least
> > > lock
> > > the pools.
> > 
> > Locking the pools is a good idea.
> > 
> > My preference is not to tie this feature to the hmp-unsafe command line,
> > more on this below.
> 
> The only reason I suggested to tie with hmp-unsafe is if you (or anyone else)
> really wanted to use a version where the pool are unlocked.
> 
> If we are going to implement the lock, then I think the hmp-unsafe would not
> be necessary for such configuration.
> 
> > 
> > 
> > > > - when hmp-unsafe is on, this feature will be turned of (if activated in
> > > > configuration) and all cores would be added in the same pool.
> > > > 
> > > > What do you think ?
> > > 
> > > I am split. On one hand, this is making easier for someone to try
> > > big.LITTLE
> > > as you don't have manually pin vCPUs. On the other hand, this is handling
> > > a
> > > single use-case and you would need to use hmp-unsafe and pinning if you
> > > want
> > > to get more exotic setup (e.g. a domain with big.LITTLE).
> > > 
> > > Another possible solution is to pin dom0 vCPUs (AFAIK they are just sticky
> > > by
> > > default) and then create the pools from dom0 userspace. My assumption is
> > > for
> > > dom0less we would want to use pinning instead.
> > > 
> > > That said I would like to hear from Xilinx and EPAM as, IIRC, they are
> > > already
> > > using hmp-unsafe in production.
> > 
> > This discussion has been very interesting, it is cool to hear new ideas
> > like this one. I have a couple of thoughts to share.
> > 
> > First I think that the ability of creating cpupools at boot time is
> > super important. It goes way beyond big.LITTLE. It would be incredibly
> > useful to separate real-time (sched=null) and non-real-time
> > (sched=credit2) workloads. I think it will only become more important
> > going forward so I'd love to see an option to configure cpupools that
> > works for dom0less. It could be based on device tree properties rather
> > than kconfig options.
> > 
> > It is true that if we had the devicetree-based cpupool configuration I
> > mentioned, then somebody could use it to create cpupools matching
> > big.LITTLE. > So "in theory" it solves the problem. However, I think that
> > for big.LITTLE it would be suboptimal. For big.LITTLE it would be best
> > if Xen configured the cpupools automatically rather than based on the
> > device tree configuration. 
> 
> This brings one question. How do Linux detect and use big.LITTLE? Is there a
> Device-Tree binding?
> 
> [...]
> 
> > So I think that it is a good idea to have a command line option (better
> > than a kconfig option) to trigger the MIDR-based cpupool creation at
> > boot time. The option could be called midr-cpupools=on/off or
> > hw-cpupools=on/off for example.
> > In terms of whether it should be the default or not, I don't feel
> > strongly about it.
> 
> On by default means this will security supported and we need to be reasonably
> confident this cannot be broken.
> 
> In this case, I am not only referring to moving a pCPU between pool but also
> Xen doing the right thing (e.g. finding the minimum cache line size).
> 
> 
> [...]
> 
> > - midr-cpupools alone
> > cpupools created at boot, warning/errors on changing cpupools >
> > - midr-cpupools + hmp-unsafe
> > cpupools created at boot, changing cpupools is allowed (we could still
> > warn but no errors)
> > 
> > - hmp-unsafe alone
> > same as today with hmp-unsafe
> 
> I like better Juergen's version. But before agreeing on the command line , I
> would like to understand how Linux is dealing with big.LITTLE today (see my
> question above).

I also like Juergen's version better :-)

The device tree binding that covers big.LITTLE is the same that covers
cpus: Documentation/devicetree/bindings/arm/cpus.yaml

So basically, you can tell it is a big.LITTLE system because the
compatible strings differ between certain cpus, e.g.:

      cpu@0 {
        device_type = "cpu";
        compatible = "arm,cortex-a15";
        reg = <0x0>;
      };

      cpu@1 {
        device_type = "cpu";
        compatible = "arm,cortex-a15";
        reg = <0x1>;
      };

      cpu@100 {
        device_type = "cpu";
        compatible = "arm,cortex-a7";
        reg = <0x100>;
      };

      cpu@101 {
        device_type = "cpu";
        compatible = "arm,cortex-a7";
        reg = <0x101>;
      };


> > For the default as I said I don't have a strong preference. I think
> > midr-cpupools could be "on" be default.
> 
> I think this should be off at the beginning until the feature is matured
> enough. We are soon opening the tree again for the next development cycle. So
> I think we have enough time to make sure everything work properly to have
> turned on by default before next release.

That's fine



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.