[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 02/18] xen/arm: Implement PSCI system suspend call (virtual interface)



Hi Julien,

Thanks for your feedback, I'll need to answer in iterations.

On Mon, Nov 12, 2018 at 4:27 PM Julien Grall <julien.grall@xxxxxxx> wrote:
>
> Hi Mirela,
>
> On 11/12/18 11:30 AM, Mirela Simonovic wrote:
> > The implementation consists of:
> > -Adding PSCI system suspend call as new PSCI function
> > -Trapping PSCI system_suspend HVC
> > -Implementing PSCI system suspend call (virtual interface that allows
> >   guests to suspend themselves)
> >
> > The PSCI system suspend should be called by a guest from its boot
> > VCPU. Non-boot VCPUs of the guest should be hot-unplugged using PSCI
> > CPU_OFF call prior to issuing PSCI system suspend. Interrupts that
> > are left enabled by the guest are assumed to be its wake-up interrupts.
> > Therefore, a wake-up interrupt triggers the resume of the guest. Guest
> > should resume regardless of the state of Xen (suspended or not).
> >
> > When a guest calls PSCI system suspend the respective domain will be
> > suspended if the following conditions are met:
> > 1) Given resume entry point is not invalid
> > 2) Other (if any) VCPUs of the calling guest are hot-unplugged
> >
> > If the conditions above are met the calling domain is labeled as
> > suspended and the calling VCPU is blocked. If nothing else wouldn't
> > be done the suspended domain would resume from the place where it
> > called PSCI system suspend. This is expected if processing of the PSCI
> > system suspend call fails. However, in the case of success the calling
> > guest should resume (continue execution after the wake-up) from the entry
> > point which is given as the first argument of the PSCI system suspend
> > call. In addition to the entry point, the guest expects to start within
> > the environment whose state matches the state after reset. This means
> > that the guest should find reset register values, MMU disabled, etc.
> > Thereby, the context of VCPU should be 'reset' (as if the system is
> > comming out of reset), the program counter should contain entry point,
> > which is 1st argument, and r0/x0 should contain context ID which is 2nd
> > argument of PSCI system suspend call. The context of VCPU is set
> > accordingly when the PSCI system suspend is processed, so that nothing
> > needs to be done on resume/wake-up path. However, in order to ensure that
> > this context doesn't get overwritten by the scheduler when scheduling out
> > this VCPU (would normally happen after the calling CPU is blocked), we need
> > to check whether to return early from ctxt_switch_from().
> >
> > There are variables in domain structure to keep track of domain shutdown.
> > One of existing shutdown reason is 'suspend' which this patch is using to
> > track the suspend state of a domain. Those variables are used to determine
> > whether to early return from ctxt_switch_from() or not.
> >
> > A suspended domain will resume after the Xen receives an interrupt which is
> > targeted to the domain, unblocks the domain's VCPU, and schedules it in.
> > When the VCPU is scheduled in, the VCPU context is already reset, and
> > contains the right resume entry point in program counter that will be
> > restored in ctxt_switch_to(). The only thing that needs to be done at this
> > point is to clear the variables that marked the domain state as suspended.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@xxxxxxxxxx>
> > Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xxxxxxxxxx>
> >
> > ---
> > Changes in v2:
> >
> > -Fix print to compile for arm32 and to align with Xen coding style
> > ---
> >   xen/arch/arm/Makefile            |   1 +
> >   xen/arch/arm/domain.c            |  13 +++
> >   xen/arch/arm/suspend.c           | 166 
> > +++++++++++++++++++++++++++++++++++++++
> >   xen/arch/arm/vpsci.c             |  19 +++++
> >   xen/include/asm-arm/perfc_defn.h |   1 +
> >   xen/include/asm-arm/psci.h       |   2 +
> >   xen/include/asm-arm/suspend.h    |  16 ++++
> >   xen/include/xen/sched.h          |   1 +
> >   8 files changed, 219 insertions(+)
> >   create mode 100644 xen/arch/arm/suspend.c
> >   create mode 100644 xen/include/asm-arm/suspend.h
> >
> > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> > index 23c5d9adbc..744b1a4dc8 100644
> > --- a/xen/arch/arm/Makefile
> > +++ b/xen/arch/arm/Makefile
> > @@ -43,6 +43,7 @@ obj-y += setup.o
> >   obj-y += shutdown.o
> >   obj-y += smp.o
> >   obj-y += smpboot.o
> > +obj-y += suspend.o
> >   obj-y += sysctl.o
> >   obj-y += time.o
> >   obj-y += traps.o
> > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> > index e594b48d81..7f8105465c 100644
> > --- a/xen/arch/arm/domain.c
> > +++ b/xen/arch/arm/domain.c
> > @@ -97,6 +97,11 @@ static void ctxt_switch_from(struct vcpu *p)
> >       if ( is_idle_vcpu(p) )
> >           return;
> >
> > +    /* VCPU's context should not be saved if its domain is suspended */
> > +    if ( p->domain->is_shut_down &&
> > +        (p->domain->shutdown_code == SHUTDOWN_suspend) )
> > +        return;
> SHUTDOWN_suspend is used in Xen for other purpose (see
> SCHEDOP_shutdown). The other user of that code relies on all the state
> to be saved on suspend.
>

We just need a flag to mark a domain as suspended, and I do believe
SHUTDOWN_suspend is not used anywhere else.
Let's come back on this.

> However, what is the issue with saving all the registers here?
>

We need to save arguments that are provided by a guest with system
suspend PSCI call. These arguments are the entry point that needs to
be saved in program counter and context ID that needs to be saved in
x0/r0. We don't have these arguments here. Context switch happens
after processing the system suspend PSCI call, so it's too late.

> > +
> >       p2m_save_state(p);
> >
> >       /* CP 15 */
> > @@ -181,6 +186,14 @@ static void ctxt_switch_to(struct vcpu *n)
> >       if ( is_idle_vcpu(n) )
> >           return;
> >
> > +    /* If the domain was suspended, it is resuming now */
> > +    if ( n->domain->is_shut_down &&
> > +        (n->domain->shutdown_code == SHUTDOWN_suspend) )
> > +    {
> > +        n->domain->is_shut_down = 0;
> > +        n->domain->shutdown_code = SHUTDOWN_CODE_INVALID;
> > +    }
>
> This looks like a hack. Why not calling domain_resume when receiving the
> interrupt?
>

Good point, I need to double check and come back on this.

> > +
> >       p2m_restore_state(n);
> >
> >       vpidr = READ_SYSREG32(MIDR_EL1);
> > diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
> > new file mode 100644
> > index 0000000000..9eea9214e1
> > --- /dev/null
> > +++ b/xen/arch/arm/suspend.c
>
> I would prefer if we don't mix guest and host suspend in the same file.

Sure, we can move guest suspend code into an another file, e.g.
xen/arch/arm/vsuspend.c

>
> > @@ -0,0 +1,166 @@
>
> Missing copyright headers here.

Thanks

>
> > +#include <xen/sched.h>
> > +#include <asm/cpufeature.h>
> > +#include <asm/event.h>
> > +#include <asm/psci.h>
> > +
> > +/* Reset values of VCPU architecture specific registers */
>
> Technically this is not requires as most of the registers are unknown. I
> understand this helps for debugging an OS.
>
> I would introduce it in a separate patch and directly in
> arch_set_info_guest as I would like the behavior to be the same
> everywhere we need to reset a vCPU.

I agree. Please just consider that resetting a vCPU context is done in
2 scenarios: one where a vCPU is just created, and another one when
the vCPU already exists but the context has to be cleared. Could you
please provide some guidance on how to do this, because we struggled
for a while and didn't really find a nice way?

>
> > +static void vcpu_arch_reset(struct vcpu *v)
> > +{
> > +    v->arch.ttbr0 = 0;
> > +    v->arch.ttbr1 = 0;
> > +    v->arch.ttbcr = 0;
> > +
> > +    v->arch.csselr = 0;
> > +    v->arch.cpacr = 0;
> > +    v->arch.contextidr = 0;
> > +    v->arch.tpidr_el0 = 0;
> > +    v->arch.tpidrro_el0 = 0;
> > +    v->arch.tpidr_el1 = 0;
> > +    v->arch.vbar = 0;
> > +    if ( is_32bit_domain(v->domain) )
>
> This is not necessary
>
> > +        v->arch.dacr = 0;
> > +    v->arch.par = 0;
> > +#if defined(CONFIG_ARM_32)
> > +    v->arch.mair0 = 0;
> > +    v->arch.mair1 = 0;
> > +    v->arch.amair0 = 0;
> > +    v->arch.amair1 = 0;
> > +#else
> > +    v->arch.mair = 0;
> > +    v->arch.amair = 0;
> > +#endif
> > +    /* Fault Status */
> > +#if defined(CONFIG_ARM_32)
> > +    v->arch.dfar = 0;
> > +    v->arch.ifar = 0;
> > +    v->arch.dfsr = 0;
> > +#elif defined(CONFIG_ARM_64)
> > +    v->arch.far = 0;
> > +    v->arch.esr = 0;
> > +#endif
> > +
> > +    if ( is_32bit_domain(v->domain) )
>
> Same here.
>
> > +        v->arch.ifsr  = 0;
> > +    v->arch.afsr0 = 0;
> > +    v->arch.afsr1 = 0;
> > +
> > +#ifdef CONFIG_ARM_32
> > +    v->arch.joscr = 0;
> > +    v->arch.jmcr = 0;
> > +#endif
> > +
> > +    if ( is_32bit_domain(v->domain) && cpu_has_thumbee )
>
> Same here.
>
> > +    {
> > +        v->arch.teecr = 0;
> > +        v->arch.teehbr = 0;
> > +    }
> > +}
> > +
> > +/*
> > + * This function sets the context of current VCPU to the state which is 
> > expected
> > + * by the guest on resume. The expected VCPU state is:
> > + * 1) pc to contain resume entry point (1st argument of PSCI 
> > SYSTEM_SUSPEND)
> > + * 2) r0/x0 to contain context ID (2nd argument of PSCI SYSTEM_SUSPEND)
> > + * 3) All other general purpose and system registers should have reset 
> > values
> > + *
> > + * Note: this function has to return void because it has to always 
> > succeed. In
> > + * other words, this function is called from virtual PSCI SYSTEM_SUSPEND
> > + * implementation, which can return only a limited number of possible 
> > errors,
> > + * none of which could represent the fact that an error occurred when 
> > preparing
> > + * the domain for suspend.
> > + * Consequently, dynamic memory allocation cannot be done within this 
> > function,
> > + * because if malloc fails the error has nowhere to propagate.
>
> You could crash the domain if you are not able to resume it. In the
> current context...
>
> > + */
> > +static void vcpu_suspend(register_t epoint, register_t cid)
> > +{
> > +    /* Static allocation because dynamic would need a non-void return */
> > +    static struct vcpu_guest_context ctxt;
>
> ... this is not right. This function can be called concurrently, so a
> lot of funny things can happen (i.e corruption).
>
> The vCPU context does not look too big. So I would just allocate it on
> the stack directly.
>

Agreed, 'static' should be removed to address all these issues.

> > +    struct vcpu *v = current;
> > +
> > +    /* Make sure that VCPU guest regs are zeroied */
>
> s/zeroied/zeroed/

Thanks

>
> > +    memset(&ctxt, 0, sizeof(ctxt));
> > +
> > +    /* Set non-zero values to the registers prior to copying */
> > +    ctxt.user_regs.pc64 = (u64)epoint;
> > +
> > +    if ( is_32bit_domain(current->domain) )
> > +    {
> > +        ctxt.user_regs.r0_usr = cid;
> > +        ctxt.user_regs.cpsr = PSR_GUEST32_INIT;
> > +
> > +        /* Thumb set is allowed only for 32-bit domain */
> > +        if ( epoint & 1 )
> > +        {
> > +            ctxt.user_regs.cpsr |= PSR_THUMB;
> > +            ctxt.user_regs.pc64 &= ~(u64)1;
> > +        }
> > +    }
> > +#ifdef CONFIG_ARM_64
> > +    else
> > +    {
> > +        ctxt.user_regs.x0 = cid;
> > +        ctxt.user_regs.cpsr = PSR_GUEST64_INIT;
> > +    }
> > +#endif
> > +    ctxt.sctlr = SCTLR_GUEST_INIT;
> > +    ctxt.flags = VGCF_online;
> > +
> > +    /* Reset architecture specific registers */
> > +    vcpu_arch_reset(v); > +
> > +    /* Initialize VCPU registers */
> > +    _arch_set_info_guest(v, &ctxt);
>
> AFAICT, this is expected to be called with the domain lock taken as this
> can be called by various path.
>
> Also, most of the function is the same as CPU_on. So I would like to see
> the code factored in the separate function and used in both place.
>

I agree, but the 2 scenarios (VCPU allocation versus clearing VCPU
context) made it a bit difficult to share. Please let me know if you
have some additional hint on how to exactly structure the code.

> > +}
> > +
> > +int32_t domain_suspend(register_t epoint, register_t cid)
> > +{
> > +    struct vcpu *v;
> > +    struct domain *d = current->domain;
> > +    bool is_thumb = epoint & 1;
> > +
> > +    dprintk(XENLOG_DEBUG,
> > +            "Dom%d suspend: epoint=0x%"PRIregister", 
> > cid=0x%"PRIregister"\n",
> > +            d->domain_id, epoint, cid);
> > +
> > +    /* THUMB set is not allowed with 64-bit domain */
> > +    if ( is_64bit_domain(d) && is_thumb )
> > +        return PSCI_INVALID_ADDRESS;
> > +
> > +    /* Ensure that all CPUs other than the calling one are offline */
> > +    for_each_vcpu ( d, v )
> > +    {
> > +        if ( v != current && is_vcpu_online(v) )
> > +            return PSCI_DENIED;
> > +    }
>
> What does prevent a vCPU to not come online while doing the loop?

As you suggested probably nothing if there is a bug in the guest,
which we want to check for. Is the domain_lock right thing to use
here?

>
> > +
> > +    /*
> > +     * Prepare the calling VCPU for suspend (reset its context, save entry 
> > point
> > +     * into pc and context ID into r0/x0 as specified by PSCI 
> > SYSTEM_SUSPEND)
> > +     */
> > +    vcpu_suspend(epoint, cid);
> > +
> > +    /*
> > +     * Set the domain state to suspended (will be cleared when the domain
> > +     * resumes, i.e. VCPU of this domain gets scheduled in).
> > +     */
> > +    d->is_shut_down = 1;
> > +    d->shutdown_code = SHUTDOWN_suspend;
>
> If you look at the other usage, you will notice that they are protected
> with a lock. Why is it not necessary here?
>

I think it is necessary here too

> I am also not entirely sure why we could not re-use code that already
> exist in common code. Surely suspend/resume should work in a similar way?
>

Could you please be more specific (which common code)?

> > +
> > +    /*
> > +     * The calling domain is suspended by blocking its last running VCPU. 
> > If an
> > +     * event is pending the domain will resume right away (VCPU will not 
> > block,
> > +     * but when scheduled in it will resume from the given entry point).
> > +     */
> > +    vcpu_block_unless_event_pending(current);
> > +
> > +    return PSCI_SUCCESS;
> > +}
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
> > index 9f4e5b8844..f7922be0c5 100644
> > --- a/xen/arch/arm/vpsci.c
> > +++ b/xen/arch/arm/vpsci.c
> > @@ -18,6 +18,7 @@
> >   #include <asm/vgic.h>
> >   #include <asm/vpsci.h>
> >   #include <asm/event.h>
> > +#include <asm/suspend.h>
> >
> >   #include <public/sched.h>
> >
> > @@ -210,6 +211,11 @@ static void do_psci_0_2_system_reset(void)
> >       domain_shutdown(d,SHUTDOWN_reboot);
> >   }
> >
> > +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t 
> > cid)
> > +{
> > +    return domain_suspend(epoint, cid);
> > +}
> > +
> >   static int32_t do_psci_1_0_features(uint32_t psci_func_id)
> >   {
> >       /* /!\ Ordered by function ID and not name */
> > @@ -227,6 +233,8 @@ static int32_t do_psci_1_0_features(uint32_t 
> > psci_func_id)
> >       case PSCI_0_2_FN32_SYSTEM_OFF:
> >       case PSCI_0_2_FN32_SYSTEM_RESET:
> >       case PSCI_1_0_FN32_PSCI_FEATURES:
> > +    case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> > +    case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> >       case ARM_SMCCC_VERSION_FID:
> >           return 0;
> >       default:
> > @@ -357,6 +365,17 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, 
> > uint32_t fid)
> >           return true;
> >       }
> >
> > +    case PSCI_1_0_FN32_SYSTEM_SUSPEND:
> > +    case PSCI_1_0_FN64_SYSTEM_SUSPEND:
> > +    {
> > +        register_t epoint = PSCI_ARG(regs,1);
> > +        register_t cid = PSCI_ARG(regs,2);
> > +
> > +        perfc_incr(vpsci_system_suspend);
> > +        PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid));
> > +        return true;
> > +    }
> > +
> >       default:
> >           return false;
> >       }
> > diff --git a/xen/include/asm-arm/perfc_defn.h 
> > b/xen/include/asm-arm/perfc_defn.h
> > index 8922e9525a..a02d0adea8 100644
> > --- a/xen/include/asm-arm/perfc_defn.h
> > +++ b/xen/include/asm-arm/perfc_defn.h
> > @@ -32,6 +32,7 @@ PERFCOUNTER(vpsci_system_reset,        "vpsci: 
> > system_reset")
> >   PERFCOUNTER(vpsci_cpu_suspend,         "vpsci: cpu_suspend")
> >   PERFCOUNTER(vpsci_cpu_affinity_info,   "vpsci: cpu_affinity_info")
> >   PERFCOUNTER(vpsci_features,            "vpsci: features")
> > +PERFCOUNTER(vpsci_system_suspend,      "vpsci: system_suspend")
> >
> >   PERFCOUNTER(vcpu_kick,                 "vcpu: notify other vcpu")
> >
> > diff --git a/xen/include/asm-arm/psci.h b/xen/include/asm-arm/psci.h
> > index 832f77afff..26462d0c47 100644
> > --- a/xen/include/asm-arm/psci.h
> > +++ b/xen/include/asm-arm/psci.h
> > @@ -43,10 +43,12 @@ void call_psci_system_reset(void);
> >   #define PSCI_0_2_FN32_SYSTEM_OFF          PSCI_0_2_FN32(8)
> >   #define PSCI_0_2_FN32_SYSTEM_RESET        PSCI_0_2_FN32(9)
> >   #define PSCI_1_0_FN32_PSCI_FEATURES       PSCI_0_2_FN32(10)
> > +#define PSCI_1_0_FN32_SYSTEM_SUSPEND      PSCI_0_2_FN32(14)
> >
> >   #define PSCI_0_2_FN64_CPU_SUSPEND         PSCI_0_2_FN64(1)
> >   #define PSCI_0_2_FN64_CPU_ON              PSCI_0_2_FN64(3)
> >   #define PSCI_0_2_FN64_AFFINITY_INFO       PSCI_0_2_FN64(4)
> > +#define PSCI_1_0_FN64_SYSTEM_SUSPEND      PSCI_0_2_FN64(14)
> >
> >   /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
> >   #define PSCI_0_2_AFFINITY_LEVEL_ON      0
> > diff --git a/xen/include/asm-arm/suspend.h b/xen/include/asm-arm/suspend.h
> > new file mode 100644
> > index 0000000000..de787d296a
> > --- /dev/null
> > +++ b/xen/include/asm-arm/suspend.h
> > @@ -0,0 +1,16 @@
> > +#ifndef __ASM_ARM_SUSPEND_H__
> > +#define __ASM_ARM_SUSPEND_H__
> > +
> > +int32_t domain_suspend(register_t epoint, register_t cid);
> > +
> > +#endif
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> > index 3171eabfd6..1f4e86524f 100644
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -24,6 +24,7 @@
> >   #include <xen/wait.h>
> >   #include <public/xen.h>
> >   #include <public/domctl.h>
> > +#include <public/sched.h>
> >   #include <public/sysctl.h>
> >   #include <public/vcpu.h>
> >   #include <public/vm_event.h>
> >
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.