[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests



Hi Jan,

On Tue, Jul 29, 2025 at 12:11 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 29.07.2025 10:52, Mykola Kvach wrote:
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -1334,16 +1334,15 @@ int domain_shutdown(struct domain *d, u8 reason)
> >      return 0;
> >  }
> >
> > -void domain_resume(struct domain *d)
> > +#ifdef CONFIG_ARM
> > +void domain_resume_nopause(struct domain *d)
> > +#else
> > +static void domain_resume_nopause(struct domain *d)
> > +#endif
>
> #ifndef CONFIG_ARM
> static
> #endif
> void domain_resume_nopause(struct domain *d)
>
> to have as little redundancy as possible.

Okay, I’ll change it.

>
> >  {
> >      struct vcpu *v;
> >
> > -    /*
> > -     * Some code paths assume that shutdown status does not get reset under
> > -     * their feet (e.g., some assertions make this assumption).
> > -     */
> > -    domain_pause(d);
> > -
> > +    domain_lock(d);
>
> This addition of locking affects domain_resume() as well. Neither need nor
> correctness are discussed in the description. If the locking was really
> needed for domain_resume() as well, I suppose adding that would better be
> a separate change anyway.

Thanks for the review.

The locking was added to avoid potential races involving _VPF_suspended and
the suspend/resume logic.

Consider the case where domain_lock() is not used inside domain_resume():

Domain 1 initiates suspend via PSCI SYSTEM_SUSPEND. At the same time,
Domain 0 invokes resume for Domain 1.

Domain 0 calls xl resume, which leads to domain_resume(). Domain 1 acquires
domain_lock() as part of the suspend path. Then it acquires the shutdown
lock in domain_shutdown(). Domain 0 is blocked waiting for the shutdown
lock. When Domain 1 releases the shutdown lock, it sets _VPF_suspended and
modifies the VCPU context. Then Domain 0 clears _VPF_suspended.

At this point, ctxt_switch_from() might be called with _VPF_suspended
already cleared, and the VCPU context partially updated. For example, the
guest PC is set to the resume entry point, but some registers like TTBR or
SCTLR_EL1 are saved from the current hardware context by
ctxt_switch_from.

However, after reviewing the flow again, I think this kind of race could
still happen even with the lock in place. Imagine Domain 1 sets the flag
via SYSTEM_SUSPEND, and then Domain 0 clears it by resuming the domain
before the first context switch. This could still result in a partially
updated context with inconsistent state.

So it might be better to update the VCPU context at the point of resume
instead of doing it during suspend. I'll look into that further and also
check for other possible races if the update is moved.

>
> The addition of this locking is particularly interesting considering that
> ...
>
> >      spin_lock(&d->shutdown_lock);
>
> ... is what follows right after.
>
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -814,6 +814,9 @@ void domain_destroy(struct domain *d);
> >  int domain_kill(struct domain *d);
> >  int domain_shutdown(struct domain *d, u8 reason);
> >  void domain_resume(struct domain *d);
> > +#ifdef CONFIG_ARM
> > +void domain_resume_nopause(struct domain *d);
> > +#endif
> >
> >  int domain_soft_reset(struct domain *d, bool resuming);
> >
> > @@ -1010,6 +1013,10 @@ static inline struct domain *next_domain_in_cpupool(
> >  /* VCPU is parked. */
> >  #define _VPF_parked          8
> >  #define VPF_parked           (1UL<<_VPF_parked)
> > +/* VCPU is suspended. */
> > +#define _VPF_suspended       9
> > +#define VPF_suspended        (1UL<<_VPF_suspended)
>
> Irrespective of the style violations in pre-existing code, can you please
> not add further violations, by inserting the missing blanks?

Okay

>
> > +
> >
>
> Please also don't introduce double blank lines.

I'll remove it.

>
> Jan

Best regards,
Mykola



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.