[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Memory ordering question in the shutdown deferral code



Hi Jan,

On 21/09/2020 14:11, Jan Beulich wrote:
On 21.09.2020 13:40, Julien Grall wrote:
(+ Xen-devel)

Sorry I forgot to CC xen-devel.

On 21/09/2020 12:38, Julien Grall wrote:
Hi all,

I have started to look at the deferral code (see
vcpu_start_shutdown_deferral()) because we need it for LiveUpdate and
Arm will soon use it.

The current implementation is using an smp_mb() to ensure ordering
between a write then a read. The code looks roughly (I have slightly
adapted it to make my question more obvious):

domain_shutdown()
      d->is_shutting_down = 1;
      smp_mb();
      if ( !vcpu0->defer_shutdown )
      {
        vcpu_pause_nosync(v);
        v->paused_for_shutdown = 1;
      }

vcpu_start_shutdown_deferral()
      vcpu0->defer_shutdown = 1;
      smp_mb();
      if ( unlikely(d->is_shutting_down) )
        vcpu_check_shutdown(v);

      return vcpu0->defer_shutdown;

smp_mb() should only guarantee ordering (this may be stronger on some
arch), so I think there is a race between the two functions.

It would be possible to pause the vCPU in domain_shutdown() because
vcpu0->defer_shutdown wasn't yet seen.

Equally, vcpu_start_shutdown_deferral() may not see d->is_shutting_down
and therefore Xen may continue to send the I/O. Yet the vCPU will be
paused so the I/O will never complete.

Individually for each of these I agree. But isn't the goal merely
to prevent both to enter their if()-s' bodies at the same time?
And isn't the combined effect of the two barriers preventing just
this?

The code should already be able to deal with that as vcpu_check_shutdown() will request to hold d->shutdown_lock and then check v->paused_for_shutdown.

So I am not sure why the barriers would matter here.


I am not fully familiar with the IOREQ code, but it sounds to me this is
not the behavior that was intended. Can someone more familiar with the
code confirm it?

As to original intentions, I'm afraid among the people still
listed as maintainers for any part of Xen it may only be Tim to
possibly have been involved in the original installation of
this model, and hence who may know of the precise intentions
and considerations back at the time.

It would be useful to know the original intentions, so I have CCed Tim.

However, I think it is more important to agree on what we want to achieve so we can decide whether the existing code is suitable.

Do you agree that we only want to shutdown (or pause it at an architecturally restartable bounday) a domain with no I/Os inflights?


As far as I'm concerned, to be honest I don't think I've ever
managed to fully convince myself of the correctness of the
model in the general case. But since it did look good enough
for x86 ...

Right, the memory model on x86 is quite simple compare to Arm :). I am pretty sure we need some sort of ordering, but I am not convinced we have the correct one in place if we want to cater architecture with more relaxed memory model.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.