[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 03/10] xen/arm: inflight irqs during migration



On Thu, 24 Jul 2014, Ian Campbell wrote:
> On Thu, 2014-07-24 at 15:48 +0100, Stefano Stabellini wrote:
> > On Wed, 23 Jul 2014, Ian Campbell wrote:
> > > On Wed, 2014-07-23 at 15:45 +0100, Stefano Stabellini wrote:
> > > > On Thu, 17 Jul 2014, Ian Campbell wrote:
> > > > > On Thu, 2014-07-10 at 19:13 +0100, Stefano Stabellini wrote:
> > > > > > We need to take special care when migrating irqs that are already
> > > > > > inflight from one vcpu to another. See "The effect of changes to an
> > > > > > GICD_ITARGETSR", part of chapter 4.3.12 of the ARM Generic Interrupt
> > > > > > Controller Architecture Specification.
> > > > > > 
> > > > > > The main issue from the Xen point of view is that the lr_pending and
> > > > > > inflight lists are per-vcpu. The lock we take to protect them is 
> > > > > > also
> > > > > > per-vcpu.
> > > > > > 
> > > > > > In order to avoid issues, if the irq is still lr_pending, we can
> > > > > > immediately move it to the new vcpu for injection.
> > > > > > 
> > > > > > Otherwise if it is in a GICH_LR register, set a new flag
> > > > > > GIC_IRQ_GUEST_MIGRATING, so that we can recognize when we receive 
> > > > > > an irq
> > > > > > while the previous one is still inflight (given that we are only 
> > > > > > dealing
> > > > > > with hardware interrupts here, it just means that its LR hasn't been
> > > > > > cleared yet on the old vcpu).  If GIC_IRQ_GUEST_MIGRATING is set, we
> > > > > > only set GIC_IRQ_GUEST_QUEUED and interrupt the old vcpu. To know 
> > > > > > which
> > > > > > one is the old vcpu, we introduce a new field to pending_irq, called
> > > > > > vcpu_migrate_from.
> > > > > > When clearing the LR on the old vcpu, we take special care of 
> > > > > > injecting
> > > > > > the interrupt into the new vcpu. To do that we need to release the 
> > > > > > old
> > > > > > vcpu lock before taking the new vcpu lock.
> > > > > 
> > > > > I still think this is an awful lot of complexity and scaffolding for
> > > > > something which is rare on the scale of things and which could be 
> > > > > almost
> > > > > trivially handled by requesting a maintenance interrupt for one EOI 
> > > > > and
> > > > > completing the move at that point.
> > > > 
> > > > Requesting a maintenance interrupt is not as simple as it looks:
> > > > - ATM we don't know how to edit a living GICH_LR register, we would have
> > > > to add a function for that;
> > > 
> > > That doesn't sound like a great hardship. Perhaps you can reuse the
> > > setter function anyhow.
> > > 
> > > > - if we request a maintenance interrupt then we also need to EOI the
> > > > physical IRQ, that is something that we don't do anymore (unless
> > > > PLATFORM_QUIRK_GUEST_PIRQ_NEED_EOI but that is another matter). We would
> > > > need to understand that some physical irqs need to be EOI'ed by Xen and
> > > > some don't.
> > > 
> > > I was thinking the maintenance interrupt handler would take care of
> > > this.
> > 
> > In that case we would have to resurrect the code to loop over the
> > GICH_EISR* registers from maintenance_interrupt.
> > Anything can be done, I am just pointing out that this alternative
> > approach is not as cheap as it might sound.
> 
> It's simple though, that's the benefit.
> 
> > > > Also requesting a maintenance interrupt would only guarantee that the
> > > > vcpu is interrupted as soon as possible, but it won't save us from
> > > > having to introduce GIC_IRQ_GUEST_MIGRATING.
> > > 
> > > I didn't expect GIC_IRQ_GUEST_MIGRATING to go away. If nothing else you
> > > would need it to flag to the maintenance IRQ that it needs to EOI
> > > +complete the migration.
> > > 
> > > >  It would only let us skip
> > > > adding vcpu_migrate_from and the 5 lines of code in
> > > > vgic_vcpu_inject_irq.
> > > 
> > > And the code in gic_update_one_lr I think, and most of
> > > vgic_vcpu_inject-cpu.
> > > And more than the raw lines of code the
> > > *complexity* would be much lower.
> > 
> > I don't know about the complexity. One thing is to completely get rid of
> > maintenance interrupts. Another is to get rid of them in most cases but
> > not all. Having to deal both with not having them and with having them,
> > increases complexity, at least in my view. It simpler to think that you
> > have them all the times or never.
> 
> The way I view it is that the maintenance interrupt path is the dumb and
> obvious one which is always there and can always be used and doesn't
> need thinking about. Then the other optimisations are finding ways to
> avoid actually using it, but can always fall back to the dumb way if
> something too complex to deal with occurs.
> 
> > In any case replying to this email made me realize that there is indeed
> > a lot of unneeded code in this patch, especially given that writing to
> > the physical ITARGETSR is guaranteed to affect pending (non active)
> > irqs.  From the ARM ARM:
> > 
> > "Software can write to an GICD_ITARGETSR at any time. Any change to a CPU
> > targets field value:
> > 
> > [...]
> > 
> > Has an effect on any pending interrupts. This means:
> >  â adding a CPU interface to the target list of a pending interrupt makes
> >    that interrupt pending on that CPU interface
> >  â removing a CPU interface from the target list of a pending interrupt
> >    removes the pending state of that interrupt on that CPU interface."
> > 
> > 
> > I think we can rely on this behaviour. Thanks to patch #5 we know that
> > we'll be receiving the second physical irq on the old cpu and from then
> > on the next ones always on the new cpu. So we won't need
> > vcpu_migrate_from, the complex ordering of MIGRATING and QUEUED, or the
> > maintenance_interrupt.
> 
> That would be good to avoid all that for sure and would certainly impact
> my opinion of the complexity cost of this stuff.
> 
> Are you sure about the second physical IRQ always hitting on the source
> pCPU though? I'm unclear about where the physical ITARGETSR gets written
> in the scheme you are proposing.

It gets written right away if there are no inflight irqs. Otherwise it
gets written when clearing the LRs. That's why we are sure it is going
to hit the old cpu. If the vcpu gets descheduled after EOIing the irq,
that is also fine because Xen is going to clear the LRs on hypervisor
entry.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.