[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] guest being crashed from viridian_start_apic_assist()


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
  • Date: Fri, 3 Jun 2016 13:31:04 +0000
  • Accept-language: en-GB, en-US
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 03 Jun 2016 13:32:38 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>
  • Thread-index: AQHRvXbr6jLXJV0/1kSoOfD8WEkQxZ/Xdc2w///q9wCAADJgoIAAAbiAgAAlYBA=
  • Thread-topic: guest being crashed from viridian_start_apic_assist()

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: 03 June 2016 14:06
> To: Paul Durrant
> Cc: xen-devel
> Subject: RE: guest being crashed from viridian_start_apic_assist()
> 
> >>> On 03.06.16 at 14:11, <Paul.Durrant@xxxxxxxxxx> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: 03 June 2016 10:59
> >> To: Paul Durrant
> >> Cc: xen-devel
> >> Subject: RE: guest being crashed from viridian_start_apic_assist()
> >>
> >> >>> On 03.06.16 at 11:39, <Paul.Durrant@xxxxxxxxxx> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> >> Sent: 03 June 2016 10:04
> >> >>
> >> >> to test a guest ACPI change I went through all Windows versions that
> >> >> I have installed anywhere, to find 32-bit Win7 dying (the other
> >> >> variants I tried worked fine), albeit retrying a couple of times I then
> >> >> didn't see this a second time. Is this something you've observed
> >> >> elsewhere? The APIC assist logic isn't that complicated, and I can't
> >> >> really spot any race in there...
> >> >
> >> > How did it die?
> >>
> >> From the single domain_crash() in that function. No other information
> >> was available (apart from the register dump, which doesn't look like
> >> it would be very useful for analysis here).
> >>
> >
> > There is an inconsistency between the test to abort a previous assist, which
> > is based on:
> >
> >     irr = vlapic_find_highest_irr(vlapic);
> >
> >     isr = vlapic_find_highest_isr(vlapic);
> >     if ( (isr & 0xf0) >= (irr & 0xf0) )
> >
> > and the test to start a new one:
> >
> >     isr = vlapic_find_lowest_vector(&vlapic->regs->data[APIC_ISR]);
> >     if ( isr >= 0 && isr < vector )
> >
> > I suspect that may be the problem you hit.
> 
> I don't think so. Of the second if(), the right side could actually be
> dropped, as the logic in vlapic_has_pending_irq() guarantees
> (vector & 0xf0) > (highest_isr & 0xf0), i.e. vector > highest_isr
> and hence necessarily vector > lowest_isr. That is, if
> vlapic_has_pending_irq() finds _some_ ISR bit set, it must be one
> below vector. Which means viridian_start_apic_assist() gets called
> solely when there's _no_ other ISR bit set.
> 

Yes, that test does appear to be unnecessary.

> What I find more problematic looking at those functions (but
> unrelated to the problem here afaict) is the
> vlapic_virtual_intr_delivery_enabled() related logic and its
> interaction with the Viridian APIC assist (leaving aside nested
> mode for the moment): vlapic_has_pending_irq() won't do
> anything with the APIC assist in that case, but if force_ack is
> true in vlapic_ack_pending_irq() there will be an interaction.
> 

...and the interaction would indeed be the crash you saw, I think, because you 
could start an APIC assist when vlapic_virtual_intr_delivery_enabled() is true, 
but never complete it because vlapic_has_pending_irq() bails before doing so. 
Thus, attempting the next APIC assist would lead to a domain_crash().

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.