[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
> From: Xuquan (Euler) [mailto:xuquan8@xxxxxxxxxx] > Sent: Friday, September 09, 2016 11:02 AM > > On August 30, 2016 1:58 PM, Tian Kevin < kevin.tian@xxxxxxxxx > wrote: > >> From: Xuquan (Euler) [mailto:xuquan8@xxxxxxxxxx] > >> Sent: Friday, August 19, 2016 8:59 PM > >> > >> When Xen apicv is enabled, wall clock time is faster on Windows7-32 > >> guest with high payload (with 2vCPU, captured from xentrace, in high > >> payload, the count of IPI interrupt increases rapidly between these vCPUs). > >> > >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector > >> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the > >> IPI intrrupt is high priority than periodic timer interrupt. Xen > >> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI) > >> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI > >> interrupt within VMX non-root operation without a VM exit. Within VMX > >> non-root operation, if periodic timer interrupt index of bit is set in > >> VIRR and highest, the apicv delivers periodic timer interrupt within VMX > >non-root operation as well. > >> > >> But in current code, if Xen doesn't update periodic timer interrupt > >> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not > >> aware of this case to decrease the count (pending_intr_nr) of pending > >> periodic timer interrupt, then Xen will deliver a periodic timer > >> interrupt again. The guest receives more periodic timer interrupt. > >> > >> If the periodic timer interrut is delivered and not the highest > >> priority, make Xen be aware of this case to decrease the count of > >> pending periodic timer interrupt. > >> > >> Signed-off-by: Yifei Jiang <jiangyifei@xxxxxxxxxx> > >> Signed-off-by: Rongguang He <herongguang.he@xxxxxxxxxx> > >> Signed-off-by: Quan Xu <xuquan8@xxxxxxxxxx> > >> --- > >> Why RFC: > >> 1. I am not quite sure for other cases, such as nested case. > >> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including > >external > >> interrupts, non-maskable interrupt system-management interrrupts, > >exceptions > >> and VM exit) may occur before delivery of a periodic timer interrupt, > >> the > >periodic > >> timer interrupt may be lost when a coming periodic timer interrupt is > >delivered. > >> Actually, and so current code is. > >> --- > >> xen/arch/x86/hvm/vmx/intr.c | 16 +++++++++++++++- > >> 1 file changed, 15 insertions(+), 1 deletion(-) > >> > >> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c > >> index 8fca08c..d3a034e 100644 > >> --- a/xen/arch/x86/hvm/vmx/intr.c > >> +++ b/xen/arch/x86/hvm/vmx/intr.c > >> @@ -334,7 +334,21 @@ void vmx_intr_assist(void) > >> __vmwrite(EOI_EXIT_BITMAP(i), > >v->arch.hvm_vmx.eoi_exit_bitmap[i]); > >> } > >> > >> - pt_intr_post(v, intack); > >> + /* > >> + * If the periodic timer interrut is delivered and not the highest > >priority, > >> + * make Xen be aware of this case to decrease the count of pending > >periodic > >> + * timer interrupt. > >> + */ > >> + if ( pt_vector != -1 && intack.vector > pt_vector ) > >> + { > >> + struct hvm_intack pt_intack; > >> + > >> + pt_intack.vector = pt_vector; > >> + pt_intack.source = hvm_intsrc_lapic; > >> + pt_intr_post(v, pt_intack); > >> + } > >> + else > >> + pt_intr_post(v, intack); > >> } > > > >Here is my thought. w/o above patch one pending pt interrupt may result in > >multiple injections of guest timer interrupt, as you identified, when > >pt_vector > >happens to be not the highest pending vector. Then w/ your patch, multiple > >pending pt interrupt instances may be combined as one injection of guest > >timer > >interrupt. Especially intack.vector > >>pt_vector doesn't mean pt_intr is eligible for injection, which might > >be blocked due to other reasons. As Jan pointed out, this path is very > >fragile, so > >better we can have a fix to sustain the original behavior with one pending pt > >instance causing one actual injection. > > > >What about doing pt_intr_post in EOI-induced VM-exit instead? EOI exit bit > >for > >pt_intr is already set to 1 which means every injection would incur an > >EOI-induced VM-exit: > > > > /* > > * Set eoi_exit_bitmap for periodic timer interrup to cause > > EOI-induced > >VM > > * exit, then pending periodic time interrups have the chance to be > >injected > > * for compensation > > */ > > if (pt_vector != -1) > > vmx_set_eoi_exit_bitmap(v, pt_vector); > > > >I don't think delaying post makes much difference here. Even we do post > >earlier > >like your patch, further pending instances have to wait until current > >instance is > >completed. So as long as post happens before EOI is completed, it should be > >fine. > > Kevin, I verify your suggestion with below modification. wall clock time is > __still__ faster. > I hope this modification is correct to your suggestion. > > I __guess__ that the vm-entry is more frequent than PT interrupt, > Specifically, if there is a PT interrupt pending, the count (pending_intr_nr) > is 1.. > > before next PT interrupt coming, each PT interrupt injection may not incur an > EOI-induced > VM-exit directly, multiple PT interrupt inject for a pending PT interrupt, > then multiple EOI-induced VM-exit incur with multiple pt_intr_post() to > decrease the count > (pending_intr_nr), but the count (pending_intr_nr) is still 1 before next PT > interrupt coming, > then only one pt_intr_post() is effective.. I don't quite understand your description here, but just for your patch see below... > > Thanks for your time!! > Quan > > ======= modification ======== > > diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c > index 1d5d287..cc247c3 100644 > --- a/xen/arch/x86/hvm/vlapic.c > +++ b/xen/arch/x86/hvm/vlapic.c > @@ -433,6 +433,11 @@ void vlapic_EOI_set(struct vlapic *vlapic) > void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector) > { > struct domain *d = vlapic_domain(vlapic); > + struct hvm_intack pt_intack; > + > + pt_intack.vector = vector; > + pt_intack.source = hvm_intsrc_lapic; > + pt_intr_post(vlapic_vcpu(vlapic), pt_intack); > > if ( vlapic_test_and_clear_vector(vector, &vlapic->regs->data[APIC_TMR]) > ) > vioapic_update_EOI(d, vector); > diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c > index 8fca08c..29d9bbf 100644 > --- a/xen/arch/x86/hvm/vmx/intr.c > +++ b/xen/arch/x86/hvm/vmx/intr.c > @@ -333,8 +333,6 @@ void vmx_intr_assist(void) > clear_bit(i, &v->arch.hvm_vmx.eoi_exitmap_changed); > __vmwrite(EOI_EXIT_BITMAP(i), > v->arch.hvm_vmx.eoi_exit_bitmap[i]); > } > - > - pt_intr_post(v, intack); > } > else > { > Because we update pt irq in every vmentry, there is a chance that already-injected instance (before EOI-induced exit happens) will incur another pending IRR setting if there is a VM-exit happens between HW virtual interrupt injection (vIRR->0, vISR->1) and EOI-induced exit (vISR->0), since pt_intr_post hasn't been invoked yet. I guess this is the reason why you still see faster wallclock. I think you need mark this pending_intr_post situation explicitly. Then pt_update_irq should skip such pt timer when pending_intr_post of that timer is true (otherwise the update is meaningless since previous one hasn't been posted yet). Then with your change to post in EOI-induced exit handler, it should work correctly to meet the goal - one virtual interrupt delivery for one pending pt intr... Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |