[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Single step in HVM domU on Intel machine may see wrong DB6


  • To: "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>
  • From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
  • Date: Tue, 04 Mar 2014 16:06:27 +0100
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>
  • Delivery-date: Tue, 04 Mar 2014 15:06:37 +0000
  • Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Message-ID:Date:From:Organization:User-Agent: MIME-Version:To:CC:Subject:References:In-Reply-To: Content-Type:Content-Transfer-Encoding; b=Yn6hJE3JHZj5UNZSn2DaSJ75PCd6z0y/LC6iVNW4RtEFohjFZNfld5Ah VkhzS0dtjdT/4RAifljewtWRLhUSf8vhYf+etEzHHzaHSVi6BFOiBEbnC 41nhkX3VLlOUFLWMWsnHe0HMmpHiskaxWG5XyFe1gRcfNgCsjMAIRK+B2 dASiIcnalTXOShxBHL29hUCQtOszSUTyfR9G1UV91HJ55EjN1L6v3QXdP Tx+eWpAcvPGv+qTnkTerQfFkHNyoG;
  • List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 26.02.2014 06:15, Zhang, Yang Z wrote:
Juergen Gross wrote on 2014-02-21:
On 21.02.2014 02:26, Zhang, Yang Z wrote:
Juergen Gross wrote on 2014-02-20:
Hi,

Hi, Juergen


I think I've found a bug in debug trap handling in the Xen hypervisor
in case of a HVM domu using single stepping:

Debug registers are restored on vcpu switch only if db7 has any debug
events activated or if the debug registers are marked to be used by
the domU. This leads to problems if the domU uses single stepping and
vcpu switch occurs between the single step trap and reading of db6 in
the guest. db6 contents (single step indicator) are lost in this case.

Jan suggested to intercept the debug trap in the hypervisor and mark
the debug registers to be used by the domU to enable saving and
restoring the debug registers in case of a context switch. I used the
attached patch (applies to Xen 4.2.3) to verify this solution and it
worked (without the patch a test was able to reproduce the bug once in
about 3 hours, with the patch the test ran for more than 12 hours
without problem).

Obviously the patch isn't the final one, as I deactivated the "monitor
trap flag" feature to avoid any strange dependencies. Jan wanted
someone from the VMX folks to put together a proper fix to avoid
overlooking some corner case.


Thanks for reporting this issue.
Actually, I don't know the scenario that you saw this issue. Are you using
single step inside guest? Or running gdb to debug VM remotely?

Single step inside guest:

1. Guest sets TF flag in status loaded by IRET and does IRET
2. Debug trap in guest occurs, physical DB6 holds single step indicator
3. vcpu scheduling event occurs, debug registers are NOT saved as not marked
     being dirty and DB7 has no debug events configured
4. when guest vcpu is scheduled again, DB6 has lost single step indicator

How about the following patch. It is not tested because I don't have the 
environment.
After setting trap_debug in guest exception bitmap, the vmexit for trap_debug 
is not only used by gdb, but also will used by guest itself. In case of such 
vmexit, we need to restore the debug register and inject a trap exception into 
guest.

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b128e81..113a313 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1171,8 +1171,6 @@ void vmx_update_debug_state(struct vcpu *v)
      unsigned long mask;

      mask = 1u << TRAP_int3;
-    if ( !cpu_has_monitor_trap_flag )
-        mask |= 1u << TRAP_debug;

      if ( v->arch.hvm_vcpu.debug_state_latch )
          v->arch.hvm_vmx.exception_bitmap |= mask;
@@ -2690,9 +2688,13 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
              __vmread(EXIT_QUALIFICATION, &exit_qualification);
              HVMTRACE_1D(TRAP_DEBUG, exit_qualification);
              write_debugreg(6, exit_qualification | 0xffff0ff0);
-            if ( !v->domain->debugger_attached || cpu_has_monitor_trap_flag )
-                goto exit_and_crash;
-            domain_pause_for_debugger();
+            if ( v->domain->debugger_attached )
+                domain_pause_for_debugger();
+            else
+            {
+                __restore_debug_registers(v);
+                hvm_inject_hw_exception(TRAP_debug, HVM_DELIVER_NO_ERROR_CODE);
+            }
              break;
          case TRAP_int3:
          {
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index dcc3483..0d76d8c 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -378,7 +378,8 @@ static inline int hvm_event_pending(struct vcpu *v)
          (cpu_has_xsave ? X86_CR4_OSXSAVE : 0))))

  /* These exceptions must always be intercepted. */
-#define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U << TRAP_invalid_op))
+#define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U << TRAP_invalid_op) |\
+                       (1U << TRAP_debug))

  /*
   * x86 event types. This enumeration is valid for:


BTW: I also think we should clear the CPU_BASED_MOV_DR_EXITING bit in 
__restore_debug_registers(). After restore the debug register, we should not 
trap any DR access unless the VCPU is scheduled out again. Not sure whether I 
am wrong.

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b128e81..56a3140 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -394,6 +394,8 @@ static void __restore_debug_registers(struct vcpu *v)
      write_debugreg(3, v->arch.debugreg[3]);
      write_debugreg(6, v->arch.debugreg[6]);
      /* DR7 is loaded from the VMCS. */
+    v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MOV_DR_EXITING;
+    vmx_update_cpu_exec_control(v);
  }

  /*


Okay, finally I've got a test machine and could test your patch. The problem
was no longer observed, neither with nor without removing the
CPU_BASED_MOV_DR_EXITING bit in __restore_debug_registers().

I did no test to verify any other functionality regarding debug registers.


Juergen

--
Juergen Gross                 Principal Developer Operating Systems
PBG PDG ES&S SWE OS6                   Telephone: +49 (0) 89 62060 2932
Fujitsu                                   e-mail: juergen.gross@xxxxxxxxxxxxxx
Mies-van-der-Rohe-Str. 8                Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.