[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] xen/arm: Software Step ARMv8 - PC stuck on instruction



Hello all,

I'm trying to implement a single step functionality for XEN on ARMv8 using "Software Step Exceptions". My problem with this is, that after taking the exception the PC will stay on the same instruction.

By adding a "singlestep_enabled" flag in the "struct arch_domain" (based on the single step mechanism for x86), I'm able to set the needed registers (namely MDSCR_EL1.SS , SPSR_EL2.SS, MDCR_EL2.TDE ) for each vcpu that is used by a given domain (referenced by its domain_id).
Within the "arch/arm/traps.c:leave_hypervisor_tail()" function, which is called when exiting the hypervisor (according to /arch/arm/arm64/entry.S), I am checking the singlestep_enabled flag and set the registers (by this i can assure that each register is set on every vm entry). Also I'm checking that the registers are set for the correct domain and vcpu (by examining current->domain)

In comparison with the ARM ArchManual State machine(ARM DDI 0487B.a: Page 1957) an instruction that should be single-stepped, will be executed when "ERET setting PSTATE.SS to 1".
For this to happen, specific conditions should be met. Table D2-22 on page 1959 defines which Table sets this condition (in my case :{MDSCR_EL1.SS=1, Lock=False, NS=1, TDE=1} ).

Because I'm routing the exception from EL1 to EL2 (because of their naming convention "From EL = EL2" and "Target EL=EL1, according to Page 1959) with KDE=1, PSTATE.D=1 (monitored by printing MDSCR_EL1 and cpu_user_regs.cpsr) the system should copy SPSR_EL2.SS to PSTATE.SS when executing ERET.

The state machine dictates, that when PSTATE.SS=1 the system should be in the "Active-not-pending" state and after this should execute the single-stepped instruction, which should increase the PC.
But because my PC stays constant, the state in which the system is, should be the Active-Pending state.

By printing the PC value within the Exception handler (xen/arch/arm/traps.c:do_trap_guest_sync()) I can see the exceptions will be generated (otherwise there would be no prints) and the PC stays on the same value, which results in a not working VM.

Following is the code, that is use to setup single stepping:

--- original_xen/xen/xen/arch/arm/traps.c    2017-07-04 13:58:09.526280389 +0200
+++ xen/xen/arch/arm/traps.c    2017-07-04 13:48:48.146066332 +0200
@@ -1247,6 +1247,7 @@
 
 asmlinkage void do_trap_guest_sync(struct cpu_user_regs *regs)
 {
     const union hsr hsr = { .bits = regs->hsr };

     enter_hypervisor_head(regs);

     switch (hsr.ec) {
     case HSR_EC_WFI_WFE:
         /*
@@ -2917,6 +2931,7 @@
 #endif
+    case HSR_EC_SOFTSTEP_LOWER_EL:
+        do_trap_software_step(regs);    
+        break;
     default:
         gprintk(XENLOG_WARNING,
                 "Unknown Guest Trap. HSR=0x%x EC=0x%x IL=%x Syndrome=0x%"PRIx32"\n",

Extended the Switch case in trap_guest_sync_handler to support singlestep on ARMv8. Defined "HSR_EC_SOFTSTEP_LOWER_EL"=0x32 in /xen/include/asm/processor.h

 
 
+asmlinkage void do_trap_software_step(struct cpu_user_regs *regs)
+{
+    /*inform dom0*/
+    //PC to next instruction
+    gprintk(XENLOG_ERR, "SPSR_EL2 = 0x%lx  Regs.SPSR = 0x%x\n", READ_SYSREG(SPSR_EL2) ,regs->cpsr);
+}
+
Handler method that will be called when a software step exception is catched by the hypervisor (currently just prints various information). This is also the function, which allows me to check whether or not the PC was increased.
 
 asmlinkage void leave_hypervisor_tail(void)
 {
+    /*This methode will be called after the 'guest_entry' macro in /arch/arm64/entry.S set guest registers
+    Check single_step_enabled flag in domain struct here and set needed registers
+
+    */
+   
+    struct vcpu *v = current;
+
+    if ( unlikely(v->domain->arch.monitor.singlestep_enabled ) )
+    {
+      
+        WRITE_SYSREG(READ_SYSREG(MDCR_EL2)  | HDCR_TDE, MDCR_EL2);
+        WRITE_SYSREG(READ_SYSREG(SPSR_EL2)  | 0x200000, SPSR_EL2 );
+        WRITE_SYSREG(READ_SYSREG(MDSCR_EL1) | 0x1, MDSCR_EL1);
+
+        if (!(v->arch.single_step ))
+        {
+            gprintk(XENLOG_ERR, "Setting vcpu=%d for domain=%d\n",v->vcpu_id,v->domain->domain_id);
+           
+            gprintk(XENLOG_ERR, "[Set_singlestep] MDSCR_EL1     0x%lx\n", READ_SYSREG(MDSCR_EL1));
+            gprintk(XENLOG_ERR, "[Set_singlestep] SPSR_EL2      0x%lx\n", READ_SYSREG(SPSR_EL2));
+            gprintk(XENLOG_ERR, "[Set_singlestep] MDCR_EL2      0x%lx\n", READ_SYSREG(MDCR_EL2));
+            v->arch.single_step = 1;
+
+            return;
+        }else
+        {
+            //gprintk(XENLOG_ERR, "Register for vcpu=%d for domain=%d already set\n",v->vcpu_id,v->domain->domain_id);
+        }    
+    }

As mentioned, this function will set the needed registers. "monitor.singlestep_enabled" is the domain SS flag which is used to determine if the registers should be set. "arch.single_step" is the vcpu flag to check if the register were already set once (not really in use as for now). "HDCR_TDE" is the same value as "MDCR_EL2_TDE" would be, but this one is not implemented yet, thats why I'm using HDCR_TDE. "SPSR_EL2 | 0x200000" sets the SS bit for EL2 (because our exception will be taken to the hypervisor). "MDSCR_EL1 | 0x1" to enable the SS bit.
Because I'm checking the domain in this function, every vcpu that will be used, will be set with the values above. By this I can assure that each vcpu will trigger these exceptions.

--- original_xen/xen/xen/arch/arm/monitor.c    2017-07-04 13:58:09.522280302 +0200
+++ xen/xen/arch/arm/monitor.c    2017-07-04 10:37:09.553642139 +0200
@@ -28,6 +28,10 @@
  
int arch_monitor_domctl_event(struct domain *d,
                             struct xen_domctl_monitor_op *mop)
 {
     struct arch_domain *ad = &d->arch;
     bool_t requested_status = (XEN_DOMCTL_MONITOR_OP_ENABLE == mop->op);
 
     switch ( mop->event )
@@ -45,6 +49,168 @@
         break;
     }
 
+    case XEN_DOMCTL_MONITOR_EVENT_SINGLESTEP:
+    {
+        /*Adapted from x8/singlestepping*/
+       
+        bool_t old_status = ad->monitor.singlestep_enabled;
+
+        if ( unlikely(old_status == requested_status) )
+            return -EEXIST;
+        gprintk(XENLOG_ERR, "Setting singlestep enabled to %x\n", requested_status);
+        gprintk(XENLOG_ERR, "Anzahl VCPUs=%d in Domain %d\n", d->domain_id, d->max_vcpus);
+        gprintk(XENLOG_ERR, "Setting singlestep Flag for Domain=%x\n", d->domain_id);
+
+        domain_pause(d);
+        ad->monitor.singlestep_enabled = requested_status;
+        domain_unpause(d);
 
This method will be called through the /tools/tests/xen-access tool test and sets the domain flag in order to enable single step.

My guess is that (in relation to the state machine of software stepping) my implementation misses something for the ERET instruction to copy the correct value to PSTATE.SS, even though the table D2-24 (page 1961) should indicate that the SPSR_EL2.SS bit will be written.

I would be thankful if somebody who is familiar with the ARM debug architecture could help me find the necessary information to resolve this problem

Greetings Florian
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.