[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen/arm: Software Step ARMv8 - PC stuck on instruction



Hello,

i was just testing the single step implementation and realized that the before mentioned solution is not fully working. I'm still trying to enable SS for a VM on Xen.
To test my implementation i wrote a small Kernel Module and started it in the DomU. The module only contains a loop which increments a counter and prints its value.
Right after loading the module I start the single step mechanism in the Dom0 for the VM (again with xen-access).
As soon as i start the SS the VM will stop working.

In the SS handler i print the "cpu_user_regs->pc" program counter. From there i can see, that each instruction address is used twice: (as it generates the following outputs)

(XEN) d1v0 do_trap_software_step PC =  0xffff000008081a80
(XEN) d1v0 do_trap_software_step PC =  0xffff000008081a80
(XEN) d1v0 do_trap_software_step PC =  0xffff000008082700
(XEN) d1v0 do_trap_software_step PC =  0xffff000008082700
(XEN) d1v0 do_trap_software_step PC =  0xffff000008082704
(XEN) d1v0 do_trap_software_step PC =  0xffff000008082704
(XEN) d1v0 do_trap_software_step PC =  0xffff000008082708
(XEN) printk: 119614 messages suppressed.
(XEN) d1v0 do_trap_software_step PC =  0xffff0000088cbd6c
(XEN) printk: 120131 messages suppressed.
(XEN) d1v0 do_trap_software_step PC =  0xffff0000088cbd64
(XEN) printk: 120255 messages suppressed.
(XEN) d1v0 do_trap_software_step PC =  0xffff0000088cbd64

The single step handler "do_trap_software_step" is called from (file is /arch/arm/arm64/entry.S): hyp_traps_vector (VBAR_EL2)->guest_sync->do_trap_guest_sync->do_trap_software_step

The ARM ARM (D2-1956 - ARM DDI 0487B.a ID033117) states that, in order to enables software step:

A debugger enables MDSCR_EL1.SS = 1
Executes an ERET

The PE executes the instruction to be single-stepped
Takes a software step exception on the next instruction

 
As mentioned I set the needed registers (including MDSCR_EL1) every time when the "leave_hypervisor_tail" function is called. This function will called from within the "exit" macro in "/arch/arm/arm64/entry.S" which is called after every exception return. Including the "guest_sync" exception.

Right after the "leave_hypervisor_tail" the ERET instruction will also be called within the "return_from_trap" macro.

Because of the prints in the single step handler I can assure that the software step exceptions are executed and correctly routed to the hypervisor.
Yet I can't figure out why the PC got the same value twice and why the VM will stop working.

My guess is that by setting the needed SS registers ever time when we leave the guest, the configuration won't allow the guest to execute the "to be single stepped instruction"
Before executing the (first) instruction the VM will generate the SS exception (as desired). In the hypervisor we will set the SS registers again, which could hinder the VM to execute the instruction (which we want because we already generated an SS exception for this instruction) and instead generate a second SS exception for it. This will lead to the second PC print in the single step handler

But I'm not able to find any proof for this.

If I'm using the software step exception for only one instruction and disable it right after it (from within xen-access with an VM_EVENT) the VM will work without problems.

Any help to find the missing step in order to enable VM single stepping would be appreciated

Greetings Florian


2017-07-05 16:03 GMT+02:00 Florian Jakobsmeier <florian.jakobsmeier@xxxxxxxxxxxxxx>:

2017-07-04 20:37 GMT+02:00 Julien Grall <julien.grall@xxxxxxx>:

On 07/04/2017 01:30 PM, Florian Jakobsmeier wrote:
Hello all,

Hi Florian,


      asmlinkage void leave_hypervisor_tail(void)
      {
    +    /*This methode will be called after the 'guest_entry' macro in
    /arch/arm64/entry.S set guest registers
    +    Check single_step_enabled flag in domain struct here and set
    needed registers
    +
    +    */
    +
    +    struct vcpu *v = current;
    +
    +    if ( unlikely(v->domain->arch.monitor.singlestep_enabled ) )
    +    {
    +
    +        WRITE_SYSREG(READ_SYSREG(MDCR_EL2)  | HDCR_TDE, MDCR_EL2);
    +        WRITE_SYSREG(READ_SYSREG(SPSR_EL2)  | 0x200000, SPSR_EL2 );
    +        WRITE_SYSREG(READ_SYSREG(MDSCR_EL1) | 0x1, MDSCR_EL1);
    +
    +        if (!(v->arch.single_step ))
    +        {
    +            gprintk(XENLOG_ERR, "Setting vcpu=%d for
    domain=%d\n",v->vcpu_id,v->domain->domain_id);
    +
    +            gprintk(XENLOG_ERR, "[Set_singlestep] MDSCR_EL1        0x%lx\n", READ_SYSREG(MDSCR_EL1));
    +            gprintk(XENLOG_ERR, "[Set_singlestep] SPSR_EL2         0x%lx\n", READ_SYSREG(SPSR_EL2));
    +            gprintk(XENLOG_ERR, "[Set_singlestep] MDCR_EL2         0x%lx\n", READ_SYSREG(MDCR_EL2));
    +            v->arch.single_step = 1;
    +
    +            return;
    +        }else
    +        {
    +            //gprintk(XENLOG_ERR, "Register for vcpu=%d for
    domain=%d already set\n",v->vcpu_id,v->domain->domain_id);
    +        }
    +    }


As mentioned, this function will set the needed registers. "monitor.singlestep_enabled" is the domain SS flag which is used to determine if the registers should be set. "arch.single_step" is the vcpu flag to check if the register were already set once (not really in use as for now). "HDCR_TDE" is the same value as "MDCR_EL2_TDE" would be, but this one is not implemented yet, thats why I'm using HDCR_TDE. "SPSR_EL2 | 0x200000" sets the SS bit for EL2 (because our exception will be taken to the hypervisor). "MDSCR_EL1 | 0x1" to enable the SS bit.
Because I'm checking the domain in this function, every vcpu that will be used, will be set with the values above. By this I can assure that each vcpu will trigger these exceptions.

SPSR_EL2 is saved/restored on entry and exit of a trap to the hypervisor (see arch/arm/arm*/entry.S). So the value you wrote in the register is overridden afterwards.

If you want to set the SS bit, you need to do in the save registered cpsr. You can access using:

guest_cpu_user_regs()->cpsr |= 0x200000;

This solved the problem. Thank you
 
Cheers,

--
Julien Grall

Greetings
Florian

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.