[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] X86 emulation under HAP/EPT



For my research, I need to run a SMP hvm guest in log-dirty mode and
after the first log-dirty fault, instead of making the page r/w, I need to log
the next 128 reads & writes by the vcpus. After logging this many
accesses, I set the page
to rw as is the case with usual log-dirty mode. Basically, the page
access changes to
p2m_access_n after the first log-dirty fault and is then reverted to
p2m->default_access
after 128 accesses.

Log-dirty allows me to log only one write access. In order to log
multiple read/write accesses,
I resorted to *emulating* the instructions that cause the page fault.
(I guess I could also play around with the trap flags & single
stepping the guest, but thats a last resort).

My initial attempts to do this with shadow paging proved to be too
painful and cumbersome.
So I switched to HAP, and am using an Intel Xeon machine with EPT support.
I have a 32bit debian guest with 2.6.24 kernel and a 64-bit 2.6.32
pvops dom0, & xen-unstable.

I can see that the vmx_vmexit_handler does some emulation for select
operations (e.g., msr)
So, I assume that when the code faults with EXIT_REASON_EPT_VIOLATION
and jumps into
hvm.c:hvm_hap_nested_page_fault(), it is either due to MMIO/PoD/LogDirty

Is it right to assume that when the EPT_VIOLATION fault occurs, the
instruction in question
intends to do only simple reads/writes to the page? No MSRs, rdtscs,
cr3 switches, etc,
as they are caught and emulated in the vmexit handler

I added the following code in
xen/arch/x86/hvm/hvm.c:hvm_hap_nested_page_fault(),
where the majority of log-dirty bits get set,

/* Spurious fault? PoD and log-dirty also take this path. */
if ( p2m_is_ram(p2mt) )
{
  if ((p2ma != p2m_access_rx2rw) && (p2mt & p2m_ram_logdirty)
      && access_valid && (mfn_x(mfn) != INVALID_MFN) && !access_x)
    {
         if (pg->emulation_count >127)
         {
          emul_end:
                /* Set page as r/w in the EPT.
                   Give rwx access to page since earlier access was
no-access (hack)
                 */
                p2m_change_type(v->domain, gfn, p2m_ram_logdirty, p2m_ram_rw);
                paging_mark_dirty(mfn)
         }
         else
         {
            struct hvm_emulate_ctxt ctxt;
            struct cpu_user_regs = get_cpu_user_regs();
            int rc;

            /* Emulate */
            hvm_emulate_prepare(&ctxt, regs);
            rc = hvm_emulate_one(&ctxt);
            hvm_emulate_writeback(&ctxt);

            /* If emulation failed, give the page read/write access
and dont tinker with it again. */
            if (rc != X86EMUL_OKAY) goto emul_end;


            /* revoke all access to the page, so that we trap on next access.
             * the function below is exactly same as
p2m_change_type(), except that it takes the
             * access type also as a parameter, instead of setting the
access to p2m->default_access.
             */
            p2m_change_type_access(v->domain, gfn, p2m_ram_logdirty,
p2m_ram_logdirty, p2m_access_n);

            /* set bit in vcpu's log-dirty bitmap */
            vcpu_mark_dirty(v, mfn);
            pg->emul_count++;

            /* Seems to make no difference - with/without this call */
            ept_sync_domain(v->domain);
        }
           return 1;
    }
    else
    {
            paging_mark_dirty(v->domain, mfn_x(mfn));
            p2m_change_type(v->domain, gfn, p2m_ram_logdirty, p2m_ram_rw);
            return 1;
    }
}

When I enable the log dirty mode, I see a bunch of emulation failures
with exception code
X86EMUL_UNHANDLEABLE and then a vm-entry failure saying invalid guest state.

(XEN) Failed vm entry (exit reason 0x80000021) caused by invalid guest
state (0).
(XEN) ************* VMCS Area **************

(XEN) *** Guest State ***
(XEN) CR0: actual=0x000000008005003b, shadow=0x000000008005003b,
gh_mask=ffffffffffffffff
(XEN) CR4: actual=0x00000000000026d0, shadow=0x0000000000000690,
gh_mask=ffffffffffffffff
(XEN) CR3: actual=0x0000000037c90000, target_count=0
(XEN)      target0=0000000000000000, target1=0000000000000000
(XEN)      target2=0000000000000000, target3=0000000000000000
(XEN) RSP = 0x00000000f7c7ff84 (0x00000000f7c7ff84)  RIP =
0x00000000c03123e3 (0x00000000c03123e3)
(XEN) RFLAGS=0x0000000000000086 (0x0000000000000086)  DR7 = 0x0000000000000400
(XEN) Sysenter RSP=00000000c1fb1300 CS:RIP=0060:00000000c0104330
(XEN) CS: sel=0x0060, attr=0x0c09b, limit=0xffffffff, base=0x0000000000000000
(XEN) DS: sel=0x007b, attr=0x0c0f3, limit=0xffffffff, base=0x0000000000000000
(XEN) SS: sel=0x0068, attr=0x0c093, limit=0xffffffff, base=0x0000000000000000
(XEN) ES: sel=0x007b, attr=0x0c0f3, limit=0xffffffff, base=0x0000000000000000
(XEN) FS: sel=0x00d8, attr=0x08093, limit=0xffffffff, base=0x0000000001b63000
(XEN) GS: sel=0x0000, attr=0x1c000, limit=0xffffffff, base=0x0000000000000000
(XEN) GDTR:                           limit=0x000000ff, base=0x00000000c1fac000
(XEN) LDTR: sel=0x0000, attr=0x1c000, limit=0xffffffff, base=0x0000000000000000
(XEN) IDTR:                           limit=0x000007ff, base=0x00000000c03f8000
(XEN) TR: sel=0x0080, attr=0x0008b, limit=0x00002073, base=0x00000000c1faf100
(XEN) Guest PAT = 0x0007040600070406
(XEN) TSC Offset = fffffe96f2a9e5c0
(XEN) DebugCtl=0000000000000000 DebugExceptions=0000000000000000
(XEN) Interruptibility=0000 ActivityState=0000

(XEN) *** Host State ***
(XEN) RSP = 0xffff83082636ff90  RIP = 0xffff82c4801d0d40
(XEN) CS=e008 DS=0000 ES=0000 FS=0000 GS=0000 SS=0000 TR=e040
(XEN) FSBase=0000000000000000 GSBase=0000000000000000 TRBase=ffff8308263f5b00
(XEN) GDTBase=ffff830826359000 IDTBase=ffff830826365000
(XEN) CR0=000000008005003b CR3=000000083f7f0000 CR4=00000000000026f0
(XEN) Sysenter RSP=ffff83082636ffc0 CS:RIP=e008:ffff82c480218670
(XEN) Host PAT = 0x0000050100070406

(XEN) *** Control State ***
(XEN) PinBased=0000003f CPUBased=b6a065fa SecondaryExec=0000006b
(XEN) EntryControls=000051ff ExitControls=000fefff
(XEN) ExceptionBitmap=00040040
(XEN) VMEntry: intr_info=800000ef errcode=00000000 ilen=00000000
(XEN) VMExit: intr_info=00000000 errcode=00000000 ilen=00000000
(XEN)         reason=80000021 qualification=00000000
(XEN) IDTVectoring: info=800000ef errcode=00000000
(XEN) TPR Threshold = 0x00
(XEN) EPT pointer = 0x000000083f7fe01e
(XEN) Virtual processor ID = 0x0042
(XEN) **************************************

(XEN) domain_crash called from vmx.c:2161
(XEN) Domain 1 (vcpu#0) crashed on cpu#1:
(XEN) ----[ Xen-4.2-unstable-crew  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    1
(XEN) RIP:    0060:[<00000000c03123e3>]
(XEN) RFLAGS: 0000000000000086   CONTEXT: hvm guest
(XEN) rax: 00000000f7c0de80   rbx: 00000000f7c0de8c   rcx: 00000000f7cba700
(XEN) rdx: 00000000f7c0de80   rsi: 00000000f7c0de80   rdi: 00000000f7c0de80
(XEN) rbp: 00000000f7c0de84   rsp: 00000000f7c7ff84   r8:  0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
(XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 0000000000000690
(XEN) cr3: 0000000037c90000   cr2: 00000000b7f69dcc
(XEN) ds: 007b   es: 007b   fs: 00d8   gs: 0000   ss: 0068   cs: 0060


Any pointers on how to resolve this issue?



Shriram

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.