RE: [Xen-devel] vmx & efer
>>> "Nakajima, Jun" <jun.nakajima@xxxxxxxxx> 04.05.07 18:44 >>>
>Jan Beulich wrote:
>> Am I blind in that I cannot find the place where the guest intended
>> EFER value gets loaded into the CPU register? The VMCS has no field
>> for this (other than AMD's VMCB), and the guest_msr_state->flags bit
>> for this register doesn't get set anywhere. I'm implying that the
>> guest thus always runs with all features enabled that were enabled by
>> the hypervisor (slight security issue, as EFER.SCE set implies LSTAR
>> was initialized, which may not be true).
>The bit LMA and LME are automaticaly are loaded by the hardware. Please
>look at the spec (Volume 3B).
Hmm, that cannot be fully true. The description for bit 9 of the VM-Entry
control field says: "Its value is loaded into IA32_EFER.LMA and IA32_EFER.LME
as part of VM entry." However, even if the bit is clear the processor must
remain in 4-level paging mode, and unless I'm missing something there's no
separation between a bit controlling long mode in terms of the effect the
L-bit of a code descriptor has (and e.g. requiring 64-bit gates in descriptor
tables) and a bit controlling the paging mode. So in reality there must be
two bits (and hence neither of them can be considered EFER.LME or EFER.LMA).
>For SCE, we discussed a while ago, but can you please elaborate on the
Consider a 64-bit guest that doesn't itself set EFER.SCE: Since the real
bit is still set, an application in the guest can execute syscall. However,
since likely LSTAR and STAR also were never written by the OS, control
will transfers to the address the hypervisor set for its own purpose, but
in the context of the guest. Hence arbitrary code will be executed in the
guest in kernel mode.
Likewise, I don't think it is a good idea to enforce EFER.NX in the guest.
While this will not allow improper code execution, it still modifies CPU
behavior (i.e. the page fault error code will differ when bit 63 of a page
table entry is set).
>> Further I am quite confused about the saving and restoring of CSTAR -
>> all parts of the SDM state or imply that this register doesn't exist
>> (as syscall is supposedly invalid in compatibility mode), so it
>> wouldn't need saving/restoring at all; there's one exception though:
>> section 18.104.22.168 says "SYSCALL/SYSRET invocations can occur from
>> either 32-bit compatibility mode application code or from 64-bit
>> application code."
>I agree that it's slightly confusing, but the previous sentence says
>"They are available only in 64-bit mode and only when the SCE bit of the
>IA32_EFER MSR is set."
Could you get your doc folks to fix the sentence then?
>The reason we save/restore CSTAR is that x86-64 Linux (still) writes to
>it because it did exist before. But I think we can stop doing that.
Yes, this should match real hardware - while I didn't check it, I suppose it
must be one of writes-ignored-reads-return-zero or reads-return-last-
Xen-devel mailing list