[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments



On 03/04/17 16:07, Jan Beulich wrote:
>>>> On 03.04.17 at 16:27, <andrew.cooper3@xxxxxxxxxx> wrote:
>> On 03/04/17 10:13, Jan Beulich wrote:
>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@xxxxxxxxxx> wrote:
>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t 
>> may_defer)
>>>>      return X86EMUL_OKAY;
>>>>  }
>>>>  
>>>> +enum hvm_segmentation_mode hvm_seg_mode(
>>>> +    const struct vcpu *v, enum x86_segment seg,
>>>> +    const struct segment_register *cs)
>>> The inputs here are at least somewhat counterintuitive (for example,
>>> from an abstract pov it is unexpected that the result depends on seg
>>> and cs). At the very least I think the naming should make clear that
>>> cs is not just some code segment, but the CS v has currently in use
>>> (e.g. active_cs). Going further the question is whether having this
>>> helper is really useful (and not perhaps inviting mis-use), and hence
>>> whether the inputs passed here wouldn't better be passed directly
>>> to hvm_virtual_to_linear_addr(), the more that the "seg" argument
>>> is required to match up between the two calls.
>> I purposefully wanted to avoid people opencoding the logic and getting
>> it wrong (looks like even I got it wrong).
>>
>> I'm not convinced that passing the parameters individually is better.
>>
>>>> +{
>>>> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
>>>> +        return hvm_seg_mode_real;
>>> What about VM86 mode?
>> Very good point.
>>
>>>> +    if ( hvm_long_mode_active(v) &&
>>>> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
>>>> +        return hvm_seg_mode_long;
>>>> +
>>>> +    return hvm_seg_mode_prot;
>>> Did you verify this actually matching real hardware behavior? There
>>> being obvious anomalies when compat ring-0 code executes
>>> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
>>> operands, yet 32-bit/compat code would expect 6-byte ones, so
>>> one of the two views is necessarily wrong, and whichever it is, it
>>> introduces an inconsistency), I wouldn't take it for given that _all_
>>> descriptor table accessing insns behave like they would from a
>>> 64-bit code segment (I nevertheless assume they do, but as I
>>> can't see it written down anywhere, we shouldn't assume so,
>>> considering how many other oddities there are in x86).
>>>
>>> This question is also being supported by the SDM using the same
>>> standard "Same exceptions as in protected mode" in the
>>> respective insns' "Compatibility Mode Exceptions" sections, yet
>>> the behavior above implies that #GP(0) might also result for
>>> compat mode descriptor table accesses if the descriptor address
>>> ends up being non-canonical. Interestingly enough the PM
>>> doesn't separate exception specifications for 32-bit protected,
>>> compat, and 64-bit modes.
>> You are mistaken.
>>
>> {L,S}{I,G}DT are {read,write}_segment_register() operations, using a
>> plain memory operand.
>>
>> When we come to the segmentation check, it will be by default
>> %ds-relative, with size as calculated by op_bytes in the instruction
>> emulator.
> I think I didn't make myself clear then: I'm not at all talking about how
> the memory access of these insns get carried out, I solely talk about
> the size of their operands:

I still fail to see what the size of the operands have to do with the
choice of segmentation mode.

> In long mode to load IDTR or GDTR you'd expect a 64-bit base and a 16-bit 
> limit.

Why?  I'd expect nothing of the sort, because 32bit compat segments are
purposefully designed to be no functional difference from regular 32bit
protected mode segments.  That includes not changing the behaviour of
instructions like this.

> Hence if _all_ system segment
> register related insns worked consistently in long mode, the four
> named insns would have to have 10-byte operands.

This isn't a valid expectation to draw.

>  I'm pretty sure
> they don't though, so there is _one_ anomaly already.

Indeed they don't.  In a compatibility mode segment, they have take a
6-byte operand, identically to 32bit mode.

> With that I don't think we can rule out there being other anomalies, with this
> not being talked about explicitly anywhere in the doc.

I don't think any of this is relevant to the correctness of this patch.

Without this fix, implicit accesses to system segments in a
compatibility mode segment will truncate the resulting linear address as
part of performing the segmentation calculations, which is obviously not
how real hardware behaves.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.