[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 04/17] x86emul: support {, V}{, U}COMIS{S, D}



>>> On 01.03.17 at 15:16, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 28/02/17 12:51, Jan Beulich wrote:
>> @@ -5468,6 +5468,55 @@ x86_emulate(
>>          state->simd_size = simd_none;
>>          break;
>>  
>> +    CASE_SIMD_PACKED_FP(, 0x0f, 0x2e):     /* ucomis{s,d} xmm/mem,xmm */
>> +    CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2e): /* vucomis{s,d} xmm/mem,xmm */
>> +    CASE_SIMD_PACKED_FP(, 0x0f, 0x2f):     /* comis{s,d} xmm/mem,xmm */
>> +    CASE_SIMD_PACKED_FP(_VEX, 0x0f, 0x2f): /* vcomis{s,d} xmm/mem,xmm */
>> +        if ( vex.opcx == vex_none )
>> +        {
>> +            if ( vex.pfx )
>> +                vcpu_must_have(sse2);
>> +            else
>> +                vcpu_must_have(sse);
>> +            get_fpu(X86EMUL_FPU_xmm, &fic);
>> +        }
>> +        else
>> +        {
>> +            host_and_vcpu_must_have(avx);
>> +            get_fpu(X86EMUL_FPU_ymm, &fic);
>> +        }
> 
> This is starting to become a common sequence.  Is there any sensible way
> to factor it out in a non-macro way, to avoid the compiler instantiating
> it at the top of many basic blocks?

While I would have wanted to, I couldn't think of one which wouldn't
be almost as ugly as the redundancy. The main problem being - as
you likely understand yourself - that we'd need parameters for all
the used features as well as all the involved X86EMUL_FPU_* values.

>> +        opc = init_prefixes(stub);
>> +        opc[0] = b;
>> +        opc[1] = modrm;
>> +        if ( ea.type == OP_MEM )
>> +        {
>> +            rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, vex.pfx ? 8 : 4,
>> +                           ctxt);
>> +            if ( rc != X86EMUL_OKAY )
>> +                goto done;
>> +
>> +            /* Convert memory operand to (%rAX). */
>> +            rex_prefix &= ~REX_B;
>> +            vex.b = 1;
>> +            opc[1] &= 0x38;
>> +        }
>> +        fic.insn_bytes = PFX_BYTES + 2;
>> +        opc[2] = 0xc3;
>> +
>> +        copy_REX_VEX(opc, rex_prefix, vex);
>> +        invoke_stub(_PRE_EFLAGS("[eflags]", "[mask]", "[tmp]"),
>> +                    _POST_EFLAGS("[eflags]", "[mask]", "[tmp]"),
>> +                    [eflags] "+g" (_regs._eflags),
>> +                    [tmp] "=&r" (cr4 /* dummy */), "+m" (*mmvalp),
> 
> This is latently dangerous.  It would be better to have an explicit
> "unsigned long dummy;", which the compiler will perfectly easily elide
> during register scheduling.

The thing I want to avoid as much as possible are these ugly,
improperly indented extra scopes following case labels. If
putting a dummy variable into the whole switch() scope is okay
with you, I could live with that. But then again I don't see the
danger here - there's no imaginable use for cr4 in this piece of
code.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.