[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 05/23] x86emul: support AVX512F gather insns



On 04.07.2019 16:10, Andrew Cooper wrote:
> On 01/07/2019 12:18, Jan Beulich wrote:
>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
>> @@ -9100,6 +9100,133 @@ x86_emulate(
>>            put_stub(stub);
>>    
>>            if ( rc != X86EMUL_OKAY )
>> +            goto done;
>> +
>> +        state->simd_size = simd_none;
>> +        break;
>> +    }
>> +
>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x90): /* vpgatherd{d,q} 
>> mem,[xyz]mm{k} */
>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x91): /* vpgatherq{d,q} 
>> mem,[xyz]mm{k} */
>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x92): /* vgatherdp{s,d} 
>> mem,[xyz]mm{k} */
>> +    case X86EMUL_OPC_EVEX_66(0x0f38, 0x93): /* vgatherqp{s,d} 
>> mem,[xyz]mm{k} */
>> +    {
>> +        typeof(evex) *pevex;
>> +        union {
>> +            int32_t dw[16];
>> +            int64_t qw[8];
>> +        } index;
>> +        bool done = false;
>> +
>> +        ASSERT(ea.type == OP_MEM);
>> +        generate_exception_if((!evex.opmsk || evex.brs || evex.z ||
>> +                               evex.reg != 0xf ||
>> +                               modrm_reg == state->sib_index),
>> +                              EXC_UD);
>> +        avx512_vlen_check(false);
>> +        host_and_vcpu_must_have(avx512f);
>> +        get_fpu(X86EMUL_FPU_zmm);
>> +
>> +        /* Read destination and index registers. */
>> +        opc = init_evex(stub);
>> +        pevex = copy_EVEX(opc, evex);
>> +        pevex->opcx = vex_0f;
>> +        opc[0] = 0x7f; /* vmovdqa{32,64} */
>> +        /*
>> +         * The register writeback below has to retain masked-off elements, 
>> but
>> +         * needs to clear upper portions in the index-wider-than-data cases.
>> +         * Therefore read (and write below) the full register. The 
>> alternative
>> +         * would have been to fiddle with the mask register used.
>> +         */
>> +        pevex->opmsk = 0;
>> +        /* Use (%rax) as destination and modrm_reg as source. */
>> +        pevex->b = 1;
>> +        opc[1] = (modrm_reg & 7) << 3;
>> +        pevex->RX = 1;
>> +        opc[2] = 0xc3;
>> +
>> +        invoke_stub("", "", "=m" (*mmvalp) : "a" (mmvalp));
>> +
>> +        pevex->pfx = vex_f3; /* vmovdqu{32,64} */
>> +        pevex->w = b & 1;
>> +        /* Switch to sib_index as source. */
>> +        pevex->r = !mode_64bit() || !(state->sib_index & 0x08);
>> +        pevex->R = !mode_64bit() || !(state->sib_index & 0x10);
>> +        opc[1] = (state->sib_index & 7) << 3;
>> +
>> +        invoke_stub("", "", "=m" (index) : "a" (&index));
>> +        put_stub(stub);
>> +
>> +        /* Clear untouched parts of the destination and mask values. */
>> +        n = 1 << (2 + evex.lr - ((b & 1) | evex.w));
>> +        op_bytes = 4 << evex.w;
>> +        memset((void *)mmvalp + n * op_bytes, 0, 64 - n * op_bytes);
>> +        op_mask &= (1 << n) - 1;
>> +
>> +        for ( i = 0; op_mask; ++i )
>> +        {
>> +            signed long idx = b & 1 ? index.qw[i] : index.dw[i];
> 
> No signed.

Hmm - would you mind this remaining consistent with the AVX
counterpart code? (As an aside I continue to think it is a bad
thing to not have explicit "signed" when we actually mean signed
quantities, seeing the still large amount of plain short/int/long
uses that actually should be unsigned.)

>  However, surely this needs to be int64_t anyway, to function
> correctly in a 32bit build of the test harness?

No, only 32 bits (or less, when the scale factor is larger than 1)
will be used for address calculation. And again this is no
different from pre-existing AVX code.

> The SDM says VPGATHERQD is encodable in 32bit with quadword indices.

Sure, truncating to just 32-bit values due to 32-bit addressing.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.