[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v7 27/32] xen/x86: allow HVM guests to use hypercalls to bring up vCPUs



El 05/10/15 a les 12.28, Andrew Cooper ha escrit:
> On 02/10/15 16:48, Roger Pau Monne wrote:
>> Allow the usage of the VCPUOP_initialise, VCPUOP_up, VCPUOP_down and
>> VCPUOP_is_up hypercalls from HVM guests.
>>
>> This patch introduces a new structure (vcpu_hvm_context) that should be used
>> in conjuction with the VCPUOP_initialise hypercall in order to initialize
>> vCPUs for HVM guests.
>>
>> Signed-off-by: Roger Pau Monnà <roger.pau@xxxxxxxxxx>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> Cc: Ian Campbell <ian.campbell@xxxxxxxxxx>
>> Cc: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
>> ---
>> Changes since v6:
>>  - Add comments to clarify some initializations.
>>  - Introduce a generic default_initialize_vcpu that's used to initialize a
>>    ARM vCPU or a x86 PV vCPU.
>>  - Move the undef of the SEG macro.
>>  - Fix the size of the eflags register, it should be 32bits.
>>  - Add a comment regarding the value of the 12-15 bits of the _ar fields.
>>  - Remove the 16bit strucutre, the 32bit one can be used to start the cpu in
>>    real mode.
>>  - Add some sanity checks to the values passed in.
>>  - Add paddings to vcpu_hvm_context so the layout on 32/64bits is the same.
>>  - Add support for the compat version of VCPUOP_initialise.
>>
>> Changes since v5:
>>  - Fix a coding style issue.
>>  - Merge the code from wip-dmlite-v5-refactor by Andrew in order to reduce
>>    bloat.
>>  - Print the offending %cr3 in case of error when using shadow.
>>  - Reduce the scope of local variables in arch_initialize_vcpu.
>>  - s/current->domain/v->domain/g in arch_initialize_vcpu.
>>  - Expand the comment in public/vcpu.h to document the usage of
>>    vcpu_hvm_context for HVM guests.
>>  - Add myself as the copyright holder for the public hvm_vcpu.h header.
>>
>> Changes since v4:
>>  - Don't assume mode is 64B, add an explicit check.
>>  - Don't set TF_kernel_mode, it is only needed for PV guests.
>>  - Don't set CR0_ET unconditionally.
>> ---
>>  xen/arch/x86/domain.c             | 185 
>> ++++++++++++++++++++++++++++++++++++++
>>  xen/arch/x86/hvm/hvm.c            |   8 ++
>>  xen/common/compat/domain.c        |  71 +++++++++++----
>>  xen/common/domain.c               |  56 +++++++++---
>>  xen/include/Makefile              |   1 +
>>  xen/include/asm-x86/domain.h      |   3 +
>>  xen/include/public/hvm/hvm_vcpu.h | 144 +++++++++++++++++++++++++++++
>>  xen/include/public/vcpu.h         |   6 +-
>>  xen/include/xlat.lst              |   3 +
>>  9 files changed, 448 insertions(+), 29 deletions(-)
>>  create mode 100644 xen/include/public/hvm/hvm_vcpu.h
>>
>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>> index a3b1c9b..af5feea 100644
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -37,6 +37,7 @@
>>  #include <xen/wait.h>
>>  #include <xen/guest_access.h>
>>  #include <public/sysctl.h>
>> +#include <public/hvm/hvm_vcpu.h>
>>  #include <asm/regs.h>
>>  #include <asm/mc146818rtc.h>
>>  #include <asm/system.h>
>> @@ -1176,6 +1177,190 @@ int arch_set_info_guest(
>>  #undef c
>>  }
>>  
>> +/* Called by VCPUOP_initialise for HVM guests. */
>> +int arch_set_info_hvm_guest(struct vcpu *v, vcpu_hvm_context_t *ctx)
>> +{
>> +    struct cpu_user_regs *uregs = &v->arch.user_regs;
>> +    struct segment_register cs, ds, ss, es, tr;
>> +
>> +    switch ( ctx->mode )
>> +    {
>> +    default:
>> +        return -EINVAL;
>> +
>> +    case VCPU_HVM_MODE_32B:
>> +    {
>> +        const struct vcpu_hvm_x86_32 *regs = &ctx->cpu_regs.x86_32;
>> +        uint32_t limit;
>> +
>> +#define SEG(s, r)                                                       \
>> +    (struct segment_register){ .sel = 0, .base = (r)->s ## _base,       \
>> +            .limit = (r)->s ## _limit, .attr.bytes = (r)->s ## _ar }
>> +        cs = SEG(cs, regs);
>> +        ds = SEG(ds, regs);
>> +        ss = SEG(ss, regs);
>> +        es = SEG(es, regs);
>> +        tr = SEG(tr, regs);
>> +#undef SEG
>> +
>> +        /* Basic sanity checks. */
>> +        if ( cs.attr.fields.pad != 0 || ds.attr.fields.pad != 0 ||
>> +             ss.attr.fields.pad != 0 || es.attr.fields.pad != 0 ||
>> +             tr.attr.fields.pad != 0 )
>> +        {
>> +            gprintk(XENLOG_ERR, "Attribute bits 12-15 of the segments are 
>> not null\n");
> 
> I would use 'zero' as opposed to 'null' here.  There is nothing to do
> with pointers here.

Done.

>> +            return -EINVAL;
>> +        }
>> +
>> +        limit = cs.limit * (cs.attr.fields.g ? PAGE_SIZE : 1);
> 
> This will overflow in the common case.  Calculation of the limit is a
> little awkward.  I believe this should do:
> 
> limit = cs.limit
> if ( cs.attr.fields.g )
>     limit = (limit << 12) | 0xfff;
> 
> In the case that g is set and cs is a flat segment, limit should have
> the value ~0U, rather than 0 which is what your calculation will achieve.

Fixed.

>> +        if ( regs->eip > limit )
>> +        {
>> +            gprintk(XENLOG_ERR, "EIP address is outside of the CS limit\n");
> 
> In all cases, please print out the values, to make the error message
> more helpful.
> 
> e.g. "EIP (%08x) outside CS limit (%08x)"
> 
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( ds.attr.fields.dpl > cs.attr.fields.dpl )
>> +        {
>> +            gprintk(XENLOG_ERR, "DPL of DS is greater than DPL of CS\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( ss.attr.fields.dpl != cs.attr.fields.dpl )
>> +        {
>> +            gprintk(XENLOG_ERR, "DPL of SS is different than DPL of CS\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( es.attr.fields.dpl > cs.attr.fields.dpl )
>> +        {
>> +            gprintk(XENLOG_ERR, "DPL of ES is greater than DPL of CS\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( ((regs->efer & EFER_LMA) && !(regs->efer & EFER_LME)) ||
>> +             ((regs->efer & EFER_LME) && !(regs->efer & EFER_LMA)) )
> 
> This simplifies to ( (!!(regs->efer & EFER_LMA)) ^ (!!(regs->efer &
> EFER_LME)) )
> 
>> +        {
>> +            gprintk(XENLOG_ERR, "EFER.LMA and EFER.LME must be both set\n");
> 
> And this should say "both the same", rather than both set.

I've fixed all the error messages to be more descriptive.

> Having said this, I still don't think it is sensible to require that LMA
> is set, seeing as it is strictly a read-only bit in EFER.  I would
> suggest keying on LME alone, and automatically ORing in LMA, which
> matches the behaviour of hardware more closely.

I've done as suggested and made LMA optional, Xen will set it by default
when LME is set by the user.

>> +            return -EINVAL;
>> +        }
>> +
>> +        uregs->rax    = regs->eax;
>> +        uregs->rcx    = regs->ecx;
>> +        uregs->rdx    = regs->edx;
>> +        uregs->rbx    = regs->ebx;
>> +        uregs->rsp    = regs->esp;
>> +        uregs->rbp    = regs->ebp;
>> +        uregs->rsi    = regs->esi;
>> +        uregs->rdi    = regs->edi;
>> +        uregs->rip    = regs->eip;
>> +        uregs->rflags = regs->eflags;
>> +
>> +        v->arch.hvm_vcpu.guest_cr[0] = regs->cr0;
>> +        v->arch.hvm_vcpu.guest_cr[3] = regs->cr3;
>> +        v->arch.hvm_vcpu.guest_cr[4] = regs->cr4;
>> +        v->arch.hvm_vcpu.guest_efer  = regs->efer;
>> +    }
>> +    break;
>> +
>> +    case VCPU_HVM_MODE_64B:
>> +    {
>> +        const struct vcpu_hvm_x86_64 *regs = &ctx->cpu_regs.x86_64;
>> +
>> +        /* Basic sanity checks. */
>> +        if ( !is_canonical_address(regs->rip) )
>> +        {
>> +            gprintk(XENLOG_ERR, "RIP contains a non-canonical address\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( !(regs->cr0 & X86_CR0_PG) )
>> +        {
>> +            gprintk(XENLOG_ERR, "CR0 doesn't have paging enabled\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( !(regs->cr4 & X86_CR4_PAE) )
>> +        {
>> +            gprintk(XENLOG_ERR, "CR4 doesn't have PAE enabled\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +        if ( (regs->efer & (EFER_LME | EFER_LMA)) != (EFER_LME | EFER_LMA) )
>> +        {
>> +            gprintk(XENLOG_ERR, "EFER doesn't have LME or LMA enabled\n");
>> +            return -EINVAL;
>> +        }
>> +
>> +        uregs->rax    = regs->rax;
>> +        uregs->rcx    = regs->rcx;
>> +        uregs->rdx    = regs->rdx;
>> +        uregs->rbx    = regs->rbx;
>> +        uregs->rsp    = regs->rsp;
>> +        uregs->rbp    = regs->rbp;
>> +        uregs->rsi    = regs->rsi;
>> +        uregs->rdi    = regs->rdi;
>> +        uregs->rip    = regs->rip;
>> +        uregs->rflags = regs->rflags;
>> +
>> +        v->arch.hvm_vcpu.guest_cr[0] = regs->cr0;
>> +        v->arch.hvm_vcpu.guest_cr[3] = regs->cr3;
>> +        v->arch.hvm_vcpu.guest_cr[4] = regs->cr4;
>> +        v->arch.hvm_vcpu.guest_efer  = regs->efer;
>> +
>> +#define SEG(b, l, a)                                                    \
>> +    (struct segment_register){ .sel = 0, .base = (b), .limit = (l),     \
>> +                               .attr.bytes = (a) }
>> +        cs = SEG(0, ~0u, 0xa9b); /* 64bit code segment. */
>> +        ds = ss = es = SEG(0, ~0u, 0xc93);
>> +        tr = SEG(0, 0x67, 0x8b); /* 64bit TSS (busy). */
>> +#undef SEG
> 
> I would be tempted to get rid of this macro entirely.  The other macro
> was to hide all the regs-> references, but this is entirely from constants.

IMHO I think it makes the code easier to understand, but I'm not going
to argue about it. Does anyone else has a preference whether to remove
the macro or not?

Roger.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.