[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10 05/10] xen: Add vmware_port support



On 15/05/15 00:34, Don Slutz wrote:
> This includes adding is_vmware_port_enabled
>
> This is a new xen_arch_domainconfig flag,
> XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK.
>
> This enables limited support of VMware's hyper-call.
>
> This is both a more complete support then in currently provided by
> QEMU and/or KVM and less.  The missing part requires QEMU changes
> and has been left out until the QEMU patches are accepted upstream.
>
> VMware's hyper-call is also known as VMware Backdoor I/O Port.
>
> Note: this support does not depend on vmware_hw being non-zero.
>
> Summary is that VMware treats "in (%dx),%eax" (or "out %eax,(%dx)")
> to port 0x5658 specially.  Note: since many operations return data
> in EAX, "in (%dx),%eax" is the one to use.  The other lengths like
> "in (%dx),%al" will still do things, only AL part of EAX will be
> changed.  For "out %eax,(%dx)" of all lengths, EAX will remain
> unchanged.
>
> An open source example of using this is:
>
> http://open-vm-tools.sourceforge.net/
>
> Which only uses "inl (%dx)".  Also
>
> http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458
>
> Some of the best info is at:
>
> https://sites.google.com/site/chitchatvmback/backdoor
>
> Signed-off-by: Don Slutz <dslutz@xxxxxxxxxxx>
> ---
> v10:
>     Probably better as EOPNOTSUPP, as it is a configuration problem.
>     This function looks as if it should be static.
>     I would suggest putting vmport_register declaration in hvm.h ...
>     As indicated before, I don't think this is a good use case for a
>     domain creation flag.
>       Switch to the new config way.
>     struct domain *d => struct domain *currd
>     Are you sure you don't want to zero the high halves of 64-bit ...
>       Comment added.
>    Then just have this handled into the default case.
>       Reworked new_eax handling.
>    is_hvm_domain(currd)
>    And - why here rather than before the switch() or even right at the
>    start of the function?
>       Moved to start.
>    With that, is it really correct that OUT updates the other registers
>    just like IN? If so, this deserves a comment, so that readers won't
>    think this is in error.
>      All done in comment at start.
>
>
> v9:
>   Switch to x86_emulator to handle #GP code moved to next patch.
>     Can you explain why a HVM param isn't suitable here?
>       Issue with changing QEMU on the fly.
>       Andrew Cooper: My recommendation is still to use a creation flag
>         So no change.
>     Please move SVM's identical definition into ...
>       Did this as #1.  No longer needed, but since the patch was ready
>       I have included it.
>     --Lots of questions about code that no long is part of this patch. --
>     With this, is handling other than 32-bit in/out really
>     meaningful/correct?
>       Added comment about this.
>     Since you can't get here for PV, I can't see what you need this.
>       Changed to an ASSERT.
>     Why version 4?
>       Added comment about this.
>     -- Several questions about register changes.
>       Re-coded to use new_eax and set *val to this.
>       Change to generealy use reg->_e..
>     These ei1/ei2 checks belong in the callers imo -
>       Moved.
>     the "port" function parameter isn't even checked
>       Add check for exact match.
>     If dropping the code is safe without also forbidding the
>     combination of nested and VMware emulation.
>       Added the forbidding the combination of nested and VMware.
>       Mostly do to the cases of the nested virtual code is the one
>       to handle VMware stuff if needed, not the root one.  Also I am
>       having issues testing xen nested in xen and using hvm.
>
> v7:
>       More on AMD in the commit message.
>       Switch to only change 32bit part of registers, what VMware
>         does.
>     Too much logging and tracing.
>       Dropped a lot of it.  This includes vmport_debug=
>
> v6:
>       Dropped the attempt to use svm_nextrip_insn_length via
>       __get_instruction_length (added in v2).  Just always look
>       at upto 15 bytes on AMD.
>
> v5:
>       we should make sure that svm_vmexit_gp_intercept is not executed for
>       any other guest.
>         Added an ASSERT on is_vmware_port_enabled.
>       magic integers?
>         Added #define for them.
>       I am fairly certain that you need some brackets here.
>         Added brackets.
>
>  xen/arch/x86/domain.c             |   4 ++
>  xen/arch/x86/hvm/hvm.c            |   9 +++
>  xen/arch/x86/hvm/vmware/Makefile  |   1 +
>  xen/arch/x86/hvm/vmware/vmport.c  | 143 
> ++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/hvm/domain.h  |   3 +
>  xen/include/asm-x86/hvm/hvm.h     |   2 +
>  xen/include/public/arch-x86/xen.h |   4 ++
>  7 files changed, 166 insertions(+)
>  create mode 100644 xen/arch/x86/hvm/vmware/vmport.c
>
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index bc3d3a5..153048a 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -519,7 +519,11 @@ int arch_domain_create(struct domain *d, unsigned int 
> domcr_flags,
>          (domcr_flags & DOMCRF_hap);
>      d->arch.hvm_domain.mem_sharing_enabled = 0;
>      if ( config )
> +    {
>          d->arch.hvm_domain.vmware_hwver = config->vmware_hwver;
> +        d->arch.hvm_domain.is_vmware_port_enabled =
> +            !!(config->arch_flags & XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK);
> +    }

This hunk is also subject to my rebase request from patch 2.

>  
>      d->arch.s3_integrity = !!(domcr_flags & DOMCRF_s3_integrity);
>  
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 05c80e9..a179123 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1462,6 +1462,9 @@ int hvm_domain_initialise(struct domain *d)
>          goto fail1;
>      d->arch.hvm_domain.io_handler->num_slot = 0;
>  
> +    if ( d->arch.hvm_domain.is_vmware_port_enabled )
> +        vmport_register(d);
> +
>      if ( is_pvh_domain(d) )
>      {
>          register_portio_handler(d, 0, 0x10003, handle_pvh_io);
> @@ -5785,6 +5788,12 @@ static int hvmop_set_param(
>              break;
>          if ( a.value > 1 )
>              rc = -EINVAL;
> +        /* Prevent nestedhvm with vmport */
> +        if ( d->arch.hvm_domain.is_vmware_port_enabled )
> +        {
> +            rc = -EOPNOTSUPP;
> +            break;
> +        }
>          /*
>           * Remove the check below once we have
>           * shadow-on-shadow.
> diff --git a/xen/arch/x86/hvm/vmware/Makefile 
> b/xen/arch/x86/hvm/vmware/Makefile
> index 3fb2e0b..cd8815b 100644
> --- a/xen/arch/x86/hvm/vmware/Makefile
> +++ b/xen/arch/x86/hvm/vmware/Makefile
> @@ -1 +1,2 @@
>  obj-y += cpuid.o
> +obj-y += vmport.o
> diff --git a/xen/arch/x86/hvm/vmware/vmport.c 
> b/xen/arch/x86/hvm/vmware/vmport.c
> new file mode 100644
> index 0000000..08ddef9
> --- /dev/null
> +++ b/xen/arch/x86/hvm/vmware/vmport.c
> @@ -0,0 +1,143 @@
> +/*
> + * HVM VMPORT emulation
> + *
> + * Copyright (C) 2012 Verizon Corporation
> + *
> + * This file is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License Version 2 (GPLv2)
> + * as published by the Free Software Foundation.
> + *
> + * This file is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details. <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/lib.h>
> +#include <asm/hvm/hvm.h>
> +#include <asm/hvm/support.h>
> +
> +#include "backdoor_def.h"
> +
> +static int vmport_ioport(int dir, uint32_t port, uint32_t bytes, uint32_t 
> *val)
> +{
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +
> +    /*
> +     * While VMware expects only 32-bit in, they do support using
> +     * other sizes and out.  However they do require only the 1 port
> +     * and the correct value in eax.  Since some of the data
> +     * returned in eax is smaller the 32 bits and/or you only need
> +     * the other registers the dir and bytes do not need any
> +     * checking.  The caller will handle the bytes, and dir is
> +     * handled below for eax.
> +     */
> +    if ( port == BDOOR_PORT && regs->_eax == BDOOR_MAGIC )
> +    {
> +        uint32_t new_eax = ~0u;
> +        uint64_t value;
> +        struct vcpu *curr = current;
> +        struct domain *currd = curr->domain;
> +
> +        ASSERT(is_hvm_domain(currd));

You will not be getting here for a non HVM domain, but there is nothing
anti PVH here.  I would drop the assert.

> +        /*
> +         * VMware changes the other (non eax) registers ignoring dir
> +         * (IN vs OUT).  It also changes only the 32-bit part
> +         * leaving the high 32-bits unchanged, unlike what one would
> +         * expect to happen.
> +         */
> +        switch ( regs->_ecx & 0xffff )
> +        {
> +        case BDOOR_CMD_GETMHZ:
> +            new_eax = currd->arch.tsc_khz / 1000;
> +            break;

Newlines after break statements please.

> +        case BDOOR_CMD_GETVERSION:
> +            /* MAGIC */
> +            regs->_ebx = BDOOR_MAGIC;
> +            /* VERSION_MAGIC */
> +            new_eax = 6;
> +            /* Claim we are an ESX. VMX_TYPE_SCALABLE_SERVER */
> +            regs->_ecx = 2;
> +            break;
> +        case BDOOR_CMD_GETHWVERSION:
> +            /* vmware_hw */
> +            new_eax = currd->arch.hvm_domain.vmware_hwver;
> +            /*
> +             * Returning zero is not the best.  VMware was not at
> +             * all consistent in the handling of this command until
> +             * VMware hardware version 4.  So it is better to claim
> +             * 4 then 0.  This should only happen in strange configs.
> +             */
> +            if ( !new_eax )
> +                new_eax = 4;
> +            break;
> +        case BDOOR_CMD_GETHZ:
> +        {
> +            struct segment_register sreg;
> +
> +            hvm_get_segment_register(curr, x86_seg_ss, &sreg);
> +            if ( sreg.attr.fields.dpl == 0 )
> +            {
> +                value = currd->arch.tsc_khz * 1000;
> +                /* apic-frequency (bus speed) */
> +                regs->_ecx = 1000000000ULL / APIC_BUS_CYCLE_NS;
> +                /* High part of tsc-frequency */
> +                regs->_ebx = value >> 32;
> +                /* Low part of tsc-frequency */
> +                new_eax = value;
> +            }
> +            break;
> +        }
> +        case BDOOR_CMD_GETTIME:
> +            value = get_localtime_us(currd) -
> +                currd->time_offset_seconds * 1000000ULL;
> +            /* hostUsecs */
> +            regs->_ebx = value % 1000000UL;
> +            /* hostSecs */
> +            new_eax = value / 1000000ULL;
> +            /* maxTimeLag */
> +            regs->_ecx = 1000000;
> +            /* offset to GMT in minutes */
> +            regs->_edx = currd->time_offset_seconds / 60;
> +            break;
> +        case BDOOR_CMD_GETTIMEFULL:
> +            /* BDOOR_MAGIC */
> +            new_eax = BDOOR_MAGIC;
> +            value = get_localtime_us(currd) -
> +                currd->time_offset_seconds * 1000000ULL;
> +            /* hostUsecs */
> +            regs->_ebx = value % 1000000UL;
> +            /* hostSecs low 32 bits */
> +            regs->_edx = value / 1000000ULL;
> +            /* hostSecs high 32 bits */
> +            regs->_esi = (value / 1000000ULL) >> 32;
> +            /* maxTimeLag */
> +            regs->_ecx = 1000000;
> +            break;
> +        default:
> +            /* Let backing DM handle */
> +            return X86EMUL_UNHANDLEABLE;
> +        }
> +        if ( dir == IOREQ_READ )
> +            *val = new_eax;
> +    }
> +    else if ( dir == IOREQ_READ )
> +        *val = ~0u;
> +
> +    return X86EMUL_OKAY;
> +}
> +
> +void vmport_register(struct domain *d)
> +{
> +    register_portio_handler(d, BDOOR_PORT, 4, vmport_ioport);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-set-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-x86/hvm/domain.h 
> b/xen/include/asm-x86/hvm/domain.h
> index 1cb0311..d3166e4 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -124,6 +124,9 @@ struct hvm_domain {
>      spinlock_t             uc_lock;
>      bool_t                 is_in_uc_mode;
>  
> +    /* VMware backdoor port available */
> +    bool_t                 is_vmware_port_enabled;
> +
>      /* Pass-through */
>      struct hvm_iommu       hvm_iommu;
>  
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index 2965fbb..e76f612 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -522,6 +522,8 @@ extern bool_t opt_hvm_fep;
>  #define opt_hvm_fep 0
>  #endif
>  
> +void vmport_register(struct domain *d);
> +
>  #endif /* __ASM_X86_HVM_HVM_H__ */
>  
>  /*
> diff --git a/xen/include/public/arch-x86/xen.h 
> b/xen/include/public/arch-x86/xen.h
> index 5a5bad6..ccc5f1f 100644
> --- a/xen/include/public/arch-x86/xen.h
> +++ b/xen/include/public/arch-x86/xen.h
> @@ -262,8 +262,12 @@ struct arch_shared_info {
>  };
>  typedef struct arch_shared_info arch_shared_info_t;
>  
> +/* Enable use of vmware backdoor port. */
> +#define XEN_DOMCTL_CONFIG_VMWARE_PORT_BIT   0
> +#define XEN_DOMCTL_CONFIG_VMWARE_PORT_MASK  (1U << 
> XEN_DOMCTL_CONFIG_VMWARE_PORT_BIT)
>  struct xen_arch_domainconfig {
>      uint64_t vmware_hwver;
> +    uint64_t arch_flags;
>  };
>  
>  #endif /* !__ASSEMBLY__ */

Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>  I
have not paid any attention to the implementation of the backdoor port
handler, but everything else seems ok.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.