[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 03/11] nEPT: Implement guest ept's walker



At 01:57 +0800 on 11 Dec (1355191035), xiantao.zhang@xxxxxxxxx wrote:
> From: Zhang Xiantao <xiantao.zhang@xxxxxxxxx>
> 
> Implment guest EPT PT walker, some logic is based on shadow's
> ia32e PT walker. During the PT walking, if the target pages are
> not in memory, use RETRY mechanism and get a chance to let the
> target page back.
> 
> Signed-off-by: Zhang Xiantao <xiantao.zhang@xxxxxxxxx>

The design looks pretty good.  A few comments below on code details --
I think the only big one is that the ept walker shouldn't force eptes
into 'normal' pte types just so it can reuse the walk_t struct.

> @@ -88,10 +88,11 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, 
> int set_dirty)
>  
>  /* If the map is non-NULL, we leave this function having 
>   * acquired an extra ref on mfn_to_page(*mfn) */
> -static inline void *map_domain_gfn(struct p2m_domain *p2m,
> +void *map_domain_gfn(struct p2m_domain *p2m,
>                                     gfn_t gfn, 
>                                     mfn_t *mfn,
>                                     p2m_type_t *p2mt,
> +                                   p2m_query_t *q,

I think this should just be a plain p2m_query_t and not a pointer to
one; the code below only dereferences the pointer to read it.  That will
save you having a variable just to hold 'P2M_ALLOC | P2M_UNSHARE' in a
few places below.

> --- /dev/null
> +++ b/xen/arch/x86/mm/hap/nested_ept.c
> +/* For EPT's walker reserved bits and EMT check  */
> +#define EPT_MUST_RSV_BITS (((1ull << PADDR_BITS) -1) & \
> +                     ~((1ull << paddr_bits) - 1))
> +
> +
> +#define EPT_EMT_WB  6
> +#define EPT_EMT_UC  0

These two definitions should be in vmx.h along with the other
architectural constants for EPTEs.

> +
> +#define NEPT_VPID_CAP_BITS 0
> +
> +#define NEPT_1G_ENTRY_FLAG (1 << 11)
> +#define NEPT_2M_ENTRY_FLAG (1 << 10)
> +#define NEPT_4K_ENTRY_FLAG (1 << 9)
> +
> +/* Always expose 1G and 2M capability to guest, 
> + so don't need additional check */
> +bool_t nept_sp_entry(uint64_t entry)
> +{
> +    return !!(entry & EPTE_SUPER_PAGE_MASK);
> +}
> +
> +static bool_t nept_rsv_bits_check(uint64_t entry, uint32_t level)
> +{
> +    uint64_t rsv_bits = EPT_MUST_RSV_BITS;
> +
> +    switch ( level ){
> +        case 1:
> +            break;
> +        case 2 ... 3:
> +            if (nept_sp_entry(entry))
> +                rsv_bits |=  ((1ull << (9 * (level -1 ))) -1) << PAGE_SHIFT;
> +            else
> +                rsv_bits |= 0xfull << 3;

Please use EPTE_EMT_MASK rather than open-coding it.

> +            break;
> +        case 4:
> +        rsv_bits |= 0xf8;

Again, please use EPTE_EMT_MASK | EPTE_IGMT_MASK | EPTE_SUPER_PAGE_MASK.

> +        break;
> +        default:
> +            printk("Unsupported EPT paging level: %d\n", level);
> +    }
> +    if ( ((entry & rsv_bits) ^ rsv_bits) == rsv_bits )
> +        return 0;

This XOR is useful in the normal walker because we care about _which_
bits are wrong.  Here, you can just return !(entry & rsv_bits) for the
same result.

> +    return 1;
> +}
> +
> +/* EMT checking*/
> +static bool_t nept_emt_bits_check(uint64_t entry, uint32_t level)
> +{
> +    ept_entry_t e;
> +    e.epte = entry;
> +    if ( e.sp || level == 1 ) {
> +        if ( e.emt == 2 || e.emt == 3 || e.emt == 7 )
> +            return 1;

Please define more of the EPT_EMT_* constants for these values and use
them.

> +    }
> +    return 0;
> +}
> +
> +static bool_t nept_rwx_bits_check(uint64_t entry) {
> +    /*write only or write/execute only*/
> +    uint8_t rwx_bits = entry & 0x7;
> +
> +    if ( rwx_bits == 2 || rwx_bits == 6)
> +        return 1;
> +    if ( rwx_bits == 4 && !(NEPT_VPID_CAP_BITS &
> +                        VMX_EPT_EXEC_ONLY_SUPPORTED))

Please pass the entry as an ept_entry_t and check the named r, w and x
fields rather than using magic numbers. 

> +        return 1;
> +    return 0;
> +}
> +
> +/* nept's misconfiguration check */
> +static bool_t nept_misconfiguration_check(uint64_t entry, uint32_t level)
> +{
> +    return (nept_rsv_bits_check(entry, level) ||
> +                nept_emt_bits_check(entry, level) ||
> +                nept_rwx_bits_check(entry));
> +}
> +
> +static bool_t nept_present_check(uint64_t entry)
> +{
> +    if (entry & 0x7)

Again, please pass an ept_entry_t and check the r/w/x fields. 

> +        return 1;
> +    return 0;
> +}
> +
> +uint64_t nept_get_ept_vpid_cap(void)
> +{
> +    /*TODO: exposed ept and vpid features*/

This TODO comment doesn't get removed later in the series.  Is returning
0 here always OK?

> +    return NEPT_VPID_CAP_BITS;
> +}
> +
> +static uint32_t
> +nept_walk_tables(struct vcpu *v, unsigned long l2ga, walk_t *gw)
> +{
> +    p2m_type_t p2mt;
> +    uint32_t rc = 0, ret = 0, gflags;
> +    struct domain *d = v->domain;
> +    struct p2m_domain *p2m = d->arch.p2m;
> +    gfn_t base_gfn = _gfn(nhvm_vcpu_p2m_base(v) >> PAGE_SHIFT);
> +    p2m_query_t qt = P2M_ALLOC;
> +
> +    guest_l1e_t *l1p = NULL;
> +    guest_l2e_t *l2p = NULL;
> +    guest_l3e_t *l3p = NULL;
> +    guest_l4e_t *l4p = NULL;

These aren't realy guest_l*es, so I think you should use ept_entry_t *
to point to them.  While you're at it, why not define an equivalent
ept_walk_t struct that uses the ept-specific types instead of putting
EPT entries in a normal walk_t?

Also, unlike the normal guest walker, you don't need to hold these maps
open for writing A/D bits, so you could just use a single pointer and
unmap as you go.

> +    sp = nept_sp_entry(gw->l3e.l3);
> +    /* Super 1G entry */
> +    if ( sp )
> +    {
> +        /* Generate a fake l1 table entry so callers don't all 
> +         * have to understand superpages. */

You only have one caller for this function, and it does understand
superpages -- it explicitly check for them.  So I think you can avoid
this part altogether (likewise for 2M superpages) and just DTRT in the
caller.

Cheers,

Tim.

> +        gfn_t start = guest_l3e_get_gfn(gw->l3e);
> +
> +        /* Increment the pfn by the right number of 4k pages. */
> +        start = _gfn((gfn_x(start) & ~GUEST_L3_GFN_MASK) +
> +                     ((l2ga >> PAGE_SHIFT) & GUEST_L3_GFN_MASK));
> +        gflags = (gw->l3e.l3 & 0x7f) | NEPT_1G_ENTRY_FLAG;
> +        gw->l1e = guest_l1e_from_gfn(start, gflags);
> +        gw->l2mfn = gw->l1mfn = _mfn(INVALID_MFN);
> +        goto done;
> +    }
> +

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.