[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/nested-hap: Fix handling of L0_ERROR



On 18.11.2019 19:15, Andrew Cooper wrote:
> When nestedhvm_hap_nested_page_fault() returns L0_ERROR,
> hvm_hap_nested_page_fault() operates on the adjusted gpa.  However, it
> operates with the original npfec, which is no longer be correct.

Nit: Perhaps "may" instead of "is"?

> In particular, it is possible to get a nested fault where the translation is
> not present in L12 (and therefore L02), while it is present in L01.

I'm afraid I don't see the connection to the issue at hand, where
we have a page present in both L01 and L12, just not in L02. And
there's also no L0_ERROR here - both the initial (propagation) and
the subsequent (live-locking) exits report DONE according to what
I thought was the outcome of yesterday's discussion on irc.

I take it you imply that L0_ERROR would need raising (as per the
auxiliary code fragment adding the "(access_x && *page_order)"
check), but I wonder whether that would really be correct. This
depends on what L0_ERROR really is supposed to mean: An error
because of actual L0 settings (x=0 in our case), or an error
because of intended L0 settings (x=1 in our case). After all a
violation of just the p2m_access (which also affects r/w/x)
doesn't get reported by nestedhap_walk_L0_p2m() as L0_ERROR
either (and hence would, as it seems to me, lead to a similar
live lock).

Therefore I wonder whether your initial idea of doing the
shattering right here wouldn't be the better course of action.
nestedhap_fix_p2m() could either install the large page and then
shatter it right away, or it could install just the individual
small page. Together with the different npfec adjustment model
suggested below (leading to npfec.present to also get updated in
the DONE case) a similar "insn-fetch && present" conditional (to
that introduced for XSA-304) could then be used there.

Even better - by making the violation checking around the
original XSA-304 addition a function (together with the 304
addition), such a function might then be reusable here. This
might then address the p2m_access related live lock as well.

> When handling an L0_ERROR, adjust npfec as well as gpa.

The gpa adjustment referred to here is not in nestedhap_walk_L0_p2m()
but in nestedhvm_hap_nested_page_fault(), if I'm not mistaken?

> @@ -181,6 +180,18 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t 
> L1_gpa, paddr_t *L0_gpa,
>      *L0_gpa = (mfn_x(mfn) << PAGE_SHIFT) + (L1_gpa & ~PAGE_MASK);
>  out:
>      __put_gfn(p2m, L1_gpa >> PAGE_SHIFT);
> +
> +    /*
> +     * When reporting L0_ERROR, rewrite nfpec to match what would have 
> occured
> +     * if hardware had walked the L0, rather than the combined L02.
> +     */
> +    if ( rc == NESTEDHVM_PAGEFAULT_L0_ERROR )
> +    {
> +        npfec->present = !mfn_eq(mfn, INVALID_MFN);

To be in line with the conditional a few lines up from here,
wouldn't this better be !mfn_valid(mfn)?

Should there ever be a case to clear the flag when it was set? If
a mapping has gone away between the time the exit condition was
detected and the time we re-evaluate things here, I think it
should still report "present" back to the caller. Taking both
remarks together I'm thinking of

        if ( mfn_valid(mfn) )
            npfec->present = 1;

> +        npfec->gla_valid = 0;

For this, one the question is whose linear address is meant here.
If it's L2's, then it shouldn't be cleared. If it's L1's, then
it would seem to me that it should have been avoided to set the
field, or at least it should have been cleared the moment we're
past L12 handling.

And then there is the question of overall flow here. On the basis
of npfec not being of any interest anymore to the caller's caller
if reporting back DONE (but as per far above it might help our
immediate caller) I wonder whether

static int
nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t L1_gpa, paddr_t *L0_gpa,
                      p2m_type_t *p2mt, p2m_access_t *p2ma,
                      unsigned int *page_order, struct npfec *npfec)
{
    mfn_t mfn;
    int rc;

    /* walk L0 P2M table */
    mfn = get_gfn_type_access(p2m, L1_gpa >> PAGE_SHIFT, p2mt, p2ma,
                              0, page_order);

    rc = NESTEDHVM_PAGEFAULT_DIRECT_MMIO;
    if ( *p2mt == p2m_mmio_direct )
        goto direct_mmio_out;
    rc = NESTEDHVM_PAGEFAULT_MMIO;
    if ( *p2mt == p2m_mmio_dm )
        goto out;

    rc = NESTEDHVM_PAGEFAULT_L0_ERROR;
    /*
     * When reporting L0_ERROR, rewrite nfpec to match what would have occurred
     * if hardware had walked the L0, rather than the combined L02.
     */
    npfec->gla_valid = 0;
    npfec->kind = npfec_kind_unknown;

    if ( !mfn_valid(mfn) )
        goto out;

    npfec->present = 1;

    if ( npfec->write_access && p2m_is_readonly(*p2mt) )
        goto out;

    if ( p2m_is_paging(*p2mt) || p2m_is_shared(*p2mt) || !p2m_is_ram(*p2mt) )
        goto out;

    rc = NESTEDHVM_PAGEFAULT_DONE;
 direct_mmio_out:
    *L0_gpa = (mfn_x(mfn) << PAGE_SHIFT) + (L1_gpa & ~PAGE_MASK);
 out:
    __put_gfn(p2m, L1_gpa >> PAGE_SHIFT);
    return rc;
}

wouldn't be preferable.

Plus I notice that neither your nor my variant take care of the
NESTEDHVM_PAGEFAULT_DIRECT_MMIO case (where "present" would also
want to become set afaict, and I guess the other two npfec
adjustments would also be applicable).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.