Xen project Mailing List

Re: [Xen-devel] [PATCH] x86/nested-hap: Fix handling of L0_ERROR

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Tue, 19 Nov 2019 12:13:56 +0100

Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Juergen Gross <jgross@xxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Tue, 19 Nov 2019 11:14:20 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 18.11.2019 19:15, Andrew Cooper wrote: > When nestedhvm_hap_nested_page_fault() returns L0_ERROR, > hvm_hap_nested_page_fault() operates on the adjusted gpa. However, it > operates with the original npfec, which is no longer be correct. Nit: Perhaps "may" instead of "is"? > In particular, it is possible to get a nested fault where the translation is > not present in L12 (and therefore L02), while it is present in L01. I'm afraid I don't see the connection to the issue at hand, where we have a page present in both L01 and L12, just not in L02. And there's also no L0_ERROR here - both the initial (propagation) and the subsequent (live-locking) exits report DONE according to what I thought was the outcome of yesterday's discussion on irc. I take it you imply that L0_ERROR would need raising (as per the auxiliary code fragment adding the "(access_x && *page_order)" check), but I wonder whether that would really be correct. This depends on what L0_ERROR really is supposed to mean: An error because of actual L0 settings (x=0 in our case), or an error because of intended L0 settings (x=1 in our case). After all a violation of just the p2m_access (which also affects r/w/x) doesn't get reported by nestedhap_walk_L0_p2m() as L0_ERROR either (and hence would, as it seems to me, lead to a similar live lock). Therefore I wonder whether your initial idea of doing the shattering right here wouldn't be the better course of action. nestedhap_fix_p2m() could either install the large page and then shatter it right away, or it could install just the individual small page. Together with the different npfec adjustment model suggested below (leading to npfec.present to also get updated in the DONE case) a similar "insn-fetch && present" conditional (to that introduced for XSA-304) could then be used there. Even better - by making the violation checking around the original XSA-304 addition a function (together with the 304 addition), such a function might then be reusable here. This might then address the p2m_access related live lock as well. > When handling an L0_ERROR, adjust npfec as well as gpa. The gpa adjustment referred to here is not in nestedhap_walk_L0_p2m() but in nestedhvm_hap_nested_page_fault(), if I'm not mistaken? > @@ -181,6 +180,18 @@ nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t > L1_gpa, paddr_t *L0_gpa, > *L0_gpa = (mfn_x(mfn) << PAGE_SHIFT) + (L1_gpa & ~PAGE_MASK); > out: > __put_gfn(p2m, L1_gpa >> PAGE_SHIFT); > + > + /* > + * When reporting L0_ERROR, rewrite nfpec to match what would have > occured > + * if hardware had walked the L0, rather than the combined L02. > + */ > + if ( rc == NESTEDHVM_PAGEFAULT_L0_ERROR ) > + { > + npfec->present = !mfn_eq(mfn, INVALID_MFN); To be in line with the conditional a few lines up from here, wouldn't this better be !mfn_valid(mfn)? Should there ever be a case to clear the flag when it was set? If a mapping has gone away between the time the exit condition was detected and the time we re-evaluate things here, I think it should still report "present" back to the caller. Taking both remarks together I'm thinking of if ( mfn_valid(mfn) ) npfec->present = 1; > + npfec->gla_valid = 0; For this, one the question is whose linear address is meant here. If it's L2's, then it shouldn't be cleared. If it's L1's, then it would seem to me that it should have been avoided to set the field, or at least it should have been cleared the moment we're past L12 handling. And then there is the question of overall flow here. On the basis of npfec not being of any interest anymore to the caller's caller if reporting back DONE (but as per far above it might help our immediate caller) I wonder whether static int nestedhap_walk_L0_p2m(struct p2m_domain *p2m, paddr_t L1_gpa, paddr_t *L0_gpa, p2m_type_t *p2mt, p2m_access_t *p2ma, unsigned int *page_order, struct npfec *npfec) { mfn_t mfn; int rc; /* walk L0 P2M table */ mfn = get_gfn_type_access(p2m, L1_gpa >> PAGE_SHIFT, p2mt, p2ma, 0, page_order); rc = NESTEDHVM_PAGEFAULT_DIRECT_MMIO; if ( *p2mt == p2m_mmio_direct ) goto direct_mmio_out; rc = NESTEDHVM_PAGEFAULT_MMIO; if ( *p2mt == p2m_mmio_dm ) goto out; rc = NESTEDHVM_PAGEFAULT_L0_ERROR; /* * When reporting L0_ERROR, rewrite nfpec to match what would have occurred * if hardware had walked the L0, rather than the combined L02. */ npfec->gla_valid = 0; npfec->kind = npfec_kind_unknown; if ( !mfn_valid(mfn) ) goto out; npfec->present = 1; if ( npfec->write_access && p2m_is_readonly(*p2mt) ) goto out; if ( p2m_is_paging(*p2mt) || p2m_is_shared(*p2mt) || !p2m_is_ram(*p2mt) ) goto out; rc = NESTEDHVM_PAGEFAULT_DONE; direct_mmio_out: *L0_gpa = (mfn_x(mfn) << PAGE_SHIFT) + (L1_gpa & ~PAGE_MASK); out: __put_gfn(p2m, L1_gpa >> PAGE_SHIFT); return rc; } wouldn't be preferable. Plus I notice that neither your nor my variant take care of the NESTEDHVM_PAGEFAULT_DIRECT_MMIO case (where "present" would also want to become set afaict, and I guess the other two npfec adjustments would also be applicable). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.