[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 2/2] x86/dom0: attempt to fixup p2m page-faults for PVH dom0
On 14.02.2025 13:38, Roger Pau Monné wrote: > On Fri, Feb 14, 2025 at 12:53:01PM +0100, Jan Beulich wrote: >> On 14.02.2025 10:29, Roger Pau Monne wrote: >>> --- a/xen/arch/x86/hvm/emulate.c >>> +++ b/xen/arch/x86/hvm/emulate.c >>> @@ -338,8 +338,38 @@ static int hvmemul_do_io( >>> if ( !s ) >>> { >>> if ( is_mmio && is_hardware_domain(currd) ) >>> - gdprintk(XENLOG_DEBUG, "unhandled memory %s to %#lx size >>> %u\n", >>> - dir ? "read" : "write", addr, size); >>> + { >>> + /* >>> + * PVH dom0 is likely missing MMIO mappings on the p2m, >>> due to >>> + * the incomplete information Xen has about the memory >>> layout. >>> + * >>> + * Either print a message to note dom0 attempted to access >>> an >>> + * unpopulated GPA, or try to fixup the p2m by creating an >>> + * identity mapping for the faulting GPA. >>> + */ >>> + if ( opt_dom0_pf_fixup ) >>> + { >>> + int inner_rc = hvm_hwdom_fixup_p2m(addr); >> >> Why not use rc, as we do elsewhere in the function? > > hvm_hwdom_fixup_p2m() returns an errno, while rc in this context > contains X86EMUL_ values. I could indeed re-use rc, it just felt > wrong to mix different error address spaces on the same variable. Hmm, yes, I see. >>> --- a/xen/arch/x86/hvm/hvm.c >>> +++ b/xen/arch/x86/hvm/hvm.c >>> @@ -13,6 +13,7 @@ >>> #include <xen/lib.h> >>> #include <xen/trace.h> >>> #include <xen/sched.h> >>> +#include <xen/iocap.h> >>> #include <xen/irq.h> >>> #include <xen/softirq.h> >>> #include <xen/domain.h> >>> @@ -5458,6 +5459,36 @@ int hvm_copy_context_and_params(struct domain *dst, >>> struct domain *src) >>> return rc; >>> } >>> >>> +bool __ro_after_init opt_dom0_pf_fixup; >>> +int hvm_hwdom_fixup_p2m(paddr_t addr) >> >> The placement here looks odd to me. Why not as static function in emulate.c? >> Or alternatively why not as p2m_hwdom_fixup() in mm/p2m.c? > > I don't have a strong opinion, if you are fine with it a static > function in emulate.c might be the best then. I'd be fine with either of the suggested options. mm/p2m.c is perhaps the more logical home for such a function, yet the option of having it static is quite appealing, too. Hence why I came to think of that one first. >>> +{ >>> + unsigned long gfn = paddr_to_pfn(addr); >>> + struct domain *currd = current->domain; >>> + p2m_type_t type; >>> + mfn_t mfn; >>> + int rc; >>> + >>> + ASSERT(is_hardware_domain(currd)); >>> + ASSERT(!altp2m_active(currd)); >>> + >>> + /* >>> + * Fixups are only applied for MMIO holes, and rely on the hardware >>> domain >>> + * having identity mappings for non RAM regions (gfn == mfn). >>> + */ >>> + if ( !iomem_access_permitted(currd, gfn, gfn) || >>> + !is_memory_hole(_mfn(gfn), _mfn(gfn)) ) >>> + return -EPERM; >>> + >>> + mfn = get_gfn(currd, gfn, &type); >>> + if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) ) >>> + rc = mfn_eq(mfn, _mfn(gfn)) ? 0 : -EEXIST; >> >> I understand this is to cover the case where two vCPU-s access the same GFN >> at about the same time. However, the "success" log message at the call site >> being debug-only means we may be silently hiding bugs in release builds, if >> e.g. we get here despite the GFN having had an identity mapping already for >> ages. > > Possibly, but what would be your suggestion to fix this? I will think > about it, but I can't immediately see a solution that's not simply to > make the message printed by the caller to be gprintk() instead of > gdprintk() so catch such bugs. Would you agree to that? My thinking was that it might be best to propagate a distinguishable error code (perhaps -EEXIST, with its present use then replaced) out of the function, and make the choice of gprintk() vs gdprintk() depend on that. Accompanied by a comment explaining things a little. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |