[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 2/2] x86/dom0: attempt to fixup p2m page-faults for PVH dom0
On 14.02.2025 10:29, Roger Pau Monne wrote: > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -822,7 +822,8 @@ Specify the bit width of the DMA heap. > > ### dom0 > = List of [ pv | pvh, shadow=<bool>, verbose=<bool>, > - cpuid-faulting=<bool>, msr-relaxed=<bool> ] (x86) > + cpuid-faulting=<bool>, msr-relaxed=<bool>, > + pf-fixup=<bool> ] (x86) > > = List of [ sve=<integer> ] (Arm64) > > @@ -883,6 +884,19 @@ Controls for how dom0 is constructed on x86 systems. > > If using this option is necessary to fix an issue, please report a bug. > > +* The `pf-fixup` boolean is only applicable when using a PVH dom0 and > + defaults to false. > + > + When running dom0 in PVH mode the dom0 kernel has no way to map MMIO > + regions into the p2m, such mode relies on Xen dom0 builder populating > + the p2m with all MMIO regions that dom0 should access. However Xen > + doesn't have a complete picture of the host memory map, due to not > + being able to process ACPI dynamic tables. > + > + The `pf-fixup` option allows Xen to attempt to add missing MMIO regions > + to the p2m in response to page-faults generated by dom0 trying to access > + unpopulated entries in the p2m. I wonder if this is to implementation focused for a command line option doc. In particular the multiple uses of "p2m" are standing out in this regard. > --- a/xen/arch/x86/dom0_build.c > +++ b/xen/arch/x86/dom0_build.c > @@ -286,6 +286,10 @@ int __init parse_arch_dom0_param(const char *s, const > char *e) > opt_dom0_cpuid_faulting = val; > else if ( (val = parse_boolean("msr-relaxed", s, e)) >= 0 ) > opt_dom0_msr_relaxed = val; > +#ifdef CONFIG_HVM > + else if ( (val = parse_boolean("pf-fixup", s, e)) >= 0 ) > + opt_dom0_pf_fixup = val; > +#endif > else > return -EINVAL; I fear the scope of these sub-options is getting increasingly confusing. opt_dom0_msr_relaxed is what its name says - specific to Dom0. opt_dom0_cpuid_faulting, otoh, is a control domain option (i.e. also applicable to a [hypothetical?] late ctrldom). Now you add an option that's applicable to the hardware domain, i.e. also coverting late-hwdom. > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -338,8 +338,38 @@ static int hvmemul_do_io( > if ( !s ) > { > if ( is_mmio && is_hardware_domain(currd) ) > - gdprintk(XENLOG_DEBUG, "unhandled memory %s to %#lx size > %u\n", > - dir ? "read" : "write", addr, size); > + { > + /* > + * PVH dom0 is likely missing MMIO mappings on the p2m, due > to > + * the incomplete information Xen has about the memory > layout. > + * > + * Either print a message to note dom0 attempted to access an > + * unpopulated GPA, or try to fixup the p2m by creating an > + * identity mapping for the faulting GPA. > + */ > + if ( opt_dom0_pf_fixup ) > + { > + int inner_rc = hvm_hwdom_fixup_p2m(addr); Why not use rc, as we do elsewhere in the function? > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -13,6 +13,7 @@ > #include <xen/lib.h> > #include <xen/trace.h> > #include <xen/sched.h> > +#include <xen/iocap.h> > #include <xen/irq.h> > #include <xen/softirq.h> > #include <xen/domain.h> > @@ -5458,6 +5459,36 @@ int hvm_copy_context_and_params(struct domain *dst, > struct domain *src) > return rc; > } > > +bool __ro_after_init opt_dom0_pf_fixup; > +int hvm_hwdom_fixup_p2m(paddr_t addr) The placement here looks odd to me. Why not as static function in emulate.c? Or alternatively why not as p2m_hwdom_fixup() in mm/p2m.c? > +{ > + unsigned long gfn = paddr_to_pfn(addr); > + struct domain *currd = current->domain; > + p2m_type_t type; > + mfn_t mfn; > + int rc; > + > + ASSERT(is_hardware_domain(currd)); > + ASSERT(!altp2m_active(currd)); > + > + /* > + * Fixups are only applied for MMIO holes, and rely on the hardware > domain > + * having identity mappings for non RAM regions (gfn == mfn). > + */ > + if ( !iomem_access_permitted(currd, gfn, gfn) || > + !is_memory_hole(_mfn(gfn), _mfn(gfn)) ) > + return -EPERM; > + > + mfn = get_gfn(currd, gfn, &type); > + if ( !mfn_eq(mfn, INVALID_MFN) || !p2m_is_hole(type) ) > + rc = mfn_eq(mfn, _mfn(gfn)) ? 0 : -EEXIST; I understand this is to cover the case where two vCPU-s access the same GFN at about the same time. However, the "success" log message at the call site being debug-only means we may be silently hiding bugs in release builds, if e.g. we get here despite the GFN having had an identity mapping already for ages. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |