|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3 1/2] IOMMU/x86: disallow device assignment to PoD guests
On Thu, Feb 03, 2022 at 10:21:54AM +0100, Jan Beulich wrote:
> On 03.02.2022 10:04, Roger Pau Monné wrote:
> > On Thu, Feb 03, 2022 at 09:31:03AM +0100, Jan Beulich wrote:
> >> On 02.02.2022 17:13, Roger Pau Monné wrote:
> >>> On Tue, Jan 04, 2022 at 10:41:32AM +0100, Jan Beulich wrote:
> >>>> @@ -359,7 +360,10 @@ p2m_pod_set_mem_target(struct domain *d,
> >>>>
> >>>> ASSERT( pod_target >= p2m->pod.count );
> >>>>
> >>>> - ret = p2m_pod_set_cache_target(p2m, pod_target, 1/*preemptible*/);
> >>>> + if ( has_arch_pdevs(d) || cache_flush_permitted(d) )
> >>>
> >>> Is it possible to have cache flush allowed without any PCI device
> >>> assigned? AFAICT the iomem/ioport_caps would only get setup when there
> >>> are device passed through?
> >>
> >> One can assign MMIO or ports to a guest the raw way. That's not
> >> secure, but functionally explicitly permitted.
> >>
> >>> TBH I would be fine if we just say that PoD cannot be used in
> >>> conjunction with an IOMMU, and just check for is_iommu_enable(d) here.
> >>>
> >>> I understand it's technically possible for PoD to be used together
> >>> with a domain that will later get a device passed through once PoD is
> >>> no longer in use, but I doubt there's much value in supporting that
> >>> use case, and I fear we might be introducing corner cases that could
> >>> create issues in the future. Overall I think it would be safer to just
> >>> disable PoD in conjunction with an IOMMU.
> >>
> >> I consider it wrong to put in place such a restriction, but I could
> >> perhaps accept you and Andrew thinking this way if this was the only
> >> aspect playing into here. However, this would then want an equivalent
> >> tools side check, and while hunting down where to make the change as
> >> done here, I wasn't able to figure out where that alternative
> >> adjustment would need doing. Hence I would possibly(!) buy into this
> >> only if someone else took care of doing so properly in the tool stack
> >> (including the emission of a sensible error message).
> >
> > What about the (completely untested) chunk below:
> >
> > diff --git a/tools/libs/light/libxl_create.c
> > b/tools/libs/light/libxl_create.c
> > index d7a40d7550..e585ef4c5c 100644
> > --- a/tools/libs/light/libxl_create.c
> > +++ b/tools/libs/light/libxl_create.c
> > @@ -1160,17 +1160,16 @@ int libxl__domain_config_setdefault(libxl__gc *gc,
> > pod_enabled = (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV) &&
> > (d_config->b_info.target_memkb < d_config->b_info.max_memkb);
> >
> > - /* We cannot have PoD and PCI device assignment at the same time
> > + /* We cannot have PoD and an active IOMMU at the same time
> > * for HVM guest. It was reported that IOMMU cannot work with PoD
> > * enabled because it needs to populated entire page table for
> > - * guest. To stay on the safe side, we disable PCI device
> > - * assignment when PoD is enabled.
> > + * guest.
> > */
> > if (d_config->c_info.type != LIBXL_DOMAIN_TYPE_PV &&
> > - d_config->num_pcidevs && pod_enabled) {
> > + d_config->c_info.passthrough != LIBXL_PASSTHROUGH_DISABLED &&
> > + pod_enabled) {
> > ret = ERROR_INVAL;
> > - LOGD(ERROR, domid,
> > - "PCI device assignment for HVM guest failed due to PoD
> > enabled");
> > + LOGD(ERROR, domid, "IOMMU cannot be enabled together with PoD");
> > goto error_out;
> > }
>
> Perhaps. Seeing this I actually recall coming across this check during
> my investigation. Not changing it along the lines of what you do was
> then really more because of me not being convinced of the extra
> restriction; I clearly misremembered when writing the earlier reply.
> If we were to do what you suggest, I'd like to ask that the comment be
> changed differently, though: "We cannot ..." then isn't really true
> anymore. We choose not to permit this mode; "cannot" only applies to
> actual device assignment (and of course only as long as there aren't
> restartable IOMMU faults).
I'm fine with an adjusted wording here. This was mostly a placement
suggestion, but I didn't gave much thought to the error message.
> >> Finally this still leaves out the "raw MMIO / ports" case mentioned
> >> above.
> >
> > But the raw MMIO 'mode' doesn't care much about PoD, because if
> > there's no PCI device assigned there's no IOMMU setup, and thus such
> > raw MMIO regions (could?) belong to a device that's not constrained by
> > the guest p2m anyway?
>
> Hmm, yes, true.
>
> >>>> --- a/xen/common/vm_event.c
> >>>> +++ b/xen/common/vm_event.c
> >>>> @@ -639,7 +639,7 @@ int vm_event_domctl(struct domain *d, st
> >>>>
> >>>> rc = -EXDEV;
> >>>> /* Disallow paging in a PoD guest */
> >>>> - if ( p2m_pod_entry_count(p2m_get_hostp2m(d)) )
> >>>> + if ( p2m_pod_active(d) )
> >>>
> >>> Isn't it fine to just check for entry_count like you suggest in the
> >>> change to libxl?
> >>
> >> I didn't think it would be, but I'm not entirely sure: If paging was
> >> enabled before a guest actually starts, it wouldn't have any entries
> >> but still be a PoD guest if it has a non-empty cache. The VM event
> >> folks may be able to clarify this either way. But ...
> >>
> >>> This is what p2m_pod_entry_count actually does (rather than entry_count |
> >>> count).
> >>
> >> ... you really mean "did" here, as I'm removing p2m_pod_entry_count()
> >> in this patch. Of course locking could be added to it instead (or
> >> p2m_pod_get_mem_target() be used in its place), but I'd prefer if we
> >> could go with just the check which precisely matches what the comment
> >> says (IOW otherwise I'd need to additionally know what exactly the
> >> comment is to say).
> >
> > Could you briefly mention this in the commit message? Ie: VM event
> > code is also adjusted to make sure PoD is not in use and cannot be
> > used during the guest lifetime?
>
> I've added
>
> "Nor was the use of that function in line with the immediately preceding
> comment: A PoD guest isn't just one with a non-zero entry count, but
> also one with a non-empty cache (e.g. prior to actually launching the
> guest)."
>
> to the already existing paragraph about the removal of that function.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |