[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] x86/PoD: tie together P2M update and increment of entry count



On Wed, Mar 13, 2024 at 12:19 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
>
> On 13.03.2024 11:58, George Dunlap wrote:
> > On Tue, Mar 12, 2024 at 3:22 PM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >>
> >> When not holding the PoD lock across the entire region covering P2M
> >> update and stats update, the entry count - if to be incorrect at all -
> >> should indicate too large a value in preference to a too small one, to
> >> avoid functions bailing early when they find the count is zero. However,
> >> instead of moving the increment ahead (and adjust back upon failure),
> >> extend the PoD-locked region.
> >>
> >> Fixes: 99af3cd40b6e ("x86/mm: Rework locking in the PoD layer")
> >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> >> ---
> >> The locked region could be shrunk again, by having multiple unlock
> >> calls. But I think both ioreq_request_mapcache_invalidate() and
> >> domain_crash() are fair enough to call with the lock still held?
> >> ---
> >> v3: Extend locked region instead. Add Fixes: tag.
> >> v2: Add comments.
> >>
> >> --- a/xen/arch/x86/mm/p2m-pod.c
> >> +++ b/xen/arch/x86/mm/p2m-pod.c
> >> @@ -1348,16 +1348,22 @@ mark_populate_on_demand(struct domain *d
> >>          }
> >>      }
> >>
> >> +    /*
> >> +     * P2M update and stats increment need to collectively be under PoD 
> >> lock,
> >> +     * to prevent code elsewhere observing PoD entry count being zero 
> >> despite
> >> +     * there actually still being PoD entries (created by the 
> >> p2m_set_entry()
> >> +     * invocation below).
> >> +     */
> >> +    pod_lock(p2m);
> >> +
> >>      /* Now, actually do the two-way mapping */
> >>      rc = p2m_set_entry(p2m, gfn, INVALID_MFN, order,
> >>                         p2m_populate_on_demand, p2m->default_access);
> >>      if ( rc == 0 )
> >>      {
> >> -        pod_lock(p2m);
> >>          p2m->pod.entry_count += 1UL << order;
> >>          p2m->pod.entry_count -= pod_count;
> >>          BUG_ON(p2m->pod.entry_count < 0);
> >> -        pod_unlock(p2m);
> >>
> >>          ioreq_request_mapcache_invalidate(d);
> >>      }
> >> @@ -1373,6 +1379,8 @@ mark_populate_on_demand(struct domain *d
> >>          domain_crash(d);
> >>      }
> >>
> >> +    pod_unlock(p2m);
> >
> > We're confident that neither domain_crash() nor
> > ioreq_request_mapcache_invalidate() will grab any of the p2m locks?
>
> There's no doubt about ioreq_request_mapcache_invalidate(). domain_crash(),
> otoh, invokes show_execution_state(), which in principle would be nice to
> dump the guest stack among other things. My patch doing so was reverted, so
> right now there's no issue there. Plus any attempt to do so would need to
> be careful anyway regarding locks. But as you see it is not a clear cut no,
> so ...
>
> > If so,
> >
> > Reviewed-by: George Dunlap <george.dunlap@xxxxxxxxx>
>
> ... rather than taking this (thanks), maybe I indeed better follow the
> alternative outlined in the post-commit-message remark?

I keep missing your post-commit-message remarks due to the way I'm
applying your series.  Yes, that had occurred to me as well -- I don't
think this is a hot path, and I do think it would be good to avoid
laying a trap for future people wanting to change domain_crash(); in
particular as that would change a domain crash into either a host
crash or a potential deadlock.

I think I would go with multiple if statements, rather than multiple
unlock calls though.

 -George



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.