[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 2/3] pirq_cleanup_check() leaks



On Mon, Jul 01, 2024 at 11:47:34AM +0200, Jan Beulich wrote:
> On 01.07.2024 10:55, Roger Pau Monné wrote:
> > On Thu, Jul 27, 2023 at 09:38:29AM +0200, Jan Beulich wrote:
> >> Its original introduction had two issues: For one the "common" part of
> >> the checks (carried out in the macro) was inverted.
> > 
> > Is the current logic in evtchn_close() really malfunctioning?
> 
> First: I'm getting the impression that this entire comment doesn't relate
> to the part of the description above, but to the 2nd paragraph further
> down. Otherwise I'm afraid I may not properly understand your question,
> and hence my response below may not make any sense at all.
> 
> > pirq->evtchn = 0;
> > pirq_cleanup_check(pirq, d1); <- cleanup for PV domains
> > if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) > 0 )
> >     unmap_domain_pirq_emuirq(d1, pirq->pirq); <- cleanup for HVM domains
> > 
> > It would seem to me the pirq_cleanup_check() call just after setting
> > evtchn = 0 was done to account for PV domains, while the second
> > (hidden) pirq_cleanup_check() call in unmap_domain_pirq_emuirq() would
> > do the cleanup for HVM domains.
> > 
> > Maybe there's something that I'm missing, I have to admit the PIRQ
> > logic is awfully complicated, even more when we mix the HVM PIRQ
> > stuff.
> 
> If you look at pirq_cleanup_check() you'll notice that it takes care
> of one HVM case as well (the not emuirq one, i.e. particularly PVH,
> but note also how physdev_hvm_map_pirq() calls map_domain_emuirq_pirq()
> only conditionally). Plus the crucial aspect of the 2nd paragraph of
> the description is that past calling pirq_cleanup_check() it is not
> really valid anymore to (blindly) de-reference the struct pirq pointer
> we hold in hands. The is_hvm_domain() qualification wasn't enough,
> since - as said - it's only one of the possibilities that would allow
> the pirq to remain legal to use past the call, when having taken the
> function's
> 
>         if ( pirq->arch.hvm.emuirq != IRQ_UNBOUND )
>             return;
> 
> path. A 2nd would be taking the
> 
>         if ( !pt_pirq_cleanup_check(&pirq->arch.hvm.dpci) )
>             return;
> 
> path (i.e. a still in use pass-through IRQ), but the 3rd would still
> end in the struct pirq being purged even for HVM.

Right, I was missing that if pirq is properly freed then further
usages of it after the pirq_cleanup_check() would be use after free.

> >> And then after
> >> removal from the radix tree the structure wasn't scheduled for freeing.
> >> (All structures still left in the radix tree would be freed upon domain
> >> destruction, though.)
> > 
> > So if my understanding is correct, we didn't have a leak due to the
> > missing free_pirq_struct() because the inverted check in
> > pirq_cleanup_check() macro prevented the removal from the radix tree,
> > and so stale entries would be left there and freed at domain
> > destruction?
> 
> That's the understanding I had come to, yes. What I wasn't entirely
> sure about (see the 2nd post-commit-message remark) is why the entry
> being left in the radix tree never caused any problems. Presumably
> that's a result of pirq_get_info() first checking whether an entry is
> already there, allocating a new one only for previously empty slots.

Yes, I came to the same conclusion, that not freeing wasn't an issue
as Xen would re-use the old entry.  Hopefully it's clean enough to not
cause issues when re-using.

> >> --- a/xen/common/event_channel.c
> >> +++ b/xen/common/event_channel.c
> >> @@ -711,9 +711,10 @@ int evtchn_close(struct domain *d1, int
> >>              if ( !is_hvm_domain(d1) )
> >>                  pirq_guest_unbind(d1, pirq);
> >>              pirq->evtchn = 0;
> >> -            pirq_cleanup_check(pirq, d1);
> >> -            if ( is_hvm_domain(d1) && domain_pirq_to_irq(d1, pirq->pirq) 
> >> > 0 )
> >> -                unmap_domain_pirq_emuirq(d1, pirq->pirq);
> >> +            if ( !is_hvm_domain(d1) ||
> >> +                 domain_pirq_to_irq(d1, pirq->pirq) <= 0 ||
> >> +                 unmap_domain_pirq_emuirq(d1, pirq->pirq) < 0 )
> > 
> > pirq_cleanup_check() already calls pirq_cleanup_check() itself.  Could
> > you please add a comment to note that unmap_domain_pirq_emuirq()
> > succeeding implies the call to pirq_cleanup_check() has already been
> > done?
> > 
> > Otherwise the logic here looks unbalanced by skipping the
> > pirq_cleanup_check() when unmap_domain_pirq_emuirq() succeeds.
> 
> Sure, added:
> 
>                 /*
>                  * The successful path of unmap_domain_pirq_emuirq() will have
>                  * called pirq_cleanup_check() already.
>                  */

With that added:

Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

> >> --- a/xen/include/xen/irq.h
> >> +++ b/xen/include/xen/irq.h
> >> @@ -158,7 +158,7 @@ extern struct pirq *pirq_get_info(struct
> >>  void pirq_cleanup_check(struct pirq *, struct domain *);
> >>  
> >>  #define pirq_cleanup_check(pirq, d) \
> >> -    ((pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
> >> +    (!(pirq)->evtchn ? pirq_cleanup_check(pirq, d) : (void)0)
> > 
> > Not that you need to fix it here, but why not place this check in
> > pirq_cleanup_check() itself?
> 
> See the first of the post-commit-message remarks: The goal was to not
> require every arch to replicate that check. At the time it wasn't
> clear (to me at least) that the entire concept of pIRQ would likely
> remain an x86 special thing anyway.

Anyway, such change would better be done in a separate commit anyway.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.