[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/irq: Skip unmap_domain_pirq XSM during destruction


  • To: Jan Beulich <jbeulich@xxxxxxxx>, Jason Andryuk <jandryuk@xxxxxxxxx>
  • From: "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>
  • Date: Tue, 5 Apr 2022 09:13:37 -0400
  • Arc-authentication-results: i=1; mx.zohomail.com; dkim=pass header.i=apertussolutions.com; spf=pass smtp.mailfrom=dpsmith@xxxxxxxxxxxxxxxxxxxx; dmarc=pass header.from=<dpsmith@xxxxxxxxxxxxxxxxxxxx>
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1649164443; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To; bh=rrO5+QU9QPNy/h5i5VmmGW/q+AsY/3yLxFeX1M9sdHY=; b=lZ60eJL91SkAescWg8PYFneb+YpL9CUGtf6HNnSzl07AfSUFf8bAKcP5CSq1/d2kYhT1dROiJFcF64ZxCp7nYE4Tc/iXTby4lFahrlokEjEqrze/YnXFaiNn81ln3J6Er0hMcsJ6V4jQ8SA357pdiTtM4qpg0Ho5ha7hnAj7TQg=
  • Arc-seal: i=1; a=rsa-sha256; t=1649164443; cv=none; d=zohomail.com; s=zohoarc; b=mg5267UnM7WXh/N70WrNDHnpd4W+YjTEY1qM92V7y2Nkpzx9ewR+UU2vdVqVibCrPte3K1Ara6OFOiCI1dLLn9/rjCeI1+hkfHICv3TzxWsuVcsnFgHI2wgcyl8BGD+XE0shzANGTZ0easqOU4JyU1tUB0hiyCD4ZptxoFYB/Pg=
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 05 Apr 2022 13:14:28 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 4/5/22 04:18, Jan Beulich wrote:
> On 30.03.2022 20:17, Jason Andryuk wrote:
>> xsm_unmap_domain_irq was seen denying unmap_domain_pirq when called from
>> complete_domain_destroy as an RCU callback.  The source context was an
>> unexpected, random domain.  Since this is a xen-internal operation,
>> we don't want the XSM hook denying the operation.
>>
>> Check d->is_dying and skip the check when the domain is dead.  The RCU
>> callback runs when a domain is in that state.
> 
> One question which has always been puzzling me (perhaps to Daniel): While
> I can see why mapping of an IRQ needs to be subject to an XSM check, it's
> not really clear to me why unmapping would need to be, at least as long
> as it's the domain itself which requests the unmap (and which I would
> view to extend to the domain being cleaned up). But maybe that's why it's
> XSM_HOOK ...

There are situations for instance where there is a flask-based system
with one or more domains (v-platform-mgr) that are each responsible for
the management of a subset of domains and are responsible for
hotplugging in and out a device, i.e. granting the privilege to a
v-platform-mgr to call PHYSDEVOP_map_pirq/PHYSDEVOP_unmap_pirq, for the
domains each one is managing.

>> ---
>> Dan wants to change current to point at DOMID_IDLE when the RCU callback
>> runs.  I think Juergen's commit 53594c7bd197 "rcu: don't use
>> stop_machine_run() for rcu_barrier()" may have changed this since it
>> mentions stop_machine_run scheduled the idle vcpus to run the callbacks
>> for the old code.
>>
>> Would that be as easy as changing rcu_do_batch() to do:
>>
>> +        /* Run as "Xen" not a random domain's vcpu. */
>> +        vcpu = get_current();
>> +        set_current(idle_vcpu[smp_processor_id()]);
>>          list->func(list);
>> +        set_current(vcpu);
>>
>> or is using set_current() only acceptable as part of context_switch?
> 
> Indeed I would question any uses outside of context_switch() (and
> system bringup).

I am not familiar with the details of the scheduler, but from a higher
level, conceptual perspective, I do not understand why an idle domain
task is being executed without an explicit context switch to the idle
domain to ensure the current world view is consistent with the task
execution scope. Just seems to me like this is creating a situation
where things have the potential to go sideways/wrong.

v/r,
dps



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.