[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 2/2] xen: merge temporary vcpu pinning scenarios


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Juergen Gross <JGross@xxxxxxxx>
  • From: Jan Beulich <JBeulich@xxxxxxxx>
  • Date: Fri, 26 Jul 2019 11:35:23 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=suse.com;dmarc=pass action=none header.from=suse.com;dkim=pass header.d=suse.com;arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WvsDLP7MtyKCo948+mQCJO7JMc6MCEGn3Zr8+vGnsB8=; b=Y8I8oXvv3ukYhgKe1/Claxq+Lfxm+aJE5Gy94A2hZIfdr9Wl4SC5yEAKlV7LEjMsgrDynr7hBu/9IVCYJrotAKBbs6FcRJbxlbMZkfvyEKDJFWiSAb0JJdtfK4U5I1+lnkQ/UDHcDGxR36gL62I2ynmLS+BhNI1fEpNY/CwOsQW1Q/zc1UAnaEtyqPhuY6VWdJ1Nkxm7SeIHWH/KEaOF+OXhoDRlzDLAGw1HScvgLBGioS2y5o/nST0SeHFMoroFy0lsgNZkqw/HVCdyIFqn0+X51df3pzMsWwpcCbTe2S5l0S/l6lrHlW9PMo92J7RGHch5g9akiNzTCd3YV38VUg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gXJscLb8Ax4u9Ult/X96twEsd+r8B9bQu1zELNkdBePxtjRPa5ahl6Li6aWsjguyK3o7ZopaWK4zrlDqFSEz2tWeIds4gdl1vGgaIeZh5VwlcvaX6zP3NLxL3rmG158qbt0xvrche6vuaQeHDqJTdMwHPIRLLiM58tww//yWxRR1Vysf+uPJIoopGbn0o3K48OeHC4lezzjk2wqrviHtEuxcah7hP2XK7dwAWBKUvwKuQ4wk1RnAJWp4ObDXHABhboe2aKk0/0f19HDqDrmC2lmOhdeLrMJWgtM/SK1sO6ZA+bbt3710dAVb3kwXVKIH4IpiIXyxeI1AQVLWQ+AB9Q==
  • Authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@xxxxxxxx;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, WeiLiu <wl@xxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 26 Jul 2019 11:39:05 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHVQhK9XkC67/eVmky5KVr20rRuZKbcqfttgAAeRIA=
  • Thread-topic: [PATCH v3 2/2] xen: merge temporary vcpu pinning scenarios

On 26.07.2019 11:46, Andrew Cooper wrote:
> On 24/07/2019 12:26, Juergen Gross wrote:
>> @@ -182,30 +178,24 @@ static void __prepare_to_wait(struct waitqueue_vcpu 
>> *wqv)
>>   static void __finish_wait(struct waitqueue_vcpu *wqv)
>>   {
>>       wqv->esp = NULL;
>> -    (void)vcpu_set_hard_affinity(current, &wqv->saved_affinity);
>> +    vcpu_temporary_affinity(current, NR_CPUS, VCPU_AFFINITY_WAIT);
>>   }
>>   
>>   void check_wakeup_from_wait(void)
>>   {
>> -    struct waitqueue_vcpu *wqv = current->waitqueue_vcpu;
>> +    struct vcpu *curr = current;
>> +    struct waitqueue_vcpu *wqv = curr->waitqueue_vcpu;
>>   
>>       ASSERT(list_empty(&wqv->list));
>>   
>>       if ( likely(wqv->esp == NULL) )
>>           return;
>>   
>> -    /* Check if we woke up on the wrong CPU. */
>> -    if ( unlikely(smp_processor_id() != wqv->wakeup_cpu) )
>> +    /* Check if we are still pinned. */
>> +    if ( unlikely(!(curr->affinity_broken & VCPU_AFFINITY_WAIT)) )
>>       {
>> -        /* Re-set VCPU affinity and re-enter the scheduler. */
>> -        struct vcpu *curr = current;
>> -        cpumask_copy(&wqv->saved_affinity, curr->cpu_hard_affinity);
>> -        if ( vcpu_set_hard_affinity(curr, cpumask_of(wqv->wakeup_cpu)) )
>> -        {
>> -            gdprintk(XENLOG_ERR, "Unable to set vcpu affinity\n");
>> -            domain_crash(current->domain);
>> -        }
>> -        wait(); /* takes us back into the scheduler */
>> +        gdprintk(XENLOG_ERR, "vcpu affinity lost\n");
>> +        domain_crash(curr->domain);
>>       }
> 
> I'm sorry to retract my R-by after the fact, but I've only just noticed
> (while rebasing some of my pending work over this) that it is buggy.
> 
> The reason wait() was called is because it is not safe to leave that
> if() clause.
> 
> With this change in place, we'll arrange for the VM to be crashed, then
> longjump back into the stack from from the waiting vCPU, on the wrong
> CPU.  Any caller with smp_processor_id() or thread-local variables cache
> by pointer on the stack will then cause memory corruption.
> 
> Its not immediately obvious how to fix this, but bear in mind that as
> soon as the vm-event interface is done, I plan to delete this whole
> waitqueue infrastructure anyway.

In which case - should we revert the commit until this is resolved?

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.