[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] Limit retry in ControllerGetResponse


  • To: Tu Dinh <ngoc-tu.dinh@xxxxxxxxxx>, "win-pv-devel@xxxxxxxxxxxxxxxxxxxx" <win-pv-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Owen Smith <owen.smith@xxxxxxxxxx>
  • Date: Tue, 21 Oct 2025 13:06:57 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tcdv9+RXLx5Yh0WpNXzklhjc2HErDUHbYpPEkyk/hFc=; b=e/ix0ad07YNP6vCaLNl7Iclo9++/NN1vrw7PMefbaT7g4zUWMWQ6zh4eMl3mOGL8LWAldVVNb272qLBCD0dNQBRtwwxmxxdjqB9Gq3H5rfRJSOaEGbxryJhyyNsj5KkGK+W7bOGpcJMtvCs+LNqFJfbObAL3EnODhDgxGFs+J7a+7lEJ6FzXszxu0stAE8BuNrqyAVqFNEZxWi26UM0NGJMoA7aNxIzWFSEkFMunK+cQ31cSvDTa4oTKRQM3QLnPwygpCmgIxjjLbslko7yNHAyg/5HqBfXEkgvnz7hFi+LUK9Y+AchBixB+fiMHVAgTWJSxxCtyxGY/HkWTTlweBw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=suTWnHDeksOlrXz6lOG4Ul3BgGs45fO06AGcCvi3MIzaTv60vMDYm5nuwx3r4wFcowTR07GW0ZyhO48pQtveAR/fqP6zyYh8Lush8OdBZD7YQS7eSKIveSuGzT1OlWeeRBtkTEhkritJv838oF8CVM4SLXB+oA1DlxFJzzyhlbgrqP9sjaI9p5vi9S4V2J5p9fo9mugEPZFFVI5XcWqqN7knveXpTIIsjAS7TvyCJr5yMlxzfjmPI3UrgfSqSTy+WwOpZYLXyoeVkg8rHFmE+P1KMZie9hH7Ov/YadqbWuORsAaemdireJywqakQ4mcYlInduiIpK/cdY5zUDfJdwg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Delivery-date: Tue, 21 Oct 2025 13:07:06 +0000
  • List-id: Developer list for the Windows PV Drivers subproject <win-pv-devel.lists.xenproject.org>
  • Msip_labels:
  • Thread-index: AQHcPoX0lX7XuHgnf0m66vrzmzNQE7TMmU2F
  • Thread-topic: [PATCH 2/2] Limit retry in ControllerGetResponse

Is there a way of detecting the backend has disappeared during 
ControllerGetResponse?
As currently, there is a 100ms delay between attempts to trigger the backend - 
its possible the backend could take longer to respond in normal operation

Owen

________________________________________
From: win-pv-devel <win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx> on behalf of Tu 
Dinh <ngoc-tu.dinh@xxxxxxxxxx>
Sent: 16 October 2025 11:16 AM
To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Cc: Tu Dinh
Subject: [PATCH 2/2] Limit retry in ControllerGetResponse

During unplug, NDIS may issue a hash reconfiguration. However, the
backend may disappear before the request has had a chance to complete.
This is observable by e.g. adding a delay at the beginning of
VifReceiverSetHashAlgorithm and unplugging. If this happens, Xenvif will
get stuck waiting forever in ControllerGetResponse, at DISPATCH_LEVEL to
boot.

Limit the EvtchnWait retry to 10 attempts (10ms each to keep the
previous 100ms limit) and return an error code to the caller if the wait
has failed.

Signed-off-by: Tu Dinh <ngoc-tu.dinh@xxxxxxxxxx>
---
 src/xenvif/controller.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/src/xenvif/controller.c b/src/xenvif/controller.c
index e032972..109c030 100644
--- a/src/xenvif/controller.c
+++ b/src/xenvif/controller.c
@@ -248,7 +248,7 @@ fail1:
 #define TIME_MS(_ms)        (TIME_US((_ms) * 1000))
 #define TIME_RELATIVE(_t)   (-(_t))

-#define XENVIF_CONTROLLER_POLL_PERIOD 100 // ms
+#define XENVIF_CONTROLLER_POLL_PERIOD 10 // ms

 _IRQL_requires_(DISPATCH_LEVEL)
 static NTSTATUS
@@ -258,11 +258,12 @@ ControllerGetResponse(
     )
 {
     LARGE_INTEGER                   Timeout;
+    ULONG                           Attempt;
     NTSTATUS                        status;

     Timeout.QuadPart = TIME_RELATIVE(TIME_MS(XENVIF_CONTROLLER_POLL_PERIOD));

-    for (;;) {
+    for (Attempt = 0; Attempt < 10; Attempt++) {
         ULONG   Count;

         Count = XENBUS_EVTCHN(GetCount,
@@ -284,6 +285,13 @@ ControllerGetResponse(
             __ControllerSend(Controller);
     }

+    // Use STATUS_TRANSACTION_TIMED_OUT as an error code since STATUS_TIMEOUT 
is
+    // a success code.
+    if (Controller->Response.id != Controller->Request.id) {
+        status = STATUS_TRANSACTION_TIMED_OUT;
+        goto done;
+    }
+
     ASSERT3U(Controller->Response.type, ==, Controller->Request.type);

     switch (Controller->Response.status) {
@@ -311,6 +319,7 @@ ControllerGetResponse(
     if (NT_SUCCESS(status) && Data != NULL)
         *Data = Controller->Response.data;

+done:
     RtlZeroMemory(&Controller->Request,
                   sizeof (struct xen_netif_ctrl_request));
     RtlZeroMemory(&Controller->Response,
--
2.51.0.windows.2



--
Ngoc Tu Dinh | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.