Xen project Mailing List

Re: IRQ latency measurements in hypervisor

To: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Sat, 16 Jan 2021 12:59:48 +0000

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=blaOGoBfb/O6TBcvHQcMAnRBplyBdzKvCoZ2u0SIF80=; b=RimjqWbgBixMtt+2GkjITIo3pOtpVTWUdRCRgV+BfIm3Q6KDDXfJpGtZowR0UNuyXHKZzTiT7uAvEl2T0iPJrGvEklfCDtyH0lBH9AS5Oqn3Hwiy2f+Zxn678oTMpCiXZ7rXK6dEao6VkufqPEURREmzy8C4yBAs5p3PDAfg0dnpZ5nh6rLMz/WVjU5X/eLpXNkND0vHjWqeg6XRhyPSZgh1fbm80yK5/PxenYTAWx6AMWHei7u9eC3NHmTtbOHOLH2RzKf7JlmzY5elKnW/toOVJakFLv1sjkL6hpYAm6a1yoq7koz8eK7nqoXIaa4t1BuxS0dgM7N2AEAeSZmB/Q==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Iv0lwZKpBNbmzwWF1K3EtfF2TDhnNo+SYzTcvtw7rAt26SU9wOszCnCbxgD091zOyxbTAVGI+c5DJFSymADgue1sXPX/DMntXiPXz+pl8cnE08FTfAzC8U7Y6TRgWEz7dNwSLGZjlC9WIcZAmAJAp2Mhtttack2DTWzjEMS4E+yF4ClxMoQ4f06QOYuYzb9OJg5d7qouRes7sv+Jeq+ebNXWE13aXWspuJAjzuRFcKk0p+0H19AK2gmPy4p6FGVRG3lUY8OllMZLZLzxYrLTIamYwfPo6Sjh+xXMXpIdoPWwq0OaI1oC8YH4zaPdt4wXnt22HFUSMVFr/7d+FAmnOw==

Authentication-results: esa4.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com

Cc: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Julien Grall <jgrall@xxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxx>, "Bertrand.Marquis@xxxxxxx" <Bertrand.Marquis@xxxxxxx>

Delivery-date: Sat, 16 Jan 2021 13:00:09 +0000

Ironport-sdr: I1LurOx7N8a7c8D+yYwgs65hsB07b0KGirET74848BJNbg3FCgX2g3i4vZ+xkmJE67GGH5wxPs eGdx7fdN91IqbMdrmkuZGCOH8eLzdaQEPCfG/GnVqXGpRSK4Iu38Czcboj8BWzxF0MgCsj5tH/ 83DQVlSfusF4jADic3mpDbPTc1xyLBvgcNItJBAlqCPp0/qdJ3Jayc2fE2kHGKapPwiqKkci0Z m6ofhU5eKfXMiWVxUEtbtTqGdRF7+qDGPMystALx8JRc23jPhzcNc4f7djkTEms32+FsGvKUpq z0o=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15/01/2021 23:41, Stefano Stabellini wrote: >>>>> This is very interestingi too. Did you get any spikes with the >>>>> period >>>>> set to 100us? It would be fantastic if there were none. >>>>> >>>>>> 3. Huge latency spike during domain creation. I conducted some >>>>>> additional tests, including use of PV drivers, but this didn't >>>>>> affected the latency in my "real time" domain. But attempt to >>>>>> create another domain with relatively large memory size of 2GB >>>>>> led >>>>>> to huge spike in latency. Debugging led to this call path: >>>>>> >>>>>> XENMEM_populate_physmap -> populate_physmap() -> >>>>>> alloc_domheap_pages() -> alloc_heap_pages()-> huge >>>>>> "for ( i = 0; i < (1 << order); i++ )" loop. >>>> There are two for loops in alloc_heap_pages() using this syntax. Which >>>> one are your referring to? >>> I did some tracing with Lautrebach. It pointed to the first loop and >>> especially to flush_page_to_ram() call if I remember correctly. >> Thanks, I am not entirely surprised because we are clean and invalidating the >> region line by line and across all the CPUs. >> >> If we are assuming 128 bytes cacheline, we will need to issue 32 cache >> instructions per page. This going to involve quite a bit of traffic on the >> system. > I think Julien is most likely right. It would be good to verify this > with an experiment. For instance, you could remove the > flush_page_to_ram() call for one test and see if you see any latency > problems. > > >> One possibility would be to defer the cache flush when the domain is created >> and use the hypercall XEN_DOMCTL_cacheflush to issue the flush. >> >> Note that XEN_DOMCTL_cacheflush would need some modification to be >> preemptible. But at least, it will work on a GFN which is easier to track. > > This looks like a solid suggestion. XEN_DOMCTL_cacheflush is already > used by the toolstack in a few places. > > I am also wondering if we can get away with fewer flush_page_to_ram() > calls from alloc_heap_pages() for memory allocations done at boot time > soon after global boot memory scrubbing. I'm pretty sure there is room to improve Xen's behaviour in general, by not scrubbing pages already known to be zero. As far as I'm aware, there are improvements which never got completed when lazy scrubbing was added, and I think it is giving us a hit on x86, where we don't even have to do any cache maintenance on the side. ~Andrew

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.