[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 03/15] IOMMU/x86: support freeing of pagetables


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 1 Jun 2022 17:25:16 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7pV5/dPL0HhN5Oev5pVTgPjhTkLl94wRUUk6mIRZApg=; b=aUQhVBE9dzk5XpZoK8ipwB7A0qJzuFWm1OpkKcF0Vx3s2QI+Br/wz5xN7ZThmiJzh8T3p6u42FZ4W8nhsljyy4SgScSI0ZQy67rpasjqNyVYHe5GbmMNmxR26bzx1VLeoRTi5UekoIz50dnTmceSq/IsmMH8Jsq1/bwoOWWsWDfDwvc/wFUSxRh4z3A5jkwB04LnGten40QHs7hGFjPTVk0BD7fPOoUoepQhBkFmIzlPDBomPTBAK0Fdbga0l/Fxy520URBhiHun0ZOzeW7rEpk4GGYA+eLXjZSf0D1dL3im+Egd6REo9tOZ7mL9HvgaNipcV93nB8zv4X1B8NB0FA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HxR63S9uihjTZqJPJMXMhCJ8f6CDZOp0vpbi3hmJttsuXL4VWUzlMAmB3LU83rjXs2AUhP667ez0YLGqmxPKz+3Ai4oEkwe2vxU39NDxxFnJlkRd4z6xkJ+4QcZVsbb/ccWmiiE542N9oocck6m6UM1mmPe8w5c+jJe9k47SJHuHsEZcxRn0w3PElMf92yMpTXKds199f//h3BZL8PkJNov40bcINlu6fdYeyHhHbO+zHNwfJyHtTFtY3OdB182MvRRaDes5yyTgMGpeBQWhI1+6LGmKket0+VjbMUvMCupJFr4Kl1XbGovEHceEGE9Ok2v6y9yvVL2NkALN01RVZw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 01 Jun 2022 15:25:25 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.06.2022 11:24, Roger Pau Monné wrote:
> On Wed, Jun 01, 2022 at 09:32:44AM +0200, Jan Beulich wrote:
>> On 31.05.2022 18:25, Roger Pau Monné wrote:
>>> On Fri, May 27, 2022 at 01:13:09PM +0200, Jan Beulich wrote:
>>>> @@ -566,6 +567,98 @@ struct page_info *iommu_alloc_pgtable(st
>>>>      return pg;
>>>>  }
>>>>  
>>>> +/*
>>>> + * Intermediate page tables which get replaced by large pages may only be
>>>> + * freed after a suitable IOTLB flush. Hence such pages get queued on a
>>>> + * per-CPU list, with a per-CPU tasklet processing the list on the 
>>>> assumption
>>>> + * that the necessary IOTLB flush will have occurred by the time tasklets 
>>>> get
>>>> + * to run. (List and tasklet being per-CPU has the benefit of accesses not
>>>> + * requiring any locking.)
>>>> + */
>>>> +static DEFINE_PER_CPU(struct page_list_head, free_pgt_list);
>>>> +static DEFINE_PER_CPU(struct tasklet, free_pgt_tasklet);
>>>> +
>>>> +static void free_queued_pgtables(void *arg)
>>>> +{
>>>> +    struct page_list_head *list = arg;
>>>> +    struct page_info *pg;
>>>> +    unsigned int done = 0;
>>>> +
>>>> +    while ( (pg = page_list_remove_head(list)) )
>>>> +    {
>>>> +        free_domheap_page(pg);
>>>> +
>>>> +        /* Granularity of checking somewhat arbitrary. */
>>>> +        if ( !(++done & 0x1ff) )
>>>> +             process_pending_softirqs();
>>>
>>> Hm, I'm wondering whether we really want to process pending softirqs
>>> here.
>>>
>>> Such processing will prevent the watchdog from triggering, which we
>>> likely want in production builds.  OTOH in debug builds we should make
>>> sure that free_queued_pgtables() doesn't take longer than a watchdog
>>> window, or else it's likely to cause issues to guests scheduled on
>>> this same pCPU (and calling process_pending_softirqs() will just mask
>>> it).
>>
>> Doesn't this consideration apply to about every use of the function we
>> already have in the code base?
> 
> Not really, at least when used by init code or by the debug key
> handlers.  This use is IMO different than what I would expect, as it's
> a guest triggered path that we believe do require such processing.
> Normally we would use continuations for such long going guest
> triggered operations.

So what do you suggest I do? Putting the call inside #ifndef CONFIG_DEBUG
is not a good option imo. Re-scheduling the tasklet wouldn't help, aiui
(it would still run again right away). Moving the work to another CPU so
this one can do other things isn't very good either - what if other CPUs
are similarly busy? Remains making things more complicated here by
involving a timer, the handler of which would re-schedule the tasklet. I
have to admit I don't like that very much either. The more that the
use of process_pending_softirqs() is "just in case" here anyway - if lots
of page tables were to be queued, I'd expect the queuing entity to be
preempted before a rather large pile could accumulate.

Maybe I could make iommu_queue_free_pgtable() return non-void, to instruct
the caller to bubble up a preemption notification once a certain number
of pages have been queued for freeing. This might end up intrusive ...

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.