[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen-blkback: fix compatibility bug with single page rings



On 27.01.2021 12:33, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@xxxxxxxx>
>> Sent: 27 January 2021 11:21
>> To: paul@xxxxxxx
>> Cc: 'Paul Durrant' <pdurrant@xxxxxxxxxx>; 'Konrad Rzeszutek Wilk' 
>> <konrad.wilk@xxxxxxxxxx>; 'Roger Pau
>> Monné' <roger.pau@xxxxxxxxxx>; 'Jens Axboe' <axboe@xxxxxxxxx>; 'Dongli Zhang'
>> <dongli.zhang@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; 
>> linux-block@xxxxxxxxxxxxxxx; xen-
>> devel@xxxxxxxxxxxxxxxxxxxx
>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
>> rings
>>
>> On 27.01.2021 12:09, Paul Durrant wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@xxxxxxxx>
>>>> Sent: 27 January 2021 10:57
>>>> To: Paul Durrant <paul@xxxxxxx>
>>>> Cc: Paul Durrant <pdurrant@xxxxxxxxxx>; Konrad Rzeszutek Wilk 
>>>> <konrad.wilk@xxxxxxxxxx>; Roger Pau
>>>> Monné <roger.pau@xxxxxxxxxx>; Jens Axboe <axboe@xxxxxxxxx>; Dongli Zhang 
>>>> <dongli.zhang@xxxxxxxxxx>;
>>>> linux-kernel@xxxxxxxxxxxxxxx; linux-block@xxxxxxxxxxxxxxx; 
>>>> xen-devel@xxxxxxxxxxxxxxxxxxxx
>>>> Subject: Re: [PATCH] xen-blkback: fix compatibility bug with single page 
>>>> rings
>>>>
>>>> On 27.01.2021 11:30, Paul Durrant wrote:
>>>>> From: Paul Durrant <pdurrant@xxxxxxxxxx>
>>>>>
>>>>> Prior to commit 4a8c31a1c6f5 ("xen/blkback: rework connect_ring() to avoid
>>>>> inconsistent xenstore 'ring-page-order' set by malicious blkfront"), the
>>>>> behaviour of xen-blkback when connecting to a frontend was:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring specified by 'ring-ref'
>>>>> - else expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>>
>>>>> This was correct behaviour, but was broken by the afforementioned commit 
>>>>> to
>>>>> become:
>>>>>
>>>>> - read 'ring-page-order'
>>>>> - if not present then expect a single page ring
>>>>> - expect a ring specified by 'ring-refX' where X is between 0 and
>>>>>   1 << ring-page-order
>>>>> - if that didn't work then see if there's a single page ring specified by
>>>>>   'ring-ref'
>>>>>
>>>>> This incorrect behaviour works most of the time but fails when a frontend
>>>>> that sets 'ring-page-order' is unloaded and replaced by one that does not
>>>>> because, instead of reading 'ring-ref', xen-blkback will read the stale
>>>>> 'ring-ref0' left around by the previous frontend will try to map the wrong
>>>>> grant reference.
>>>>>
>>>>> This patch restores the original behaviour.
>>>>
>>>> Isn't this only the 2nd of a pair of fixes that's needed, the
>>>> first being the drivers, upon being unloaded, to fully clean up
>>>> after itself? Any stale key left may lead to confusion upon
>>>> re-use of the containing directory.
>>>
>>> In a backend we shouldn't be relying on, nor really expect IMO, a frontend 
>>> to clean up after itself.
>> Any backend should know *exactly* what xenstore nodes it’s looking for from 
>> a frontend.
>>
>> But the backend can't know whether a node exists because the present
>> frontend has written it, or because an earlier instance forgot to
>> delete it. It can only honor what's there. (In fact the other day I
>> was wondering whether some of the writes of boolean "false" nodes
>> wouldn't better be xenbus_rm() instead.)
> 
> In the particular case this patch is fixing for me, the frontends are the 
> Windows XENVBD driver and the Windows crash version of the same driver 
> (actually built from different code). The 'normal' instance is multi-page 
> aware and the crash instance is not quite, i.e. it uses the old ring-ref but 
> knows to clean up 'ring-page-order'.
> Clearly, in a crash situation, we cannot rely on frontend to clean up

Ah, I see (and agree).

> so what you say does highlight that there indeed needs to be a second patch 
> to xen-blkback to make sure it removes 'ring-page-order' itself as 'state' 
> cycles through Closed and back to InitWait.

And not just this one node then, I suppose?

> I think this patch does still stand on its own though.

Perhaps, yes.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.