[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8 06/16] xen/memory: Fix mapping grant tables with XENMEM_acquire_resource


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Mon, 1 Feb 2021 11:11:37 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cW0y75xdrGwUEFSo42TsmqJIrR3/6nV9yYU4GpM0/Wg=; b=EeRC2CtcgdwdTQB8ANFMcavkhIUMqptABmPnZ/QvUws4KccYY5mEbZojz63CgdwdAXTQqE5sqRKSBjMb3NsiD4d4WDEfyzwkoLBck8Nze9kmxdZGMeBt7ivZLF617YwiH95b+DImUOR63NmrswCM9CK7+KqYOX6X7UGN/PwYkF+MxUSirQDjzX1q5a/BqllMBfpl8zncn3Om18BF9gVc9jSPdhkzhviyKgqejZUrXc46Sayhzu98+WACCf61w65LvRWVB15NL1USMLLCIOQ9nBJLQE1P5Y1In5MpJyS4vMk1NwAD7G6Nyt7+JXVyGxk4M/G6E8cJwouz1JZy1UO1Mw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=behsLTtOxHweBNicK6o6u4YOweeG4n5p8UCnLG1hCnJCRwmGFxp3B3HpVLMNe7JR95bnWR07cU8erAGzkpN+7c0OZsAW67XfIyxF0gOZNY+0NGdH85+e+9OYjrcuK/LWbvUUImO4evV66FzAyNlIeCQ+p0DbhTWdNRriIeU27j0Guu4i+1w2hBFaLANk7oEhMgGWRSysqid6VpaP5cM7TGmlV3O+/NR3SSjd+ejflyHfxXUBCm1CR8pNHUWa8/Ngbl0eatc6uDz3i/+PwQYrielONniFsRa/HjTXdpGJ8lw/83xwxY7EJzjSNR+lu9Ut8lf0GmIzXwQyzA5wOeNIfQ==
  • Authentication-results: esa2.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Paul Durrant <paul@xxxxxxx>, Michał Leszczyński <michal.leszczynski@xxxxxxx>, Hubert Jasudowicz <hubert.jasudowicz@xxxxxxx>, Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
  • Delivery-date: Mon, 01 Feb 2021 11:12:07 +0000
  • Ironport-sdr: O9xCrGXcLhKM4N2MpDmmqdgv8fA8hSyUncQjLbJR6L64/hOyN90fDqQSLKfaF3UDnqk/bk+3YB wVxfUpBdYF+VV36Q4ZiISWEjDuIioMpCzF5yNe07gK8E/4aTiNjDReDAHToSpul8oHyOkk632n tqGcU01+vf6bzeHgUQSValJrk95FV3/oTDdQWl0v3l44vqx/1Hcof7z6F7NyVtcBTOliAauleW I5VRblt+sF/0mY7grY+iP+Dc2r66OvMBmCfgN257Ge/2uCmgNUEp3PQ1kDWPpL05XPMgCw72Oa TNI=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01/02/2021 10:10, Roger Pau Monné wrote:
> On Sat, Jan 30, 2021 at 02:58:42AM +0000, Andrew Cooper wrote:
>> A guest's default number of grant frames is 64, and XENMEM_acquire_resource
>> will reject an attempt to map more than 32 frames.  This limit is caused by
>> the size of mfn_list[] on the stack.
>>
>> Fix mapping of arbitrary size requests by looping over batches of 32 in
>> acquire_resource(), and using hypercall continuations when necessary.
>>
>> To start with, break _acquire_resource() out of acquire_resource() to cope
>> with type-specific dispatching, and update the return semantics to indicate
>> the number of mfns returned.  Update gnttab_acquire_resource() and x86's
>> arch_acquire_resource() to match these new semantics.
>>
>> Have do_memory_op() pass start_extent into acquire_resource() so it can pick
>> up where it left off after a continuation, and loop over batches of 32 until
>> all the work is done, or a continuation needs to occur.
>>
>> compat_memory_op() is a bit more complicated, because it also has to marshal
>> frame_list in the XLAT buffer.  Have it account for continuation information
>> itself and hide details from the upper layer, so it can marshal the buffer in
>> chunks if necessary.
>>
>> With these fixes in place, it is now possible to map the whole grant table 
>> for
>> a guest.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Just one comment/question regarding a continuation below.
>
> I have to admit I had a hard time reviewing this, all this compat code
> plus the continuation stuff is quite hard to follow.
>
>> ---
>> CC: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
>> CC: Ian Jackson <iwj@xxxxxxxxxxxxxx>
>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx>
>> CC: Wei Liu <wl@xxxxxxx>
>> CC: Julien Grall <julien@xxxxxxx>
>> CC: Paul Durrant <paul@xxxxxxx>
>> CC: Michał Leszczyński <michal.leszczynski@xxxxxxx>
>> CC: Hubert Jasudowicz <hubert.jasudowicz@xxxxxxx>
>> CC: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
>>
>> v8:
>>  * nat => cmp change in the start_extent check.
>>  * Rebase over 'frame' and ARM/IOREQ series.
>>
>> v3:
>>  * Spelling fixes
>> ---
>>  xen/common/compat/memory.c |  94 +++++++++++++++++++++++++++-------
>>  xen/common/grant_table.c   |   3 ++
>>  xen/common/memory.c        | 124 
>> +++++++++++++++++++++++++++++++++------------
>>  3 files changed, 169 insertions(+), 52 deletions(-)
>>
>> diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
>> index 834c5e19d1..4c9cd9c05a 100644
>> --- a/xen/common/compat/memory.c
>> +++ b/xen/common/compat/memory.c
>> @@ -402,23 +402,10 @@ int compat_memory_op(unsigned int cmd, 
>> XEN_GUEST_HANDLE_PARAM(void) compat)
>>          case XENMEM_acquire_resource:
>>          {
>>              xen_pfn_t *xen_frame_list = NULL;
>> -            unsigned int max_nr_frames;
>>  
>>              if ( copy_from_guest(&cmp.mar, compat, 1) )
>>                  return -EFAULT;
>>  
>> -            /*
>> -             * The number of frames handled is currently limited to a
>> -             * small number by the underlying implementation, so the
>> -             * scratch space should be sufficient for bouncing the
>> -             * frame addresses.
>> -             */
>> -            max_nr_frames = (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
>> -                sizeof(*xen_frame_list);
>> -
>> -            if ( cmp.mar.nr_frames > max_nr_frames )
>> -                return -E2BIG;
>> -
>>              /* Marshal the frame list in the remainder of the xlat space. */
>>              if ( !compat_handle_is_null(cmp.mar.frame_list) )
>>                  xen_frame_list = (xen_pfn_t *)(nat.mar + 1);
>> @@ -432,6 +419,28 @@ int compat_memory_op(unsigned int cmd, 
>> XEN_GUEST_HANDLE_PARAM(void) compat)
>>  
>>              if ( xen_frame_list && cmp.mar.nr_frames )
>>              {
>> +                unsigned int xlat_max_frames =
> Could be made const static I think?

It is a compile time constant, but the compiler can already figure that
out.  static is definitely out of the question.

>
>> +                    (COMPAT_ARG_XLAT_SIZE - sizeof(*nat.mar)) /
>> +                    sizeof(*xen_frame_list);
>> +
>> +                if ( start_extent >= cmp.mar.nr_frames )
>> +                    return -EINVAL;
>> +
>> +                /*
>> +                 * Adjust nat to account for work done on previous
>> +                 * continuations, leaving cmp pristine.  Hide the 
>> continaution
>> +                 * from the native code to prevent double accounting.
>> +                 */
>> +                nat.mar->nr_frames -= start_extent;
>> +                nat.mar->frame += start_extent;
>> +                cmd &= MEMOP_CMD_MASK;
>> +
>> +                /*
>> +                 * If there are two many frames to fit within the xlat 
>> buffer,
>> +                 * we'll need to loop to marshal them all.
>> +                 */
>> +                nat.mar->nr_frames = min(nat.mar->nr_frames, 
>> xlat_max_frames);
>> +
>>                  /*
>>                   * frame_list is an input for translated guests, and an 
>> output
>>                   * for untranslated guests.  Only copy in for translated 
>> guests.
>> @@ -444,14 +453,14 @@ int compat_memory_op(unsigned int cmd, 
>> XEN_GUEST_HANDLE_PARAM(void) compat)
>>                                               cmp.mar.nr_frames) ||
>>                           __copy_from_compat_offset(
>>                               compat_frame_list, cmp.mar.frame_list,
>> -                             0, cmp.mar.nr_frames) )
>> +                             start_extent, nat.mar->nr_frames) )
>>                          return -EFAULT;
>>  
>>                      /*
>>                       * Iterate backwards over compat_frame_list[] expanding
>>                       * compat_pfn_t to xen_pfn_t in place.
>>                       */
>> -                    for ( int x = cmp.mar.nr_frames - 1; x >= 0; --x )
>> +                    for ( int x = nat.mar->nr_frames - 1; x >= 0; --x )
>>                          xen_frame_list[x] = compat_frame_list[x];
> Unrelated question, but I don't really see the point of iterating
> backwards, wouldn't it be easy to use use the existing 'i' loop
> counter and for a for ( i = 0; i <  nat.mar->nr_frames; i++ )?
>
> (Not that you need to fix it here, just curious about why we use that
> construct instead).

Iterating backwards is totally critical.

xen_frame_list and compat_frame_list are the same numerical pointer. 
We've just filled it 50% full with compat_pfn_t's, and need to turn
these into xen_pfn_t's which are double the size.

Iterating forwards would clobber every entry but the first.

>
>>                  }
>>              }
>> @@ -600,9 +609,11 @@ int compat_memory_op(unsigned int cmd, 
>> XEN_GUEST_HANDLE_PARAM(void) compat)
>>          case XENMEM_acquire_resource:
>>          {
>>              DEFINE_XEN_GUEST_HANDLE(compat_mem_acquire_resource_t);
>> +            unsigned int done;
>>  
>>              if ( compat_handle_is_null(cmp.mar.frame_list) )
>>              {
>> +                ASSERT(split == 0 && rc == 0);
>>                  if ( __copy_field_to_guest(
>>                           guest_handle_cast(compat,
>>                                             compat_mem_acquire_resource_t),
>> @@ -611,6 +622,21 @@ int compat_memory_op(unsigned int cmd, 
>> XEN_GUEST_HANDLE_PARAM(void) compat)
>>                  break;
>>              }
>>  
>> +            if ( split < 0 )
>> +            {
>> +                /* Continuation occurred. */
>> +                ASSERT(rc != XENMEM_acquire_resource);
>> +                done = cmd >> MEMOP_EXTENT_SHIFT;
>> +            }
>> +            else
>> +            {
>> +                /* No continuation. */
>> +                ASSERT(rc == 0);
>> +                done = nat.mar->nr_frames;
>> +            }
>> +
>> +            ASSERT(done <= nat.mar->nr_frames);
>> +
>>              /*
>>               * frame_list is an input for translated guests, and an output 
>> for
>>               * untranslated guests.  Only copy out for untranslated guests.
>> @@ -626,7 +652,7 @@ int compat_memory_op(unsigned int cmd, 
>> XEN_GUEST_HANDLE_PARAM(void) compat)
>>                   */
>>                  BUILD_BUG_ON(sizeof(compat_pfn_t) > sizeof(xen_pfn_t));
>>  
>> -                for ( i = 0; i < cmp.mar.nr_frames; i++ )
>> +                for ( i = 0; i < done; i++ )
>>                  {
>>                      compat_pfn_t frame = xen_frame_list[i];
>>  
>> @@ -636,15 +662,45 @@ int compat_memory_op(unsigned int cmd, 
>> XEN_GUEST_HANDLE_PARAM(void) compat)
>>                      compat_frame_list[i] = frame;
>>                  }
>>  
>> -                if ( __copy_to_compat_offset(cmp.mar.frame_list, 0,
>> +                if ( __copy_to_compat_offset(cmp.mar.frame_list, 
>> start_extent,
>>                                               compat_frame_list,
>> -                                             cmp.mar.nr_frames) )
>> +                                             done) )
>>                      return -EFAULT;
> Is it fine to return with a possibly pending continuation already
> encoded here?
>
> Other places seem to crash the domain when that happens.

Hmm.  This is all a total mess.  (Elsewhere the error handling is also
broken - a caller who receives an error can't figure out how to recover)

But yes - I think you're right - the only thing we can do here is `goto
crash;` and woe betide any 32bit kernel which passes a pointer to a
read-only buffer.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.