[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.

To: "Quan Xu" <quan.xu@xxxxxxxxx>
From: "Jan Beulich" <JBeulich@xxxxxxxx>
Date: Wed, 30 Mar 2016 02:05:56 -0600
Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Feng Wu <feng.wu@xxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Liu Jinsong <jinsong.liu@xxxxxxxxxxxxxxx>, Dario Faggioli <dario.faggioli@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>
Delivery-date: Wed, 30 Mar 2016 08:06:11 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 30.03.16 at 04:28, <quan.xu@xxxxxxxxx> wrote:
> On March 29, 2016 3:21pm, <JBeulich@xxxxxxxx> wrote:
>> >>> On 28.03.16 at 05:33, <quan.xu@xxxxxxxxx> wrote:
>> > On March 18, 2016 1:15am, <JBeulich@xxxxxxxx> wrote:
>> >> >>> On 17.03.16 at 07:54, <quan.xu@xxxxxxxxx> wrote:
>> >> > --- a/xen/common/grant_table.c
>> >> > +++ b/xen/common/grant_table.c
>> >> > @@ -932,8 +932,9 @@ __gnttab_map_grant_ref(
>> >> >              {
>> >> >                  nr_gets++;
>> >> >                  (void)get_page(pg, rd);
>> >> > -                if ( !(op->flags & GNTMAP_readonly) )
>> >> > -                    get_page_type(pg, PGT_writable_page);
>> >> > +                if ( !(op->flags & GNTMAP_readonly) &&
>> >> > +                     !get_page_type(pg, PGT_writable_page) )
>> >> > +                        goto could_not_pin;
>> >>
>> >> This needs explanation, as it doesn't look related to what your
>> >> actual goal is: If an error was possible here, I think this would be
>> >> a security issue. However, as also kind of documented by the
>> >> explicitly ignored return value from get_page(), it is my understanding 
>> >> there
>> here we only obtain an _extra_ reference.
>> >>
>> >
>> > For this point, I inferred from:
>> > map_vcpu_info()
>> > {
>> > ...
>> >     if ( !get_page_type(page, PGT_writable_page) )
>> >     {
>> >         put_page(page);
>> >         return -EINVAL;
>> >     }
>> > ...
>> > }
>> > , then for get_page_type(), I think the return value:
>> >      0 -- error,
>> >      1-- right.
>> >
>> > So if get_page_type() is failed, we should goto could_not_pin.
>> 
>> Did you read my reply at all? The explanation I'm expecting here is why 
> error
>> checking is all of the sudden needed _at all_.
>> 
> 
> Sorry for my stupid reply.
> As in this version, before the open discussion, I try to return the 
> iommu_{,un}map_page() error in this call tree:
>            iommu_{,un}map_page() -- __get_page_type() -- get_page_type()---
> then, in this point, I try to deal with this iommu_{,un}map_page() error.

I still don't get it: We're talking about a get_page_type() invocation
that previously was known to never fail (or at least so we hope,
based on the existing code). What I'm expecting as an explanation
is why this "cannot fail" state is not true any longer. And while
sorting this out, please pay particular attention to the limited set of
cases where __get_page_type() calls iommu_{,un}map_page() in
the first place.

>> > btw, there is another issue in the call path:
>> >     iommu_{,un}map_page() -- __get_page_type() -- get_page_type()---
>> >
>> >
>> > I tried to return iommu_{,un}map_page() error code in
>> > __get_page_type(), is it right?
>> 
>> If the operation got fully rolled back - yes. Whether fully rolling back is 
>> feasible
>> there though is - see the respective discussion - an open question.
>> 
> 
> For the open question, does it refer to as below:

Partly.

> """
> As said, we first need
> to settle on an abstract model. Do we want IOMMU mapping
> failures to be fatal to the domain (perhaps with the exception
> of the hardware one)? I think we do, and for the hardware domain
> we'd do things on a best effort basis (always erring on the side
> of unmapping). Which would probably mean crashing the domain
> could be centralized in iommu_{,un}map_page(). How much roll
> back would then still be needed in callers of these functions
> for the hardware domain's sake would need to be seen.
> """
> 
> I hope it is yes.

It is not clear to me what part of the above this is meant to refer to.
Perhaps this is meant to answer the question in the 2nd sentence,
but I think this really ought to take a little more than "yes".

>> >> > --- a/xen/drivers/passthrough/x86/iommu.c
>> >> > +++ b/xen/drivers/passthrough/x86/iommu.c
>> >> > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct
>> >> domain *d)
>> >> >      this_cpu(iommu_dont_flush_iotlb) = 0;
>> >> >
>> >> >      if ( !rc )
>> >> > -        iommu_iotlb_flush_all(d);
>> >> > +    {
>> >> > +        rc = iommu_iotlb_flush_all(d);
>> >> > +        if ( rc )
>> >> > +            iommu_teardown(d);
>> >> > +    }
>> >> >      else if ( rc != -ERESTART )
>> >> >          iommu_teardown(d);
>> >>
>> >> Why can't you just use the existing call to iommu_teardown(), by
>> >> simply
>> > deleting
>> >> the "else"?
>> >>
>> >
>> > Just check it, could I modify it as below:
>> > --- a/xen/drivers/passthrough/x86/iommu.c
>> > +++ b/xen/drivers/passthrough/x86/iommu.c
>> > @@ -105,7 +105,8 @@ int arch_iommu_populate_page_table(struct domain
>> > *d)
>> >
>> >      if ( !rc )
>> >          iommu_iotlb_flush_all(d);
>> > -    else if ( rc != -ERESTART )
>> > +
>> > +    if ( rc != -ERESTART )
>> >          iommu_teardown(d);
>> 
>> Clearly not - not only are you losing the return value of
>> iommu_iotlb_flush_all() now, you would then also call
>> iommu_teardown() in the "success" case. My comment was related to code
>> structure, yet you seem to have taken it literally.
>> 
> 
> Then, what about this one:
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -104,8 +104,9 @@ int arch_iommu_populate_page_table(struct domain *d)
>      this_cpu(iommu_dont_flush_iotlb) = 0;
> 
>      if ( !rc )
> -        iommu_iotlb_flush_all(d);
> -    else if ( rc != -ERESTART )
> +        rc = iommu_iotlb_flush_all(d);
> +
> +    if ( !rc && rc != -ERESTART )
>          iommu_teardown(d);
> 
> 
> IMO, my original modification is correct and redundant with 2 
> 'iommu_teardown()'..
> If this is still the correct one, could you help me send out the correct 
> one?

The above looks right to me.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

References:
- [Xen-devel] [PATCH 0/2] Check VT-d Device-TLB flush error
  - From: Quan Xu
- [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.
  - From: Quan Xu
- Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.
  - From: Xu, Quan
- Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.
  - From: Jan Beulich
- Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.
  - From: Xu, Quan

Prev by Date: Re: [Xen-devel] [PATCH v7 00/22] Prepare UEFI and ACPI tables for Dom0 on ARM64
Next by Date: Re: [Xen-devel] [PATCH v2] x86/hvm/viridian: save APIC assist vector
Previous by thread: Re: [Xen-devel] [PATCH 1/2] IOMMU/MMU: Adjust top level functions for VT-d Device-TLB flush error.
Next by thread: [Xen-devel] [PATCH 2/2] IOMMU/MMU: Adjust low level functions for VT-d Device-TLB flush error.
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.