Xen project Mailing List

Re: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

To: Jan Beulich <JBeulich@xxxxxxxx>, "Wu, Feng" <feng.wu@xxxxxxxxx>

Date: Tue, 22 Dec 2015 08:10:42 +0000

Accept-language: en-US

Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "'keir@xxxxxxx'" <keir@xxxxxxx>, "'george.dunlap@xxxxxxxxxxxxx'" <george.dunlap@xxxxxxxxxxxxx>, "'andrew.cooper3@xxxxxxxxxx'" <andrew.cooper3@xxxxxxxxxx>, "'tim@xxxxxxx'" <tim@xxxxxxx>, "'xen-devel@xxxxxxxxxxxxx'" <xen-devel@xxxxxxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>

Delivery-date: Tue, 22 Dec 2015 08:10:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Thread-index: AQHRO+VXqsgKw5APA0WYMk8D+FPVFZ7VVd/g//+IeYCAAIbw8P//gjWAgAEyr4CAAAW4gIAAhnFA

Thread-topic: [Xen-devel] [PATCH v3 0/2] VT-d flush issue

>On 22.12.2015 at 4:01pm <JBeulich@xxxxxxxx> wrote: > >>> On 22.12.15 at 08:40, <feng.wu@xxxxxxxxx> wrote: > > Maybe, there are still some misunderstanding about your expectation. > > Let me summarize it here. > > > > After Quan's patch-set, there are two types of error code: > > - -EOPNOTSUPP > > Now we only support and use software way to synchronize the > > invalidation, if someone calls queue_invalidate_wait() and passes sw > > with 0, then -EOPNOTSUPP is returned (Though this cannot happen in > > real world, since > > queue_invalidate_wait() is called only in one place and 1 is passed in to > > 'sw'). > > So I am not sure what should we do for this return value, if we really > > get that return value, it means the flush is not actually executed, so > > the iommu state is incorrect, the data is inconsistent. Do you think > > what should we do for this case? > > Since seeing this error would be a software bug, BUG() or ASSERT() are fine to > handle this specific case, if need be. > > > - -ETIMEDOUT > > For this case, Quan has elaborate a lot, IIUIC, the main gap between > > you and Quan is you think the error code should be propagated to the > > up caller, while in Quan's implementation, he deals with this error in > > invalidate_timeout() > > and device_tlb_invalidate_timeout(), hence no need to propagated it to > > up called, since the handling policy will crash the domain, so we > > don't think propagated error code is needed. Even we propagate it, the > > up caller still doesn't need to do anything for it. > > "Handling" an error by e.g. domain_crash() doesn't mean you don't need to also > modify (or at the very least inspect) callers: They may continue doing things > _assuming_ success. Of course you don't need to domain_crash() at all layers. > But errors from lower layers should, at least in most ordinary cases, lead to > higher layers bailing instead of continuing. > For Device-TLB flush error, I think we need to propagated error code. For IEC/iotlb/context flush error, if panic is acceptable, we can ignore the propagated error code. BTW, it is very challenge / tricky to handle all Of error, and some error is unrecoverable. As mentioned, it looks like rewriting Xen hypervisor. For -EOPNOTSUPP, we can print warning message. If it supports interrupt method, we can return 0 in queue_invalidate_wait(). Feng, thanks for your update! -Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.