[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling



On 10/03/16 08:07, Jan Beulich wrote:
>>>> On 10.03.16 at 06:09, <kevin.tian@xxxxxxxxx> wrote:
>> It's always good to have a clear definition to which extend a performance
>> issue would become a security risk. I saw 200us/500us used as example
>> in this thread, however no one can give an accrual criteria. In that case,
>> how do we call it a problem even when Feng collected some data? Based
>> on mindset from all maintainers?
> 
> I think I've already made clear in previous comments that such
> measurements won't lead anywhere. What we need is a
> guarantee (by way of enforcement in source code) that the
> lists can't grow overly large, compared to the total load placed
> on the system.
> 
>> I think a good way of looking at this is based on which capability is 
>> impacted.
>> In this specific case the directly impacted metric is the interrupt delivery
>> latency. However today Xen is not RT-capable. Xen doesn't commit to 
>> deliver a worst-case 10us interrupt latency. The whole interrupt delivery 
>> path 
>> (from Xen into Guest) has not been optimized yet, then there could be other 
>> reasons impacting latency too beside the concern on this specific list walk. 
>> There is no baseline worst-case data w/o PI. There is no final goal to hit. 
>> There is no test case to measure. 
>>
>> Then why blocking this feature due to this unmeasurable concern and why
>> not enabling it and then improving it later when it becomes a measurable 
>> concern when Xen will commit a clear interrupt latency goal will be 
>> committed 
>> by Xen (at that time people working on that effort will have to identify all 
>> kinds 
>> of problems impacting interrupt latency and then can optimize together)?
>> People should understand possibly bad interrupt latency in extreme cases
>> like discussed in this thread (w/ or w/o PI), since Xen doesn't commit 
>> anything 
>> here.
> 
> I've never made any reference to this being an interrupt latency
> issue; I think it was George who somehow implied this from earlier
> comments. Interrupt latency, at least generally, isn't a security
> concern (generally because of course latency can get so high that
> it might become a concern). All my previous remarks regarding the
> issue are solely from the common perspective of long running
> operations (which we've been dealing with outside of interrupt
> context in a variety of cases, as you may recall). Hence the purely
> theoretical basis for some sort of measurement would be to
> determine how long a worst case list traversal would take. With
> "worst case" being derived from the theoretical limits the
> hypervisor implementation so far implies: 128 vCPU-s per domain
> (a limit which we sooner or later will need to lift, i.e. taking into
> consideration a larger value - like the 8k for PV guests - wouldn't
> hurt) by 32k domains per host, totaling to 4M possible list entries.
> Yes, it is obvious that this limit won't be reachable in practice, but
> no, any lower limit can't be guaranteed to be good enough.

Can I suggest we suspend the discussion of what would or would not be
reasonable and come back to it next week?  I definitely feel myself
digging my heels in here, so it might be good to go away and come back
to the discussion with a bit of distance.

(Potential technical solutions are still game I think.)

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.