[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] vmx: VT-d posted-interrupt core logic handling



>>> On 10.03.16 at 11:05, <kevin.tian@xxxxxxxxx> wrote:
>>  From: Tian, Kevin
>> Sent: Thursday, March 10, 2016 5:20 PM
>> 
>> > From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> > Sent: Thursday, March 10, 2016 5:06 PM
>> >
>> >
>> > > There are many linked list usages today in Xen hypervisor, which
>> > > have different theoretical maximum possible number. The closest
>> > > one to PI might be the usage in tmem (pool->share_list) which is
>> > > page based so could grow 'overly large'. Other examples are
>> > > magnitude lower, e.g. s->ioreq_vcpu_list in ioreq server (which
>> > > could be 8K in above example), and d->arch.hvm_domain.msixtbl_list
>> > > in MSI-x virtualization (which could be 2^11 per spec). Do we
>> > > also want to create some artificial scenarios to examine them
>> > > since based on actual operation K-level entries may also become
>> > > a problem?
>> > >
>> > > Just want to figure out how best we can solve all related linked-list
>> > > usages in current hypervisor.
>> >
>> > As you say, those are (perhaps with the exception of tmem, which
>> > isn't supported anyway due to XSA-15, and which therefore also
>> > isn't on by default) in the order of a few thousand list elements.
>> > And as mentioned above, different bounds apply for lists traversed
>> > in interrupt context vs such traversed only in "normal" context.
>> >
>> 
>> That's a good point. Interrupt context should have more restrictions.
> 
> Hi, Jan,
> 
> I'm thinking your earlier idea about evenly distributed list:
> 
> --
> Ah, right, I think that limitation was named before, yet I've
> forgotten about it again. But that only slightly alters the
> suggestion: To distribute vCPU-s evenly would then require to
> change their placement on the pCPU in the course of entering
> blocked state.
> --
> 
> Actually after more thinking, there is no hard requirement that
> the vcpu must block on the pcpu which is configured in 'NDST'
> of that vcpu's PI descriptor. What really matters, is that the
> vcpu is added to the linked list of the very pcpu, then when PI
> notification comes we can always find out the vcpu struct from
> that pcpu's linked list. Of course one drawback of such placement
> is additional IPI incurred in wake up path.
> 
> Then one possible optimized policy within vmx_vcpu_block could 
> be:
> 
> (Say PCPU1 which VCPU1 is currently blocked on)
> - As long as the #vcpus in the linked list on PCPU1 is below a 
> threshold (say 16), add VCPU1 to the list. NDST set to PCPU1;
> Upon PI notification on PCPU1, local linked list is searched to
> find VCPU1 and then VCPU1 will be unblocked on PCPU1;
> 
> - Otherwise, add VCPU1 to PCPU2 based on a simple distribution 
> algorithm (based on vcpu_id/vm_id). VCPU1 still blocks on PCPU1
> but NDST set to PCPU2. Upon notification on PCPU2, local linked
> list is searched to find VCPU1 and then an IPI is sent to PCPU1 to 
> unblock VCPU1;

Sounds possible, if the lock handling can be got right. But of
course there can't be any hard limit like 16, at least not alone
(on a systems with extremely many mostly idle vCPU-s we'd
need to allow larger counts - see my earlier explanations in this
regard).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.