[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Discussion about virtual iommu support for Xen guest





On 6/3/2016 7:17 PM, Tian, Kevin wrote:
From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
Sent: Friday, June 03, 2016 2:59 AM

On 02/06/16 16:03, Lan, Tianyu wrote:
On 5/27/2016 4:19 PM, Lan Tianyu wrote:
On 2016年05月26日 19:35, Andrew Cooper wrote:
On 26/05/16 09:29, Lan Tianyu wrote:

To be viable going forwards, any solution must work with PVH/HVMLite as
much as HVM.  This alone negates qemu as a viable option.

From a design point of view, having Xen needing to delegate to qemu to
inject an interrupt into a guest seems backwards.


Sorry, I am not familiar with HVMlite. HVMlite doesn't use Qemu and
the qemu virtual iommu can't work for it. We have to rewrite virtual
iommu in the Xen, right?


A whole lot of this would be easier to reason about if/when we get a
basic root port implementation in Xen, which is necessary for HVMLite,
and which will make the interaction with qemu rather more clean.  It is
probably worth coordinating work in this area.

The virtual iommu also should be under basic root port in Xen, right?


As for the individual issue of 288vcpu support, there are already
issues
with 64vcpu guests at the moment. While it is certainly fine to remove
the hard limit at 255 vcpus, there is a lot of other work required to
even get 128vcpu guests stable.


Could you give some points to these issues? We are enabling more vcpus
support and it can boot up 255 vcpus without IR support basically. It's
very helpful to learn about known issues.

We will also add more tests for 128 vcpus into our regular test to find
related bugs. Increasing max vcpu to 255 should be a good start.

Hi Andrew:
Could you give more inputs about issues with 64 vcpus and what needs to
be done to make 128vcpu guest stable? We hope to do somethings to
improve them.

What's progress of PCI host bridge in Xen? From your opinion, we should
do that first, right? Thanks.

Very sorry for the delay.

There are multiple interacting issues here.  On the one side, it would
be useful if we could have a central point of coordination on
PVH/HVMLite work.  Roger - as the person who last did HVMLite work,
would you mind organising that?

For the qemu/xen interaction, the current state is woeful and a tangled
mess.  I wish to ensure that we don't make any development decisions
which makes the situation worse.

In your case, the two motivations are quite different I would recommend
dealing with them independently.

IIRC, the issue with more than 255 cpus and interrupt remapping is that
you can only use x2apic mode with more than 255 cpus, and IOAPIC RTEs
can't be programmed to generate x2apic interrupts?  In principle, if you
don't have an IOAPIC, are there any other issues to be considered?  What
happens if you configure the LAPICs in x2apic mode, but have the IOAPIC
deliver xapic interrupts?

The key is the APIC ID. There is no modification to existing PCI MSI and
IOAPIC with the introduction of x2apic. PCI MSI/IOAPIC can only send
interrupt message containing 8bit APIC ID, which cannot address >255
cpus. Interrupt remapping supports 32bit APIC ID so it's necessary to
enable >255 cpus with x2apic mode.

If LAPIC is in x2apic while interrupt remapping is disabled, IOAPIC cannot
deliver interrupts to all cpus in the system if #cpu > 255.

Another key factor, Linux kernel disables x2apic mode when MAX APIC id
is > 255 if no interrupt remapping function. The reason for this is what
Kevin said. So booting up >255 cpus relies on the interrupt remapping.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.