xen-devel
Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU
To: |
Dante Cinco <dantecinco@xxxxxxxxx>, andrew.thomas@xxxxxxxxxx, mukesh.rathor@xxxxxxxxxx, keir.fraser@xxxxxxxxxxxxx, mathieu.desnoyers@xxxxxxxxxx, chris.mason@xxxxxxxxxx, Jeremy Fitzhardinge <jeremy@xxxxxxxx> |
Subject: |
Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough |
From: |
Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> |
Date: |
Thu, 18 Nov 2010 12:19:36 -0500 |
Cc: |
Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx> |
Delivery-date: |
Thu, 18 Nov 2010 09:21:30 -0800 |
Envelope-to: |
www-data@xxxxxxxxxxxxxxxxxxx |
In-reply-to: |
<AANLkTin7SRKuT5qQQ_1NSis1asOG3eJ1SmmC3fppsGnv@xxxxxxxxxxxxxx> |
List-help: |
<mailto:xen-devel-request@lists.xensource.com?subject=help> |
List-id: |
Xen developer discussion <xen-devel.lists.xensource.com> |
List-post: |
<mailto:xen-devel@lists.xensource.com> |
List-subscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe> |
List-unsubscribe: |
<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe> |
References: |
<20101112165541.GA10339@xxxxxxxxxxxx> <EB4C61A1A2501842A04B573FE42B14D601374FBFD2@xxxxxxxxxxxxxxxxx> <20101112223333.GD26189@xxxxxxxxxxxx> <AANLkTi=H6r2=-zJE+6eCtP4VXacYhd_e47+KRW5vdwjS@xxxxxxxxxxxxxx> <20101116185748.GA11549@xxxxxxxxxxxx> <AANLkTikw8reKXwd9CcXc3qqHuXKjbMEatAVfn19uwzs3@xxxxxxxxxxxxxx> <20101116201349.GA18315@xxxxxxxxxxxx> <AANLkTin7SRKuT5qQQ_1NSis1asOG3eJ1SmmC3fppsGnv@xxxxxxxxxxxxxx> |
Sender: |
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
Keir, Dan, Mathieu, Chris, Mukesh,
This fellow is passing in a PCI device to his Xen PV guest and trying
to get high IOPS. The kernel he is using is a 2.6.36 with tglx's
sparse_irq rework.
> I wanted to confirm that bounce buffering was indeed occurring so I
> modified swiotlb.c in the kernel and added printks in the following
> functions:
> swiotlb_bounce
> swiotlb_tbl_map_single
> swiotlb_tbl_unmap_single
> Sure enough we were calling all 3 five times per I/O. We took your
> suggestion and replaced pci_map_single with pci_pool_alloc. The
> swiotlb calls were gone but the I/O performance only improved 6% (29k
> IOPS to 31k IOPS) which is still abysmal.
Hey! 6% that is nothing to sneeze at.
>
> Any suggestions on where to look next? I have one question about the
So since you are talking IOPS I figured you must be using fio to run those
numbers. And since you mentioned HVM at some point, you are not running
this PV domain as a back-end for another PV guest. You are probably going
to run some form of iSCSI target and stuff those down the PCI device.
Couple of things that pop in my head.. but lets first address your question.
> P2M array: Does the P2M lookup occur every DMA or just during the
> allocation? What I'm getting at is this: Is the Xen-SWIOTLB a central
It only occurs during allocation. Also since you are bypassing the
bounce buffer those calls are done without any spinlock. The lookup
of P2M is bitshifting, division - and are constant - so O(1).
> resource that could be a bottleneck?
Doubt it. Your best bet to figure this out is to play with ftrace, or
perf trace. But I don't know how well they work with Xen nowadays - Jeremy
and Mathieu Desnoyers poked it a bit and I think I overheard that Mathieu got
it working?
So the next couple of possiblities are:
1). you are hitting the spinlock issues on 'struct request' or any of
the paths on the I/O. Oracle did a lot of work on those - and one
way to find this out is to look at tracing and see where the contention is.
I don't know where or if those patches have been posted upstream.. but as
said,
if you are seeing the spinlock usage high - that might be it.
1b). Spinlocks - make sure you have CONFIG_PVOPS_SPINLOCK enabled. Otherwise
you are going to hit dreadfull conditions.
2). You are hitting the 64-bit syscall wall. Basically your user-mode
application (fio) is doing a write(), which used to be int 0x80 but now
is a syscall. The syscall gets trapped in the hypervisor which has to
call in your PV kernel. You get hit with two context switches for each
'write()' call. The solution is to use a 32-bit DomU as the guest user
application and guest kernel run in different rings.
3). Xen CPU pools. You didn't say where the application that sends the IOs
is located. But if it was in a seperate domain then you will want to use
Xen CPU pools. Basically this way you can get gang-scheduling where the
guest that submits the I/O and the guest that picks up the I/O are running
right after each other. I don't know much more details, but this is what
I understand it does.
4). CPU/MSI-X affinity. I think you already did this, but make sure you pin
your guest to specific CPUs and also pin the MSI-X (vectors) to the proper
destination. You can use the 'xm debug-keys i' to see the MSI-X affinity -
it
is a mask and basically see if it overlays the CPUs you are running your
guest
at. Not sure how to actually set the MSI-X affinity ... now that I think.
Keir or some of the Intel folks might know better.
5). Andrew, Mukesh, Keir, Dan, any other ideas?
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, (continued)
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Lin, Ray
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Konrad Rzeszutek Wilk
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Lin, Ray
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Konrad Rzeszutek Wilk
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Lin, Ray
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dante Cinco
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Konrad Rzeszutek Wilk
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dante Cinco
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Konrad Rzeszutek Wilk
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dante Cinco
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough,
Konrad Rzeszutek Wilk <=
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Chris Mason
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Mathieu Desnoyers
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dante Cinco
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Lin, Ray
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dante Cinco
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dan Magenheimer
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Lin, Ray
- RE: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dan Magenheimer
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Dante Cinco
- Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough, Jeremy Fitzhardinge
|
|
|