[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] New MPI benchmark performance results (update)

To: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>
From: xuehai zhang <hai@xxxxxxxxxxxxxxx>
Date: Tue, 03 May 2005 11:48:38 -0500
Cc: Xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 03 May 2005 16:48:38 +0000
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Ian,

Thanks for the response.

In the graphs presented on the webpage, we take the resultsof native Linux as the reference and normalize the other 3scenarios to it. We observe a general pattern that usuallydom0 has a better performance than domU with SMP than domUwithout SMP (here better performance means low latency andhigh throughput). However, we also notice very bigperformance gap between domU (w/o SMP) and native linux (ordom0 because generally dom0 has a very similar performance asnative linux). Some distinct examples are: 8-node SendRecvlatency (max domU/linux score ~ 18), 8-node Allgather latency(max domU/linux score ~ 17), and 8-node Alltoall latency (maxdomU/linux > 60). The performance difference in the lastexample is HUGE and we could not think about a reasonableexplaination why transferring 512B message size is so muchdifferent than other sizes. We appreciate if you can provideyour insight to such a big performance problem in these benchmarks.
I still don't quite understand your experimental setup. What version of
Xen are you using? How many CPUs does each node have? How many domU's do
you run on a single node?

The Xen version is 2.0. Each node has 2 CPUs. "domU with SMP" I mentioned in the previous emailmeans Xen is booted with SMP support (no "nosmp" option) and I pin dom0 to the 1st CPU and pin domUto the 2nd CPU; "domU with no SMP" I mentioned means Xen is booted without SMP support (with "nosmp"option) and both dom0 and domU use the same single CPU. There is only 1 domU running on a singlenode for each experiment.

As regards the anomalous result for 512B AlltoAll performance, the best
way to track this down would be to use xen-oprofile.

I am not very familar with xen-oprofile. I notice there are some discussions about it in the mailinglist. I wonder if there is any other documents that I can refer to. Thanks.

Is it reliably repeatable?

Yes, we observe this anomaly repeatable. The reported data point in the graph is the average of 10different runs of the same experiment in different time.

Really bad results are usually due to packets being dropped
somewhere -- there hasn't ben a whole lot of effort put into UDP
performance because so few applications use it.

To clarify: do you indicate that benchmark like AlltoAll might use UDP rather than TCP astransportation protocol?


Thanks again for the help.

Xuehai

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

References:
- RE: [Xen-devel] New MPI benchmark performance results (update)
  - From: Ian Pratt

Prev by Date: Re: [Xen-devel] New MPI benchmark performance results (update)
Next by Date: Re: [Xen-devel] New MPI benchmark performance results (update)
Previous by thread: RE: [Xen-devel] New MPI benchmark performance results (update)
Next by thread: RE: [Xen-devel] New MPI benchmark performance results (update)
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.