[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86: add SSE-based copy_page()



On 12/01/2009 23:29, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:

> I finally got around to measuring this.  On my two machines,
> an Intel "Weybridge" box and an Intel TBD quadcore box,
> the new sse2 code was at best nearly the same for cold cache
> and much worse for warm cache.
> 
> I can't explain the sampling variation as I have interrupts off,
> a lock held, and pre-warmed TLB... I suppose maybe another
> processor could be causing rare TLB misses?  But in any case
> the min number is probably best for comparison.
> 
> I'm guessing the gcc optimizer for the memcpy code was tuned
> for an Intel pipeline... Jan, were you measuring on an
> AMD processor?
> 
> I've included the raw data and measurement code below.

Seems like unless we dynamically choose the copy routine, we're better off
without the SSE2 alternative. Shall I revert it then?

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.