On Fri, 2006-08-25 at 17:48 -0400, poff@xxxxxxxxxxxxxx wrote:
> +/* assumes destination page, *dp, is cacheable */
> +static __inline__ void copy_page_cacheable(void *dp, void *sp)
> +{
> + unsigned long dwords, dword_size;
> +
> + dword_size = 8;
> + dwords = (PAGE_SIZE / dword_size) - 1;
> +
> + clear_page_cacheable(dp);
> +
> + __asm__ __volatile__(
> + "mtctr %2 # copy_page\n\
> + ld %2,0(%1)\n\
> + std %2,0(%0)\n\
> +1: ldu %2,8(%1)\n\
> + stdu %2,8(%0)\n\
> + bdnz 1b"
> + : /* no result */
> + : "r" (dp), "r" (sp), "r" (dwords)
> + : "%ctr", "memory");
> +}
I would expect to see dcbtst in here, no?
Both functions (copy and clear) could stand a little loop unrolling.
I can understand if you're not *really* trying to optimize these, but in
that case why do you want to add dcbz? Is there a noticeable performance
improvement?
Also, it looks like you've removed support for mambo_memcpy(). I don't
use Mambo *ahem* systemsim myself, but that seems worth keeping. I guess
you could rename the function while you're in there. :)
--
Hollis Blanchard
IBM Linux Technology Center
_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel
|