| On Fri, 2006-08-25 at 17:48 -0400, poff@xxxxxxxxxxxxxx wrote:
> +/* assumes destination page, *dp, is cacheable */
> +static __inline__ void copy_page_cacheable(void *dp, void *sp)
> +{
> +       unsigned long dwords, dword_size;
> +
> +       dword_size = 8;
> +       dwords = (PAGE_SIZE / dword_size) - 1;
> +
> +       clear_page_cacheable(dp);
> +
> +       __asm__ __volatile__(
> +       "mtctr  %2      # copy_page\n\
> +       ld      %2,0(%1)\n\
> +       std     %2,0(%0)\n\
> +1:     ldu     %2,8(%1)\n\
> +       stdu    %2,8(%0)\n\
> +       bdnz    1b"
> +       : /* no result */
> +       : "r" (dp), "r" (sp), "r" (dwords)
> +       : "%ctr", "memory");
> +} 
I would expect to see dcbtst in here, no?
Both functions (copy and clear) could stand a little loop unrolling.
I can understand if you're not *really* trying to optimize these, but in
that case why do you want to add dcbz? Is there a noticeable performance
improvement?
Also, it looks like you've removed support for mambo_memcpy(). I don't
use Mambo *ahem* systemsim myself, but that seems worth keeping. I guess
you could rename the function while you're in there. :)
-- 
Hollis Blanchard
IBM Linux Technology Center
_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel
 |