[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[XENVIF PATCH RESEND] Faster checksum for x64 with carry flag renaming



Modern CPUs are capable of renaming CF independently of other arithmetic
flags. We can better exploit ILP by maintaining two carry chains in the
hot checksum loop. This gives a ~20% speed boost for packets larger than
64 bytes (as tested on Zen 3).

Suggested-by: Frediano Ziglio <frediano.ziglio@xxxxxxxxxx>
Signed-off-by: Tu Dinh <ngoc-tu.dinh@xxxxxxxxxx>
---
 src/xenvif/amd64/checksum_amd64.asm | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/xenvif/amd64/checksum_amd64.asm 
b/src/xenvif/amd64/checksum_amd64.asm
index 8fbc241..37bbb7e 100644
--- a/src/xenvif/amd64/checksum_amd64.asm
+++ b/src/xenvif/amd64/checksum_amd64.asm
@@ -26,10 +26,12 @@ l64:
     adc rax, [rdx + 8]
     adc rax, [rdx + 16]
     adc rax, [rdx + 24]
-    adc rax, [rdx + 32]
-    adc rax, [rdx + 40]
-    adc rax, [rdx + 48]
-    adc rax, [rdx + 56]
+    adc rax, 0
+    mov r9, [rdx + 32]
+    add r9, [rdx + 40]
+    adc r9, [rdx + 48]
+    adc r9, [rdx + 56]
+    adc rax, r9
     adc rax, 0
 
     sub r8, 64
-- 
2.51.0.windows.2



--
Ngoc Tu Dinh | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.