Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save jnettlet/f6f8b49bb7c731255c46f541f875f436 to your computer and use it in GitHub Desktop.
Save jnettlet/f6f8b49bb7c731255c46f541f875f436 to your computer and use it in GitHub Desktop.
glibc aarch64 memcpy fix for cortex-a72
From 6cf513dc8c5ab758072d894f10a58fbb4a146bd6 Mon Sep 17 00:00:00 2001
From: Jon Nettleton <jon@solid-run.com>
Date: Wed, 6 Oct 2021 09:16:35 -0400
Subject: [PATCH] Aarch64: Make memcpy more compatible with device memory
For normal non-cacheable memory ACE supports 4x128 bit r/w WRAP
transfers or 1x128 bit r/w INCR transfers. By re-ordering the
stp's in memcpy / memmove we can accomodate this better without
impacting the existing code.
This fixes an issue seen on multiple Cortex-A72 SOCs when writing
directly to a PCIe memmapped frame-buffer, which resulted in
corruption.
Signed-off-by: Jon Nettleton <jon@solid-run.com>
---
sysdeps/aarch64/memcpy.S | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/sysdeps/aarch64/memcpy.S b/sysdeps/aarch64/memcpy.S
index 0adc5246..b4f3b3aa 100644
--- a/sysdeps/aarch64/memcpy.S
+++ b/sysdeps/aarch64/memcpy.S
@@ -152,12 +152,12 @@ L(copy128):
stp G_l, G_h, [dstend, -64]
stp H_l, H_h, [dstend, -48]
L(copy96):
+ stp C_l, C_h, [dstend, -32]
+ stp D_l, D_h, [dstend, -16]
stp A_l, A_h, [dstin]
stp B_l, B_h, [dstin, 16]
stp E_l, E_h, [dstin, 32]
stp F_l, F_h, [dstin, 48]
- stp C_l, C_h, [dstend, -32]
- stp D_l, D_h, [dstend, -16]
ret
.p2align 4
@@ -274,10 +274,10 @@ L(copy64_from_start):
stp C_l, C_h, [dstend, -48]
ldp C_l, C_h, [src]
stp D_l, D_h, [dstend, -64]
- stp G_l, G_h, [dstin, 48]
- stp A_l, A_h, [dstin, 32]
- stp B_l, B_h, [dstin, 16]
stp C_l, C_h, [dstin]
+ stp B_l, B_h, [dstin, 16]
+ stp A_l, A_h, [dstin, 32]
+ stp G_l, G_h, [dstin, 48]
ret
END (MEMMOVE)
--
2.32.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment