-
Notifications
You must be signed in to change notification settings - Fork 13.9k
RISC-V: Implement (Zkne or Zknd) intrinsics correctly #146798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Because some AES key scheduling instructions require *either* Zkne or Zknd extension, we must have a target feature to represent `(Zkne || Zknd)`. This commit adds (perma-unstable) target feature to the RISC-V architecture: `zkne_or_zknd` for this purpose. Helped-by: sayantn <[email protected]>
…insics Using the inline assembly and `zkne_or_zknd` target feature could avoid current issues regarding intrinsics available when either Zkne or Zknd is available. Before this commit, intrinsics `aes64ks1i` and `aes64ks2` required both Zkne and Zknd extensions, not either Zkne or Zknd. Closes: rust-lang/stdarch#1765
cc @Amanieu, @folkertdev, @sayantn |
Example (AES-256 shared encryption/decryption)#![no_std]
#![feature(riscv_ext_intrinsics)]
#![allow(clippy::identity_op)]
use core::arch::riscv64::{aes64ks1i, aes64ks2};
#[target_feature(enable = "zkne")]
// #[target_feature(enable = "zknd")]
pub fn aes256_key_schedule(key: &[u64; 4], scheduled_key: &mut [u64; 30]) {
let mut rk0 = key[0];
let mut rk1 = key[1];
let mut rk2 = key[2];
let mut rk3 = key[3];
scheduled_key[0] = rk0;
scheduled_key[1] = rk1;
scheduled_key[2] = rk2;
scheduled_key[3] = rk3;
macro_rules! double_round {
($i: expr) => {
let tmp = aes64ks1i::<$i>(rk3);
rk0 = aes64ks2(tmp, rk0);
rk1 = aes64ks2(rk0, rk1);
let tmp = aes64ks1i::<10>(rk1);
rk2 = aes64ks2(tmp, rk2);
rk3 = aes64ks2(rk2, rk3);
scheduled_key[4 * ($i + 1) + 0] = rk0;
scheduled_key[4 * ($i + 1) + 1] = rk1;
scheduled_key[4 * ($i + 1) + 2] = rk2;
scheduled_key[4 * ($i + 1) + 3] = rk3;
};
}
double_round!(0);
double_round!(1);
double_round!(2);
double_round!(3);
double_round!(4);
double_round!(5);
// Process tail
let tmp = aes64ks1i::<6>(rk3);
rk0 = aes64ks2(tmp, rk0);
rk1 = aes64ks2(rk0, rk1);
scheduled_key[4 * 7 + 0] = rk0;
scheduled_key[4 * 7 + 1] = rk1;
} Note that we can use By compiling this AES-256 key scheduling code with optimization enabled, we'll get for example:
|
Example (AES-256 decryption only)Normally, we'll perform AES key scheduling and then conversion for decryption. Let's see what will happen when those two operations are folded together. Note that inverse MixColumns operation ( #![no_std]
#![feature(riscv_ext_intrinsics)]
#![allow(clippy::identity_op)]
use core::arch::riscv64::{aes64im, aes64ks1i, aes64ks2};
#[target_feature(enable = "zknd")]
pub fn aes256_key_schedule_on_decryption(key: &[u64; 4], scheduled_key: &mut [u64; 30]) {
let mut rk0 = key[0];
let mut rk1 = key[1];
let mut rk2 = key[2];
let mut rk3 = key[3];
scheduled_key[0] = rk0;
scheduled_key[1] = rk1;
scheduled_key[2] = aes64im(rk2);
scheduled_key[3] = aes64im(rk3);
macro_rules! double_round {
($i: expr) => {
let tmp = aes64ks1i::<$i>(rk3);
rk0 = aes64ks2(tmp, rk0);
rk1 = aes64ks2(rk0, rk1);
let tmp = aes64ks1i::<10>(rk1);
rk2 = aes64ks2(tmp, rk2);
rk3 = aes64ks2(rk2, rk3);
scheduled_key[4 * ($i + 1) + 0] = aes64im(rk0);
scheduled_key[4 * ($i + 1) + 1] = aes64im(rk1);
scheduled_key[4 * ($i + 1) + 2] = aes64im(rk2);
scheduled_key[4 * ($i + 1) + 3] = aes64im(rk3);
};
}
double_round!(0);
double_round!(1);
double_round!(2);
double_round!(3);
double_round!(4);
double_round!(5);
// Process tail
let tmp = aes64ks1i::<6>(rk3);
rk0 = aes64ks2(tmp, rk0);
rk1 = aes64ks2(rk0, rk1);
scheduled_key[4 * 7 + 0] = rk0;
scheduled_key[4 * 7 + 1] = rk1;
} Since the inline assembly implementation is
|
On rust-lang/stdarch#1765, it has been pointed out that two RISC-V (64-bit only) intrinsics to perform AES key scheduling have wrong target feature.
aes64ks1i
andaes64ks2
instructions require either Zkne (scalar cryptography: AES encryption) or Zknd (scalar cryptography: AES decryption) extension (or both) but corresponding Rust intrinsics (incore::arch::riscv64
) required both Zkne and Zknd extensions.An excerpt from the original intrinsics:
#[target_feature(enable = "zkne", enable = "zknd")]
To fix that, we need to:
llvm.riscv.aes64ks1i
/llvm.riscv.aes64ks2
LLVM intrinsics require either Zkne or Zknd extension.This PR attempts to resolve them by:
zkne_or_zknd
(implied from bothzkne
andzknd
) andzkne_or_zknd
alone cannot imply neither Zkne nor Zknd, we cannot use LLVM intrinsics).The author confirmed that we can construct an AES key scheduling function with decent performance using fixed
aes64ks1i
andaes64ks2
intrinsics (with optimization enabled).Big thanks to @sayantn for the fundamental idea.
In this implementation, the author (I) used
.option push
,.option arch
and.option pop
. They can be used to temporally change the architecture in specific region of the code and almost all architecture changes are temporary (except ELF flags permanently set by using compressed instruction extensions and/or the Ztso extension).We can use
.option arch, +zkne
or.option arch, +zknd
and I arbitrarily chose the Zkne extension.r? @Amanieu
@rustbot label +O-riscv +A-target-feature