Floating-point fused multiply-add long to accumulator (by element)
This instruction multiplies the half-precision vector elements in the first source SIMD&FP register by the specified half-precision value in the second source SIMD&FP register, and accumulates the intermediate product without rounding to the corresponding single-precision vector element of the destination SIMD&FP register.
This instruction can generate a floating-point exception. Depending on the settings in FPCR, the exception results in either a flag being set in FPSR or a synchronous exception being generated. For more information, see Floating-point exceptions and exception traps.
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state and Exception level, an attempt to execute the instruction might be trapped.
In Armv8.2 and Armv8.3, this is an OPTIONAL instruction. From Armv8.4, it is mandatory for all implementations to support it.
ID_AA64ISAR0_EL1.FHM indicates whether this instruction is supported.
It has encodings from 2 classes: FMLAL and FMLAL2
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| 0 | Q | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | L | M | Rm | 0 | 0 | 0 | 0 | H | 0 | Rn | Rd | |||||||||||
| U | sz | S | |||||||||||||||||||||||||||||
if !IsFeatureImplemented(FEAT_FHM) then EndOfDecode(Decode_UNDEF); end; if sz == '1' then EndOfDecode(Decode_UNDEF); end; let d : integer = UInt(Rd); let n : integer = UInt(Rn); let m : integer = UInt('0'::Rm); // Vm can only be in bottom 16 registers. let index : integer = UInt(H::L::M); let esize : integer{} = 32; let datasize : integer{} = 64 << UInt(Q); let elements : integer = datasize DIV esize; let part : integer = 0;
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| 0 | Q | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | L | M | Rm | 1 | 0 | 0 | 0 | H | 0 | Rn | Rd | |||||||||||
| U | sz | S | |||||||||||||||||||||||||||||
if !IsFeatureImplemented(FEAT_FHM) then EndOfDecode(Decode_UNDEF); end; if sz == '1' then EndOfDecode(Decode_UNDEF); end; let d : integer = UInt(Rd); let n : integer = UInt(Rn); let m : integer = UInt('0'::Rm); // Vm can only be in bottom 16 registers. let index : integer = UInt(H::L::M); let esize : integer{} = 32; let datasize : integer{} = 64 << UInt(Q); let elements : integer = datasize DIV esize; let part : integer = 1;
| <Vd> |
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. |
| <Ta> |
Is an arrangement specifier,
encoded in
|
| <Vn> |
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. |
| <Tb> |
Is an arrangement specifier,
encoded in
|
| <Vm> |
Is the name of the second SIMD&FP source register, encoded in the "Rm" field. |
| <index> |
Is the element index, encoded in the "H:L:M" fields. |
AArch64_CheckFPAdvSIMDEnabled(); let operand1 : bits(datasize DIV 2) = Vpart{}(n, part); let operand2 : bits(128) = V{}(m); let operand3 : bits(datasize) = V{}(d); var result : bits(datasize); var element1 : bits(esize DIV 2); let element2 : bits(esize DIV 2) = operand2[index*:(esize DIV 2)]; for e = 0 to elements-1 do element1 = operand1[e*:(esize DIV 2)]; result[e*:esize] = FPMulAddH(operand3[e*:esize], element1, element2, FPCR()); end; V{datasize}(d) = result;
2026-03_rel 2026-03-26 20:48:11
Copyright © 2010-2026 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.