BFMLA (vectors) -- A64

This instruction multiplies the active BFloat16 elements of the first source vector by the corresponding BFloat16 elements of the second source vector. The results are then added to elements of the third source (addend) vector without intermediate rounding and destructively placed in the destination and third source (addend) vector. Inactive elements in the destination vector register remain unmodified.

SVE2
(FEAT_SVE_B16B16)

Decode for this encoding

if !IsFeatureImplemented(FEAT_SVE_B16B16) then EndOfDecode(Decode_UNDEF); end; let g : integer = UInt(Pg); let n : integer = UInt(Zn); let m : integer = UInt(Zm); let da : integer = UInt(Zda); let op1_neg : boolean = FALSE; let op3_neg : boolean = FALSE;

Assembler Symbols

<Zda>

Is the name of the third source and destination scalable vector register, encoded in the "Zda" field.

<Pg>	Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field.

<Zn>	Is the name of the first source scalable vector register, encoded in the "Zn" field.

<Zm>	Is the name of the second source scalable vector register, encoded in the "Zm" field.

Operation

if IsFeatureImplemented(FEAT_SME2) then CheckSVEEnabled(); else CheckNonStreamingSVEEnabled(); end; let VL : integer{} = CurrentVL(); let PL : integer{} = VL DIV 8; let elements : integer = VL DIV 16; let mask : bits(PL) = P{}(g); let op1 : bits(VL) = if AnyActiveElement{PL}(mask, 16) then Z{VL}(n) else Zeros{VL}; let op2 : bits(VL) = if AnyActiveElement{PL}(mask, 16) then Z{VL}(m) else Zeros{VL}; let op3 : bits(VL) = Z{}(da); var result : bits(VL); for e = 0 to elements-1 do if ActivePredicateElement{PL}(mask, e, 16) then let elem1 : bits(16) = if op1_neg then BFNeg(op1[e*:16]) else op1[e*:16]; let elem2 : bits(16) = op2[e*:16]; let elem3 : bits(16) = if op3_neg then BFNeg(op3[e*:16]) else op3[e*:16]; result[e*:16] = BFMulAdd(elem3, elem1, elem2, FPCR()); else result[e*:16] = op3[e*:16]; end; end; Z{VL}(da) = result;

Operational information

This instruction might be immediately preceded in program order by a MOVPRFX instruction. The MOVPRFX must conform to all of the following requirements, otherwise the behavior of the MOVPRFX and this instruction is CONSTRAINED UNPREDICTABLE:

The MOVPRFX can be predicated or unpredicated.
A predicated MOVPRFX must use the same governing predicate register as this instruction.
A predicated MOVPRFX must use the larger of the destination element size and first source element size in the preferred disassembly of this instruction.
The MOVPRFX must specify the same destination register as this instruction.
The destination register must not refer to architectural register state referenced by any other source operand register of this instruction.

2026-03_rel 2026-03-26 20:48:11

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
0	1	1	0	0	1	0	1	0	0	1	Zm					0	0	0	Pg			Zn					Zda
								size										op

BFMLA (vectors)

SVE2(FEAT_SVE_B16B16)

Encoding

Decode for this encoding

Assembler Symbols

Operation

Operational information

SVE2
(FEAT_SVE_B16B16)