FDOT (2-way, indexed, FP16 to FP32) -- A64

This instruction calculates the fused sum-of-products of a pair of half-precision values held in each 32-bit element of the first source vector and a pair of half-precision values in an indexed 32-bit element of the second source vector, without intermediate rounding, and then destructively adds the single-precision sum-of-products to the corresponding single-precision element of the destination vector.

The half-precision pairs within the second source vector are specified using an immediate index that selects the same pair position within each 128-bit vector segment. The index range is from 0 to 3.

SVE2
(FEAT_SME2 || FEAT_SVE2p1)

Decode for this encoding

if !IsFeatureImplemented(FEAT_SME2) && !IsFeatureImplemented(FEAT_SVE2p1) then EndOfDecode(Decode_UNDEF); end; let n : integer = UInt(Zn); let m : integer = UInt(Zm); let da : integer = UInt(Zda); let index : integer = UInt(i2);

Assembler Symbols

<Zda>

Is the name of the third source and destination scalable vector register, encoded in the "Zda" field.

<Zn>	Is the name of the first source scalable vector register, encoded in the "Zn" field.

<Zm>	Is the name of the second source scalable vector register Z0-Z7, encoded in the "Zm" field.

<imm>

Is the immediate index of a pair of 16-bit elements within each 128-bit vector segment, in the range 0 to 3, encoded in the "i2" field.

Operation

CheckSVEEnabled(); let VL : integer{} = CurrentVL(); let elements : integer = VL DIV 32; let eltspersegment : integer = 128 DIV 32; let operand1 : bits(VL) = Z{}(n); let operand2 : bits(VL) = Z{}(m); let operand3 : bits(VL) = Z{}(da); var result : bits(VL); for e = 0 to elements-1 do let segmentbase : integer = e - (e MOD eltspersegment); let s : integer = segmentbase + index; let elt1_a : bits(16) = operand1[(2 * e + 0)*:16]; let elt1_b : bits(16) = operand1[(2 * e + 1)*:16]; let elt2_a : bits(16) = operand2[(2 * s + 0)*:16]; let elt2_b : bits(16) = operand2[(2 * s + 1)*:16]; var sum : bits(32) = operand3[e*:32]; sum = FPDotAdd(sum, elt1_a, elt1_b, elt2_a, elt2_b, FPCR()); result[e*:32] = sum; end; Z{VL}(da) = result;

Operational information

This instruction might be immediately preceded in program order by a MOVPRFX instruction. The MOVPRFX must conform to all of the following requirements, otherwise the behavior of the MOVPRFX and this instruction is CONSTRAINED UNPREDICTABLE:

The MOVPRFX must be unpredicated.
The MOVPRFX must specify the same destination register as this instruction.
The destination register must not refer to architectural register state referenced by any other source operand register of this instruction.

2026-03_rel 2026-03-26 20:48:11

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
0	1	1	0	0	1	0	0	0	0	1	i2		Zm			0	1	0	0	0	0	Zn					Zda
									op											opc2