STNT1D (scalar plus scalar, consecutive registers) -- A64

Contiguous store non-temporal of doublewords from multiple consecutive vectors (scalar index)

This instruction performs a contiguous non-temporal store of doublewords from elements of two or four consecutive vector registers to the memory address generated by a 64-bit scalar base and scalar index that is added to the base address. After each element access the index value is incremented, but the index register is not updated.

A non-temporal store is a hint to the system that this data is unlikely to be referenced again soon.

Two registers
(FEAT_SME2 || FEAT_SVE2p1)

Decode for this encoding

if !IsFeatureImplemented(FEAT_SME2) && !IsFeatureImplemented(FEAT_SVE2p1) then EndOfDecode(Decode_UNDEF); end; let n : integer = UInt(Rn); let m : integer = UInt(Rm); let g : integer = UInt('1'::PNg); let nreg : integer{} = 2; let t : integer = UInt(Zt::'0'); let esize : integer{} = 64;

Four registers
(FEAT_SME2 || FEAT_SVE2p1)

Decode for this encoding

if !IsFeatureImplemented(FEAT_SME2) && !IsFeatureImplemented(FEAT_SVE2p1) then EndOfDecode(Decode_UNDEF); end; let n : integer = UInt(Rn); let m : integer = UInt(Rm); let g : integer = UInt('1'::PNg); let nreg : integer{} = 4; let t : integer = UInt(Zt::'00'); let esize : integer{} = 64;

Assembler Symbols

<Zt1>	For the "Two registers" variant: is the name of the first scalable vector register to be transferred, encoded as "Zt" times 2.
	For the "Four registers" variant: is the name of the first scalable vector register to be transferred, encoded as "Zt" times 4.

<Zt2>

Is the name of the second scalable vector register to be transferred, encoded as "Zt" times 2 plus 1.

<PNg>

Is the name of the governing scalable predicate register PN8-PN15, with predicate-as-counter encoding, encoded in the "PNg" field.

<Xn|SP>

Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field.

<Xm>	Is the 64-bit name of the general-purpose offset register, encoded in the "Rm" field.

<Zt4>

Is the name of the fourth scalable vector register to be transferred, encoded as "Zt" times 4 plus 3.

Operation

if IsFeatureImplemented(FEAT_SVE2p1) then CheckSVEEnabled(); else CheckStreamingSVEEnabled(); end; let VL : integer{} = CurrentVL(); let PL : integer{} = VL DIV 8; let elements : integer = VL DIV esize; let mbytes : integer{} = esize DIV 8; var offset : bits(64); var base : bits(64); var addr : bits(64); var src : bits(VL); let pred : bits(PL) = P{}(g); let mask : bits(PL * nreg) = CounterToPredicate{}(pred[15:0]); let contiguous : boolean = TRUE; let nontemporal : boolean = TRUE; let transfer : integer = t; let tagchecked : boolean = TRUE; let accdesc : AccessDescriptor = CreateAccDescSVE(MemOp_STORE, nontemporal, contiguous, tagchecked); if !AnyActiveElement{PL*nreg}(mask, esize) then if n == 31 && ConstrainUnpredictableBool(Unpredictable_CHECKSPNONEACTIVE) then CheckSPAlignment(); end; else if n == 31 then CheckSPAlignment(); end; end; base = if n == 31 then SP{64}() else X{64}(n); offset = X{64}(m); addr = AddressAdd(base, UInt(offset) * mbytes, accdesc); for r = 0 to nreg-1 do src = Z{VL}(transfer+r); for e = 0 to elements-1 do if ActivePredicateElement{PL*nreg}(mask, r * elements + e, esize) then Mem{esize}(addr, accdesc) = src[e*:esize]; end; addr = AddressIncrement(addr, mbytes, accdesc); end; end;

Operational information

This instruction is a data-independent-time instruction as described in About PSTATE.DIT.

2026-03_rel 2026-03-26 20:48:11

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	0	1	0	0	0	0	0	0	0	1	Rm					0	1	1	PNg			Rn					Zt				1
																	msz														N

31	30	29	28	27	26	25	24	23	22	21	20	19	18	17	16	15	14	13	12	11	10	9	8	7	6	5	4	3	2	1	0
1	0	1	0	0	0	0	0	0	0	1	Rm					1	1	1	PNg			Rn					Zt			0	1
																	msz														N