LUTI4 (four registers, 16-bit and 32-bit)

Lookup table read with 4-bit indexes (four registers)

This instruction copies 16-bit or 32-bit elements from ZT0 to four destination vectors using packed 4-bit indices from a segment of the source vector register. A segment corresponds to a portion of the source vector that is consumed in order to fill the destination vector. The segment is selected by the vector segment index modulo the total number of segments.

This instruction is unpredicated.

It has encodings from 2 classes: Consecutive and Strided

Consecutive
(FEAT_SME2)

313029282726252423222120191817161514131211109876543210
110000001000101i110size00ZnZd00
opc2

Encoding

LUTI4 { <Zd1>.<T>-<Zd4>.<T> }, ZT0, <Zn>[<index>]

Decode for this encoding

if !IsFeatureImplemented(FEAT_SME2) then EndOfDecode(Decode_UNDEF); end; if size == '00' || size == '11' then EndOfDecode(Decode_UNDEF); end; let esize : integer{} = 8 << UInt(size); let isize : integer{} = 4; let n : integer = UInt(Zn); let dstride : integer = 1; let d : integer = UInt(Zd::'00'); let imm : integer = UInt(i1); let nreg : integer{} = 4;

Strided
(FEAT_SME2p1)

313029282726252423222120191817161514131211109876543210
110000001001101i110size00ZnD00Zd
opc2

Encoding

LUTI4 { <Zd1>.H, <Zd2>.H, <Zd3>.H, <Zd4>.H }, ZT0, <Zn>[<index>]

Decode for this encoding

if !IsFeatureImplemented(FEAT_SME2p1) then EndOfDecode(Decode_UNDEF); end; if size != '01' then EndOfDecode(Decode_UNDEF); end; let esize : integer{} = 8 << UInt(size); let isize : integer{} = 4; let n : integer = UInt(Zn); let dstride : integer = 4; let d : integer = UInt(D::'00'::Zd); let imm : integer = UInt(i1); let nreg : integer{} = 4;

Assembler Symbols

<Zd1>

For the "Consecutive" variant: is the name of the first scalable vector register of the destination multi-vector group, encoded as "Zd" times 4.

For the "Strided" variant: is the name of the first scalable vector register Z0-Z3 or Z16-Z19 of the destination multi-vector group, encoded as "D:'00':Zd".

<T>

Is the size specifier, encoded in size:

size <T>
00 RESERVED
01 H
10 S
11 RESERVED
<Zd4>

For the "Consecutive" variant: is the name of the fourth scalable vector register of the destination multi-vector group, encoded as "Zd" times 4 plus 3.

For the "Strided" variant: is the name of the fourth scalable vector register Z12-Z15 or Z28-Z31 of the destination multi-vector group, encoded as "D:'11':Zd".

<Zn>

Is the name of the source scalable vector register, encoded in the "Zn" field.

<index>

Is the vector segment index, in the range 0 to 1, encoded in the "i1" field.

<Zd2>

Is the name of the second scalable vector register Z4-Z7 or Z20-Z23 of the destination multi-vector group, encoded as "D:'01':Zd".

<Zd3>

Is the name of the third scalable vector register Z8-Z11 or Z24-Z27 of the destination multi-vector group, encoded as "D:'10':Zd".

Operation

CheckStreamingSVEEnabled(); CheckSMEZT0Enabled(); let VL : integer{} = CurrentVL(); let elements : integer = VL DIV esize; let segments : integer = (esize DIV (isize * nreg)) as integer{1, 2, 4, 8, 16}; let segment : integer = imm MOD segments; let indexes : bits(VL) = Z{}(n); var dst : integer = d; let table : bits(512) = ZT0{}(); for r = 0 to nreg-1 do let base : integer = (segment * nreg + r) * elements; var result : bits(VL); for e = 0 to elements-1 do let index : integer = UInt(indexes[(base+e)*:isize]); result[e*:esize] = table[index*:32][esize-1:0]; end; Z{VL}(dst) = result; dst = dst + dstride; end;

Operational information

This instruction is a data-independent-time instruction as described in About PSTATE.DIT.


2026-03_rel 2026-03-26 20:48:11

Copyright © 2010-2026 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.