# Lightweight Fault Isolation: Practical, Efficient, and Secure Software Sandboxing

Zachary Yedidia

Stanford University

Today's systems increasingly run untrusted code.

- Cloud machines and serverless (VMs, containers, WebAssembly).
- Kernels (eBPF).
- Web browsers (JavaScript, WebAssembly).
- Smart contracts (WebAssembly, EVM).

Applications need lightweight solutions with fast context switch times (single process).

**Goal**: enforce that untrusted programs

- cannot read/write outside sandbox.
- cannot directly perform system calls.



#### Native







Problem: trusting a language verifier and compiler can be dangerous.



Problem: tradeoff between performance and security.

#### CVE-2021-32629 PUBLISHED

View JSON

Memory access due to code generation flaw in Cranelift module

Important CVE JSON 5 Information

Assigner: GitHub M Published: 2021-05-24 Updated: 2021-05-24

Cranellit is an open-source code generator maintained by Bytecode Alliance. It translates a target-independent intermediate representation into executable machine code. There is a bug in 0.73 of the Cranellit X64 backend that can create a scenario that could result in a potential sandbox escape in a Wasm program. This bug was introduced in the new backend on 2020-09-08 and first included in a release on 2020-09-30, but the new backend was not the

#### CVE-2021-32629 PUBLISHED

View JSON

Memory access due to code generation flaw in Cranelift module

#### Important CVE JSON 5 Information

Assigner: GitHub M Published: 2021-05-24 Updated: 2021-05-24

#### CVE-2023-26489 PUBLISHED

Cranelitt is an open-source code generator maintained target-independent intermediate representation into exe in 0.73 of the Cranelitt X64 backend that can create a si sandbox escape in a Wasm program. This bug was 2020-09-08 and first included in a release on 2020-09-08.

Guest-controlled out-of-bounds read/write on x86\_64 in wasmtime

#### Important CVE JSON 5 Information

Assigner: GitHub M

Published: 2023-03-08 Updated: 2023-03-08

vaoritime is a fast and secure nuttime for WebAsembly. In affected versions wavitime's code generator, Cranelle, Thas a bug, on x86 64 kargets where address-mode compatation mistakenty would calculate a 35-bit effective address. The second section of the second section of the second section of the second section of the second read of the second section of the second read of the second section of the second read of the second section of the section of the second section of the section of

View JSON



| CVE-2021-32629 PUBLISHED<br>Memory access due to code generation flaw in Cranelift                                       | View JSON                                                             |           |                                                                                                                                 |
|--------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------|-----------|---------------------------------------------------------------------------------------------------------------------------------|
| Important CVE JSON 5 Information                                                                                         | +                                                                     |           |                                                                                                                                 |
| Assigner: GitHub M<br>Published: 2021-05-24 Updated: 2021-05-24<br>Cranelift is an open-source code generator maintained | CVE-2023-26489 PUBLIS<br>Guest-controlled out-of-bounds read/write on |           | View JSON                                                                                                                       |
| targe<br>in 0.: CVE-2023-41880 PUBLISHED<br>Sand<br>Miscompilation of wasm 'i64x2.shr_s' instruction<br>2020             |                                                                       | JSON      | +                                                                                                                               |
| CVE-2021-20320 PUBLISHER                                                                                                 |                                                                       | View JSON | :ted versions wasmtime's code<br>→ address-mode computation<br>WebAssembly's defined 33-bit<br>ttings, a wasm-controlled load/  |
| Important CVE JSON 5 Information                                                                                         |                                                                       | +         | om the base of linear memory.<br>ffffc = 36507222004 = ~34G`<br>.est code. This means that the<br>-34G away can be read/written |

Assigner: Redhat Published: 2022-02-18 Updated: 2022-02-18

A flaw was found in s390 eBPF JIT in bpf\_jlt\_insn in arch/s390/net/bpf\_jlt\_comp.c in the Linux kernel. In this flaw, a local attacker with special user privilege can circumvent the verifier and may lead to a confidentiality problem.



(5.7-rc1) and the XOR variant was introduced by 29 xor\*) (5.10-rc1).



| Important CVE JSON 5 Information                                | VE-2021-212                                | 220 PUBLISHED                             | View JSON        |               |
|-----------------------------------------------------------------|--------------------------------------------|-------------------------------------------|------------------|---------------|
| Assigner: GitHub M<br>Published: 2021-05-24 Updated: 2021-05-24 | Important CVE JSON 5                       | Information                               | +                |               |
|                                                                 | signer: Chrome<br>blished: 2021-04-26 Upda | ated: 2023-12-14                          |                  |               |
| allo                                                            | owed a remote attacker to J                | rusted input in V8 in Google Chrome prior | to 89.0.4389.128 | View JSC      |
| allo                                                            | owed a remote attacker to J                | CVE-2020-9802 PUBLISHED                   | to 89.0.4389.128 | View JSC<br>+ |

# Conclusion: we need better security!

 $\rightarrow$  and we can get better performance too.

SFI (Wahbe et al., SOSP '93): Verify machine code.









Presenting Lightweight Fault Isolation: low overhead, secure, scalable, simple.



Presenting Lightweight Fault Isolation: low overhead, secure, scalable, simple.



Already using WebAssembly? You can run WebAssembly inside LFI!



# Principles:

- Use 4GiB sandboxes combined with instructions to operate on 32-bit values.
- Works without modification to existing compilers.
  - $\rightarrow$  just reserve a few registers (x18 and x21) when compiling.
  - $\rightarrow$  can work with any compiler that produces GNU assembly.
- Every address in the sandbox is a valid branch target (no aligned bundles).
  → helps simplicity, code size, running time, and Spectre-safety.

- Fixed-width encoding: misalignment traps.
  - $\rightarrow$  Consistent disassembly without aligned bundles.
- 32 64-bit registers (x0-x30, sp).
- Stack pointer register (sp).
- Dedicated return address register (x30).
  → Easy to reserve registers.
- 32-bit register subsets (w0-w30, wsp).
- A 32-bit addressing mode.
  - $\rightarrow$  Fast operations for 4GiB sandboxes.

- Fixed-width encoding: misalignment traps.
  - $\rightarrow$  Consistent disassembly without aligned bundles.
- 32 64-bit registers (x0-x30, sp).
- Stack pointer register (sp).
- Dedicated return address register (x30).
  → Easy to reserve registers.
- 32-bit register subsets (w0-w30, wsp).
- A 32-bit addressing mode.
  - $\rightarrow$  Fast operations for 4GiB sandboxes.

- Fixed-width encoding: misalignment traps.
  - $\rightarrow$  Consistent disassembly without aligned bundles.
- 32 64-bit registers (x0-x30, sp).
- Stack pointer register (sp).
- Dedicated return address register (x30).
  → Easy to reserve registers.
- 32-bit register subsets (w0-w30, wsp).
- A 32-bit addressing mode.
  - $\rightarrow$  Fast operations for 4GiB sandboxes.

- Fixed-width encoding: misalignment traps.
  - $\rightarrow$  Consistent disassembly without aligned bundles.
- 32 64-bit registers (x0-x30, sp).
- Stack pointer register (sp).
- Dedicated return address register (x30).
  → Easy to reserve registers.
- 32-bit register subsets (w0-w30, wsp).
- A 32-bit addressing mode.
  - $\rightarrow$  Fast operations for 4GiB sandboxes.

## **Basic Implementation: Overview**

LFI (ARM64)



| add  | x0, x1, x2   | // | safe   |
|------|--------------|----|--------|
| b fo | 00           | // | safe   |
| svc  | #0 (syscall) | // | unsafe |
| ldr  | x1, [x0]     | // | unsafe |
| mov  | x18, x0      | // | unsafe |

Unsafe instruction  $\rightarrow$  rejected by verifier.

| add  | x0, x1, x2   | // | safe   |
|------|--------------|----|--------|
| b fo | 00           | // | safe   |
| svc  | #0 (syscall) | // | unsafe |
| ldr  | x1, [x0]     | // | unsafe |
| mov  | x18, x0      | // | unsafe |

Unsafe instruction  $\rightarrow$  rejected by verifier.

Special/reserved register (same idea from the '93 SFI project):

• x18: always contains a valid sandbox address.

add x0, x1, x2 // safe b foo // safe svc #0 (syscall) // unsafe ldr x1, [x0] // unsafe mov x18, x0 // unsafe

add x0, x1, x2 // safe Unsafe instruction  $\rightarrow$  rejected by verifier.

Special/reserved register (same idea from the '93 SFI project):

• x18: always contains a valid sandbox address.

ldr x1, [x18] // safe

How to safely modify x18?

• x21: sandbox base address (aligned to 4GiB).

#### How to safely modify x18?

• x21: sandbox base address (aligned to 4GiB).



#### How to safely modify x18?

• x21: sandbox base address (aligned to 4GiB).



add x18, x21, w0, uxtw // safe

## How to safely modify x18?

• x21: sandbox base address (aligned to 4GiB).



If x0 contained a valid address, the add is a mov.

Otherwise, sandbox has non-escaping undefined behavior (pointer overflow).

| Original |      | Sandboxed              |
|----------|------|------------------------|
| ldr x1,  | [0]  | add x18, x21, w0, uxtw |
| iui xi,  | 「YO」 | ldr x1, [x18]          |
| br x0    |      | add x18, x21, w0, uxtw |
| DI XU    |      | br x18                 |

Instrumenter performs transformations; verifier is convinced of their safety.

| Original |                    | Sandboxed              |
|----------|--------------------|------------------------|
| ldr x1,  | [ <sub>22</sub> 0] | add x18, x21, w0, uxtw |
| iui xi,  | [YO]               | ldr x1, [x18]          |
| br x0    |                    | add x18, x21, w0, uxtw |
| DI XU    |                    | br x18                 |

Instrumenter performs transformations; verifier is convinced of their safety.

Same invariant for sp and x30.

ret x30 // safe

ldr x1, [sp, #8] // safe

| ldr x30, | [en] | ldr | x30, | [sp] |      |      |
|----------|------|-----|------|------|------|------|
| iui 100, | [sh] | add | x30, | x21, | w30, | uxtw |

Key Optimization: we can perform the guard inside a load/store addressing mode.

| Original code | Sandboxed equivalent    | Cycles of overhead |
|---------------|-------------------------|--------------------|
| ldr rt, [xN]  | ldr rt, [x21, wN, uxtw] | 0                  |

Key Optimization: we can perform the guard inside a load/store addressing mode.

| Original code     | Sandboxed equivalent            | Cycles of overhead |  |
|-------------------|---------------------------------|--------------------|--|
| ldr rt, [xN]      | t, [xN] ldr rt, [x21, wN, uxtw] |                    |  |
| ldr rt, [xN, #i]  | add w22, wN, #i                 | 1                  |  |
| Idr rt, [XN, #1]  | ldr rt, [x21, w22, uxtw]        | 1                  |  |
| ldr rt, [xN, #i]! | add xN, xN, #i                  | 1                  |  |
|                   | ldr rt, [x21, wN, uxtw]         | 1                  |  |
| ldr rt, [xN], #i  | ldr rt, [x21, wN, uxtw]         | 1                  |  |
|                   | add xN, xN, #i                  | T                  |  |

(other addressing modes omitted for brevity)

Primary metric: CPU overhead introduced by additional instructions.

Measured on SPEC 2017 benchmarks that compile with our toolchain.

- $\rightarrow$  C or C++ and compatible with Musl libc.
- $\rightarrow$  Apple M1 running Asahi Linux.
  - 1. Comparing the effects of LFI optimizations.
  - 2. Comparing LFI vs. AOT WebAssembly compilers that use LLVM and Cranelift.
    - $\rightarrow$  WAMR: LLVM.
    - $\rightarrow$  Wasm2c: LLVM.
    - $\rightarrow$  Wasmtime: Cranelift.

# **Evaluation: LFI Overhead**



## Evaluation: LFI vs. WebAssembly



## Table 1: GCP T2A VM, 2.8 GHz

| Platform | Syscall (ns) | Ctxsw (ns) |
|----------|--------------|------------|
| LFI      | 26           | 46         |
| Linux    | 162          | 2,494      |
| gVisor   | 12,019       | 22,899     |

Table 2: Apple M1, 3.2 GHz

| Platform | Syscall (ns) | Ctxsw (ns) |  |
|----------|--------------|------------|--|
| LFI      | 22           | 48         |  |
| Linux    | 129          | 1,504      |  |
| gVisor   | not          | supported  |  |

- Linux does not provide an optimized context switch implementation<sup>1</sup>.
- gVisor incurs high overhead from the suboptimal Linux switch.
- Software protection can go beyond the limits of current hardware protection.

<sup>&</sup>lt;sup>1</sup>seL4 does much better with a ~400 cycle switch.

You can follow further development at:

https://github.com/zyedidia/lfi

See the paper for more details about optimizations, the verifier, the runtime, Spectre, proposed designs for other architectures (x86-64 and RISC-V), and more!

Questions?

## **Optimization: Guard Hoisting**

Introduce two more reserved registers:

- x22: always valid.
- x23: always valid.

| ldr | x2, | [x1, | #8]  |
|-----|-----|------|------|
| str | x2, | [x0, | #8]  |
| ldr | x2, | [x1, | #16] |
| str | x2, | [x0, | #16] |
| ldr | x2, | [x1, | #24] |
| str | x2, | [x0, | #24] |

| add | x22, | x21,  | wO,  | uxtw |
|-----|------|-------|------|------|
| add | x23, | x21,  | w1,  | uxtw |
| ldr | x2,  | [x23, | #8]  |      |
| str | x2,  | [x22, | #8]  |      |
| ldr | x2,  | [x23, | #16] |      |
| str | x2,  | [x22, | #16] |      |
| ldr | x2,  | [x23, | #24] |      |
| str | x2,  | [x22, | #24] |      |

The sp register is assumed to always contain a valid address.

 $\rightarrow$  No guards necessary for stack accesses.

Guards are necessary when modifying sp, but not in all cases.

| Original code     | Sandboxed equivalent   |  |  |
|-------------------|------------------------|--|--|
| add an an #n      | add w22, wsp, #n       |  |  |
| add sp, sp, #n    | add sp, x21, w22, uxtw |  |  |
| add sp, sp, #n    |                        |  |  |
| (no branches)     | No change necessary    |  |  |
| ldr rt, [sp, #m]  |                        |  |  |
| str rt, [sp, #n]! | No change necessary    |  |  |

An efficient implementation is probably possible with Intel CET and segment registers. CET: shadow call stacks and indirect branch tracking<sup>2</sup>.

- $\rightarrow$  Ensures all indirect branches target instruction boundaries.
- $\rightarrow$  Verifier will have to check direct branches (slower verification).

Store sandbox base in %gs, reserve %r15, rewrite loads/stores:

| Original code | Sandboxed equivalent |  |  |
|---------------|----------------------|--|--|
| mov %rxx, ()  | lea (), %r15d        |  |  |
|               | mov %rxx, %gs:r15    |  |  |

<sup>&</sup>lt;sup>2</sup>Usermode IBT is not currently provided by Linux: showstopper for avoiding alignment constraints.

Problem 1: Compressed instructions, and no hardware control-flow protection (yet).

- $\rightarrow$  Require that compressed instructions only exist as pairs (otherwise decompress).
- ightarrow Require that branches target a 4-byte aligned block, possibly via an enforced and.

Problem 2: More difficult to operate on 32-bit subsets.

ightarrow Zba provides add.uw rd, rs1, rs2 (zero-extends bottom 32 bits of rs2).

Store sandbox base in x21, reserve x18,

| Original code | Sandboxed equivalent |  |  |
|---------------|----------------------|--|--|
| ld xN, n(xM)  | add.uw x18, x21, xM  |  |  |
|               | ld xN, n(x18)        |  |  |

LFI does not rely on any fine-grained control-flow integrity for sandbox correctness.

 $\rightarrow$  Speculative sandbox breakout attacks are mitigated.

LFI does not rely on any fine-grained control-flow integrity for sandbox correctness.

 $\rightarrow$  Speculative sandbox breakout attacks are mitigated.

Problem: Speculative cross-sandbox and host poisoning attacks.

LFI does not rely on any fine-grained control-flow integrity for sandbox correctness.

 $\rightarrow$  Speculative sandbox breakout attacks are mitigated.

Problem: Speculative cross-sandbox and host poisoning attacks.

Solution: ARM software context numbers.

### D13.2.121 SCXTNUM\_EL0, EL0 Read/Write Software Context Number

The SCXTNUM\_EL0 characteristics are:

### Purpose

Provides a number that can be used to separate out different context numbers with the ELO exception level, for the purpose of protecting against side-channels using branch prediction and similar resources.

### Configurations

This register is present only when FEAT\_CSV2\_2 is implemented or FEAT\_CSV2\_1p2 is implemented. Otherwise, direct accesses to SCXTNUM\_EL0 are UNDEFINED.

#### Attributes

SCXTNUM\_EL0 is a 64-bit register.

Idea: use the first page of the sandbox to store the runtime call table (read-only).

Idea: use the first page of the sandbox to store the runtime call table (read-only).

• The address of the runtime call table is already stored in x21!

Idea: use the first page of the sandbox to store the runtime call table (read-only).

• The address of the runtime call table is already stored in x21!

| SVC | #0 | ldr | x30, | [x21, | #n] |
|-----|----|-----|------|-------|-----|
|     | #0 | blr | x30  |       |     |

 $\rightarrow$  Verifier must ensure blr always follows the load.