Introduction

JavaScript is single-threaded. Your game loop, your physics, your terrain generation, your rendering. All fighting over one thread. Web Workers give you more threads. But moving data between threads means copying. SharedArrayBuffer removes that cost entirely.

What SharedArrayBuffer actually is

A normal ArrayBuffer is a fixed-length block of raw bytes in memory. You read and write to it through typed array views like Float32Array or Uint8Array.

A SharedArrayBuffer is the same thing with one critical difference. Multiple threads can access the same block of memory.

With a regular ArrayBuffer, when you postMessage it to a worker, one of two things happens. Either the browser copies the entire buffer. Or you "transfer" it, which means the original thread loses access entirely.

With SharedArrayBuffer, nothing moves. Nothing copies. Both threads see the exact same bytes in RAM.

// Main thread creates the buffer.
const buffer = new SharedArrayBuffer(4096);
const floats = new Float32Array(buffer);

// Send to worker. This does NOT copy. It sends a reference.
worker.postMessage({ buffer });

// Worker receives the same memory.
self.onmessage = (e) => {
  const floats = new Float32Array(e.data.buffer);
  floats[0] = 42.0; // Visible to the main thread immediately.
};

How this works under the hood

Every process on your computer has its own virtual address space. That means the OS gives each process the illusion that it has its own private chunk of memory, starting from address 0. But that is a lie. The actual physical RAM is shared across all processes. The OS maintains a page table that maps virtual addresses to physical addresses. A "page" is a fixed-size block of memory, usually 4 KB.

When you create a normal ArrayBuffer, the browser asks the OS to allocate some pages of physical memory. Those pages are mapped into the current thread's virtual address space. Only that thread can see them. When you postMessage this buffer to a worker, the browser allocates new physical pages for the worker, copies the bytes over, and maps those new pages into the worker's address space. Two separate physical memory regions. That is the copy.

When you create a SharedArrayBuffer, the browser does something different. It allocates physical pages, but maps those same physical pages into both the main thread's and the worker's virtual address spaces. Two virtual addresses, one physical location. The OS page table makes this possible. It is the same mechanism that powers shared memory in C (mmap with MAP_SHARED, or POSIX shm_open).

So when the worker writes floats[0] = 42.0, it writes to a physical memory location. When the main thread reads floats[0], it reads the same physical location through a different virtual address. No copy ever happens because there is only one copy of the data in physical memory.

This is also why SharedArrayBuffer was disabled after the Spectre vulnerability. Spectre exploits CPU cache timing to read memory across process boundaries. SharedArrayBuffer gave JavaScript high-resolution timing (via shared counters and tight loops) that made Spectre attacks practical from a browser tab. Browsers re-enabled it only after requiring site isolation through COOP/COEP headers. More on that later.

Why copying is not free

When a worker finishes building a terrain chunk, it has arrays of positions, normals, colors, UVs, texture weights. For a 64x64 grid, that is 4096 vertices. Each position is 3 floats at 4 bytes each. 49,152 bytes just for positions. Add normals, colors, tangents, UVs, two weight sets. Easily 300-500 KB per chunk.

When you postMessage a regular ArrayBuffer, the browser has to:

Serialize the data. The structured clone algorithm walks the object and converts it.
Allocate new physical pages on the receiving side.
Copy every byte across.
Deserialize on the main thread.

Steps 1 and 4 cost time even for flat float arrays because the structured clone algorithm is general-purpose. It handles nested objects, cyclic references, all kinds of types.

Step 3 is raw memcpy. For 400 KB, fast on modern hardware. But you are not building one chunk. If a player moves through a procedural world and 20 chunks generate in a burst, you are copying 8 MB of geometry. At 60 FPS your budget is 16.6ms per frame. Every millisecond on copies is a millisecond stolen from rendering.

With SharedArrayBuffer, steps 1 through 4 do not happen. postMessage sends a reference. The main thread wraps that same memory in a Float32Array and hands it directly to WebGL. Zero overhead.

The Transferable alternative

postMessage supports "transferable" objects. You can transfer an ArrayBuffer to avoid copying by moving ownership.

worker.postMessage({ buffer: myArrayBuffer }, [myArrayBuffer]);
// myArrayBuffer.byteLength is now 0. Worker lost it.

No copy. But the sender loses the buffer. If the worker wants to reuse that buffer for the next chunk, it cannot. Gone.

SharedArrayBuffer has no such tradeoff. Both sides keep access. The worker writes, posts "done", and immediately starts the next job using the same buffer. The main thread reads at its own pace.

The work-stealing pool pattern

One worker is fine for simple offloading. For heavy work you want all CPU cores. That means a pool.

You have a list of workers, a queue of pending work, and two buckets: free and busy. Work comes in, grab a free worker, send the work. No free worker, queue it. Worker finishes, move it back to free, check the queue.

class WorkerPool {
  constructor(size, scriptUrl) {
    this._workers = Array.from(
      { length: size },
      () => new Worker(scriptUrl, { type: "module" }),
    );
    this._free = [...this._workers];
    this._busy = new Set();
    this._queue = [];
  }

  enqueue(data, callback) {
    this._queue.push({ data, callback });
    this._pump();
  }

  _pump() {
    while (this._free.length > 0 && this._queue.length > 0) {
      const worker = this._free.pop();
      this._busy.add(worker);

      const { data, callback } = this._queue.shift();

      worker.onmessage = (e) => {
        this._busy.delete(worker);
        this._free.push(worker);
        callback(e.data);
        this._pump(); // Drain the queue.
      };

      worker.postMessage(data);
    }
  }

  get busy() {
    return this._queue.length > 0 || this._busy.size > 0;
  }
}

The _pump call after each completion is the key. Self-draining queue. Worker finishes, immediately picks up the next job. On an 8-core machine, run 7 workers plus the main thread. All cores utilized.

Combine this with SharedArrayBuffer. Each worker fills shared buffers with generated geometry. Posts "done" with references. Main thread reads directly. No copy bottleneck no matter how many chunks are in flight.

Terrain chunk example in practice

Main thread sends a message:

const msg = {
  subject: "build_chunk",
  params: {
    noiseParams: { seed: 42, octaves: 6, scale: 1024 },
    width: 500,
    offset: [1000, 0, 2000],
    resolution: 64,
    worldMatrix: camera.matrixWorld,
  },
};

workerPool.enqueue(msg, (result) => {
  chunk.geometry.setAttribute(
    "position",
    new THREE.Float32BufferAttribute(result.positions, 3),
  );
  chunk.geometry.setAttribute(
    "normal",
    new THREE.Float32BufferAttribute(result.normals, 3),
  );
  chunk.show();
});

Plain numbers, arrays, a matrix. No class instances. Workers cannot receive class instances because structured clone does not handle methods or prototypes. You decompose into raw config, the worker reconstructs on its side.

Worker receives and computes:

self.onmessage = (msg) => {
  const noise = new Noise(msg.data.params.noiseParams);
  const heightGen = new HeightGenerator(noise, offset, min, max);

  // Heavy math. Does not block main thread.
  const positions = [];
  const normals = [];
  for (let x = 0; x <= resolution; x++) {
    for (let y = 0; y <= resolution; y++) {
      // Sample noise, compute position, compute normal...
    }
  }

  // Pack into SharedArrayBuffer.
  const bytesPerFloat = 4;
  const posBuf = new Float32Array(
    new SharedArrayBuffer(bytesPerFloat * positions.length),
  );
  posBuf.set(positions);

  const normBuf = new Float32Array(
    new SharedArrayBuffer(bytesPerFloat * normals.length),
  );
  normBuf.set(normals);

  self.postMessage({
    subject: "build_chunk_result",
    positions: posBuf,
    normals: normBuf,
  });
};

posBuf.set(positions) copies from the temporary JS array into the shared buffer. That is the only copy in the entire pipeline.

Beyond terrain

The pattern applies anywhere you have heavy computation producing large typed array output.

Physics simulation. Run collision detection for thousands of bodies in a worker. Write positions and rotations into a SharedArrayBuffer. Main thread reads them to update transforms.

Pathfinding. A* or navmesh queries for hundreds of AI agents. Distribute across workers. Write paths into shared memory. Main thread reads waypoints directly.

Procedural meshes. Marching cubes for voxels, metaballs, destructible geometry. Workers compute vertices. Results land in shared buffers. Main thread uploads to GPU.

Audio processing. Game audio DSP in AudioWorklet runs on a separate thread. SharedArrayBuffer lets you share audio buffers with game logic workers for synchronized effects.

Particle simulation. CPU-driven particle systems with forces, collisions, lifetime. Worker computes, shared buffer holds position/size/color arrays, main thread uploads to Points geometry every frame.

The gotchas

COOP/COEP headers. After Spectre, browsers require these HTTP headers or SharedArrayBuffer is undefined:

Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp

This also affects cross-origin resources. CDN scripts and textures need crossorigin attributes.

Race conditions. Two threads writing and reading the same memory can cause torn reads. If the worker is halfway through writing (x, y, z) and the main thread reads, you get stale data. For geometry this is usually fine because you only read after the "done" message. For real-time shared state you need Atomics for synchronization: Atomics.wait, Atomics.notify, Atomics.store, Atomics.load.

No GC help. SharedArrayBuffer memory is not garbage collected normally. Creating new shared buffers without releasing references leaks memory. Pre-allocate and reuse.

Fixed size. You cannot resize after creation. If output size varies, allocate for the worst case or fall back to transferable.

Mental model

Your game is two worlds.

The main thread owns the screen. Render loop, input, scene graph. Its job is to stay under 16ms per frame. No heavy computation here.

The workers own the math. Terrain, physics, pathfinding. No frame budget. They take as long as they need.

The bridge between them is messages. Copying limits the bandwidth of that bridge. SharedArrayBuffer removes that limit. The data does not cross the bridge. It sits in shared physical memory, accessible from both sides through their own virtual address mappings.

Keep the main thread light. Push heavy work to workers. Use SharedArrayBuffer so results cost nothing to bring back.

Why SharedArrayBuffer Is So Powerful in Game Dev

Introduction

What SharedArrayBuffer actually is

How this works under the hood

Why copying is not free

The Transferable alternative

The work-stealing pool pattern

Terrain chunk example in practice

Beyond terrain

The gotchas

Mental model

Comments

More from this blog

Become a cracked product engineer today

Bipedal, humanoid, and the words for creature shapes in games

Lerp and smoothstep, what they actually do

IK and FK, what they actually are

How to make your textures fast on the GPU

Command Palette

Introduction

What SharedArrayBuffer actually is

How this works under the hood

Why copying is not free

The Transferable alternative

The work-stealing pool pattern

Terrain chunk example in practice

Beyond terrain

The gotchas

Mental model

Comments

More from this blog