How it works under the hood

Inside JPEG encoding

A 3 MB photograph becomes a 200 KB file. Where does the data go? JPEG's pipeline is eight distinct mathematical steps — each one exploiting a different property of human vision. Here they all are, in runnable JavaScript.

01RGB→YCbCrcolour space

02Subsample4:2:0

038×8 blockspartition

04DCTfrequency

05Quantiselossy step

06Zigzagreorder

07RLErun-length

08Huffmanentropy

Discrete Cosine Transform Quantisation tables Entropy coding Psychovisual model

§01

Why raw pixels are huge

A pixel is three 8-bit numbers — red, green, blue — so 1 byte each. A 12-megapixel photo at 3:2 aspect is 4000×3000 pixels. That's 36 million bytes just for the raw data, before any header, metadata, or framing.

Human perception has two important asymmetries that JPEG exploits: the eye is far more sensitive to brightness changes than colour changes, and far more sensitive to low spatial frequencies (gradual changes) than high ones (fine detail). JPEG throws away exactly what you can't see.

File size at different colour depths and resolutions

raw_size.js — how much data a raw image contains

// How much data does an uncompressed image contain?

function rawSize(width, height, channels = 3, bitsPerChannel = 8) {
  const bytes = width * height * channels * (bitsPerChannel / 8);
  return { bytes, kb: bytes/1024, mb: bytes/1024/1024 };
}

const cameras = [
  { name: "4K frame",      w: 3840, h: 2160 },
  { name: "12MP phone",    w: 4000, h: 3000 },
  { name: "48MP phone",    w: 8064, h: 6048 },
  { name: "1080p video frame", w: 1920, h: 1080 },
];

for (const cam of cameras) {
  const r = rawSize(cam.w, cam.h);
  console.log(`${cam.name.padEnd(20)} ${cam.w}×${cam.h}`);
  console.log(`  Raw RGB:   ${r.mb.toFixed(1)} MB`);
  console.log(`  Typical JPEG Q=80: ~${(r.mb/15).toFixed(2)} MB  (${Math.round(15)}:1 compression)`);
  console.log(`  Typical JPEG Q=50: ~${(r.mb/25).toFixed(2)} MB  (${Math.round(25)}:1 compression)`);
}

// 30fps video without compression
const videoRaw = rawSize(1920, 1080);
console.log(`\n1080p video @ 30fps:`);
console.log(`  ${(videoRaw.mb * 30).toFixed(0)} MB/s uncompressed`);
console.log(`  ${(videoRaw.mb * 30 * 3600 / 1024).toFixed(0)} GB/hour`);

§02

Colour space: RGB → YCbCr

RGB stores three equally-weighted colour channels. But your eye has ~120 million rod cells (brightness-sensitive) and only ~6 million cone cells (colour-sensitive), and the cones are densest at the fovea. JPEG exploits this by converting to YCbCr: one luma channel Y (brightness) and two chroma channels Cb and Cr (blue-difference and red-difference colour).

This separation is the precondition for chroma subsampling in the next step — you can discard colour information you'll barely notice, while preserving every luma bit.

How the conversion works

The three output channels are computed as weighted sums of R, G, and B. The weights come from the ITU-R BT.601 standard and reflect how the human eye perceives colour.

Y (luma) is a weighted average of all three channels, with green weighted most heavily because the eye has the most green-sensitive cones:

Y = 0.299 × R + 0.587 × G + 0.114 × B

Green gets 59% of the weight — the eye is most sensitive to green light.

Cb (blue-difference chroma) measures how much bluer the pixel is relative to its luma. It's centred at 128 (neutral grey maps to exactly 128):

Cb = −0.169 × R − 0.331 × G + 0.500 × B + 128

Pure blue (0,0,255) → Cb=255. Pure red (255,0,0) → Cb=43. Grey (128,128,128) → Cb=128.

Cr (red-difference chroma) measures how much redder the pixel is relative to its luma, also centred at 128:

Cr = 0.500 × R − 0.419 × G − 0.081 × B + 128

Pure red (255,0,0) → Cr=213. Pure blue (0,0,255) → Cr=107. Grey (128,128,128) → Cr=128.

Three things to notice: the coefficients in each row sum to zero (before the +128 offset), which means a neutral grey always maps to Cb=128, Cr=128 regardless of its brightness. The +128 offset shifts the range from [−128, 127] to [0, 255] so it fits in a byte. And the Y coefficients sum to 1.0 (0.299 + 0.587 + 0.114 = 1.0), so white (255,255,255) maps to Y=255.

RGB ↔ YCbCr — drag the sliders, see both representations

RGB input

R180

G100

B40

YCbCr output

ycbcr.js — the colour space transform

// RGB ↔ YCbCr conversion (ITU-R BT.601 coefficients — used by JPEG)

function rgbToYCbCr(r, g, b) {
  const Y  =  0.299   * r + 0.587   * g + 0.114   * b;
  const Cb = -0.16875 * r - 0.33126 * g + 0.5     * b + 128;
  const Cr =  0.5     * r - 0.41869 * g - 0.08131 * b + 128;
  return [Math.round(Y), Math.round(Cb), Math.round(Cr)];
}

function yCbCrToRGB(Y, Cb, Cr) {
  const r = Y                        + 1.40200 * (Cr - 128);
  const g = Y - 0.34414 * (Cb - 128) - 0.71414 * (Cr - 128);
  const b = Y + 1.77200 * (Cb - 128);
  const clamp = v => Math.max(0, Math.min(255, Math.round(v)));
  return [clamp(r), clamp(g), clamp(b)];
}

// Test roundtrip on a few colours
const colours = [
  [255, 0,   0  ], // red
  [0,   255, 0  ], // green
  [0,   0,   255], // blue
  [180, 100, 40 ], // warm orange
  [200, 200, 200], // grey
];
for (const [r,g,b] of colours) {
  const [Y,Cb,Cr] = rgbToYCbCr(r,g,b);
  const [r2,g2,b2] = yCbCrToRGB(Y,Cb,Cr);
  console.log(`RGB(${String(r).padStart(3)},${String(g).padStart(3)},${String(b).padStart(3)})`
    + ` → Y=${String(Y).padStart(3)} Cb=${String(Cb).padStart(3)} Cr=${String(Cr).padStart(3)}`
    + ` → RGB(${r2},${g2},${b2}) ✓`);
}

// Key insight: Y carries all the perceived brightness
// Two pixels can have the same Y but look very different in colour
const [Yr,Cbr,Crr] = rgbToYCbCr(255,0,0);
const [Yg,Cbg,Crg] = rgbToYCbCr(0,255,0);
const [Yb,Cbb,Crb] = rgbToYCbCr(0,0,255);
console.log(`\nLuma (Y) of pure colours:`);
console.log(`  Red:   Y=${Yr}  (low — red looks dark in B&W)`);
console.log(`  Green: Y=${Yg}  (high — green appears brightest)`);
console.log(`  Blue:  Y=${Yb}   (medium-low)`);

§03

Chroma subsampling

Because the eye is far less sensitive to colour detail than to brightness detail, JPEG discards 75% of the colour data before any other compression takes place. In the most common mode 4:2:0, every 2×2 block of pixels shares a single Cb value and a single Cr value — but each pixel keeps its own Y value.

This step alone reduces data to 50% of the original (from 3 bytes/pixel to 1.5 bytes/pixel) with virtually invisible quality loss on photographic content.

Subsampling modes — click to compare

chroma_subsample.js — 4:2:0 downsampling and upsampling

// Chroma subsampling: 4:2:0
// Discard 3 of every 4 chroma samples — nearly invisible to human vision

function subsample420(channel, width, height) {
  // Average each 2×2 block into one sample
  const outW = width  >> 1;  // half width
  const outH = height >> 1;  // half height
  const out  = new Float32Array(outW * outH);
  for (let y = 0; y < outH; y++) {
    for (let x = 0; x < outW; x++) {
      out[y*outW+x] = (
        channel[(y*2  ) * width + (x*2  )] +
        channel[(y*2  ) * width + (x*2+1)] +
        channel[(y*2+1) * width + (x*2  )] +
        channel[(y*2+1) * width + (x*2+1)]
      ) / 4;
    }
  }
  return out;
}

function upsample420(subsampled, outW, outH) {
  // Nearest-neighbour upsample (JPEG decoder does bilinear in practice)
  const inW = outW >> 1;
  const out = new Float32Array(outW * outH);
  for (let y = 0; y < outH; y++)
    for (let x = 0; x < outW; x++)
      out[y*outW+x] = subsampled[(y>>1)*inW + (x>>1)];
  return out;
}

// Simulate with a 4×4 grid of Cb values
const W = 4, H = 4;
const cbOriginal = Float32Array.from([
  128, 130, 100, 100,
  128, 130, 100, 100,
  140, 140, 110, 110,
  140, 140, 110, 110,
]);
const cbDown = subsample420(cbOriginal, W, H);
const cbUp   = upsample420(cbDown, W, H);

console.log("Original Cb (4×4):");
for (let r = 0; r < H; r++) console.log(" ", [...cbOriginal.slice(r*W,(r+1)*W)].join("  "));

console.log("\nSubsampled 4:2:0 (2×2):");
for (let r = 0; r < H/2; r++) console.log(" ", [...cbDown.slice(r*W/2,(r+1)*W/2)].map(v=>v.toFixed(1)).join("  "));

console.log("\nUpsampled back to 4×4 (reconstruction):");
for (let r = 0; r < H; r++) console.log(" ", [...cbUp.slice(r*W,(r+1)*W)].map(v=>v.toFixed(1)).join("  "));

const origBytes = W * H * 3;    // Y + Cb + Cr
const subBytes  = W * H + 2 * (W/2) * (H/2); // Y full, Cb/Cr halved each dim
console.log(`\nData: ${origBytes} bytes → ${subBytes} bytes (${(subBytes/origBytes*100).toFixed(0)}% retained)`);

§04

8×8 blocks and the DCT

Each channel is divided into non-overlapping 8×8 blocks (64 pixels). Each block is transformed by the Discrete Cosine Transform (DCT), which converts the block from the spatial domain (pixel values) into the frequency domain (how much of each spatial frequency is present).

Intuitively, the spatial domain is just the block as you would normally look at it: this pixel is bright, that pixel is darker, this corner changes quickly. The frequency domain asks a different question: how much of this block is smooth overall, how much is a gentle left-to-right fade, how much is a fine checkerboard-like texture? Instead of describing the block pixel by pixel, JPEG describes it as a mixture of simple visual patterns, from broad smooth shading to tiny rapid changes. That is useful because photographs usually contain much more low-frequency structure than high-frequency detail, so JPEG can keep the broad patterns and coarsen the tiny ones.

The output is also an 8×8 grid of 64 numbers called DCT coefficients. The top-left coefficient (DC) represents the average brightness of the block. The others (AC) represent increasingly fine detail from left-to-right and top-to-bottom.

The DCT doesn't compress anything — it just rearranges the information. The compression happens in the next step (quantisation). The point of the DCT is to make the high-frequency detail coefficients small, so quantisation can discard them aggressively.

DCT transform — enter an 8×8 block, see the frequency coefficients

Input block (pixel values 0–255)

DCT coefficients (warm=+, cool=−)

DC (top-left): —

Right = finer left-to-right change · Down = finer top-to-bottom change

dct.js — the 2D Discrete Cosine Transform from scratch

// 2D DCT-II — the exact transform used in JPEG
// Input: 8×8 block of pixel values (shifted: 0→255 becomes -128→127)
// Output: 8×8 block of frequency coefficients

const N = 8;
const C = u => u === 0 ? 1/Math.sqrt(2) : 1; // normalisation factor

function dct2d(block) {
  // block is a flat Float32Array of 64 values, row-major
  const out = new Float32Array(64);
  const scale = 2 / N;
  for (let v = 0; v < N; v++) {
    for (let u = 0; u < N; u++) {
      let sum = 0;
      for (let y = 0; y < N; y++) {
        for (let x = 0; x < N; x++) {
          sum += block[y*N+x]
            * Math.cos(((2*x+1)*u*Math.PI) / (2*N))
            * Math.cos(((2*y+1)*v*Math.PI) / (2*N));
        }
      }
      out[v*N+u] = 0.25 * C(u) * C(v) * sum;
    }
  }
  return out;
}

function idct2d(coeffs) {
  const out = new Float32Array(64);
  for (let y = 0; y < N; y++) {
    for (let x = 0; x < N; x++) {
      let sum = 0;
      for (let v = 0; v < N; v++)
        for (let u = 0; u < N; u++)
          sum += C(u) * C(v) * coeffs[v*N+u]
            * Math.cos(((2*x+1)*u*Math.PI) / (2*N))
            * Math.cos(((2*y+1)*v*Math.PI) / (2*N));
      out[y*N+x] = 0.25 * sum;
    }
  }
  return out;
}

// Test: transform a simple block and check roundtrip
const testBlock = Float32Array.from([
  52, 55, 61, 66, 70, 61, 64, 73,
  63, 59, 55, 90, 109,85, 69, 72,
  62, 59, 68, 113,144,104,66, 73,
  63, 58, 71, 122,154,106,70, 69,
  67, 61, 68, 104,126,88, 68, 70,
  79, 65, 60, 70, 77, 68, 58, 75,
  85, 71, 64, 59, 55, 61, 65, 83,
  87, 79, 69, 68, 65, 76, 78, 94,
].map(v => v - 128));  // level-shift: subtract 128

const coeffs   = dct2d(testBlock);
const restored = idct2d(coeffs);

console.log("DCT coefficients (first row):");
console.log(" ", [...coeffs.slice(0,8)].map(v=>v.toFixed(1)).join("  "));
console.log(`\nDC coefficient (block average): ${coeffs[0].toFixed(2)}`);
console.log(`Expected (mean - 128 scaled): ~${((testBlock.reduce((a,b)=>a+b)/64)*2).toFixed(1)}`);

// Verify roundtrip: IDCT(DCT(x)) should recover x
const maxErr = Math.max(...testBlock.map((v,i) => Math.abs(v - restored[i])));
console.log(`\nRoundtrip max error: ${maxErr.toFixed(6)} (should be ~0)`);

§05

DCT basis functions

The DCT represents each 8×8 block as a weighted sum of 64 basis patterns. Each basis pattern is a specific spatial frequency — from the DC average (pattern 0,0) to a rapidly-oscillating checkerboard (pattern 7,7). The DCT coefficient for each pattern is the weight.

Click any basis pattern below to see it, and see what its coefficient is in the example block. The fact that natural images have most energy in low-frequency patterns is exactly why JPEG achieves high compression.

All 64 DCT basis patterns — click to inspect

Click a pattern above

§06

Quantisation — the lossy step

Each DCT coefficient is divided by a number from the quantisation table and rounded to the nearest integer. This is the only irreversible step in JPEG — the one where information is permanently discarded.

The quantisation table is just an 8×8 grid of divisor values, one for each position in the 8×8 DCT coefficient block. JPEG is effectively saying: “for this low-frequency coefficient, divide by a small number and keep it fairly accurately; for this high-frequency coefficient, divide by a larger number and round more aggressively.”

The quantisation table has larger values for high-frequency coefficients (bottom-right of the grid), discarding fine detail aggressively, and small values for low-frequency coefficients (top-left), preserving smooth gradients. The JPEG quality setting scales this table: Q=100 means divide by ~1 (lossless-ish); Q=1 means divide by ~100 (heavily lossy).

Quantisation — slide quality and see which coefficients survive

Quality 75

DCT coefficients

Quantisation table

After quantise

quantise.js — the quantisation step and its inverse

// Quantisation — the only lossy step in JPEG
// JPEG standard luminance quantisation table (quality factor 50)
const LUMA_Q50 = [
  16, 11, 10, 16, 24, 40, 51, 61,
  12, 12, 14, 19, 26, 58, 60, 55,
  14, 13, 16, 24, 40, 57, 69, 56,
  14, 17, 22, 29, 51, 87, 80, 62,
  18, 22, 37, 56, 68, 109,103,77,
  24, 35, 55, 64, 81, 104,113,92,
  49, 64, 78, 87, 103,121,120,101,
  72, 92, 95, 98, 112,100,103,99,
];

function scaleQTable(baseTable, quality) {
  // IJG quality scaling formula
  const scale = quality < 50
    ? 5000 / quality
    : 200 - 2 * quality;
  return baseTable.map(v => Math.max(1, Math.min(255, Math.floor((v * scale + 50) / 100))));
}

function quantise(coeffs, qTable) {
  return coeffs.map((c, i) => Math.round(c / qTable[i]));
}

function dequantise(quantised, qTable) {
  return quantised.map((c, i) => c * qTable[i]);
}

// Compare different quality settings on the example coefficients
const exampleCoeffs = Float32Array.from([
  -415, -30,  -61,  27,   56,  -20,  -2,   0,
    4,  -22,  -40,  26,   14,   -6,  -3,   1,
  -47,   13,   57, -26,  -23,    3,   5,   0,
   -5,    7,   -4,  -1,    5,   -1,  -1,   0,
    1,   -2,    0,   0,    0,    0,   0,   0,
    0,    0,    0,   0,    0,    0,   0,   0,
    0,    0,    0,   0,    0,    0,   0,   0,
    0,    0,    0,   0,    0,    0,   0,   0,
]);

for (const Q of [90, 75, 50, 25]) {
  const qTable = scaleQTable(LUMA_Q50, Q);
  const quant  = quantise(exampleCoeffs, qTable);
  const zeros  = quant.filter(v => v === 0).length;
  const dequant= dequantise(quant, qTable);
  const maxErr = Math.max(...exampleCoeffs.map((v,i) => Math.abs(v - dequant[i])));
  console.log(`Q=${Q}: ${zeros}/64 zeros, max error=${maxErr.toFixed(0)}`);
  if (Q === 75) {
    console.log("  First row quantised:", [...quant.slice(0,8)].join("  "));
  }
}

§07

Zigzag scan and run-length encoding

After quantisation, the 8×8 grid of coefficients needs to be serialised into a 1D sequence for entropy coding. JPEG uses a zigzag order — starting from the DC coefficient (top-left), spiralling outward — so that low-frequency coefficients come first and the long string of zeros at the end (the discarded high-frequency detail) comes last.

This sequence of integers is then run-length encoded: long runs of zeros are replaced by (count, next-nonzero) pairs. A special EOB (end-of-block) symbol marks where the zeros start and never stop.

Zigzag traversal — watch the scan order

Step: 0/64

Serialised sequence:

After run-length encoding:

zigzag_rle.js — zigzag serialisation and run-length encoding

// Zigzag order: traversal pattern for an 8×8 grid
// Ensures DC first, then low→high frequency AC coefficients

const ZIGZAG = [
   0,  1,  8, 16,  9,  2,  3, 10,
  17, 24, 32, 25, 18, 11,  4,  5,
  12, 19, 26, 33, 40, 48, 41, 34,
  27, 20, 13,  6,  7, 14, 21, 28,
  35, 42, 49, 56, 57, 50, 43, 36,
  29, 22, 15, 23, 30, 37, 44, 51,
  58, 59, 52, 45, 38, 31, 39, 46,
  53, 60, 61, 54, 47, 55, 62, 63,
];

function zigzagSerialise(block) {
  return ZIGZAG.map(i => block[i]);
}

// Run-length encode the AC coefficients
// DC is handled separately (delta-coded across blocks)
// Format: (zeroes_before, value) pairs, then EOB = (0,0)
function rleAC(zigzagged) {
  const result = [];
  let zeroRun = 0;
  for (let i = 1; i < zigzagged.length; i++) { // skip DC at index 0
    if (zigzagged[i] === 0) {
      zeroRun++;
    } else {
      // ZRL (zero run length > 15): encode as (15, 0) repeated
      while (zeroRun > 15) { result.push([15, 0]); zeroRun -= 16; }
      result.push([zeroRun, zigzagged[i]]);
      zeroRun = 0;
    }
  }
  result.push([0, 0]); // EOB — end of block marker
  return result;
}

// Example: quantised block with many zeros
const quantised = [
  -26, -3,  -6,  2,  2, -1,  0,  0,
    0, -2,  -4,  1,  1,  0,  0,  0,
   -3,  1,   5, -1, -1,  0,  0,  0,
   -3,  1,   2, -1,  0,  0,  0,  0,
    1,  0,  -1,  0,  0,  0,  0,  0,
    0,  0,   0,  0,  0,  0,  0,  0,
    0,  0,   0,  0,  0,  0,  0,  0,
    0,  0,   0,  0,  0,  0,  0,  0,
].flat();

const zzSeq = zigzagSerialise(quantised);
const rle   = rleAC(zzSeq);

console.log("DC coefficient:", zzSeq[0], "(encoded separately, delta from prev block)");
console.log("AC zigzag sequence:", zzSeq.slice(1).join(" "));
console.log("\nRLE pairs (zeros_before, value):");
rle.forEach(([z,v]) => {
  if (z===0 && v===0) console.log("  EOB — end of block");
  else console.log(`  (${z}, ${v})`);
});
console.log(`\n64 values → ${rle.length} RLE pairs = ${(rle.length/64*100).toFixed(0)}% of raw size`);

§08

Huffman entropy coding

The final step encodes the RLE pairs using Huffman coding, a lossless compression technique that assigns shorter bit patterns to more frequent symbols. In a typical JPEG block, small coefficients like (0, 1) or (0, -1) appear far more often than large ones — Huffman exploits this imbalance.

JPEG precomputes Huffman tables for typical image data (the standard tables are in the JPEG spec). The encoder scans the RLE pairs and emits variable-length bit strings. The decoder reads bits one at a time, walking a Huffman tree until it reaches a leaf.

Huffman coding — build a tree from symbol frequencies

huffman.js — build and encode/decode with a Huffman tree

// Huffman coding — lossless entropy compression
// Frequent symbols get short codes; rare ones get long codes

class HuffNode {
  constructor(sym, freq, left=null, right=null) {
    this.sym=sym; this.freq=freq; this.left=left; this.right=right;
  }
}

function buildHuffmanTree(freqMap) {
  let heap = [...freqMap].map(([s,f]) => new HuffNode(s,f));
  heap.sort((a,b) => a.freq - b.freq);
  while (heap.length > 1) {
    const a = heap.shift(), b = heap.shift();
    const merged = new HuffNode(null, a.freq+b.freq, a, b);
    const ins = heap.findIndex(n => n.freq >= merged.freq);
    heap.splice(ins === -1 ? heap.length : ins, 0, merged);
  }
  return heap[0];
}

function buildCodeTable(root) {
  const table = new Map();
  function walk(node, code) {
    if (node.sym !== null) { table.set(node.sym, code); return; }
    walk(node.left,  code + "0");
    walk(node.right, code + "1");
  }
  walk(root, "");
  return table;
}

function encode(symbols, table) {
  return symbols.map(s => table.get(s) ?? "?").join("");
}

function decode(bits, root) {
  const result = [];
  let node = root;
  for (const bit of bits) {
    node = bit === "0" ? node.left : node.right;
    if (node.sym !== null) { result.push(node.sym); node = root; }
  }
  return result;
}

// JPEG-like symbol frequencies from a typical luminance block
const freqs = new Map([
  ["EOB", 50], ["(0,1)", 30], ["(0,-1)", 25],
  ["(0,2)", 15], ["(1,1)", 12], ["(0,-2)", 10],
  ["(0,3)",  6], ["(2,1)",  4], ["(0,-3)",  3],
]);

const tree  = buildHuffmanTree(freqs);
const codes = buildCodeTable(tree);

console.log("Huffman code table:");
[...codes.entries()].sort((a,b) => a[1].length - b[1].length)
  .forEach(([sym, code]) => {
    const freq = freqs.get(sym);
    console.log(`  ${sym.padEnd(10)} freq=${String(freq).padStart(3)} → ${code.padEnd(10)} (${code.length} bits)`);
  });

// Encode a sample sequence
const seq = ["(0,1)", "(0,-1)", "(0,2)", "EOB"];
const bits = encode(seq, codes);
const back = decode(bits, tree);
console.log(`\nEncoded [${seq.join(",")}]:`);
console.log(`  Bits: ${bits} (${bits.length} bits)`);
console.log(`  Fixed 8-bit encoding: ${seq.length * 8} bits`);
console.log(`  Saving: ${seq.length*8 - bits.length} bits (${((1-bits.length/(seq.length*8))*100).toFixed(0)}%)`);
console.log(`  Decoded back: [${back.join(",")}] ✓`);

§09

Full encode/decode pipeline

Putting all eight steps together: take an 8×8 block of pixel values through the complete JPEG pipeline and back. This is what happens to every block in every channel of every JPEG file ever saved.

full_pipeline.js — encode and decode a complete 8×8 block

// Complete JPEG pipeline: pixels → DCT → quantise → zigzag → RLE
// Then the reverse: RLE decode → dezigzag → dequantise → IDCT → pixels

const N = 8;
const C = u => u === 0 ? 1/Math.sqrt(2) : 1;

const dct2d = block => {
  const out = new Float32Array(64);
  for (let v=0;v<N;v++) for (let u=0;u<N;u++) {
    let s=0;
    for (let y=0;y<N;y++) for (let x=0;x<N;x++)
      s+=block[y*N+x]*Math.cos((2*x+1)*u*Math.PI/(2*N))*Math.cos((2*y+1)*v*Math.PI/(2*N));
    out[v*N+u]=0.25*C(u)*C(v)*s;
  }
  return out;
};
const idct2d = coeffs => {
  const out = new Float32Array(64);
  for (let y=0;y<N;y++) for (let x=0;x<N;x++) {
    let s=0;
    for (let v=0;v<N;v++) for (let u=0;u<N;u++)
      s+=C(u)*C(v)*coeffs[v*N+u]*Math.cos((2*x+1)*u*Math.PI/(2*N))*Math.cos((2*y+1)*v*Math.PI/(2*N));
    out[y*N+x]=0.25*s;
  }
  return out;
};

const LUMA_Q50 = [
  16,11,10,16,24,40,51,61, 12,12,14,19,26,58,60,55,
  14,13,16,24,40,57,69,56, 14,17,22,29,51,87,80,62,
  18,22,37,56,68,109,103,77, 24,35,55,64,81,104,113,92,
  49,64,78,87,103,121,120,101, 72,92,95,98,112,100,103,99,
];

const ZIGZAG = [
   0, 1, 8,16, 9, 2, 3,10, 17,24,32,25,18,11, 4, 5,
  12,19,26,33,40,48,41,34, 27,20,13, 6, 7,14,21,28,
  35,42,49,56,57,50,43,36, 29,22,15,23,30,37,44,51,
  58,59,52,45,38,31,39,46, 53,60,61,54,47,55,62,63,
];

function scaleQ(q) {
  const s = q<50 ? 5000/q : 200-2*q;
  return LUMA_Q50.map(v => Math.max(1, Math.min(255, Math.floor((v*s+50)/100))));
}

function encodeBlock(pixels, quality) {
  const qTable   = scaleQ(quality);
  const shifted  = pixels.map(v => v - 128);
  const coeffs   = dct2d(Float32Array.from(shifted));
  const quant    = coeffs.map((c,i) => Math.round(c / qTable[i]));
  const zigzag   = ZIGZAG.map(i => quant[i]);
  const zeros    = quant.filter(v => v===0).length;
  return { quant, zigzag, zeros, qTable };
}

function decodeBlock(quant, qTable) {
  const dequant  = quant.map((c,i) => c * qTable[i]);
  const spatial  = idct2d(Float32Array.from(dequant));
  return spatial.map(v => Math.max(0, Math.min(255, Math.round(v + 128))));
}

// A real 8×8 luminance block from a photograph (the classic JPEG test block)
const original = [
   52, 55, 61, 66, 70, 61, 64, 73,
   63, 59, 55, 90,109, 85, 69, 72,
   62, 59, 68,113,144,104, 66, 73,
   63, 58, 71,122,154,106, 70, 69,
   67, 61, 68,104,126, 88, 68, 70,
   79, 65, 60, 70, 77, 68, 58, 75,
   85, 71, 64, 59, 55, 61, 65, 83,
   87, 79, 69, 68, 65, 76, 78, 94,
];

for (const Q of [95, 75, 50, 25]) {
  const enc       = encodeBlock(original, Q);
  const restored  = decodeBlock(enc.quant, enc.qTable);
  const maxErr    = Math.max(...original.map((v,i) => Math.abs(v - restored[i])));
  const mse       = original.reduce((s,v,i) => s+(v-restored[i])**2,0) / 64;
  const psnr      = 10 * Math.log10(255*255 / mse);
  console.log(`Q=${String(Q).padStart(3)}: ${enc.zeros}/64 zeros, maxErr=${String(maxErr).padStart(3)}, PSNR=${psnr.toFixed(1)}dB`);
}

console.log("\nRestored block at Q=75:");
const enc75 = encodeBlock(original, 75);
const res75 = decodeBlock(enc75.quant, enc75.qTable);
for (let r=0;r<8;r++) console.log(" ", [...res75.slice(r*8,(r+1)*8)].join("  "));

§10

Quality factor and tradeoffs

The JPEG quality setting (1–100) controls a single scalar that scales the quantisation table. At Q=100 the quantisation step is ~1 so almost no information is discarded. At Q=1 it's ~200 — every coefficient is rounded to zero or ±1. Real images are typically saved at Q=75–85 for web use.

Quality vs. reconstruction error — interactive

Quality 75

Original block

Reconstructed at Q=75

PSNR (Peak Signal-to-Noise Ratio) above 40 dB is generally indistinguishable from the original. Most JPEG files at Q=75 achieve 38–42 dB. Q=50 gets 33–37 dB — visible artefacts on close inspection. Q=10 gets 25–30 dB — visible blocking and ringing.