Show HN: TurboQuant-WASM – Google's vector quantization in the browser

6.8 relevance

TurboQuant-WASM brings Google's quantization to browsers; novel but niche for web ML.

2026-04-05 open/source Hacker News (100+)

TurboQuant WASM SIMD vector compression — 3 bits/dim with fast dot product. Requires relaxed SIMD (Chrome 114+, Firefox 128+, Safari 18+, Node 20+) - teamchong/turboquant-wasm

Summary

TurboQuant-WASM implements Google's ICLR 2026 TurboQuant algorithm in WebAssembly, compressing float32 embeddings 6x (1.5GB to 240MB) without any training step. It enables direct dot product searches on compressed data via a TypeScript API that includes optimized batch operations like dotBatch. The library requires runtimes with relaxed SIMD support, such as Chrome 114+ and Node.js 20+, for performance.

Key Takeaway

Evaluate TurboQuant-WASM for compressing and querying embeddings in web-based ML services to cut storage and latency.

Why it matters

This enables efficient vector search in browser and serverless AI applications, reducing memory and bandwidth costs for high-dimensional embeddings without retraining.