Show HN: Kandle – A WebGPU-based ML library written from scratch in JavaScript

2 points by finalkk 7 hours ago|3 comments

•

finalkk 7 hours ago

Hi HN,

I’ve spent the last 3 months building Kandle because I was frustrated with the status quo of ML on the Web.

Currently, if you want to run models in the browser, you’re mostly stuck with ONNX Runtime or WebLLM. They are incredible for production, but they are "Blackboxes." You have almost zero control over the intermediate tensors, and implementing custom logic or hooks between layers is a nightmare.

I missed the "PyTorch vibe"—the transparency and the flexibility of Eager Mode. While TensorFlow.js exists, its API has always felt "off" for those of us coming from the Torch ecosystem.

So I built Kandle from scratch. It’s a Web-native ML framework designed to bring the true PyTorch experience to JavaScript/TypeScript:

* Deeply aligned with PyTorch’s ATen/c10 architecture: I’ve implemented a complete tensor system with stride mechanisms, broadcasting, and zero-copy view operations like transpose, permute, and slice.

* Whitebox Framework: Unlike static graph engines, Kandle is Eager Mode. You can pause execution at any layer, inspect every tensor, and register forward hooks.

* Ecosystem: It includes 200+ operators, a full nn.Module system, and partial torchaudio functionality for complex pre-processing.

* Native Safetensor support: Load weights directly without painful conversions.

It’s still in the early stages (Autograd is coming next!), but the forward pass is stable enough to run models like Qwen3 and Whisper.

Interactive Demo: http://kandle-demo.vercel.app

I built a "Logit Lens" and "Attention Link" visualizer in the demo to show the unique advantage of using a whitebox framework for model interpretability in the browser. You can literally see the model's "thoughts" evolve through each layer.

Note for the Demo: Since quantization is not yet implemented, the demo currently only supports the original Qwen3-0.6B pre-trained weights in bf16 format (model.safetensors).

GitHub: https://github.com/final-kk/kandle

I’m really curious to hear what you think. Does the JS ecosystem need a "Torch standard" API, or is the current "Inference-only" path enough?

•

forgotpwd16 7 hours ago

How it compares to jax-js? Besides API preference that is.

•

finalkk 7 hours ago

Honestly, I haven't done a proper performance benchmark yet. Most of my WebGPU shaders were generated via "vibe coding" (heavily AI-assisted) to prioritize rapid architectural verification over deep kernel optimization. So, jax-js or ONNX Runtime would likely outperform Kandle in raw speed at this stage.

However, it’s hard to put aside "API preference" because that is the core feature. The real value of Kandle isn't just the syntax, but the workflow compatibility.

For example, when I implemented Qwen3 or Whisper, I could practically "copy-paste" the logic from the official HuggingFace transformers Python repository into TypeScript. You don't have to re-think the model as a static graph or adapt to a different paradigm—if it works in PyTorch, you already know how to build it in Kandle.

Beyond that, Kandle is aiming for a "batteries-included" ecosystem. We already have built-in support for Safetensors and torchaudio transforms, so you can handle the entire pipeline from loading weights to audio pre-processing (like Mel Spectrograms) without leaving the framework.

So while jax-js is great for high-performance numerical apps, Kandle is for the developer who wants to bridge the gap between Python research and Web deployment with zero cognitive overhead.