Neural Network Playground
Build, train, and visualize multi-layer perceptrons in real time. Experiment with architectures, datasets, and hyperparameters to develop intuition for how neural networks learn.
Interactive visualizations for building intuition about deep learning concepts.
Build, train, and visualize multi-layer perceptrons in real time. Experiment with architectures, datasets, and hyperparameters to develop intuition for how neural networks learn.
A small visible-window KV-cache recall trace for seeing how FIFO leaves foreign work ahead of a request's tail, and how a request-aware dispatch heuristic changes that.
A 10×10 deterministic gridworld for watching Bellman backups and greedy policy improvement solve a known MDP. Pick a route template or click any cell to choose the destination, then scrub value iteration one sweep at a time.
Pick a preset or dial in V, d, L, T, B, and dtype to see the real tax of vocabulary on a transformer: embedding parameters, untied LM head, logits FLOPs per token, and KV-cache bytes.
Pick from twenty-nine production tokenizers — tiktoken, Llama, Gemma, Mistral, Qwen, DeepSeek, Cohere, Phi, Yi — paste any text, and watch every model carve it into pieces. Click a token to walk its BPE merge tree.
Pin two to four tokenizers against the same input and watch where they carve it differently. Highlighted pills mark the spans only some encoders share; the bar at the top prints counts plus a spread so the multilingual penalty becomes a number.
Paste a paragraph, dial in a vocabulary size, and watch Karpathy's minBPE mint one merge token at a time. Each learned token gets compared byte-for-byte against GPT-4's cl100k_base so you can see where your forty-byte trainer agrees with the production tokenizer — and where it goes its own way.