Topics
Explore by topic
attention2autodiff1backpropagation3batch-normalization1batching1calculus2computational-graphs1custom-gradients1ecosystem1exploding-gradients1float321floating-point1forward-propagation1fundamentals6gradient-checking2gradient-descent2ieee-7541inference1infinity1initialization1intermediate2jax1kv-cache1linear-regression1lstms1machine-epsilon1machine-learning2mcp1memory1memory-optimization1mixed-precision1mlp1multi-layer-perceptron1nan1neural-networks1number-representation1numerical-stability2optimization5perceptron1production2pytorch1quantization1residual-connections1rnns1rounding1standards1subnormals1tool-calling1transformers1ulp1vanishing-gradients1vjp1vllm1