Sep 30, 2025110 min readfoundation
Gradient Descent: Theory, Mathematics, and Implementation
Why does walking downhill in parameter space solve everything from linear regression to GPT? A rigorous treatment of gradient descent: convergence theory, variants and the challenges of real-world optimization.