36 min readfoundation
Tokenizers from First Principles
Tokenization looks like preprocessing and behaves like architecture. From bytes to BPE to the cracks at the frontier, this is an argument that almost everything weird about LLMs starts at the atom you chose.