5 min read
Transformers, as I teach them
Strip away the math and what's left is surprisingly clean. Here's the mental model I use when I teach transformers, the architecture behind every modern LLM, to students who've never written a line of machine learning.