LLMs contain a LOT of parameters. But what’s a parameter?
When a model is trained, each word in its vocabulary is assigned a numerical value that captures the meaning of that word in relation to all the other words, based on how the word appears in countless examples across the model’s training data. Each word gets replaced by a kind of code? Yeah. But there’s…


