2024-09-02 20:08
Something that’s often missed is the difference between theory and engineering. LLMs have relatively little theory. The principles of operation are fairly simple and a few months is plenty to get fluent. But there’s a HUGE amount of engineering involved in building a useful model: A zillion small details that make the difference between modal collapse/divergence and actual good output. And those details will take a longer time to become fluent in, along with continuous reading and research…