“The personality isn't in any single line of code. It's in the space between the data and the weights. — Arman Hossain, creator of GuppyLM”
You know that feeling when you use ChatGPT and it feels like magic you can't explain? Every LLM tutorial either hand-waves the fundamentals or drowns you in theory without code you can run. You've heard about transformers and tokenizers, but you've never watched a loss curve drop on a model you actually built — and that gap makes all of AI feel like a black box.
Think of it like teaching a child to finish your sentences — you show them thousands of examples, they spot patterns, eventually they predict what comes next. GuppyLM does this with text: you feed it 60K fish conversations (synthetically generated from templates), it learns which words tend to follow others. The architecture is a vanilla transformer — 6 layers, 384 hidden dimensions, 4096-token vocabulary — small enough to train in 5 minutes. You run one Colab notebook, watch loss drop from 10.0 to 0.384, and chat with your creation.
If you've used GPT-4 but couldn't explain what a transformer actually does, this is your entry point. Perfect for developers who learn by doing — you'll touch every layer of the stack. Not for you if you need production-ready multi-turn chat or factual Q&A; this is intentionally tiny and domain-locked.
Yes — specifically if you want to demystify LLMs. The creator's honest documentation of failed experiments (system prompts, chain-of-thought, multi-turn) is worth more than the code itself. This is educational infrastructure, not production tooling. Expect to spend an afternoon, walk away understanding tokenizers, embeddings, attention, and loss curves viscerally.
Deep-dive insight, Easy and Pro modes, plus action playbooks — the full breakdown is one tap away.