Blogs
Understanding NanoChat Model Parameters: A Complete Breakdown
The NanoChat d20 model has 560,988,160 parameters (≈561M). But how do we arrive at this number? This post breaks down the exact calculations behind the model architecture.
The Complete Guide to Training Large Language Models: A Three-Stage Pipeline
A comprehensive, detailed technical blog on the three-stage LLM training pipeline! Giving a detailed breakdown of each stage (pretraining, midtraining, and fine-tuning)