Research Papers
A place to collect paper breakdowns by theme. Each entry will eventually link to a full blog post that explains the paper from first principles.
Optimization
SGD and variants
How simple stochastic gradient descent and its momentum-based cousins actually move on a loss surface.
(Later this will link to a /blog/[slug] breakdown.)
Architectures (CNNs / Transformers)
Attention Is All You Need
The original transformer paper — sequence modeling with attention instead of recurrence.
(Later this will link to a /blog/[slug] breakdown.)
Generative Models
Diffusion Models
Turning noise into structure with a forward noising process and a learned denoiser.
(Later this will link to a /blog/[slug] breakdown.)
RAG & LLM Systems
Retrieval-Augmented Generation
Combining parametric knowledge in an LLM with non-parametric retrieval over external data.
(Later this will link to a /blog/[slug] breakdown.)