Blog

Technical deep dives.

Articles on transformer architectures, model quantization, inference optimization, and the systems behind production AI.

Attention

Visual walkthrough of single-headed and multi-headed self-attention — understanding the matrix dimensions and operations.

Jan 2024

Quick refresher guides for data-related frameworks — from TensorFlow to PyTorch, simplified for rapid learning.

Feb 2024

A comprehensive guide to understanding and implementing cross entropy in machine learning — from information theory to PyTorch.

Feb 2024

An essential guide to the foundational components of modern neural networks — activations, cost functions, optimizers, and regularization.

Feb 2024

A deep dive into model quantization — reducing model size while preserving accuracy. Covers symmetric and asymmetric approaches with code.

Jan 2024

Distilled insights from books on technology, AI, and engineering — the golden takeaways worth remembering.

Feb 2024

An object-oriented implementation of the complete Transformer architecture in PyTorch — from input embeddings to the full encoder-decoder model.

Jan 2024

A practical guide to coding the Transformer architecture from scratch — turning the 'Attention is All You Need' paper into working code.

Jan 2024