Blog

Technical deep dives.

Articles on transformer architectures, model quantization, inference optimization, and the systems behind production AI.

Attention

Frameworks

Fundamentals

Quantization

Quick Books

Transformers