Notes on inference systems, evaluation, deployment, and operational reliability.
Model size attracts attention, but inference infrastructure determines whether AI systems are usable, affordable, reliable, and scalable in production.