All Posts

AI infrastructure, DC fabric design, and the CCDE journey. Published every two weeks.

DC Fabric Design
Why AI Training Clusters Need Lossless Ethernet — And What That Means for Your Fabric
Every GPU in a training cluster communicates constantly with every other GPU. When packets drop, training jobs stall. Here's why lossless Ethernet is non-negotiable and what it takes to actually build it.
2026-05-08 · 8 min read