DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale - bbs.mldev.org

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

henrywu: Posts: 202; Joined: Sun Apr 17, 2022 4:57 pm

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Quote

Post by henrywu » Thu Jun 16, 2022 7:11 pm

https://arxiv.org/pdf/2201.05596.pdf

As the training of giant dense models hits the boundary on the availability and capability of the hardware resou

…login to view the rest of this post

Return to “ML Paper Reading”