DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Machine Learning Paper Review and Discussion


henrywu
Posts: 202
Joined: Sun Apr 17, 2022 4:57 pm

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Post by henrywu »

https://arxiv.org/pdf/2201.05596.pdf

As the training of giant dense models hits the boundary on the availability and capability of the hardware resou

login to view the rest of this post