Page 1 of 1

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

Posted: Thu Jun 16, 2022 7:11 pm
by henrywu

https://arxiv.org/pdf/2201.05596.pdf

As the training of giant dense models hits the boundary on the availability and capability of the hardware resou

login to view the rest of this post