
GitHub - NVIDIA/Megatron-LM: Ongoing research training …
Megatron Core expands upon Megatron-LM's GPU-optimized techniques with more cutting-edge innovations on system-level optimizations, featuring composable and modular APIs.
Releases · NVIDIA/Megatron-LM - GitHub
Ongoing research training transformer models at scale - Releases · NVIDIA/Megatron-LM
Megatron-LM/megatron/core/QuickStart.md at main - GitHub
Ongoing research training transformer models at scale - Megatron-LM/megatron/core/QuickStart.md at main · NVIDIA/Megatron-LM
Megatron-LM/megatron/core/README.md at main - GitHub
Ongoing research training transformer models at scale - Megatron-LM/megatron/core/README.md at main · NVIDIA/Megatron-LM
ROCm/Stanford-Megatron-LM - GitHub
Ongoing research training transformer models at scale - ROCm/Stanford-Megatron-LM
GitHub - epfLLM/Megatron-LLM: distributed trainer for LLMs
Our repository is a modification of the original Megatron-LM codebase by Nvidia. Added key features include: architectures supported: Llama, Llama 2, Code Llama, Falcon and Mistral …
GitHub - shizhengLi/megatron-learning: 分布式训练框架Megatron …
本项目是一个专注于Megatron-LM框架的学习和研究项目,涵盖了从基础概念到高级实现技术的完整知识体系。 通过系统性的学习和实践,帮助开发者深入理解大规模语言模型训练的核心原 …
Megatron-DeepSpeed - GitHub
DeepSpeed version of NVIDIA's Megatron-LM that adds additional support for several features such as MoE model training, Curriculum Learning, 3D Parallelism, and others.
GitHub - Ascend/Megatron-LM
Megatron-LM 概述 准备训练环境 开始训练 训练结果展示 版本说明
alibaba/Pai-Megatron-Patch - GitHub
The design philosophy of Pai-Megatron-Patch is to avoid invasive modifications to the source code of Megatron-LM. In other words, it does not add new modules directly to Megatron-LM.