nvidia
/

megatron-gpt2-345m

Model card Files Files and versions

jdemouth commited on Jan 4, 2022

Commit

115a6a3

·

1 Parent(s): 89445bf

Update README.md

Files changed (1) hide show

README.md +16 -16

README.md CHANGED Viewed

@@ -70,6 +70,22 @@ You can move those files to different directories if needed.
 python3 $MYDIR/transformers/src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
 ```
 ## Text generation
 The following code shows how to use the Megatron GPT2 checkpoint and the Transformers API to generate text.
@@ -118,22 +134,6 @@ wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_
 python src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
 ```
-As explained in [PR #14956](https://github.com/huggingface/transformers/pull/14956), if when running this conversion
-script and you're getting an exception:
-```
-ModuleNotFoundError: No module named 'megatron.model.enums'
-```
-you need to tell python where to find the clone of Megatron-LM, e.g.:
-```
-cd /tmp
-git clone https://github.com/NVIDIA/Megatron-LM
-PYTHONPATH=/tmp/Megatron-LM python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py ...
-```
-Or, if you already have it cloned elsewhere, simply adjust the path to the existing path.
-If the training was done using a Megatron-LM fork, e.g. [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed/) then
-you may need to have that one in your path, i.e., /path/to/Megatron-DeepSpeed.
 3. Fetch missing files
 ```
 git clone https://huggingface.co/nvidia/megatron-gpt2-345m/

 python3 $MYDIR/transformers/src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
 ```
+As explained in [PR #14956](https://github.com/huggingface/transformers/pull/14956), if when running this conversion
+script and you're getting an exception:
+```
+ModuleNotFoundError: No module named 'megatron.model.enums'
+```
+you need to tell python where to find the clone of Megatron-LM, e.g.:
+```
+cd /tmp
+git clone https://github.com/NVIDIA/Megatron-LM
+PYTHONPATH=/tmp/Megatron-LM python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py ...
+```
+Or, if you already have it cloned elsewhere, simply adjust the path to the existing path.
+If the training was done using a Megatron-LM fork, e.g. [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed/) then
+you may need to have that one in your path, i.e., /path/to/Megatron-DeepSpeed.
 ## Text generation
 The following code shows how to use the Megatron GPT2 checkpoint and the Transformers API to generate text.
 python src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
 ```
 3. Fetch missing files
 ```
 git clone https://huggingface.co/nvidia/megatron-gpt2-345m/