Update README.md
Browse files
README.md
CHANGED
|
@@ -70,6 +70,22 @@ You can move those files to different directories if needed.
|
|
| 70 |
python3 $MYDIR/transformers/src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
|
| 71 |
```
|
| 72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
## Text generation
|
| 74 |
|
| 75 |
The following code shows how to use the Megatron GPT2 checkpoint and the Transformers API to generate text.
|
|
@@ -118,22 +134,6 @@ wget --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/megatron_
|
|
| 118 |
python src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
|
| 119 |
```
|
| 120 |
|
| 121 |
-
As explained in [PR #14956](https://github.com/huggingface/transformers/pull/14956), if when running this conversion
|
| 122 |
-
script and you're getting an exception:
|
| 123 |
-
```
|
| 124 |
-
ModuleNotFoundError: No module named 'megatron.model.enums'
|
| 125 |
-
```
|
| 126 |
-
you need to tell python where to find the clone of Megatron-LM, e.g.:
|
| 127 |
-
```
|
| 128 |
-
cd /tmp
|
| 129 |
-
git clone https://github.com/NVIDIA/Megatron-LM
|
| 130 |
-
PYTHONPATH=/tmp/Megatron-LM python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py ...
|
| 131 |
-
```
|
| 132 |
-
Or, if you already have it cloned elsewhere, simply adjust the path to the existing path.
|
| 133 |
-
|
| 134 |
-
If the training was done using a Megatron-LM fork, e.g. [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed/) then
|
| 135 |
-
you may need to have that one in your path, i.e., /path/to/Megatron-DeepSpeed.
|
| 136 |
-
|
| 137 |
3. Fetch missing files
|
| 138 |
```
|
| 139 |
git clone https://huggingface.co/nvidia/megatron-gpt2-345m/
|
|
|
|
| 70 |
python3 $MYDIR/transformers/src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py $MYDIR/nvidia/megatron-gpt2-345m/checkpoint.zip
|
| 71 |
```
|
| 72 |
|
| 73 |
+
As explained in [PR #14956](https://github.com/huggingface/transformers/pull/14956), if when running this conversion
|
| 74 |
+
script and you're getting an exception:
|
| 75 |
+
```
|
| 76 |
+
ModuleNotFoundError: No module named 'megatron.model.enums'
|
| 77 |
+
```
|
| 78 |
+
you need to tell python where to find the clone of Megatron-LM, e.g.:
|
| 79 |
+
```
|
| 80 |
+
cd /tmp
|
| 81 |
+
git clone https://github.com/NVIDIA/Megatron-LM
|
| 82 |
+
PYTHONPATH=/tmp/Megatron-LM python src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py ...
|
| 83 |
+
```
|
| 84 |
+
Or, if you already have it cloned elsewhere, simply adjust the path to the existing path.
|
| 85 |
+
|
| 86 |
+
If the training was done using a Megatron-LM fork, e.g. [Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed/) then
|
| 87 |
+
you may need to have that one in your path, i.e., /path/to/Megatron-DeepSpeed.
|
| 88 |
+
|
| 89 |
## Text generation
|
| 90 |
|
| 91 |
The following code shows how to use the Megatron GPT2 checkpoint and the Transformers API to generate text.
|
|
|
|
| 134 |
python src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py megatron_lm_345m_v0.0.zip
|
| 135 |
```
|
| 136 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 137 |
3. Fetch missing files
|
| 138 |
```
|
| 139 |
git clone https://huggingface.co/nvidia/megatron-gpt2-345m/
|