abhi-mosaic commited on
Commit
3a2139e
·
1 Parent(s): 48823a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -5
README.md CHANGED
@@ -80,7 +80,10 @@ This model is best used with the MosaicML [llm-foundry repository](https://githu
80
 
81
  ```python
82
  import transformers
83
- model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b', trust_remote_code=True)
 
 
 
84
  ```
85
  Note: This model requires that `trust_remote_code=True` be passed to the `from_pretrained` method.
86
  This is because we use a custom `MPT` model architecture that is not yet part of the Hugging Face `transformers` package.
@@ -88,19 +91,34 @@ This is because we use a custom `MPT` model architecture that is not yet part of
88
 
89
  To use the optimized [triton implementation](https://github.com/openai/triton) of FlashAttention, you can load the model with `attn_impl='triton'` and move the model to `bfloat16`:
90
  ```python
91
- config = transformers.AutoConfig.from_pretrained('mosaicml/mpt-7b', trust_remote_code=True)
 
 
 
92
  config.attn_config['attn_impl'] = 'triton'
93
 
94
- model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b', config=config, torch_dtype=torch.bfloat16, trust_remote_code=True)
 
 
 
 
 
95
  model.to(device='cuda:0')
96
  ```
97
 
98
  Although the model was trained with a sequence length of 2048, ALiBi enables users to increase the maximum sequence length during finetuning and/or inference. For example:
99
 
100
  ```python
101
- config = transformers.AutoConfig.from_pretrained('mosaicml/mpt-7b', trust_remote_code=True)
 
 
 
102
  config.update({"max_seq_len": 4096})
103
- model = transformers.AutoModelForCausalLM.from_pretrained('mosaicml/mpt-7b', config=config, trust_remote_code=True)
 
 
 
 
104
  ```
105
 
106
  This model was trained with the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.
 
80
 
81
  ```python
82
  import transformers
83
+ model = transformers.AutoModelForCausalLM.from_pretrained(
84
+ 'mosaicml/mpt-7b',
85
+ trust_remote_code=True
86
+ )
87
  ```
88
  Note: This model requires that `trust_remote_code=True` be passed to the `from_pretrained` method.
89
  This is because we use a custom `MPT` model architecture that is not yet part of the Hugging Face `transformers` package.
 
91
 
92
  To use the optimized [triton implementation](https://github.com/openai/triton) of FlashAttention, you can load the model with `attn_impl='triton'` and move the model to `bfloat16`:
93
  ```python
94
+ config = transformers.AutoConfig.from_pretrained(
95
+ 'mosaicml/mpt-7b',
96
+ trust_remote_code=True
97
+ )
98
  config.attn_config['attn_impl'] = 'triton'
99
 
100
+ model = transformers.AutoModelForCausalLM.from_pretrained(
101
+ 'mosaicml/mpt-7b',
102
+ config=config,
103
+ torch_dtype=torch.bfloat16,
104
+ trust_remote_code=True
105
+ )
106
  model.to(device='cuda:0')
107
  ```
108
 
109
  Although the model was trained with a sequence length of 2048, ALiBi enables users to increase the maximum sequence length during finetuning and/or inference. For example:
110
 
111
  ```python
112
+ config = transformers.AutoConfig.from_pretrained(
113
+ 'mosaicml/mpt-7b',
114
+ trust_remote_code=True
115
+ )
116
  config.update({"max_seq_len": 4096})
117
+ model = transformers.AutoModelForCausalLM.from_pretrained(
118
+ 'mosaicml/mpt-7b',
119
+ config=config,
120
+ trust_remote_code=True
121
+ )
122
  ```
123
 
124
  This model was trained with the [EleutherAI/gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) tokenizer.