Dovakiins commited on
Commit
d304116
·
verified ·
1 Parent(s): 3f1f5e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -701
README.md CHANGED
@@ -1,702 +1,12 @@
1
- # Axolotl
2
-
3
- Axolotl is a tool designed to streamline the fine-tuning of various AI models, offering support for multiple configurations and architectures.
4
-
5
- Features:
6
- - Train various Huggingface models such as llama, pythia, falcon, mpt
7
- - Supports fullfinetune, lora, qlora, relora, and gptq
8
- - Customize configurations using a simple yaml file or CLI overwrite
9
- - Load different dataset formats, use custom formats, or bring your own tokenized datasets
10
- - Integrated with xformer, flash attention, rope scaling, and multipacking
11
- - Works with single GPU or multiple GPUs via FSDP or Deepspeed
12
- - Easily run with Docker locally or on the cloud
13
- - Log results and optionally checkpoints to wandb or mlflow
14
- - And more!
15
-
16
- <a href="https://www.phorm.ai/query?projectId=e315ba4a-4e14-421f-ab05-38a1f9076f25">
17
- <img alt="phorm.ai" src="https://img.shields.io/badge/Phorm-Ask_AI-%23F2777A.svg?&logo=">
18
- </a>
19
-
20
- <table>
21
- <tr>
22
- <td>
23
-
24
- ## Table of Contents
25
- - [Introduction](#axolotl)
26
- - [Supported Features](#axolotl-supports)
27
- - [Quickstart](#quickstart-)
28
- - [Environment](#environment)
29
- - [Docker](#docker)
30
- - [Conda/Pip venv](#condapip-venv)
31
- - [Cloud GPU](#cloud-gpu) - Latitude.sh, JarvisLabs, RunPod
32
- - [Bare Metal Cloud GPU](#bare-metal-cloud-gpu)
33
- - [Windows](#windows)
34
- - [Mac](#mac)
35
- - [Google Colab](#google-colab)
36
- - [Launching on public clouds via SkyPilot](#launching-on-public-clouds-via-skypilot)
37
- - [Launching on public clouds via dstack](#launching-on-public-clouds-via-dstack)
38
- - [Dataset](#dataset)
39
- - [Config](#config)
40
- - [Train](#train)
41
- - [Inference](#inference-playground)
42
- - [Merge LORA to Base](#merge-lora-to-base)
43
- - [Special Tokens](#special-tokens)
44
- - [All Config Options](#all-config-options)
45
- - Advanced Topics
46
- - [Multipack](./docs/multipack.qmd)<svg width="24" height="24" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path d="M17 13.5v6H5v-12h6m3-3h6v6m0-6-9 9" class="icon_svg-stroke" stroke="#666" stroke-width="1.5" fill="none" fill-rule="evenodd" stroke-linecap="round" stroke-linejoin="round"></path></svg>
47
- - [RLHF & DPO](./docs/rlhf.qmd)<svg width="24" height="24" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path d="M17 13.5v6H5v-12h6m3-3h6v6m0-6-9 9" class="icon_svg-stroke" stroke="#666" stroke-width="1.5" fill="none" fill-rule="evenodd" stroke-linecap="round" stroke-linejoin="round"></path></svg>
48
- - [Dataset Pre-Processing](./docs/dataset_preprocessing.qmd)<svg width="24" height="24" viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg"><path d="M17 13.5v6H5v-12h6m3-3h6v6m0-6-9 9" class="icon_svg-stroke" stroke="#666" stroke-width="1.5" fill="none" fill-rule="evenodd" stroke-linecap="round" stroke-linejoin="round"></path></svg>
49
- - [Common Errors](#common-errors-)
50
- - [Tokenization Mismatch b/w Training & Inference](#tokenization-mismatch-bw-inference--training)
51
- - [Debugging Axolotl](#debugging-axolotl)
52
- - [Need Help?](#need-help-)
53
- - [Badge](#badge-)
54
- - [Community Showcase](#community-showcase)
55
- - [Contributing](#contributing-)
56
- - [Sponsors](#sponsors-)
57
-
58
- </td>
59
- <td>
60
-
61
- <div align="center">
62
- <img src="image/axolotl.png" alt="axolotl" width="160">
63
- <div>
64
- <p>
65
- <b>Axolotl provides a unified repository for fine-tuning <br />a variety of AI models with ease</b>
66
- </p>
67
- <p>
68
- Go ahead and Axolotl questions!!
69
- </p>
70
- <img src="https://github.com/OpenAccess-AI-Collective/axolotl/actions/workflows/pre-commit.yml/badge.svg?branch=main" alt="pre-commit">
71
- <img alt="PyTest Status" src="https://github.com/OpenAccess-AI-Collective/axolotl/actions/workflows/tests.yml/badge.svg?branch=main">
72
- </div>
73
- </div>
74
-
75
- </td>
76
- </tr>
77
- </table>
78
-
79
- ## Axolotl supports
80
-
81
- | | fp16/fp32 | lora | qlora | gptq | gptq w/flash attn | flash attn | xformers attn |
82
- |-------------|:----------|:-----|-------|------|-------------------|------------|--------------|
83
- | llama | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
84
- | Mistral | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
85
- | Mixtral-MoE | ✅ | ✅ | ✅ | ❓ | ❓ | ❓ | ❓ |
86
- | Mixtral8X22 | ✅ | ✅ | ✅ | ❓ | ❓ | ❓ | ❓ |
87
- | Pythia | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❓ |
88
- | cerebras | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❓ |
89
- | btlm | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❓ |
90
- | mpt | ✅ | ❌ | ❓ | ❌ | ❌ | ❌ | ❓ |
91
- | falcon | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❓ |
92
- | gpt-j | ✅ | ✅ | ✅ | ❌ | ❌ | ❓ | ❓ |
93
- | XGen | ✅ | ❓ | ✅ | ❓ | ❓ | ❓ | ✅ |
94
- | phi | ✅ | ✅ | ✅ | ❓ | ❓ | ❓ | ❓ |
95
- | RWKV | ✅ | ❓ | ❓ | ❓ | ❓ | ❓ | ❓ |
96
- | Qwen | ✅ | ✅ | ✅ | ❓ | ❓ | ❓ | ❓ |
97
- | Gemma | ✅ | ✅ | ✅ | ❓ | ❓ | ✅ | ❓ |
98
-
99
- ✅: supported
100
- ❌: not supported
101
- ❓: untested
102
-
103
- ## Quickstart ⚡
104
-
105
- Get started with Axolotl in just a few steps! This quickstart guide will walk you through setting up and running a basic fine-tuning task.
106
-
107
- **Requirements**: Python >=3.10 and Pytorch >=2.1.1.
108
-
109
- ```bash
110
- git clone https://github.com/OpenAccess-AI-Collective/axolotl
111
- cd axolotl
112
-
113
- pip3 install packaging ninja
114
- pip3 install -e '.[flash-attn,deepspeed]'
115
- ```
116
-
117
- ### Usage
118
- ```bash
119
- # preprocess datasets - optional but recommended
120
- CUDA_VISIBLE_DEVICES="" python -m axolotl.cli.preprocess examples/openllama-3b/lora.yml
121
-
122
- # finetune lora
123
- accelerate launch -m axolotl.cli.train examples/openllama-3b/lora.yml
124
-
125
- # inference
126
- accelerate launch -m axolotl.cli.inference examples/openllama-3b/lora.yml \
127
- --lora_model_dir="./outputs/lora-out"
128
-
129
- # gradio
130
- accelerate launch -m axolotl.cli.inference examples/openllama-3b/lora.yml \
131
- --lora_model_dir="./outputs/lora-out" --gradio
132
-
133
- # remote yaml files - the yaml config can be hosted on a public URL
134
- # Note: the yaml config must directly link to the **raw** yaml
135
- accelerate launch -m axolotl.cli.train https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/examples/openllama-3b/lora.yml
136
- ```
137
-
138
- ## Advanced Setup
139
-
140
- ### Environment
141
-
142
- #### Docker
143
-
144
- ```bash
145
- docker run --gpus '"all"' --rm -it winglian/axolotl:main-latest
146
- ```
147
-
148
- Or run on the current files for development:
149
-
150
- ```sh
151
- docker compose up -d
152
- ```
153
-
154
- >[!Tip]
155
- > If you want to debug axolotl or prefer to use Docker as your development environment, see the [debugging guide's section on Docker](docs/debugging.qmd#debugging-with-docker).
156
-
157
- <details>
158
-
159
- <summary>Docker advanced</summary>
160
-
161
- A more powerful Docker command to run would be this:
162
-
163
- ```bash
164
- docker run --privileged --gpus '"all"' --shm-size 10g --rm -it --name axolotl --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --mount type=bind,src="${PWD}",target=/workspace/axolotl -v ${HOME}/.cache/huggingface:/root/.cache/huggingface winglian/axolotl:main-latest
165
- ```
166
-
167
- It additionally:
168
- * Prevents memory issues when running e.g. deepspeed (e.g. you could hit SIGBUS/signal 7 error) through `--ipc` and `--ulimit` args.
169
- * Persists the downloaded HF data (models etc.) and your modifications to axolotl code through `--mount`/`-v` args.
170
- * The `--name` argument simply makes it easier to refer to the container in vscode (`Dev Containers: Attach to Running Container...`) or in your terminal.
171
- * The `--privileged` flag gives all capabilities to the container.
172
- * The `--shm-size 10g` argument increases the shared memory size. Use this if you see `exitcode: -7` errors using deepspeed.
173
-
174
- [More information on nvidia website](https://docs.nvidia.com/deeplearning/frameworks/user-guide/index.html#setincshmem)
175
-
176
- </details>
177
-
178
- #### Conda/Pip venv
179
- 1. Install python >=**3.10**
180
-
181
- 2. Install pytorch stable https://pytorch.org/get-started/locally/
182
-
183
- 3. Install Axolotl along with python dependencies
184
- ```bash
185
- pip3 install packaging
186
- pip3 install -e '.[flash-attn,deepspeed]'
187
- ```
188
- 4. (Optional) Login to Huggingface to use gated models/datasets.
189
- ```bash
190
- huggingface-cli login
191
- ```
192
- Get the token at huggingface.co/settings/tokens
193
-
194
- #### Cloud GPU
195
-
196
- For cloud GPU providers that support docker images, use [`winglian/axolotl-cloud:main-latest`](https://hub.docker.com/r/winglian/axolotl-cloud/tags)
197
-
198
- - on Latitude.sh use this [direct link](https://latitude.sh/blueprint/989e0e79-3bf6-41ea-a46b-1f246e309d5c)
199
- - on JarvisLabs.ai use this [direct link](https://jarvislabs.ai/templates/axolotl)
200
- - on RunPod use this [direct link](https://runpod.io/gsc?template=v2ickqhz9s&ref=6i7fkpdz)
201
-
202
- #### Bare Metal Cloud GPU
203
-
204
- ##### LambdaLabs
205
-
206
- <details>
207
-
208
- <summary>Click to Expand</summary>
209
-
210
- 1. Install python
211
- ```bash
212
- sudo apt update
213
- sudo apt install -y python3.10
214
-
215
- sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1
216
- sudo update-alternatives --config python # pick 3.10 if given option
217
- python -V # should be 3.10
218
-
219
- ```
220
-
221
- 2. Install pip
222
- ```bash
223
- wget https://bootstrap.pypa.io/get-pip.py
224
- python get-pip.py
225
- ```
226
-
227
- 3. Install Pytorch https://pytorch.org/get-started/locally/
228
-
229
- 4. Follow instructions on quickstart.
230
-
231
- 5. Run
232
- ```bash
233
- pip3 install protobuf==3.20.3
234
- pip3 install -U --ignore-installed requests Pillow psutil scipy
235
- ```
236
-
237
- 6. Set path
238
- ```bash
239
- export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATH
240
- ```
241
- </details>
242
-
243
- ##### GCP
244
-
245
- <details>
246
-
247
- <summary>Click to Expand</summary>
248
-
249
- Use a Deeplearning linux OS with cuda and pytorch installed. Then follow instructions on quickstart.
250
-
251
- Make sure to run the below to uninstall xla.
252
- ```bash
253
- pip uninstall -y torch_xla[tpu]
254
- ```
255
-
256
- </details>
257
-
258
- #### Windows
259
- Please use WSL or Docker!
260
-
261
- #### Mac
262
-
263
- Use the below instead of the install method in QuickStart.
264
- ```
265
- pip3 install -e '.'
266
- ```
267
- More info: [mac.md](/docs/mac.qmd)
268
-
269
- #### Google Colab
270
-
271
- Please use this example [notebook](examples/colab-notebooks/colab-axolotl-example.ipynb).
272
-
273
- #### Launching on public clouds via SkyPilot
274
- To launch on GPU instances (both on-demand and spot instances) on 7+ clouds (GCP, AWS, Azure, OCI, and more), you can use [SkyPilot](https://skypilot.readthedocs.io/en/latest/index.html):
275
-
276
- ```bash
277
- pip install "skypilot-nightly[gcp,aws,azure,oci,lambda,kubernetes,ibm,scp]" # choose your clouds
278
- sky check
279
- ```
280
-
281
- Get the [example YAMLs](https://github.com/skypilot-org/skypilot/tree/master/llm/axolotl) of using Axolotl to finetune `mistralai/Mistral-7B-v0.1`:
282
- ```
283
- git clone https://github.com/skypilot-org/skypilot.git
284
- cd skypilot/llm/axolotl
285
- ```
286
-
287
- Use one command to launch:
288
- ```bash
289
- # On-demand
290
- HF_TOKEN=xx sky launch axolotl.yaml --env HF_TOKEN
291
-
292
- # Managed spot (auto-recovery on preemption)
293
- HF_TOKEN=xx BUCKET=<unique-name> sky spot launch axolotl-spot.yaml --env HF_TOKEN --env BUCKET
294
- ```
295
-
296
- #### Launching on public clouds via dstack
297
- To launch on GPU instance (both on-demand and spot instances) on public clouds (GCP, AWS, Azure, Lambda Labs, TensorDock, Vast.ai, and CUDO), you can use [dstack](https://dstack.ai/).
298
-
299
- Write a job description in YAML as below:
300
-
301
- ```yaml
302
- # dstack.yaml
303
- type: task
304
-
305
- image: winglian/axolotl-cloud:main-20240429-py3.11-cu121-2.2.2
306
-
307
- env:
308
- - HUGGING_FACE_HUB_TOKEN
309
- - WANDB_API_KEY
310
-
311
- commands:
312
- - accelerate launch -m axolotl.cli.train config.yaml
313
-
314
- ports:
315
- - 6006
316
-
317
- resources:
318
- gpu:
319
- memory: 24GB..
320
- count: 2
321
- ```
322
-
323
- then, simply run the job with `dstack run` command. Append `--spot` option if you want spot instance. `dstack run` command will show you the instance with cheapest price across multi cloud services:
324
-
325
- ```bash
326
- pip install dstack
327
- HUGGING_FACE_HUB_TOKEN=xxx WANDB_API_KEY=xxx dstack run . -f dstack.yaml # --spot
328
- ```
329
-
330
- For further and fine-grained use cases, please refer to the official [dstack documents](https://dstack.ai/docs/) and the detailed description of [axolotl example](https://github.com/dstackai/dstack/tree/master/examples/fine-tuning/axolotl) on the official repository.
331
-
332
- ### Dataset
333
-
334
- Axolotl supports a variety of dataset formats. It is recommended to use a JSONL. The schema of the JSONL depends upon the task and the prompt template you wish to use. Instead of a JSONL, you can also use a HuggingFace dataset with columns for each JSONL field.
335
-
336
- See [these docs](https://openaccess-ai-collective.github.io/axolotl/docs/dataset-formats/) for more information on how to use different dataset formats.
337
-
338
- ### Config
339
-
340
- See [examples](examples) for quick start. It is recommended to duplicate and modify to your needs. The most important options are:
341
-
342
- - model
343
- ```yaml
344
- base_model: ./llama-7b-hf # local or huggingface repo
345
- ```
346
- Note: The code will load the right architecture.
347
-
348
- - dataset
349
- ```yaml
350
- datasets:
351
- # huggingface repo
352
- - path: vicgalle/alpaca-gpt4
353
- type: alpaca
354
-
355
- # huggingface repo with specific configuration/subset
356
- - path: EleutherAI/pile
357
- name: enron_emails
358
- type: completion # format from earlier
359
- field: text # Optional[str] default: text, field to use for completion data
360
-
361
- # huggingface repo with multiple named configurations/subsets
362
- - path: bigcode/commitpackft
363
- name:
364
- - ruby
365
- - python
366
- - typescript
367
- type: ... # unimplemented custom format
368
-
369
- # fastchat conversation
370
- # See 'conversation' options: https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py
371
- - path: ...
372
- type: sharegpt
373
- conversation: chatml # default: vicuna_v1.1
374
-
375
- # local
376
- - path: data.jsonl # or json
377
- ds_type: json # see other options below
378
- type: alpaca
379
-
380
- # dataset with splits, but no train split
381
- - path: knowrohit07/know_sql
382
- type: context_qa.load_v2
383
- train_on_split: validation
384
-
385
- # loading from s3 or gcs
386
- # s3 creds will be loaded from the system default and gcs only supports public access
387
- - path: s3://path_to_ds # Accepts folder with arrow/parquet or file path like above. Supports s3, gcs.
388
- ...
389
-
390
- # Loading Data From a Public URL
391
- # - The file format is `json` (which includes `jsonl`) by default. For different formats, adjust the `ds_type` option accordingly.
392
- - path: https://some.url.com/yourdata.jsonl # The URL should be a direct link to the file you wish to load. URLs must use HTTPS protocol, not HTTP.
393
- ds_type: json # this is the default, see other options below.
394
- ```
395
-
396
- - loading
397
- ```yaml
398
- load_in_4bit: true
399
- load_in_8bit: true
400
-
401
- bf16: auto # require >=ampere, auto will detect if your GPU supports this and choose automatically.
402
- fp16: # leave empty to use fp16 when bf16 is 'auto'. set to false if you want to fallback to fp32
403
- tf32: true # require >=ampere
404
-
405
- bfloat16: true # require >=ampere, use instead of bf16 when you don't want AMP (automatic mixed precision)
406
- float16: true # use instead of fp16 when you don't want AMP
407
- ```
408
- Note: Repo does not do 4-bit quantization.
409
-
410
- - lora
411
- ```yaml
412
- adapter: lora # 'qlora' or leave blank for full finetune
413
- lora_r: 8
414
- lora_alpha: 16
415
- lora_dropout: 0.05
416
- lora_target_modules:
417
- - q_proj
418
- - v_proj
419
- ```
420
-
421
- #### All Config Options
422
-
423
- See [these docs](docs/config.qmd) for all config options.
424
-
425
- ### Train
426
-
427
- Run
428
- ```bash
429
- accelerate launch -m axolotl.cli.train your_config.yml
430
- ```
431
-
432
- > [!TIP]
433
- > You can also reference a config file that is hosted on a public URL, for example `accelerate launch -m axolotl.cli.train https://yourdomain.com/your_config.yml`
434
-
435
- #### Preprocess dataset
436
-
437
- You can optionally pre-tokenize dataset with the following before finetuning.
438
- This is recommended for large datasets.
439
-
440
- - Set `dataset_prepared_path:` to a local folder for saving and loading pre-tokenized dataset.
441
- - (Optional): Set `push_dataset_to_hub: hf_user/repo` to push it to Huggingface.
442
- - (Optional): Use `--debug` to see preprocessed examples.
443
-
444
- ```bash
445
- python -m axolotl.cli.preprocess your_config.yml
446
- ```
447
-
448
- #### Multi-GPU
449
-
450
- Below are the options available in axolotl for training with multiple GPUs. Note that DeepSpeed
451
- is the recommended multi-GPU option currently because FSDP may experience
452
- [loss instability](https://github.com/huggingface/transformers/issues/26498).
453
-
454
- ##### DeepSpeed
455
-
456
- Deepspeed is an optimization suite for multi-gpu systems allowing you to train much larger models than you
457
- might typically be able to fit into your GPU's VRAM. More information about the various optimization types
458
- for deepspeed is available at https://huggingface.co/docs/accelerate/main/en/usage_guides/deepspeed#what-is-integrated
459
-
460
- We provide several default deepspeed JSON configurations for ZeRO stage 1, 2, and 3.
461
-
462
- ```yaml
463
- deepspeed: deepspeed_configs/zero1.json
464
- ```
465
-
466
- ```shell
467
- accelerate launch -m axolotl.cli.train examples/llama-2/config.yml --deepspeed deepspeed_configs/zero1.json
468
- ```
469
-
470
- ##### FSDP
471
-
472
- - llama FSDP
473
- ```yaml
474
- fsdp:
475
- - full_shard
476
- - auto_wrap
477
- fsdp_config:
478
- fsdp_offload_params: true
479
- fsdp_state_dict_type: FULL_STATE_DICT
480
- fsdp_transformer_layer_cls_to_wrap: LlamaDecoderLayer
481
- ```
482
-
483
- ##### FSDP + QLoRA
484
-
485
- Axolotl supports training with FSDP and QLoRA, see [these docs](docs/fsdp_qlora.qmd) for more information.
486
-
487
- ##### Weights & Biases Logging
488
-
489
- Make sure your `WANDB_API_KEY` environment variable is set (recommended) or you login to wandb with `wandb login`.
490
-
491
- - wandb options
492
- ```yaml
493
- wandb_mode:
494
- wandb_project:
495
- wandb_entity:
496
- wandb_watch:
497
- wandb_name:
498
- wandb_log_model:
499
- ```
500
-
501
- ##### Special Tokens
502
-
503
- It is important to have special tokens like delimiters, end-of-sequence, beginning-of-sequence in your tokenizer's vocabulary. This will help you avoid tokenization issues and help your model train better. You can do this in axolotl like this:
504
-
505
- ```yml
506
- special_tokens:
507
- bos_token: "<s>"
508
- eos_token: "</s>"
509
- unk_token: "<unk>"
510
- tokens: # these are delimiters
511
- - "<|im_start|>"
512
- - "<|im_end|>"
513
- ```
514
-
515
- When you include these tokens in your axolotl config, axolotl adds these tokens to the tokenizer's vocabulary.
516
-
517
- ### Inference Playground
518
-
519
- Axolotl allows you to load your model in an interactive terminal playground for quick experimentation.
520
- The config file is the same config file used for training.
521
-
522
- Pass the appropriate flag to the inference command, depending upon what kind of model was trained:
523
-
524
- - Pretrained LORA:
525
- ```bash
526
- python -m axolotl.cli.inference examples/your_config.yml --lora_model_dir="./lora-output-dir"
527
- ```
528
- - Full weights finetune:
529
- ```bash
530
- python -m axolotl.cli.inference examples/your_config.yml --base_model="./completed-model"
531
- ```
532
- - Full weights finetune w/ a prompt from a text file:
533
- ```bash
534
- cat /tmp/prompt.txt | python -m axolotl.cli.inference examples/your_config.yml \
535
- --base_model="./completed-model" --prompter=None --load_in_8bit=True
536
- ```
537
- -- With gradio hosting
538
- ```bash
539
- python -m axolotl.cli.inference examples/your_config.yml --gradio
540
- ```
541
-
542
- Please use `--sample_packing False` if you have it on and receive the error similar to below:
543
-
544
- > RuntimeError: stack expects each tensor to be equal size, but got [1, 32, 1, 128] at entry 0 and [1, 32, 8, 128] at entry 1
545
-
546
- ### Merge LORA to base
547
-
548
- The following command will merge your LORA adapater with your base model. You can optionally pass the argument `--lora_model_dir` to specify the directory where your LORA adapter was saved, otherwhise, this will be inferred from `output_dir` in your axolotl config file. The merged model is saved in the sub-directory `{lora_model_dir}/merged`.
549
-
550
- ```bash
551
- python3 -m axolotl.cli.merge_lora your_config.yml --lora_model_dir="./completed-model"
552
- ```
553
-
554
- You may need to use the `gpu_memory_limit` and/or `lora_on_cpu` config options to avoid running out of memory. If you still run out of CUDA memory, you can try to merge in system RAM with
555
-
556
- ```bash
557
- CUDA_VISIBLE_DEVICES="" python3 -m axolotl.cli.merge_lora ...
558
- ```
559
-
560
- although this will be very slow, and using the config options above are recommended instead.
561
-
562
- ## Common Errors 🧰
563
-
564
- See also the [FAQ's](./docs/faq.qmd) and [debugging guide](docs/debugging.qmd).
565
-
566
- > If you encounter a 'Cuda out of memory' error, it means your GPU ran out of memory during the training process. Here's how to resolve it:
567
-
568
- Please reduce any below
569
- - `micro_batch_size`
570
- - `eval_batch_size`
571
- - `gradient_accumulation_steps`
572
- - `sequence_len`
573
-
574
- If it does not help, try running without deepspeed and without accelerate (replace "accelerate launch" with "python") in the command.
575
-
576
- Using adamw_bnb_8bit might also save you some memory.
577
-
578
- > `failed (exitcode: -9)`
579
-
580
- Usually means your system has run out of system memory.
581
- Similarly, you should consider reducing the same settings as when you run out of VRAM.
582
- Additionally, look into upgrading your system RAM which should be simpler than GPU upgrades.
583
-
584
- > RuntimeError: expected scalar type Float but found Half
585
-
586
- Try set `fp16: true`
587
-
588
- > NotImplementedError: No operator found for `memory_efficient_attention_forward` ...
589
-
590
- Try to turn off xformers.
591
-
592
- > accelerate config missing
593
-
594
- It's safe to ignore it.
595
-
596
- > NCCL Timeouts during training
597
-
598
- See the [NCCL](docs/nccl.qmd) guide.
599
-
600
-
601
- ### Tokenization Mismatch b/w Inference & Training
602
-
603
- For many formats, Axolotl constructs prompts by concatenating token ids _after_ tokenizing strings. The reason for concatenating token ids rather than operating on strings is to maintain precise accounting for attention masks.
604
-
605
- If you decode a prompt constructed by axolotl, you might see spaces between tokens (or lack thereof) that you do not expect, especially around delimiters and special tokens. When you are starting out with a new format, you should always do the following:
606
-
607
- 1. Materialize some data using `python -m axolotl.cli.preprocess your_config.yml --debug`, and then decode the first few rows with your model's tokenizer.
608
- 2. During inference, right before you pass a tensor of token ids to your model, decode these tokens back into a string.
609
- 3. Make sure the inference string from #2 looks **exactly** like the data you fine tuned on from #1, including spaces and new lines. If they aren't the same, adjust your inference server accordingly.
610
- 4. As an additional troubleshooting step, you can look at the token ids between 1 and 2 to make sure they are identical.
611
-
612
- Having misalignment between your prompts during training and inference can cause models to perform very poorly, so it is worth checking this. See [this blog post](https://hamel.dev/notes/llm/finetuning/05_tokenizer_gotchas.html) for a concrete example.
613
-
614
- ## Debugging Axolotl
615
-
616
- See [this debugging guide](docs/debugging.qmd) for tips on debugging Axolotl, along with an example configuration for debugging with VSCode.
617
-
618
- ## Need help? 🙋
619
-
620
- Join our [Discord server](https://discord.gg/HhrNrHJPRb) where we our community members can help you.
621
-
622
- Need dedicated support? Please contact us at [✉️[email protected]](mailto:[email protected]) for dedicated support options.
623
-
624
- ## Badge ❤🏷️
625
-
626
- Building something cool with Axolotl? Consider adding a badge to your model card.
627
-
628
- ```markdown
629
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
630
- ```
631
-
632
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
633
-
634
- ## Community Showcase
635
-
636
- Check out some of the projects and models that have been built using Axolotl! Have a model you'd like to add to our Community Showcase? Open a PR with your model.
637
-
638
- Open Access AI Collective
639
- - [Minotaur 13b](https://huggingface.co/openaccess-ai-collective/minotaur-13b-fixed)
640
- - [Manticore 13b](https://huggingface.co/openaccess-ai-collective/manticore-13b)
641
- - [Hippogriff 30b](https://huggingface.co/openaccess-ai-collective/hippogriff-30b-chat)
642
-
643
- PocketDoc Labs
644
- - [Dan's PersonalityEngine 13b LoRA](https://huggingface.co/PocketDoc/Dans-PersonalityEngine-13b-LoRA)
645
-
646
- ## Contributing 🤝
647
-
648
- Please read the [contributing guide](./.github/CONTRIBUTING.md)
649
-
650
- Bugs? Please check the [open issues](https://github.com/OpenAccess-AI-Collective/axolotl/issues/bug) else create a new Issue.
651
-
652
- PRs are **greatly welcome**!
653
-
654
- Please run the quickstart instructions followed by the below to setup env:
655
- ```bash
656
- pip3 install -r requirements-dev.txt -r requirements-tests.txt
657
- pre-commit install
658
-
659
- # test
660
- pytest tests/
661
-
662
- # optional: run against all files
663
- pre-commit run --all-files
664
- ```
665
-
666
- Thanks to all of our contributors to date. Help drive open source AI progress forward by contributing to Axolotl.
667
-
668
- <a href="https://github.com/openaccess-ai-collective/axolotl/graphs/contributors">
669
- <img src="https://contrib.rocks/image?repo=openaccess-ai-collective/axolotl" alt="contributor chart by https://contrib.rocks"/>
670
- </a>
671
-
672
- ## Sponsors 🤝❤
673
-
674
- OpenAccess AI Collective is run by volunteer contributors such as [winglian](https://github.com/winglian),
675
- [NanoCode012](https://github.com/NanoCode012), [tmm1](https://github.com/tmm1),
676
- [mhenrichsen](https://github.com/mhenrichsen), [casper-hansen](https://github.com/casper-hansen),
677
- [hamelsmu](https://github.com/hamelsmu) and many more who help us accelerate forward by fixing bugs, answering
678
- community questions and implementing new features. Axolotl needs donations from sponsors for the compute needed to
679
- run our unit & integration tests, troubleshooting community issues, and providing bounties. If you love axolotl,
680
- consider sponsoring the project via [GitHub Sponsors](https://github.com/sponsors/OpenAccess-AI-Collective),
681
- [Ko-fi](https://ko-fi.com/axolotl_ai) or reach out directly to
682
683
-
684
- ---
685
-
686
- #### 💎 Diamond Sponsors - [Contact directly](mailto:[email protected])
687
-
688
- ---
689
-
690
- #### 🥇 Gold Sponsors - $5000/mo
691
-
692
- ---
693
-
694
- #### 🥈 Silver Sponsors - $1000/mo
695
-
696
- ---
697
-
698
- #### 🥉 Bronze Sponsors - $500/mo
699
-
700
- - [JarvisLabs.ai](https://jarvislabs.ai)
701
-
702
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Nknjl
3
+ emoji: 💻🐳
4
+ colorFrom: gray
5
+ colorTo: green
6
+ sdk: docker
7
+ pinned: false
8
+ tags:
9
+ - jupyterlab
10
+ suggested_storage: small
11
+ ---
12
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference