Andrew DalPino commited on
Commit
cac8fe7
·
0 Parent(s):

Initial commit

Browse files
Files changed (14) hide show
  1. .gitattributes +35 -0
  2. .gitignore +15 -0
  3. README.md +136 -0
  4. beam_search.py +104 -0
  5. data.py +254 -0
  6. dataset/.gitignore +2 -0
  7. generate.py +100 -0
  8. instruction-tune.py +197 -0
  9. model.py +499 -0
  10. model_sizing.ipynb +330 -0
  11. models/lightgpt-small.pt +3 -0
  12. out/.gitignore +2 -0
  13. pre-train.py +320 -0
  14. requirements.txt +7 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ __pycache__/
2
+ .mypy_cache/
3
+ env/
4
+ build/
5
+ develop-eggs/
6
+ dist/
7
+ lib/
8
+ lib64/
9
+ wheels/
10
+ *.egg-info/
11
+ .installed.cfg
12
+ *.egg
13
+ .venv
14
+ venv/
15
+ ENV/
README.md ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LightGPT
2
+
3
+ A lightweight generative pre-trained Transformer (GPT) for the people! A unique feature of LightGPT is that it gives you the ability to progressively trade off compute for additional memory-efficiency - allowing you to train large models on smaller consumer hardware. It also supports memory-efficient training over multiple GPUs or clusters of GPUs using PyTorch's Distributed Data Parallel (DDP) protocol with ZeRO Redundancy sharding. Unlike closed-source LLMs, LightGPT provides both the model weights *and* the code to train and fine-tune the model yourself.
4
+
5
+ ## What makes LightGPT different?
6
+
7
+ - **Parameter-efficiency**: LightGPT aims to be a more parsimonious model by only training parameters that are absolutely necessary. As such, biases and positional embeddings have been completely removed from the neural network architecture. In addition, the token embeddings and output layer share their weights resulting in a further reduction in trainable parameters.
8
+
9
+ - **Training efficiency**: Compared to Adam, LightGPT's Adafactor optimizer reduces the number of training-time buffers from O(n*m) to O(n+m) for every trainable weight matrix with little difference in runtime or minima quality. In addition, with activation check-pointing, model, gradient, and optimizer state sharding, we can reduce the number of buffers needed during training by a factor of 10 or more.
10
+
11
+ - **Fully open-source**: Want to train your own LightGPT? Go right ahead! In addition to our model weights, we also release our training and inferencing code so you can train one yourself. With the power of open-source, we hope that others can learn from and continue improving LightGPT over time.
12
+
13
+
14
+ ## Install Project Dependencies
15
+
16
+ Project dependencies are specified in the `requirements.txt` file. You can install them with [pip](https://pip.pypa.io/en/stable/) using the following command from the project root. I recommend using a virtual environment such as venv to keep package dependencies on your system tidy.
17
+
18
+ ```
19
+ python -m venv ./.venv
20
+
21
+ source ./.venv/bin/activate
22
+
23
+ pip install -r requirements.txt
24
+ ```
25
+
26
+ ## Quick Start
27
+
28
+ If you'd just like to start training right away, the default settings should work on most single-GPU systems with 12G of VRAM or more.
29
+
30
+ ```
31
+ python pre-train.py
32
+ ```
33
+
34
+ > Note that it will take a while to download and pre-process the dataset the first time that the training script is run.
35
+
36
+ If you have a larger system you can increase the training load by increasing the capacity of the network and `batch_size` at runtime.
37
+
38
+ ```
39
+ python pre-train.py --embedding_dimensions=1024 --num_hidden_layers=24 --batch_size=8
40
+ ```
41
+
42
+ To distribute the training workload over a cluster of GPUs or multiple cluster nodes, use PyTorch's [torchrun](https://pytorch.org/docs/stable/elastic/run.html) extension to launch a distributed data parallel session.
43
+
44
+ ```
45
+ torchrun --standalone --nnodes=1 --nproc-per-node=8 pre-train.py --batch_size=16 --gradient_accumulation_steps=32
46
+ ```
47
+
48
+ > Note that when training in data-parallel mode it's important that the `gradient_accumulation_steps` divides evenly into the world size for maximum performance. For example, if we have an 8 GPU cluster, we could perform 32 gradient accumulation steps in exactly 4 passes over the network.
49
+
50
+ After training, you can generate text from the model by running the `generate.py` script from the commandline with a prompt.
51
+
52
+ ```
53
+ python generate.py
54
+ ```
55
+
56
+ ### Pre-training Arguments
57
+
58
+ | Argument | Default | Type | Description |
59
+ |---|---|---|---|
60
+ | --batch_size | 1 | int | The number of samples to pass through the network at a time. |
61
+ | --gradient_accumulation_steps | 128 | int | The number of batches to pass through the network before updating the weights. |
62
+ | --samples_per_epoch | 4096 | int | The number of training samples to pass through the network every epoch. |
63
+ | --learning_rate | 5e-4 | float | The global step size taken after every gradient accumulation step. |
64
+ | --max_gradient_norm | 1.0 | float | Clip gradients above this threshold before stepping. |
65
+ | --num_epochs | 2145 | int | The number of epochs to train for. |
66
+ | --eval_interval | 10 | int | Evaluate the model after this many epochs on the testing set. |
67
+ | --block_size | 1024 | int | The number of tokens within the context window for every sample. |
68
+ | --embedding_dimensions | 1024 | int | The dimensionality of the token embeddings. |
69
+ | --num_attention_heads | 16 | int | The number of attention heads within every block. |
70
+ | --num_hidden_layers | 24 | int | The number of attention/MLP blocks within the hidden layer of the network. |
71
+ | --dropout | 0.1 | float | The proportion of signals to send to zero during training as regularization. |
72
+ | --activation_checkpointing | False | bool | Should we use activation checkpointing? |
73
+ | --checkpoint_interval | 20 | int | Save the model parameters to disk every this many epochs. |
74
+ | --checkpoint_path | "./out/checkpoint.pt" | string | The path to the checkpoint file on disk. |
75
+ | --dataset_path | "./dataset" | string | The path to the dataset files on disk. |
76
+ | --num_dataset_processes | 8 | int | The number of processes (CPUs) to use to process the dataset. |
77
+ | --resume | False | bool | Should we resume training from the last checkpoint? |
78
+ | --device | "cuda" | string | The device to run the computation on. |
79
+ | --seed | None | int | The seed for the random number generator. |
80
+
81
+ ### Instruction-tuning Arguments
82
+
83
+ | Argument | Default | Type | Description |
84
+ |---|---|---|---|
85
+ | --base_model_path | "./out/checkpoint.pt" | string | The path to the pre-trained model. |
86
+ | --batch_size | 1 | int | The number of samples to pass through the network at a time. |
87
+ | --gradient_accumulation_steps | 128 | int | The number of batches to pass through the network before updating the weights. |
88
+ | --learning_rate | 5e-4 | float | The global step size taken after every gradient accumulation step. |
89
+ | --mask_input | False | bool | Should we mask the input part of the sample i.e. only train on the output? |
90
+ | --rank | 8 | int | The rank of the LoRA decomposition matrices. |
91
+ | --alpha | 1.0 | float | The strength of the LoRA signal. |
92
+ | --dropout | 0.05 | float | The proportion of signals to send to zero during training as regularization. |
93
+ | --num_epochs | 4 | int | The number of epochs to train for. |
94
+ | --eval_interval | 1 | int | Evaluate the model after this many epochs on the testing set. |
95
+ | --checkpoint_interval | 1 | int | Save the model parameters to disk every this many epochs. |
96
+ | --checkpoint_path | "./out/lora_instruction.pt" | string | The path to the checkpoint file on disk. |
97
+ | --resume | False | bool | Should we resume training from the last checkpoint? |
98
+ | --device | "cuda" | string | The device to run the computation on. |
99
+ | --seed | None | int | The seed for the random number generator. |
100
+
101
+ ### Generation Arguments
102
+
103
+ | Argument | Default | Type | Description |
104
+ |---|---|---|---|
105
+ | --checkpoint_path | "./out/checkpoint.pt" | string | The path to the checkpoint file on disk. |
106
+ | --lora_path | None | string | The path to the LoRA checkpoint. |
107
+ | --max_tokens | 500 | int | The maximum number of tokens that the model should generate per sample. |
108
+ | --temperature | 1.0 | float | The amount of regularization applied to the candidate token probabilities. |
109
+ | --top_k | 500 | int | Only sample from this many candidate tokens with the highest probabilities. |
110
+ | --top_p | 0.9 | float | Of the `top_k` tokens, drop all but the `top_p` portion of the cumulative probability distribution. |
111
+ | --device | "cuda" | string | The device to run the computation on. |
112
+ | --seed | None | int | The seed for the random number generator. |
113
+
114
+ ### Beam Search Arguments
115
+
116
+ | Argument | Default | Type | Description |
117
+ |---|---|---|---|
118
+ | --checkpoint_path | "./out/checkpoint.pt" | string | The path to the checkpoint file on disk. |
119
+ | --lora_path | None | string | The path to the LoRA checkpoint. |
120
+ | --max_tokens | 200 | int | The maximum number of tokens that the model should generate per sample. |
121
+ | --num_candidates | 3 | int | The number of candidate sequences to output. |
122
+ | --beam_width | 16 | int | The number of candidate sequences to keep track of during search. |
123
+ | --device | "cuda" | string | The device to run the computation on. |
124
+ | --seed | None | int | The seed for the random number generator. |
125
+
126
+ ## References:
127
+ >- A. Radford, et al. Language Models are Unsupervised Multitask Learners, OpenAI, 2019.
128
+ >- T. Brown, et al. Language Models are Few-Shot Learners. OpenAI, 2020.
129
+ >- A. Kazemnejad, et al. The Impact of Positional Encoding on Length Generalization in Transformers, 37th Conference on Neural Information Processing Systems (NeurIPS 2023).
130
+ >- S. Rajbhandari, et al. ZeRO: Memory Optimizations Toward Training Trillion Parameter Models, 2020.
131
+ >- J. R. Hermans, et al. Accumulated Gradient Normalization, JMLR: Workshop and Conference Proceedings, 2017.
132
+ >- T. Chen, et al. Training Deep Nets with Sublinear Memory Cost. MIT, 2019.
133
+ >- B. Zhang, et al. Root Mean Square Layer Normalization. 33rd Conference on Neural Information Processing Systems, NeurIPS 2019.
134
+
135
+ ## License
136
+ The code is licensed [MIT](LICENSE) and the tutorial is licensed [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).
beam_search.py ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+
3
+ from os import path
4
+ from argparse import ArgumentParser
5
+
6
+ import torch
7
+
8
+ from torch.cuda import is_available as cuda_is_available
9
+
10
+ from model import GPT, GPTWithLoRA
11
+ from data import Alpaca
12
+
13
+ import tiktoken
14
+
15
+
16
+ def main():
17
+ parser = ArgumentParser(
18
+ description="Generate text from the model given a prompt.",
19
+ )
20
+
21
+ parser.add_argument("--checkpoint_path", default="./out/checkpoint.pt", type=str)
22
+ parser.add_argument("--lora_path", default=None, type=str)
23
+ parser.add_argument("--max_tokens", default=200, type=int)
24
+ parser.add_argument("--num_candidates", default=3, type=int)
25
+ parser.add_argument("--beam_width", default=16, type=int)
26
+ parser.add_argument("--device", default="cuda", type=str)
27
+ parser.add_argument("--seed", default=None, type=int)
28
+
29
+ args = parser.parse_args()
30
+
31
+ if "cuda" in args.device and not cuda_is_available():
32
+ raise RuntimeError("Cuda is not available.")
33
+
34
+ torch.set_float32_matmul_precision("high")
35
+
36
+ if args.seed:
37
+ torch.manual_seed(args.seed)
38
+ random.seed(args.seed)
39
+
40
+ tokenizer = tiktoken.get_encoding(Alpaca.ENCODING)
41
+
42
+ checkpoint = torch.load(
43
+ args.checkpoint_path, map_location=args.device, weights_only=True
44
+ )
45
+
46
+ model = GPT(**checkpoint["model_args"])
47
+
48
+ model = torch.compile(model)
49
+
50
+ model.load_state_dict(checkpoint["model"])
51
+
52
+ print("Model checkpoint loaded")
53
+
54
+ if args.lora_path:
55
+ checkpoint = torch.load(
56
+ args.lora_path, map_location=args.device, weights_only=True
57
+ )
58
+
59
+ model = GPTWithLoRA(model, **checkpoint["lora_args"])
60
+
61
+ model = torch.compile(model)
62
+
63
+ model.load_state_dict(checkpoint["lora"], strict=False)
64
+
65
+ model.merge_lora_parameters()
66
+
67
+ print("LoRA checkpoint loaded")
68
+
69
+ model.to(args.device)
70
+
71
+ model.eval()
72
+
73
+ while True:
74
+ prompt = input("Enter a prompt: ")
75
+
76
+ if args.lora_path:
77
+ prompt = Alpaca.PROMPT_TEMPLATE.format(instruction=prompt)
78
+
79
+ prompt = tokenizer.encode_ordinary(prompt)
80
+
81
+ prompt = torch.tensor(prompt, dtype=torch.int64, device=args.device)
82
+
83
+ candidates = model.beam_search(
84
+ prompt,
85
+ args.max_tokens,
86
+ args.num_candidates,
87
+ args.beam_width,
88
+ )
89
+
90
+ for i, candidate in enumerate(candidates, start=1):
91
+ print(f"Sequence #{i}")
92
+
93
+ out = tokenizer.decode(candidate.tokens.tolist()).strip()
94
+
95
+ print(out, end="\n\n")
96
+
97
+ print("\n")
98
+
99
+ if "y" not in input("Go again? (yes|no): ").lower():
100
+ break
101
+
102
+
103
+ if __name__ == "__main__":
104
+ main()
data.py ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+
3
+ from os import path
4
+ from copy import deepcopy
5
+
6
+ from datasets import load_dataset
7
+
8
+ import tiktoken
9
+
10
+ import numpy as np
11
+
12
+ import torch
13
+
14
+ from torch import Tensor
15
+ from torch.utils.data import IterableDataset, Dataset
16
+ from torch.nn.utils.rnn import pad_sequence
17
+
18
+ from tqdm import tqdm
19
+
20
+
21
+ class Openwebtext(IterableDataset):
22
+ DATASET_NAME = "openwebtext"
23
+
24
+ FILE_PREFIX = DATASET_NAME
25
+
26
+ TRAIN_FILENAME = f"{FILE_PREFIX}-train.bin"
27
+ TEST_FILENAME = f"{FILE_PREFIX}-test.bin"
28
+
29
+ TEST_SPLIT_PROPORTION = 0.005
30
+ NUM_SHARDS = 1024
31
+
32
+ ENCODING = "r50k_base"
33
+
34
+ PADDING_INDEX = -100
35
+
36
+ def __init__(
37
+ self,
38
+ root_path: str,
39
+ train: bool = True,
40
+ tokens_per_sample: int = 1024,
41
+ samples_per_epoch: int = 4096,
42
+ num_processes: int = 8,
43
+ ):
44
+ super().__init__()
45
+
46
+ if tokens_per_sample < 1:
47
+ raise ValueError(f"Tokens per sample must be greater than 0.")
48
+
49
+ if samples_per_epoch < 1:
50
+ raise ValueError(f"Samples per epoch must be greater than 0.")
51
+
52
+ train_path = path.join(root_path, self.TRAIN_FILENAME)
53
+ test_path = path.join(root_path, self.TEST_FILENAME)
54
+
55
+ self.tokenizer = tiktoken.get_encoding(self.ENCODING)
56
+
57
+ if not path.exists(train_path) or not path.exists(test_path):
58
+ tokenized_splits = (
59
+ load_dataset(self.DATASET_NAME, num_proc=num_processes, split="train")
60
+ .train_test_split(test_size=self.TEST_SPLIT_PROPORTION, shuffle=True)
61
+ .map(
62
+ self.tokenize,
63
+ desc="Tokenizing",
64
+ remove_columns=["text"],
65
+ num_proc=num_processes,
66
+ )
67
+ )
68
+
69
+ for split, dataset in tokenized_splits.items():
70
+ bin_path = path.join(root_path, f"{self.FILE_PREFIX}-{split}.bin")
71
+
72
+ total_length = np.sum(dataset["length"], dtype=np.uint64)
73
+
74
+ bin_out = np.memmap(
75
+ bin_path, dtype=np.uint16, mode="w+", shape=total_length
76
+ )
77
+
78
+ index = 0
79
+
80
+ for i in tqdm(range(self.NUM_SHARDS), desc="Writing"):
81
+ batch = dataset.shard(
82
+ num_shards=self.NUM_SHARDS, index=i, contiguous=True
83
+ ).with_format("numpy")
84
+
85
+ token_batch = np.concatenate(batch["tokens"])
86
+
87
+ n = len(token_batch)
88
+
89
+ bin_out[index : index + n] = token_batch
90
+
91
+ index += n
92
+
93
+ bin_out.flush()
94
+
95
+ bin_file_path = path.join(
96
+ root_path, self.TRAIN_FILENAME if train else self.TEST_FILENAME
97
+ )
98
+
99
+ memmap = np.memmap(bin_file_path, dtype=np.uint16, mode="r")
100
+
101
+ self.memmap = memmap
102
+ self.max_start = len(memmap) - (tokens_per_sample + 1)
103
+ self.tokens_per_sample = tokens_per_sample
104
+ self.samples_per_epoch = samples_per_epoch
105
+
106
+ @property
107
+ def vocabulary_size(self) -> int:
108
+ return self.tokenizer.max_token_value + 1
109
+
110
+ @property
111
+ def eos_index(self) -> int:
112
+ return self.tokenizer.eot_token
113
+
114
+ def tokenize(self, sample: dict) -> dict:
115
+ tokens = self.tokenizer.encode_ordinary(sample["text"])
116
+
117
+ tokens.append(self.tokenizer.eot_token)
118
+
119
+ return {
120
+ "tokens": tokens,
121
+ "length": len(tokens),
122
+ }
123
+
124
+ def __iter__(self):
125
+ for i in range(self.samples_per_epoch):
126
+ start = random.randint(0, self.max_start)
127
+ end = start + self.tokens_per_sample
128
+
129
+ x = self.memmap[start:end]
130
+ y = self.memmap[start + 1 : end + 1]
131
+
132
+ x = x.astype(np.int64)
133
+ y = y.astype(np.int64)
134
+
135
+ assert x.shape == y.shape, "Sample / label shape mismatch."
136
+
137
+ yield x, y
138
+
139
+
140
+ class Alpaca(Dataset):
141
+ DATASET_NAME = "tatsu-lab/alpaca"
142
+
143
+ ENCODING = "r50k_base"
144
+
145
+ PADDING_INDEX = -100
146
+
147
+ PROMPT_TEMPLATE = (
148
+ "Below is an instruction that describes a task. Write a response that "
149
+ "appropriately completes the request.\n\n"
150
+ "### Instruction:\n{instruction}\n\n"
151
+ "### Response:\n"
152
+ )
153
+
154
+ PROMPT_TEMPLATE_WITH_INPUT = (
155
+ "Below is an instruction that describes a task, paired with an input "
156
+ "that provides further context. Write a response that appropriately "
157
+ "completes the request.\n\n"
158
+ "### Input:\n{input}\n\n"
159
+ "### Instruction:\n{instruction}\n\n"
160
+ "### Response:\n"
161
+ )
162
+
163
+ RESPONSE_TEMPLATE = "{output}"
164
+
165
+ def __init__(self, max_tokens_per_sample: int = 1024, mask_input: bool = True):
166
+ super().__init__()
167
+
168
+ if max_tokens_per_sample < 1:
169
+ raise ValueError(
170
+ f"Max tokens per sample must be greater than 0, {max_tokens_per_sample} given."
171
+ )
172
+
173
+ self.dataset = load_dataset(self.DATASET_NAME, split="train")
174
+
175
+ self.tokenizer = tiktoken.get_encoding(self.ENCODING)
176
+
177
+ self.max_tokens_per_sample = max_tokens_per_sample
178
+ self.mask_input = mask_input
179
+
180
+ @property
181
+ def vocabulary_size(self) -> int:
182
+ return self.tokenizer.max_token_value + 1
183
+
184
+ @property
185
+ def eos_index(self) -> int:
186
+ return self.tokenizer.eot_token
187
+
188
+ def collate(self, batch: list) -> tuple[Tensor, Tensor]:
189
+ """Custom collate function adds left padding to batched samples."""
190
+
191
+ sample, labels = [], []
192
+
193
+ for x, y in batch:
194
+ sample.append(x)
195
+ labels.append(y)
196
+
197
+ x = pad_sequence(
198
+ sample,
199
+ batch_first=True,
200
+ padding_value=self.PADDING_INDEX,
201
+ padding_side="left",
202
+ )
203
+ y = pad_sequence(
204
+ labels,
205
+ batch_first=True,
206
+ padding_value=self.PADDING_INDEX,
207
+ padding_side="left",
208
+ )
209
+
210
+ assert x.shape == y.shape, "Sample / label batch shape mismatch."
211
+
212
+ return x, y
213
+
214
+ def __getitem__(self, index: int):
215
+ row = self.dataset[index]
216
+
217
+ has_input = len(row["input"]) > 0
218
+
219
+ if has_input:
220
+ text = self.PROMPT_TEMPLATE_WITH_INPUT.format(
221
+ input=row["input"], instruction=row["instruction"]
222
+ )
223
+ else:
224
+ text = self.PROMPT_TEMPLATE.format(instruction=row["instruction"])
225
+
226
+ tokens = self.tokenizer.encode_ordinary(text)
227
+
228
+ sample = deepcopy(tokens)
229
+
230
+ if self.mask_input:
231
+ labels = [self.PADDING_INDEX] * len(tokens)
232
+ else:
233
+ labels = deepcopy(tokens)
234
+
235
+ text = self.RESPONSE_TEMPLATE.format(output=row["output"])
236
+
237
+ tokens = self.tokenizer.encode_ordinary(text)
238
+
239
+ tokens.append(self.tokenizer.eot_token)
240
+
241
+ sample.extend(tokens)
242
+ labels.extend(tokens)
243
+
244
+ end = min(len(sample), self.max_tokens_per_sample + 1)
245
+
246
+ x = torch.tensor(sample[0 : end - 1], dtype=torch.int64)
247
+ y = torch.tensor(labels[1:end], dtype=torch.int64)
248
+
249
+ assert x.shape == y.shape, "Sample / label shape mismatch."
250
+
251
+ return x, y
252
+
253
+ def __len__(self):
254
+ return len(self.dataset)
dataset/.gitignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ *
2
+ !.gitignore
generate.py ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+
3
+ from os import path
4
+ from argparse import ArgumentParser
5
+
6
+ import torch
7
+
8
+ from torch.cuda import is_available as cuda_is_available
9
+
10
+ from model import GPT, GPTWithLoRA
11
+ from data import Alpaca
12
+
13
+ import tiktoken
14
+
15
+
16
+ def main():
17
+ parser = ArgumentParser(
18
+ description="Generate text from the model given a prompt.",
19
+ )
20
+
21
+ parser.add_argument("--checkpoint_path", default="./out/checkpoint.pt", type=str)
22
+ parser.add_argument("--lora_path", default=None, type=str)
23
+ parser.add_argument("--max_tokens", default=1000, type=int)
24
+ parser.add_argument("--temperature", default=1.0, type=float)
25
+ parser.add_argument("--top_k", default=500, type=int)
26
+ parser.add_argument("--top_p", default=0.9, type=float)
27
+ parser.add_argument("--device", default="cuda", type=str)
28
+ parser.add_argument("--seed", default=None, type=int)
29
+
30
+ args = parser.parse_args()
31
+
32
+ if "cuda" in args.device and not cuda_is_available():
33
+ raise RuntimeError("Cuda is not available.")
34
+
35
+ torch.set_float32_matmul_precision("high")
36
+
37
+ if args.seed:
38
+ torch.manual_seed(args.seed)
39
+ random.seed(args.seed)
40
+
41
+ tokenizer = tiktoken.get_encoding(Alpaca.ENCODING)
42
+
43
+ checkpoint = torch.load(
44
+ args.checkpoint_path, map_location=args.device, weights_only=True
45
+ )
46
+
47
+ model = GPT(**checkpoint["model_args"])
48
+
49
+ model = torch.compile(model)
50
+
51
+ model.load_state_dict(checkpoint["model"])
52
+
53
+ print("Model checkpoint loaded")
54
+
55
+ if args.lora_path:
56
+ checkpoint = torch.load(
57
+ args.lora_path, map_location=args.device, weights_only=True
58
+ )
59
+
60
+ model = GPTWithLoRA(model, **checkpoint["lora_args"])
61
+
62
+ model = torch.compile(model)
63
+
64
+ model.load_state_dict(checkpoint["lora"], strict=False)
65
+
66
+ model.merge_lora_parameters()
67
+
68
+ print("LoRA checkpoint loaded")
69
+
70
+ model.to(args.device)
71
+
72
+ model.eval()
73
+
74
+ while True:
75
+ prompt = input("Enter a prompt: ")
76
+
77
+ if args.lora_path:
78
+ prompt = Alpaca.PROMPT_TEMPLATE.format(instruction=prompt)
79
+
80
+ prompt = tokenizer.encode_ordinary(prompt)
81
+
82
+ prompt = torch.tensor(prompt, dtype=torch.int64, device=args.device)
83
+
84
+ for token in model.generate(
85
+ prompt, args.max_tokens, args.temperature, args.top_k, args.top_p
86
+ ):
87
+ out = tokenizer.decode_single_token_bytes(token).decode(
88
+ "utf-8", errors="replace"
89
+ )
90
+
91
+ print(out, end="", flush=True)
92
+
93
+ print("\n")
94
+
95
+ if "y" not in input("Go again? (yes|no): ").lower():
96
+ break
97
+
98
+
99
+ if __name__ == "__main__":
100
+ main()
instruction-tune.py ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+
3
+ from argparse import ArgumentParser
4
+
5
+ import torch
6
+
7
+ from torch.utils.data import DataLoader
8
+ from torch.optim import Adafactor
9
+ from torch.amp import autocast
10
+ from torch.cuda import is_available as cuda_is_available, is_bf16_supported
11
+ from torch.utils.data import random_split
12
+
13
+ from torchmetrics.text import Perplexity
14
+
15
+ from model import GPT, GPTWithLoRA
16
+ from data import Alpaca
17
+
18
+ import tiktoken
19
+
20
+ from tqdm import tqdm
21
+
22
+
23
+ def main():
24
+ parser = ArgumentParser(description="Instruction-tune the foundation model.")
25
+
26
+ parser.add_argument("--base_model_path", default="./out/checkpoint.pt", type=str)
27
+ parser.add_argument("--batch_size", default=1, type=int)
28
+ parser.add_argument("--gradient_accumulation_steps", default=128, type=int)
29
+ parser.add_argument("--learning_rate", default=1e-2, type=float)
30
+ parser.add_argument("--mask_input", default=True, type=bool)
31
+ parser.add_argument("--rank", default=8, type=int)
32
+ parser.add_argument("--alpha", default=1.0, type=float)
33
+ parser.add_argument("--dropout", default=0.05, type=float)
34
+ parser.add_argument("--num_epochs", default=4, type=int)
35
+ parser.add_argument("--eval_interval", default=1, type=int)
36
+ parser.add_argument("--checkpoint_interval", default=1, type=int)
37
+ parser.add_argument(
38
+ "--checkpoint_path", default="./out/lora_instruction.pt", type=str
39
+ )
40
+ parser.add_argument("--resume", action="store_true")
41
+ parser.add_argument("--device", default="cuda", type=str)
42
+ parser.add_argument("--seed", default=None, type=int)
43
+
44
+ args = parser.parse_args()
45
+
46
+ if "cuda" in args.device and not cuda_is_available():
47
+ raise RuntimeError("Cuda is not available.")
48
+
49
+ torch.set_float32_matmul_precision("high")
50
+
51
+ dtype = (
52
+ torch.bfloat16
53
+ if "cuda" in args.device and is_bf16_supported()
54
+ else torch.float32
55
+ )
56
+
57
+ forward_context = autocast(device_type=args.device, dtype=dtype)
58
+
59
+ if args.seed:
60
+ torch.manual_seed(args.seed)
61
+ random.seed(args.seed)
62
+
63
+ checkpoint = torch.load(
64
+ args.base_model_path, map_location=args.device, weights_only=True
65
+ )
66
+
67
+ model_args = checkpoint["model_args"]
68
+
69
+ dataset = Alpaca(model_args["block_size"], args.mask_input)
70
+
71
+ training, testing = random_split(dataset, (0.9, 0.1))
72
+
73
+ train_loader = DataLoader(
74
+ training,
75
+ collate_fn=dataset.collate,
76
+ batch_size=args.batch_size,
77
+ pin_memory="cpu" not in args.device,
78
+ shuffle=True,
79
+ )
80
+ test_loader = DataLoader(
81
+ testing,
82
+ collate_fn=dataset.collate,
83
+ batch_size=args.batch_size,
84
+ pin_memory="cpu" not in args.device,
85
+ shuffle=False,
86
+ )
87
+
88
+ model = GPT(**model_args)
89
+
90
+ model = torch.compile(model)
91
+
92
+ model.load_state_dict(checkpoint["model"])
93
+
94
+ print("Model checkpoint loaded")
95
+
96
+ lora_args = {
97
+ "rank": args.rank,
98
+ "alpha": args.alpha,
99
+ "dropout": args.dropout,
100
+ }
101
+
102
+ model = GPTWithLoRA(model, **lora_args).to(args.device)
103
+
104
+ print("Compiling model")
105
+ model.compile()
106
+
107
+ print(f"Model has {model.num_trainable_params:,} trainable parameters")
108
+
109
+ optimizer = Adafactor(model.parameters(), lr=args.learning_rate)
110
+
111
+ perplexity_metric = Perplexity(ignore_index=dataset.PADDING_INDEX).to(args.device)
112
+
113
+ starting_epoch = 1
114
+
115
+ if args.resume:
116
+ checkpoint = torch.load(
117
+ args.checkpoint_path, map_location=args.device, weights_only=True
118
+ )
119
+
120
+ model.load_state_dict(checkpoint["lora"], strict=False)
121
+ optimizer.load_state_dict(checkpoint["optimizer"])
122
+ starting_epoch += checkpoint["epoch"]
123
+
124
+ print("Previous checkpoint resumed successfully")
125
+
126
+ model.train()
127
+
128
+ print("Instruction-tuning ...")
129
+
130
+ for epoch in range(starting_epoch, args.num_epochs + 1):
131
+ total_cross_entropy, total_batches = 0.0, 0
132
+
133
+ for step, (x, y) in enumerate(
134
+ tqdm(train_loader, desc=f"Epoch {epoch}", leave=False), start=1
135
+ ):
136
+ x = x.to(args.device, non_blocking=True)
137
+ y = y.to(args.device, non_blocking=True)
138
+
139
+ with forward_context:
140
+ y_pred, loss = model(x, y)
141
+
142
+ scaled_loss = loss / args.gradient_accumulation_steps
143
+
144
+ scaled_loss.backward()
145
+
146
+ total_cross_entropy += loss.item()
147
+
148
+ if step % args.gradient_accumulation_steps == 0:
149
+ optimizer.step()
150
+
151
+ optimizer.zero_grad(set_to_none=True)
152
+
153
+ total_batches += 1
154
+
155
+ average_cross_entropy = total_cross_entropy / total_batches
156
+
157
+ print(
158
+ f"Epoch {epoch}: Cross Entropy: {average_cross_entropy:.5f}",
159
+ )
160
+
161
+ if epoch % args.eval_interval == 0:
162
+ model.eval()
163
+
164
+ for x, y in tqdm(test_loader, desc="Testing", leave=False):
165
+ x = x.to(args.device, non_blocking=True)
166
+ y = y.to(args.device, non_blocking=True)
167
+
168
+ with torch.no_grad():
169
+ y_pred, _ = model(x)
170
+
171
+ perplexity_metric.update(y_pred, y)
172
+
173
+ perplexity = perplexity_metric.compute()
174
+
175
+ print(f"Perplexity: {perplexity:.3f}")
176
+
177
+ perplexity_metric.reset()
178
+
179
+ model.train()
180
+
181
+ if epoch % args.checkpoint_interval == 0:
182
+ checkpoint = {
183
+ "epoch": epoch,
184
+ "lora_args": lora_args,
185
+ "lora": model.state_dict(),
186
+ "optimizer": optimizer.state_dict(),
187
+ }
188
+
189
+ torch.save(checkpoint, args.checkpoint_path)
190
+
191
+ print("Checkpoint saved")
192
+
193
+ print("Done!")
194
+
195
+
196
+ if __name__ == "__main__":
197
+ main()
model.py ADDED
@@ -0,0 +1,499 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from math import sqrt, exp
2
+ from dataclasses import dataclass, field
3
+ from functools import partial
4
+ from typing import Iterator, Self
5
+
6
+ import torch
7
+
8
+ from torch import Tensor
9
+ from torch.nn import (
10
+ Module,
11
+ ModuleList,
12
+ Sequential,
13
+ Embedding,
14
+ MultiheadAttention,
15
+ Linear,
16
+ RMSNorm,
17
+ GELU,
18
+ Dropout1d,
19
+ CrossEntropyLoss,
20
+ Parameter,
21
+ Buffer,
22
+ )
23
+
24
+ from torch.nn.functional import softmax, log_softmax
25
+ from torch.nn.utils.parametrize import register_parametrization, remove_parametrizations
26
+ from torch.utils.checkpoint import checkpoint
27
+
28
+
29
+ class GPT(Module):
30
+ """A generative pre-trained transformer."""
31
+
32
+ def __init__(
33
+ self,
34
+ block_size: int = 1024,
35
+ embedding_dimensions: int = 1024,
36
+ num_heads: int = 16,
37
+ num_layers: int = 24,
38
+ dropout: float = 0.1,
39
+ activation_checkpointing: bool = False,
40
+ vocabulary_size: int = 50257,
41
+ padding_index: int = -100,
42
+ eos_index: int = 50256,
43
+ ):
44
+ super().__init__()
45
+
46
+ if vocabulary_size <= 0:
47
+ raise ValueError(
48
+ f"Vocabulary size must be greater than 0, {vocabulary_size} given."
49
+ )
50
+
51
+ if num_layers <= 0:
52
+ raise ValueError(f"Num layers must be greater than 0, {num_layers} given.")
53
+
54
+ token_embeddings = Embedding(
55
+ vocabulary_size, embedding_dimensions, padding_idx=padding_index
56
+ )
57
+
58
+ output_layer = Linear(embedding_dimensions, vocabulary_size, bias=False)
59
+
60
+ token_embeddings.weight = output_layer.weight # Tie weights
61
+
62
+ self.token_embeddings = token_embeddings
63
+
64
+ causal_mask = torch.full((block_size, block_size), float("-inf"))
65
+ causal_mask = torch.triu(causal_mask, diagonal=1)
66
+
67
+ self.causal_mask = Buffer(causal_mask, persistent=False)
68
+
69
+ self.body = ModuleList(
70
+ [
71
+ CausalSelfAttentionBlock(
72
+ embedding_dimensions, block_size, num_heads, dropout
73
+ )
74
+ for _ in range(num_layers)
75
+ ]
76
+ )
77
+
78
+ if activation_checkpointing:
79
+ self.checkpoint = partial(checkpoint, use_reentrant=False)
80
+ else:
81
+ self.checkpoint = lambda layer, x, attention_mask: layer(x, attention_mask)
82
+
83
+ self.output_norm = RMSNorm(embedding_dimensions)
84
+ self.output_layer = output_layer
85
+
86
+ self.loss_function = CrossEntropyLoss(ignore_index=padding_index)
87
+
88
+ self.vocabulary_size = vocabulary_size
89
+ self.block_size = block_size
90
+ self.eos_index = eos_index
91
+
92
+ @property
93
+ def num_trainable_params(self) -> int:
94
+ return sum(param.numel() for param in self.parameters() if param.requires_grad)
95
+
96
+ def forward(
97
+ self, x: Tensor, y: Tensor | None = None
98
+ ) -> tuple[Tensor, Tensor | None]:
99
+ z = self.token_embeddings(x)
100
+
101
+ b, t = x.size()
102
+
103
+ causal_mask = self.causal_mask[:t, :t]
104
+
105
+ for layer in self.body:
106
+ z = self.checkpoint(layer, z, causal_mask)
107
+
108
+ z = self.output_norm(z)
109
+ z = self.output_layer(z)
110
+
111
+ if y is not None:
112
+ y_pred = z.view(-1, z.size(-1))
113
+ labels = y.view(-1)
114
+
115
+ loss = self.loss_function(y_pred, labels)
116
+ else:
117
+ loss = None
118
+
119
+ return z, loss
120
+
121
+ @torch.no_grad()
122
+ def generate(
123
+ self,
124
+ prompt: Tensor,
125
+ max_tokens: int = 500,
126
+ temperature: float = 1.0,
127
+ top_k: int = 500,
128
+ top_p: float = 0.9,
129
+ ) -> Iterator:
130
+ """
131
+ Given a prompt, sample the next {max_tokens} tokens from the model weighted
132
+ by their predicted probabilities.
133
+ """
134
+
135
+ if max_tokens <= 0:
136
+ raise ValueError(f"Max tokens must be greater than 0, {max_tokens} given.")
137
+
138
+ if temperature <= 0:
139
+ raise ValueError(
140
+ f"Temperature must be greater than 0, {temperature} given."
141
+ )
142
+
143
+ if top_k <= 0 or top_k > self.vocabulary_size:
144
+ raise ValueError(
145
+ f"Top k must be between 1 and {self.vocabulary_size}, {top_k} given."
146
+ )
147
+
148
+ if top_p <= 0.0 or top_p > 1.0:
149
+ raise ValueError(f"Top p must be between 0 and 1, {top_p} given.")
150
+
151
+ context_window = prompt
152
+
153
+ for _ in range(max_tokens):
154
+ context_window = context_window[-self.block_size :]
155
+
156
+ y_pred, _ = self.forward(context_window.unsqueeze(0))
157
+
158
+ logits = y_pred[0, -1, :]
159
+
160
+ logits, indices = torch.topk(logits, top_k, sorted=True)
161
+
162
+ probabilities = softmax(logits, dim=0)
163
+
164
+ cumulative_probability_mass = torch.cumsum(probabilities, dim=0)
165
+
166
+ min_probability_mass = cumulative_probability_mass[0]
167
+
168
+ threshold_p = max(top_p, min_probability_mass.item())
169
+
170
+ selected_indices = cumulative_probability_mass <= threshold_p
171
+
172
+ logits = logits[selected_indices]
173
+ indices = indices[selected_indices]
174
+
175
+ logits /= temperature
176
+
177
+ probabilities = softmax(logits, dim=0)
178
+
179
+ offset = torch.multinomial(probabilities, num_samples=1).squeeze(0)
180
+
181
+ next_token = indices[offset]
182
+
183
+ if next_token == self.eos_index:
184
+ break
185
+
186
+ yield next_token
187
+
188
+ context_window = torch.cat((context_window, next_token.unsqueeze(0)))
189
+
190
+ @torch.no_grad()
191
+ def beam_search(
192
+ self,
193
+ prompt: Tensor,
194
+ max_tokens: int = 200,
195
+ num_candidates: int = 3,
196
+ beam_width: int = 16,
197
+ ) -> list:
198
+ """
199
+ Given a prompt, return the {num_candidates} highest probability sequences.
200
+ """
201
+
202
+ if max_tokens <= 0:
203
+ raise ValueError(f"Max tokens must be greater than 0, {max_tokens} given.")
204
+
205
+ if num_candidates <= 0:
206
+ raise ValueError(
207
+ f"Num candidates must be greater than 0, {num_candidates} given."
208
+ )
209
+
210
+ if beam_width <= 0:
211
+ raise ValueError(f"Beam width must be greater than 0, {beam_width} given.")
212
+
213
+ @dataclass(order=True)
214
+ class Candidate:
215
+ log_probability: float
216
+ tokens: Tensor
217
+
218
+ @property
219
+ def priority(self) -> float:
220
+ return self.log_probability
221
+
222
+ sort_candidates = partial(
223
+ sorted,
224
+ key=lambda candidate: candidate.priority,
225
+ reverse=True,
226
+ )
227
+
228
+ candidates, completed = [], []
229
+
230
+ tokens = torch.tensor([], dtype=prompt.dtype).to(prompt.device)
231
+
232
+ candidates.append(Candidate(0.0, tokens))
233
+
234
+ while len(candidates) > 0:
235
+ candidate = candidates.pop()
236
+
237
+ if len(completed) >= num_candidates:
238
+ completed = sort_candidates(completed)
239
+
240
+ completed = completed[:num_candidates]
241
+
242
+ worst_candidate = completed[-1]
243
+
244
+ if candidate.log_probability < worst_candidate.log_probability:
245
+ break
246
+
247
+ if len(candidate.tokens) > 0 and candidate.tokens[-1] == self.eos_index:
248
+ candidate.tokens = candidate.tokens[:-1]
249
+
250
+ completed.append(candidate)
251
+
252
+ continue
253
+
254
+ if len(candidate.tokens) >= max_tokens:
255
+ completed.append(candidate)
256
+
257
+ continue
258
+
259
+ context_window = torch.cat((prompt, candidate.tokens))
260
+
261
+ context_window = context_window[-self.block_size :]
262
+
263
+ y_pred, _ = self.forward(context_window.unsqueeze(0))
264
+
265
+ logits = y_pred[0, -1, :]
266
+
267
+ logits, indices = torch.topk(logits, beam_width, sorted=False)
268
+
269
+ log_probabilities = log_softmax(logits, dim=0)
270
+
271
+ for log_probability, index in zip(log_probabilities, indices):
272
+ log_probability = candidate.log_probability + log_probability
273
+
274
+ tokens = torch.cat((candidate.tokens, index.unsqueeze(0)))
275
+
276
+ candidates.append(Candidate(log_probability, tokens))
277
+
278
+ candidates = sort_candidates(candidates)
279
+
280
+ candidates = candidates[:beam_width]
281
+
282
+ return completed
283
+
284
+
285
+ class GPTWithLoRA(Module):
286
+ """
287
+ A wrapper for pre-trained GPT models that applies a LoRA reparameterization
288
+ to the intermediate layers of the network.
289
+ """
290
+
291
+ def __init__(
292
+ self, model: GPT, rank: int = 8, alpha: float = 1.0, dropout: float = 0.05
293
+ ):
294
+ super().__init__()
295
+
296
+ if rank <= 0:
297
+ raise ValueError(f"Rank must be greater than 0, {rank} given.")
298
+
299
+ if alpha <= 0.0:
300
+ raise ValueError(f"Alpha must be greater than 0, {alpha} given.")
301
+
302
+ for param in model.parameters():
303
+ param.requires_grad = False
304
+
305
+ for module in model.body:
306
+ out_features, in_features = module.attention.in_proj_weight.shape
307
+
308
+ register_parametrization(
309
+ module.attention,
310
+ "in_proj_weight",
311
+ LoRA(in_features, out_features, rank, alpha, dropout),
312
+ )
313
+
314
+ out_features, in_features = module.attention.out_proj.weight.shape
315
+
316
+ register_parametrization(
317
+ module.attention.out_proj,
318
+ "weight",
319
+ LoRA(in_features, out_features, rank, alpha, dropout),
320
+ )
321
+
322
+ for layer in module.mlp.layers:
323
+ if isinstance(layer, Linear):
324
+ register_parametrization(
325
+ layer,
326
+ "weight",
327
+ LoRA.from_linear(layer, rank, alpha, dropout),
328
+ )
329
+
330
+ self.model = model
331
+
332
+ @property
333
+ def num_trainable_params(self) -> int:
334
+ return self.model.num_trainable_params
335
+
336
+ def state_dict(self):
337
+ return {
338
+ name: module
339
+ for name, module in super().state_dict().items()
340
+ if "lora" in name
341
+ }
342
+
343
+ def merge_lora_parameters(self):
344
+ """Merge the LoRA parameters with the original parameters."""
345
+
346
+ for module in self.model.modules():
347
+ if hasattr(module, "parametrizations"):
348
+ lora_params = [name for name in module.parametrizations.keys()]
349
+
350
+ for name in lora_params:
351
+ remove_parametrizations(module, name, leave_parametrized=True)
352
+
353
+ def forward(
354
+ self, x: Tensor, y: Tensor | None = None
355
+ ) -> tuple[Tensor, Tensor | None]:
356
+ return self.model.forward(x, y)
357
+
358
+ def generate(
359
+ self,
360
+ prompt: Tensor,
361
+ max_tokens: int = 500,
362
+ temperature: float = 1.0,
363
+ top_k: int = 500,
364
+ top_p: float = 0.9,
365
+ ) -> Iterator:
366
+ return self.model.generate(prompt, max_tokens, temperature, top_k)
367
+
368
+ def beam_search(
369
+ self,
370
+ prompt: Tensor,
371
+ max_tokens: int = 200,
372
+ num_candidates: int = 3,
373
+ beam_width: int = 16,
374
+ ) -> list:
375
+ return self.model.beam_search(prompt, max_tokens, num_candidates, beam_width)
376
+
377
+
378
+ class CausalSelfAttentionBlock(Module):
379
+ """Causal self-attention block with residual connections."""
380
+
381
+ def __init__(
382
+ self, embedding_dimensions: int, block_size: int, num_heads: int, dropout: float
383
+ ):
384
+ super().__init__()
385
+
386
+ if embedding_dimensions <= 0:
387
+ raise ValueError(
388
+ f"Embedding dimensions must be greater than 0, {embedding_dimensions} given."
389
+ )
390
+
391
+ if block_size <= 0:
392
+ raise ValueError(f"Block size must be greater than 0, {block_size} given.")
393
+
394
+ if num_heads <= 0:
395
+ raise ValueError(f"Num heads must be greater than 0, {num_heads} given.")
396
+
397
+ if dropout < 0 or dropout > 1:
398
+ raise ValueError(f"Dropout must be between 0 and 1, {dropout} given")
399
+
400
+ self.norm1 = RMSNorm(embedding_dimensions)
401
+ self.attention = MultiheadAttention(
402
+ embedding_dimensions,
403
+ num_heads,
404
+ batch_first=True,
405
+ dropout=dropout,
406
+ bias=False,
407
+ )
408
+
409
+ self.norm2 = RMSNorm(embedding_dimensions)
410
+ self.mlp = MLP(embedding_dimensions, 4 * embedding_dimensions, dropout)
411
+
412
+ def forward(self, x: Tensor, attention_mask: Tensor) -> Tensor:
413
+ z = self.norm1(x)
414
+ z, _ = self.attention(z, z, z, attn_mask=attention_mask, is_causal=True)
415
+
416
+ z = x + z # Residual connection
417
+
418
+ x = z
419
+
420
+ z = self.norm2(x)
421
+ z = self.mlp(z)
422
+
423
+ z = x + z # Residual connection
424
+
425
+ return z
426
+
427
+
428
+ class MLP(Module):
429
+ """A two-layer fully-connected network with dropout."""
430
+
431
+ def __init__(
432
+ self, embedding_dimensions: int, hidden_dimensions: int, dropout: float
433
+ ):
434
+ super().__init__()
435
+
436
+ if embedding_dimensions <= 0:
437
+ raise ValueError(
438
+ f"Embedding dimensions must be greater than 0, {embedding_dimensions} given."
439
+ )
440
+
441
+ if hidden_dimensions <= 0:
442
+ raise ValueError(
443
+ f"Hidden dimensions must be greater than 0, {hidden_dimensions} given."
444
+ )
445
+
446
+ self.layers = Sequential(
447
+ Linear(embedding_dimensions, hidden_dimensions, bias=False),
448
+ GELU(),
449
+ Linear(hidden_dimensions, embedding_dimensions, bias=False),
450
+ )
451
+
452
+ self.dropout = Dropout1d(p=dropout)
453
+
454
+ def forward(self, x: Tensor) -> Tensor:
455
+ return self.dropout(self.layers(x))
456
+
457
+
458
+ class LoRA(Module):
459
+ """Rank decomposition transformation."""
460
+
461
+ @classmethod
462
+ def from_linear(
463
+ cls, linear: Linear, rank: int, alpha: float, dropout: float
464
+ ) -> Self:
465
+ out_features, in_features = linear.weight.shape
466
+
467
+ return cls(in_features, out_features, rank, alpha, dropout)
468
+
469
+ def __init__(
470
+ self,
471
+ in_features: int,
472
+ out_features: int,
473
+ rank: int,
474
+ alpha: float,
475
+ dropout: float,
476
+ ):
477
+ super().__init__()
478
+
479
+ if rank <= 0:
480
+ raise ValueError(f"Rank must be greater than 0, {rank} given.")
481
+
482
+ if alpha <= 0.0:
483
+ raise ValueError(f"Alpha must be greater than 0, {alpha} given.")
484
+
485
+ std_dev = 1.0 / sqrt(rank)
486
+
487
+ self.lora_a = Parameter(torch.randn(rank, in_features) * std_dev)
488
+ self.lora_b = Parameter(torch.zeros(out_features, rank))
489
+
490
+ self.dropout = Dropout1d(p=dropout)
491
+
492
+ self.alpha = alpha
493
+
494
+ def forward(self, x: Tensor) -> Tensor:
495
+ z = self.lora_b @ self.dropout(self.lora_a)
496
+
497
+ z *= self.alpha
498
+
499
+ return x + z
model_sizing.ipynb ADDED
@@ -0,0 +1,330 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 63,
6
+ "metadata": {},
7
+ "outputs": [],
8
+ "source": [
9
+ "block_size = 1024\n",
10
+ "vocabulary_size = 50257\n",
11
+ "embedding_dimensions = 1024\n",
12
+ "num_hidden_layers = 32"
13
+ ]
14
+ },
15
+ {
16
+ "cell_type": "markdown",
17
+ "metadata": {},
18
+ "source": [
19
+ "First, we'll estimate the total number of parameters in the network."
20
+ ]
21
+ },
22
+ {
23
+ "cell_type": "code",
24
+ "execution_count": 64,
25
+ "metadata": {},
26
+ "outputs": [
27
+ {
28
+ "data": {
29
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAILCAYAAAAUkL14AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABgN0lEQVR4nO3dd1gUV9sG8HspgoqADbCg2BuIiA0rVowVey9YolFj770TjTVWjLHErtFg791YYsMWewMVFBsgKm2f7w8/5mUDRlaBheH+XReXMntmeHaV2XvPnHNGIyICIiIiIpUwMnQBREREREmJ4YaIiIhUheGGiIiIVIXhhoiIiFSF4YaIiIhUheGGiIiIVIXhhoiIiFSF4YaIiIhUheGGiIiIVIXhhoi+SKPRYOLEiXrv9+jRI2g0GqxatSrJayIi+hyGG6I0YtWqVdBoNNBoNDh16lS8x0UE9vb20Gg0aNSokQEq/HrHjh1TnptGo4GpqSkKFiyIzp0748GDB4YuL1n9888/mDhxIh49emToUohUg+GGKI0xNzfH+vXr420/fvw4njx5AjMzMwNUlTT69++PNWvWYNmyZWjYsCE2bdqE8uXL49mzZ4YuLdn8888/mDRpEsMNURJiuCFKYxo0aIAtW7YgOjpaZ/v69evh6uoKOzs7A1X27apVq4aOHTvCy8sLCxYswKxZs/D69WusXr36m44rIvjw4UMSVZk2hIeHG7oEIoNhuCFKY9q1a4dXr17h4MGDyrbIyEj88ccfaN++fYL7hIeHY8iQIbC3t4eZmRmKFSuGWbNmQUR02kVERGDQoEHImTMnsmTJgiZNmuDJkycJHvPp06fo1q0bbG1tYWZmhlKlSmHFihVJ90QB1KpVCwDw8OFDAMDKlStRq1Yt2NjYwMzMDCVLlsSSJUvi7efg4IBGjRph//79KFeuHDJmzAgfH5+vOsaxY8eUYzg5OeHYsWMAgG3btsHJyQnm5uZwdXXF5cuX4x3j1q1baNmyJbJlywZzc3OUK1cOO3bsUB5ftWoVWrVqBQCoWbOmclku9mcAwN69e1GtWjVkzpwZWbJkQcOGDXHjxg2dn9O1a1dYWFjg/v37aNCgAbJkyYIOHToAAO7evYsWLVrAzs4O5ubmyJs3L9q2bYuQkJDE/jMQpTkmhi6AiPTj4OAANzc3bNiwAd999x2AT2+AISEhaNu2LX755Red9iKCJk2a4OjRo+jevTvKlCmD/fv3Y9iwYXj69Cnmzp2rtO3RowfWrl2L9u3bo3Llyjhy5AgaNmwYr4bnz5+jUqVK0Gg06NevH3LmzIm9e/eie/fuCA0NxcCBA5Pkud6/fx8AkD17dgDAkiVLUKpUKTRp0gQmJibYuXMn+vTpA61Wi759++rse/v2bbRr1w69evVCz549UaxYMb2Pce/ePbRv3x69evVCx44dMWvWLDRu3BhLly7F6NGj0adPHwCAt7c3Wrdujdu3b8PI6NNnxhs3bqBKlSrIkycPRo4cicyZM2Pz5s3w9PTE1q1b0axZM1SvXh39+/fHL7/8gtGjR6NEiRIAoPy5Zs0adOnSBR4eHpgxYwbev3+PJUuWoGrVqrh8+TIcHByUWqOjo+Hh4YGqVati1qxZyJQpEyIjI+Hh4YGIiAj8+OOPsLOzw9OnT7Fr1y68ffsWVlZWSfLvRJTqCBGlCStXrhQAcv78eVm4cKFkyZJF3r9/LyIirVq1kpo1a4qISP78+aVhw4bKfr6+vgJApk6dqnO8li1bikajkXv37omIiJ+fnwCQPn366LRr3769AJAJEyYo27p37y65cuWSly9f6rRt27atWFlZKXU9fPhQAMjKlSv/87kdPXpUAMiKFSskODhYnj17Jrt37xYHBwfRaDRy/vx5ERHluHF5eHhIwYIFdbblz59fAMi+ffvitdf3GKdPn1a27d+/XwBIxowZ5fHjx8p2Hx8fASBHjx5VttWuXVucnJzk48ePyjatViuVK1eWIkWKKNu2bNkSb18RkbCwMLG2tpaePXvqbA8KChIrKyud7V26dBEAMnLkSJ22ly9fFgCyZcuWeM+ZSM14WYooDWrdujU+fPiAXbt2ISwsDLt27frsJak9e/bA2NgY/fv319k+ZMgQiAj27t2rtAMQr92/e2FEBFu3bkXjxo0hInj58qXy5eHhgZCQEFy6dOmrnle3bt2QM2dO5M6dGw0bNkR4eDhWr16NcuXKAQAyZsyotA0JCcHLly9Ro0YNPHjwIN5llgIFCsDDwyPez9DnGCVLloSbm5vyfcWKFQF8ulyWL1++eNtjZ3a9fv0aR44cQevWrREWFqa8Pq9evYKHhwfu3r2Lp0+f/udrcfDgQbx9+xbt2rXTeY2NjY1RsWJFHD16NN4+P/zwg873sT0z+/fvx/v37//z5xGpSbq+LHXixAn8/PPPuHjxIgIDA/Hnn3/C09NTr2Ps378fEyZMwI0bN2Bubo7q1atj9uzZOt3FREktZ86cqFOnDtavX4/3798jJiYGLVu2TLDt48ePkTt3bmTJkkVne+ylj8ePHyt/GhkZoVChQjrtYi/nxAoODsbbt2+xbNkyLFu2LMGf+eLFi696XuPHj0e1atVgbGyMHDlyoESJEjAx+d9p6q+//sKECRNw5syZeG/WISEhOpdZChQokODP0OcYcQMM8L+wYG9vn+D2N2/eAPh0OUtEMG7cOIwbNy7BOl68eIE8efIk+BjwaawM8L9xR/9maWmp872JiQny5s2rs61AgQIYPHgw5syZg3Xr1qFatWpo0qQJOnbsyEtSpGrpOtyEh4fD2dkZ3bp1Q/PmzfXe/+HDh2jatCkGDx6MdevWISQkBIMGDULz5s2/+pMrUWK1b98ePXv2RFBQEL777jtYW1unyM/VarUAgI4dO6JLly4JtilduvRXHdvJyQl16tRJ8LH79++jdu3aKF68OObMmQN7e3tkyJABe/bswdy5c5W6YsXtofnaYxgbGydYy+e2y/8P0I49ztChQxPsPQKAwoULJ7g9Vuwx1qxZk+AMuLihDwDMzMyU8T5xzZ49G127dsX27dtx4MAB9O/fH97e3jh79my8MESkFuk63Hz33XfKgMyEREREYMyYMdiwYQPevn0LR0dHzJgxA+7u7gCAixcvIiYmBlOnTlVOKkOHDkXTpk0RFRUFU1PTlHgalE41a9YMvXr1wtmzZ7Fp06bPtsufPz8OHTqEsLAwnd6bW7duKY/H/qnVanH//n2d3prbt2/rHC92JlVMTMxng0hy2LlzJyIiIrBjxw6dHpWELs8k5zESo2DBggAAU1PTL75GGo0mwe2xPWg2Njbf/Do7OTnByckJY8eOxenTp1GlShUsXboUU6dO/abjEqVWHHPzH/r164czZ85g48aNuHr1Klq1aoX69esr3cWurq4wMjLCypUrERMTg5CQEKxZswZ16tRhsKFkZ2FhgSVLlmDixIlo3LjxZ9s1aNAAMTExWLhwoc72uXPnQqPRKAE/9s9/z7aaN2+ezvfGxsZo0aIFtm7diuvXr8f7ecHBwV/zdL4otrdE4kxfDwkJwcqVK1P0GIlhY2MDd3d3+Pj4IDAwMN7jcV+jzJkzAwDevn2r08bDwwOWlpaYPn06oqKi/vMYnxMaGhpvPSQnJycYGRkhIiIiMU+FKE1K1z03/8Xf3x8rV66Ev78/cufODeBTr8y+ffuwcuVKTJ8+HQUKFMCBAwfQunVr9OrVCzExMXBzc1MGZhIlt89dFoqrcePGqFmzJsaMGYNHjx7B2dkZBw4cwPbt2zFw4EClh6BMmTJo164dFi9ejJCQEFSuXBmHDx/GvXv34h3zp59+wtGjR1GxYkX07NkTJUuWxOvXr3Hp0iUcOnQIr1+/TvLnWq9ePWTIkAGNGzdGr1698O7dO/z666+wsbFJMEAk1zESa9GiRahatSqcnJzQs2dPFCxYEM+fP8eZM2fw5MkTXLlyBcCn193Y2BgzZsxASEgIzMzMlHV4lixZgk6dOqFs2bJo27YtcubMCX9/f+zevRtVqlSJF1j/7ciRI+jXrx9atWqFokWLIjo6GmvWrFECKpFaMdx8xrVr1xATE4OiRYvqbI+IiFDW3AgKCkLPnj3RpUsXtGvXDmFhYRg/fjxatmyJgwcPfra7mSglGRkZYceOHRg/fjw2bdqElStXwsHBAT///DOGDBmi03bFihXImTMn1q1bB19fX9SqVQu7d++ON4DW1tYWf//9NyZPnoxt27Zh8eLFyJ49O0qVKoUZM2Yky/MoVqwY/vjjD4wdOxZDhw6FnZ0dfvjhB+TMmRPdunVLsWMkVsmSJXHhwgVMmjQJq1atwqtXr2BjYwMXFxeMHz9eaWdnZ4elS5fC29sb3bt3R0xMDI4ePQobGxu0b98euXPnxk8//YSff/4ZERERyJMnD6pVqwYvL68v1uDs7AwPDw/s3LkTT58+RaZMmeDs7Iy9e/eiUqVKSfp8iVITjci/lihNpzQajc5sqU2bNqFDhw64ceNGvMGDFhYWsLOzw7hx47Bv3z6cP39eeezJkyewt7fHmTNnePIgIiIyAPbcfIaLiwtiYmLw4sULVKtWLcE279+/jzc7ITYI/XvWBREREaWMdD2g+N27d/Dz84Ofnx+AT1O7/fz84O/vj6JFi6JDhw7o3Lkztm3bhocPH+Lvv/+Gt7c3du/eDQBo2LAhzp8/j8mTJ+Pu3bu4dOkSvLy8kD9/fri4uBjwmREREaVf6fqy1LFjx1CzZs1427t06YJVq1YhKioKU6dOxe+//46nT58iR44cqFSpEiZNmgQnJycAwMaNGzFz5kzcuXMHmTJlgpubG2bMmIHixYun9NMhIiIipPNwQ0REROqTri9LERERkfow3BAREZGqpLvZUlqtFs+ePUOWLFm4Dg0REVEaISIICwtD7ty5E7yPWlzpLtw8e/Ys3oJkRERElDYEBAR88aav6S7cxN44MCAgAJaWlgauhoiIiBIjNDQU9vb2OjcA/px0F25iL0VZWloy3BAREaUxiRlSwgHFREREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKiaGLoCI0i6HkbsNXUKa8einhoYugSjdYM8NERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqYpBw423tzfKly+PLFmywMbGBp6enrh9+/Z/7rNq1SpoNBqdL3Nz8xSqmIiIiFI7g4ab48ePo2/fvjh79iwOHjyIqKgo1KtXD+Hh4f+5n6WlJQIDA5Wvx48fp1DFRERElNqZGPKH79u3T+f7VatWwcbGBhcvXkT16tU/u59Go4GdnV1yl0dERERpUKoacxMSEgIAyJYt23+2e/fuHfLnzw97e3s0bdoUN27c+GzbiIgIhIaG6nwRERGReqWacKPVajFw4EBUqVIFjo6On21XrFgxrFixAtu3b8fatWuh1WpRuXJlPHnyJMH23t7esLKyUr7s7e2T6ykQERFRKqARETF0EQDwww8/YO/evTh16hTy5s2b6P2ioqJQokQJtGvXDlOmTIn3eEREBCIiIpTvQ0NDYW9vj5CQEFhaWiZJ7UTplcPI3YYuIc149FNDQ5dAlKaFhobCysoqUe/fBh1zE6tfv37YtWsXTpw4oVewAQBTU1O4uLjg3r17CT5uZmYGMzOzpCiTiIiI0gCDXpYSEfTr1w9//vknjhw5ggIFCuh9jJiYGFy7dg25cuVKhgqJiIgorTFoz03fvn2xfv16bN++HVmyZEFQUBAAwMrKChkzZgQAdO7cGXny5IG3tzcAYPLkyahUqRIKFy6Mt2/f4ueff8bjx4/Ro0cPgz0PIiIiSj0MGm6WLFkCAHB3d9fZvnLlSnTt2hUA4O/vDyOj/3UwvXnzBj179kRQUBCyZs0KV1dXnD59GiVLlkypsomIiCgVSzUDilOKPgOSiOi/cUBx4nFAMdG30ef9O9VMBSciIiJKCgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqDDdERESkKgw3REREpCoMN0RERKQqBg033t7eKF++PLJkyQIbGxt4enri9u3bX9xvy5YtKF68OMzNzeHk5IQ9e/akQLVERESUFhg03Bw/fhx9+/bF2bNncfDgQURFRaFevXoIDw//7D6nT59Gu3bt0L17d1y+fBmenp7w9PTE9evXU7ByIiIiSq00IiKGLiJWcHAwbGxscPz4cVSvXj3BNm3atEF4eDh27dqlbKtUqRLKlCmDpUuXfvFnhIaGwsrKCiEhIbC0tEyy2onSI4eRuw1dQprx6KeGhi6BKE3T5/07VY25CQkJAQBky5bts23OnDmDOnXq6Gzz8PDAmTNnEmwfERGB0NBQnS8iIiJSr1QTbrRaLQYOHIgqVarA0dHxs+2CgoJga2urs83W1hZBQUEJtvf29oaVlZXyZW9vn6R1ExERUeqSasJN3759cf36dWzcuDFJjztq1CiEhIQoXwEBAUl6fCIiIkpdTAxdAAD069cPu3btwokTJ5A3b97/bGtnZ4fnz5/rbHv+/Dns7OwSbG9mZgYzM7Mkq5WIiIhSN4P23IgI+vXrhz///BNHjhxBgQIFvriPm5sbDh8+rLPt4MGDcHNzS64yiYiIKA0xaM9N3759sX79emzfvh1ZsmRRxs1YWVkhY8aMAIDOnTsjT5488Pb2BgAMGDAANWrUwOzZs9GwYUNs3LgRFy5cwLJlywz2PIiIiCj1MGjPzZIlSxASEgJ3d3fkypVL+dq0aZPSxt/fH4GBgcr3lStXxvr167Fs2TI4Ozvjjz/+gK+v738OQiYiIqL0w6A9N4lZYufYsWPxtrVq1QqtWrVKhoqIiIgorUs1s6WIiIiIkgLDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqQrDDREREakKww0RERGpCsMNERERqYre4Wb16tXYvXu38v3w4cNhbW2NypUr4/Hjx0laHBEREZG+9A4306dPR8aMGQEAZ86cwaJFizBz5kzkyJEDgwYNSvICiYiIiPRhou8OAQEBKFy4MADA19cXLVq0wPfff48qVarA3d09qesjIiIi0ovePTcWFhZ49eoVAODAgQOoW7cuAMDc3BwfPnxI2uqIiIiI9KR3z03dunXRo0cPuLi44M6dO2jQoAEA4MaNG3BwcEjq+oiIiIj0onfPzaJFi1C5cmUEBwdj69atyJ49OwDg4sWLaNeuXZIXSERERKQPvXpuoqOj8csvv2DEiBHImzevzmOTJk1K0sKIiIiIvoZePTcmJiaYOXMmoqOjk6seIiIiom+i92Wp2rVr4/jx48lRCxEREdE303tA8XfffYeRI0fi2rVrcHV1RebMmXUeb9KkSZIVR0RERKQvvcNNnz59AABz5syJ95hGo0FMTMy3V0VERET0lfQON1qtNjnqICIiIkoS33TjzI8fPyZVHURERERJQu9wExMTgylTpiBPnjywsLDAgwcPAADjxo3Db7/9luQFEhEREelD73Azbdo0rFq1CjNnzkSGDBmU7Y6Ojli+fHmSFkdERESkL73Dze+//45ly5ahQ4cOMDY2VrY7Ozvj1q1bSVocERERkb70DjdPnz5V7goel1arRVRUVJIURURERPS19A43JUuWxMmTJ+Nt/+OPP+Di4pIkRRERERF9Lb2ngo8fPx5dunTB06dPodVqsW3bNty+fRu///47du3alRw1EhERESWa3j03TZs2xc6dO3Ho0CFkzpwZ48ePx82bN7Fz507UrVs3OWokIiIiSjS9e24AoFq1ajh48GBS10JERET0zfTuuSlYsCBevXoVb/vbt29RsGDBJCmKiIiI6GvpHW4ePXqU4P2jIiIi8PTp0yQpioiIiOhrJfqy1I4dO5S/79+/H1ZWVsr3MTExOHz4MBwcHJK0OCIiIiJ9JTrceHp6Avh05+8uXbroPGZqagoHBwfMnj07SYsjIiIi0leiw03s3cALFCiA8+fPI0eOHMlWFBEREdHX0nu21MOHD5W/f/z4Eebm5klaEBEREdG30HtAsVar5V3BiYiIKNXSO9xMnTqVdwUnIiKiVIt3BSciIiJV4V3BiYiISFV4V3AiIiJSFd4VnIiIiFSFdwUnIiIiVeFdwYmIiEhVvircxHr37p2ycnEsS0vLbyqIiIiI6FvofVnq4cOHaNiwITJnzgwrKytkzZoVWbNmhbW1NbJmzZocNRIRERElmt49Nx07doSIYMWKFbC1tYVGo0mOuoiIiIi+it7h5sqVK7h48SKKFSuWHPUQERERfRO9L0uVL18eAQEBSfLDT5w4gcaNGyN37tzQaDTw9fX9z/bHjh2DRqOJ9xUUFJQk9RAREVHap3fPzfLly9G7d288ffoUjo6OMDU11Xm8dOnSiT5WeHg4nJ2d0a1bNzRv3jzR+92+fVtn4LKNjU2i9yUiIiJ10zvcBAcH4/79+/Dy8lK2aTQaiAg0Gg1iYmISfazvvvsO3333nb4lwMbGBtbW1nrvR0REROqnd7jp1q0bXFxcsGHDBoMNKC5TpgwiIiLg6OiIiRMnokqVKp9tGxERgYiICOX70NDQlCiRiIiIDETvcPP48WPs2LEjwZtnJrdcuXJh6dKlKFeuHCIiIrB8+XK4u7vj3LlzKFu2bIL7eHt7Y9KkSSlcKRERERmK3uGmVq1auHLlikHCTbFixXRmaVWuXBn379/H3LlzsWbNmgT3GTVqFAYPHqx8HxoaCnt7+2SvlYiIiAxD73DTuHFjDBo0CNeuXYOTk1O8AcVNmjRJsuISo0KFCjh16tRnHzczM4OZmVkKVkRERESGpHe46d27NwBg8uTJ8R7Td0BxUvDz80OuXLlS9GcSERFR6qV3uPn3vaS+xbt373Dv3j3l+4cPH8LPzw/ZsmVDvnz5MGrUKDx9+hS///47AGDevHkoUKAASpUqhY8fP2L58uU4cuQIDhw4kGQ1ERERUdr2TTfO/FYXLlxAzZo1le9jx8Z06dIFq1atQmBgIPz9/ZXHIyMjMWTIEDx9+hSZMmVC6dKlcejQIZ1jEBERUfqmERHRd6fw8HAcP34c/v7+iIyM1Hmsf//+SVZccggNDYWVlRVCQkJ4B3Oib+QwcrehS0gzHv3U0NAlEKVp+rx/691zc/nyZTRo0ADv379HeHg4smXLhpcvXyJTpkywsbFJ9eGGiIiI1E3ve0sNGjQIjRs3xps3b5AxY0acPXsWjx8/hqurK2bNmpUcNRIRERElmt7hxs/PD0OGDIGRkRGMjY0REREBe3t7zJw5E6NHj06OGomIiIgSTe9wY2pqCiOjT7vZ2NgoA36trKyS7G7hRERERF9L7zE3Li4uOH/+PIoUKYIaNWpg/PjxePnyJdasWQNHR8fkqJGIiIgo0fTuuZk+fbqyaN60adOQNWtW/PDDDwgODsayZcuSvEAiIiIifejVcyMisLGxUXpobGxssG/fvmQpjIiIiOhr6NVzIyIoXLgwx9YQERFRqqVXuDEyMkKRIkXw6tWr5KqHiIiI6JvoPebmp59+wrBhw3D9+vXkqIeIiIjom+g9W6pz5854//49nJ2dkSFDBmTMmFHn8devXydZcURERET60jvczJs3LxnKICIiIkoaeoebLl26JEcdRERERElC73AT18ePH+PdFZx32iYiIiJD0jvchIeHY8SIEdi8eXOCs6ZiYmKSpDAifTiM3G3oEtKMRz81NHQJRETJSu/ZUsOHD8eRI0ewZMkSmJmZYfny5Zg0aRJy586N33//PTlqJCIiIko0vXtudu7cid9//x3u7u7w8vJCtWrVULhwYeTPnx/r1q1Dhw4dkqNOIiIiokTRu+fm9evXKFiwIIBP42tip35XrVoVJ06cSNrqiIiIiPSkd7gpWLAgHj58CAAoXrw4Nm/eDOBTj461tXWSFkdERESkL73DjZeXF65cuQIAGDlyJBYtWgRzc3MMGjQIw4YNS/ICiYiIiPSh95ibQYMGKX+vU6cObt26hYsXL6Jw4cIoXbp0khZHREREpK9EhxutVouff/4ZO3bsQGRkJGrXro0JEyYgf/78yJ8/f3LWSERERJRoib4sNW3aNIwePRoWFhbIkycP5s+fj759+yZnbURERER6S3S4+f3337F48WLs378fvr6+2LlzJ9atWwetVpuc9RERERHpJdHhxt/fHw0aNFC+r1OnDjQaDZ49e5YshRERERF9jUSHm+joaJibm+tsMzU1RVRUVJIXRURERPS1Ej2gWETQtWtXmJmZKds+fvyI3r17I3PmzMq2bdu2JW2FRERERHpIdLjp0qVLvG0dO3ZM0mKIiIiIvlWiw83KlSuTsw4iIiKiJKH3CsVEREREqRnDDREREakKww0RERGpCsMNERERqUqiwk3ZsmXx5s0bAMDkyZPx/v37ZC2KiIiI6GslKtzcvHkT4eHhAIBJkybh3bt3yVoUERER0ddK1FTwMmXKwMvLC1WrVoWIYNasWbCwsEiw7fjx45O0QCIiIiJ9JCrcrFq1ChMmTMCuXbug0Wiwd+9emJjE31Wj0TDcEBERkUElKtwUK1YMGzduBAAYGRnh8OHDsLGxSdbCiIiIiL5GolcojqXVapOjDiIiIqIkoXe4AYD79+9j3rx5uHnzJgCgZMmSGDBgAAoVKpSkxRERERHpS+91bvbv34+SJUvi77//RunSpVG6dGmcO3cOpUqVwsGDB5OjRiIiIqJE07vnZuTIkRg0aBB++umneNtHjBiBunXrJllxRERERPrSu+fm5s2b6N69e7zt3bp1wz///JMkRRERERF9Lb3DTc6cOeHn5xdvu5+fH2dQERERkcHpfVmqZ8+e+P777/HgwQNUrlwZAPDXX39hxowZGDx4cJIXSERERKQPvcPNuHHjkCVLFsyePRujRo0CAOTOnRsTJ05E//79k7xAIiIiIn3oHW40Gg0GDRqEQYMGISwsDACQJUuWJC+MiIiI6Gt81To3sRhqiIiIKLXRe0AxERERUWrGcENERESqwnBDREREqsJwQ0RERKryVeGmX79+eP36dVLXQkRERPTNEh1unjx5ovx9/fr1ePfuHQDAyckJAQEBSV8ZERER0VdIdLgpXrw48ufPj/bt2+Pjx49KoHn06BGioqK+6oefOHECjRs3Ru7cuaHRaODr6/vFfY4dO4ayZcvCzMwMhQsXxqpVq77qZxMREZE6JTrcvH37Flu2bIGrqyu0Wi0aNGiAokWLIiIiAvv378fz58/1/uHh4eFwdnbGokWLEtX+4cOHaNiwIWrWrAk/Pz8MHDgQPXr0wP79+/X+2URERKROGhGRxDT88OEDMmbMCADImjUrLl68iMDAQNSpUweOjo64ceMG7O3tcfv27a8rRKPBn3/+CU9Pz8+2GTFiBHbv3o3r168r29q2bYu3b99i3759ifo5oaGhsLKyQkhICCwtLb+qVkp9HEbuNnQJacajnxom2bH4uideUr7uROmRPu/fiV6h2NraGmXKlEGVKlUQGRmJDx8+oEqVKjAxMcGmTZuQJ08enD9//puL/y9nzpxBnTp1dLZ5eHhg4MCBn90nIiICERERyvehoaHJVR4RERGlAom+LPX06VOMHTsWZmZmiI6OhqurK6pVq4bIyEhcunQJGo0GVatWTc5aERQUBFtbW51ttra2CA0NxYcPHxLcx9vbG1ZWVsqXvb19stZIREREhpXocJMjRw40btwY3t7eyJQpE86fP48ff/wRGo0GQ4cOhZWVFWrUqJGctX6VUaNGISQkRPnizC4iIiJ1++obZ1pZWaF169bo3r07jhw5gkyZMuH48eNJWVs8dnZ28QYuP3/+HJaWlsp4oH8zMzODmZlZstZFREREqcdXLeJ39epV5M2bFwCQP39+mJqaws7ODm3atEnS4v7Nzc0Nhw8f1tl28OBBuLm5JevPJSIiorTjq8KNvb09jIw+7Xr9+vWvHsfy7t07+Pn5wc/PD8Cnqd5+fn7w9/cH8OmSUufOnZX2vXv3xoMHDzB8+HDcunULixcvxubNmzFo0KCv+vlERESkPga9t9SFCxfg4uICFxcXAMDgwYPh4uKC8ePHAwACAwOVoAMABQoUwO7du3Hw4EE4Oztj9uzZWL58OTw8PAxSPxEREaU+Xz3mJim4u7vjv5bZSWj1YXd3d1y+fDkZqyIiIqK0jHcFJyIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVhuCEiIiJVYbghIiIiVWG4ISIiIlVJFeFm0aJFcHBwgLm5OSpWrIi///77s21XrVoFjUaj82Vubp6C1RIREVFqZvBws2nTJgwePBgTJkzApUuX4OzsDA8PD7x48eKz+1haWiIwMFD5evz4cQpWTERERKmZiaELmDNnDnr27AkvLy8AwNKlS7F7926sWLECI0eOTHAfjUYDOzu7lCwz0RxG7jZ0CWnGo58aGroEIiJSIYP23ERGRuLixYuoU6eOss3IyAh16tTBmTNnPrvfu3fvkD9/ftjb26Np06a4cePGZ9tGREQgNDRU54uIiIjUy6Dh5uXLl4iJiYGtra3OdltbWwQFBSW4T7FixbBixQps374da9euhVarReXKlfHkyZME23t7e8PKykr5sre3T/LnQURERKmHwcfc6MvNzQ2dO3dGmTJlUKNGDWzbtg05c+aEj49Pgu1HjRqFkJAQ5SsgICCFKyYiIqKUZNAxNzly5ICxsTGeP3+us/358+eJHlNjamoKFxcX3Lt3L8HHzczMYGZm9s21EhERUdpg0J6bDBkywNXVFYcPH1a2abVaHD58GG5ubok6RkxMDK5du4ZcuXIlV5lERESUhhh8ttTgwYPRpUsXlCtXDhUqVMC8efMQHh6uzJ7q3Lkz8uTJA29vbwDA5MmTUalSJRQuXBhv377Fzz//jMePH6NHjx6GfBpERESUShg83LRp0wbBwcEYP348goKCUKZMGezbt08ZZOzv7w8jo/91ML158wY9e/ZEUFAQsmbNCldXV5w+fRolS5Y01FMgIiKiVMTg4QYA+vXrh379+iX42LFjx3S+nzt3LubOnZsCVREREVFalOZmSxERERH9F4YbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlIVhhsiIiJSFYYbIiIiUhWGGyIiIlKVVBFuFi1aBAcHB5ibm6NixYr4+++//7P9li1bULx4cZibm8PJyQl79uxJoUqJiIgotTN4uNm0aRMGDx6MCRMm4NKlS3B2doaHhwdevHiRYPvTp0+jXbt26N69Oy5fvgxPT094enri+vXrKVw5ERERpUYGDzdz5sxBz5494eXlhZIlS2Lp0qXIlCkTVqxYkWD7+fPno379+hg2bBhKlCiBKVOmoGzZsli4cGEKV05ERESpkUHDTWRkJC5evIg6deoo24yMjFCnTh2cOXMmwX3OnDmj0x4APDw8PtueiIiI0hcTQ/7wly9fIiYmBra2tjrbbW1tcevWrQT3CQoKSrB9UFBQgu0jIiIQERGhfB8SEgIACA0N/ZbSP0sb8T5ZjqtGSflvwNc98fi6G0ZSvu6OE/Yn2bHU7vokD0OXQEkk9ndIRL7Y1qDhJiV4e3tj0qRJ8bbb29sboBqKy2qeoStIn/i6GwZfd8Pg664+YWFhsLKy+s82Bg03OXLkgLGxMZ4/f66z/fnz57Czs0twHzs7O73ajxo1CoMHD1a+12q1eP36NbJnzw6NRvONzyD1Cw0Nhb29PQICAmBpaWnoctINvu6GwdfdMPi6G0Z6e91FBGFhYcidO/cX2xo03GTIkAGurq44fPgwPD09AXwKH4cPH0a/fv0S3MfNzQ2HDx/GwIEDlW0HDx6Em5tbgu3NzMxgZmams83a2jopyk9TLC0t08V//tSGr7th8HU3DL7uhpGeXvcv9djEMvhlqcGDB6NLly4oV64cKlSogHnz5iE8PBxeXl4AgM6dOyNPnjzw9vYGAAwYMAA1atTA7Nmz0bBhQ2zcuBEXLlzAsmXLDPk0iIiIKJUweLhp06YNgoODMX78eAQFBaFMmTLYt2+fMmjY398fRkb/m9RVuXJlrF+/HmPHjsXo0aNRpEgR+Pr6wtHR0VBPgYiIiFIRg4cbAOjXr99nL0MdO3Ys3rZWrVqhVatWyVyVOpiZmWHChAnxLs1R8uLrbhh83Q2Dr7th8HX/PI0kZk4VERERURph8BWKiYiIiJISww0RERGpCsMNERERqQrDDREREX1RWhqiy3BDBqXVanW+T0u/PERE6YVWq1VW9X/w4IGBq/kyhhsyGK1Wq6xhdO7cOXz8+DFd3BKDUhYDc9rGf7/UIfZcPXLkSIwbNw4vXrwwcEX/jeGGDCJusBk3bhz69euHjRs3QqvV8mRGSeb+/fuYMmUKvLy8sG7dOgQEBBi6JPqC2N//jx8/6vQWkGHEPR+fP38ee/fuRf/+/WFjY2PAqr6M69yQQY0ePRrLli3DH3/8AUdHR+TIkcPQJaUKIgKNRoOAgABkyJABRkZGyJkzp04opP925coV1KtXD8WKFUNgYCAeP36Mrl27YurUqan+xJxexf6/37VrF9asWYOQkBAMHjwYFSpUSJf3BExNfv75Zzx48AAxMTFp4nZHPEuSwVy7dg07d+6Er68v3N3dYWxsjFu3bmH27Nm4ceMGgPTbJa3RaLBt2zZUr14d1atXR6NGjXDixAkYGRnFG6dE8V2/fh2VK1fGjz/+iIMHD+Lu3bsYN24cVq1ahatXrxq6PPoMjUaDv/76C+3bt0fWrFkRHR2Ntm3bYvHixQgMDDR0eena06dP4ePjg7///jvVX5ICGG4oBf37TdnY2BjPnz/H+/fvcf36dYwZMwaenp5YtGgRKlSogGvXrqW7LunYMOfv74/evXtj2LBhGDZsGIoXL466devi0KFDDDhf8OrVK7i7u8PV1RXDhw9XlqYfNGgQbGxscP/+fQNXSP8lKCgIQ4cOxdKlS3Ho0CHl78uXL2fASSEJnV/mzZuHyZMn4+rVq1i3bh3Cw8MNUFnipYp7S1H6EHs55fr16yhQoADy5cuHunXrolu3bnjz5g28vLwwdepUtGzZEk5OTti1axecnJwMXHXK0mg0OHbsGB48eICePXuiT58+AICGDRvC3Nwc9evXx969e1G3bl1eovqM7Nmzo1WrVjh69Ch8fHzQpk0bJdS8ePEC+fLlM3SJFEfspahLly7h0aNHOHv2LOzt7ZXHR48eDQBYsmQJjI2N0aVLF+TJk8dQ5ape3PPKhQsX8OHDB2i1WtSoUQNjx45FWFgYhg8fjkyZMqFTp07IlCmTgSv+DCFKZjExMcrft2zZIoUKFZI1a9aIiEhgYKDs27dPTp48qbT78OGDVKpUSVatWmWQeg0pLCxMWrZsKRqNRpo1a6bz2LNnz+T7778Xc3Nz2bVrl4EqTN3i/l/r37+/ODg4yKpVq+TSpUuSN29e6d+/vwGro8/Ztm2bZMiQQYoXLy4ajUbc3d3l3r17Om1++uknyZgxo8ycOVOio6MNVKm6abVa5e8jR46UUqVKiYODg5QvX15q1aqlPDZq1CgxNTWVZcuWSVhYmCFK/SKGG0pWcd9s1q9fLzNmzBBjY2MpUqSIbNy4Ud6/f688/uHDB7l9+7Y0atRIypYtK1FRUYYo2eDOnz8vHTt2FHNzc7l48aKI/O+kExgYKO3atZMcOXJIeHi4zsmIPon7xvfjjz+Kvb29WFtbi5eXl7I97v9LMqyAgADp3Lmz+Pj4SGhoqMyZM0ccHR1lwIAB8QLOnDlz5M6dOwaqNP2YPXu2ZM+eXc6cOSNRUVEyZcoU0Wg0cujQIaXNyJEjRaPRiK+vrwEr/TyGG0oRY8aMkWzZsslvv/0mixcvlkqVKknBggVl3bp18uHDBxERWbVqldSvX1+qVq0qkZGRIiKq/4QWG06io6MlIiJC2X779m1p3Lix2Nraxgs4QUFB8uzZs5QvNpWL+38l7t9Hjx4tlpaWMm/ePHn9+rUhSqPP8PPzk6ZNm0qtWrXk8ePHyvb58+dLmTJlpF+/fnL//n0DVpj+REZGSufOnWX58uUiIrJ9+3axtLSUZcuWiYhIaGio0nbx4sWp9kMoww0lO39/fyXIxFW/fn2xt7eXDRs2SExMjNy6dUu2bNmivDGl1l+apBIbVvbu3Stt2rSRqlWryg8//CDnz58XEZF79+5Js2bNxM7OTi5fvqyzD33y6NEjmTBhgnz8+FFEdHtk4gac/v37S4ECBWThwoUSHByc4nXS/8T+H75z5454e3tL2bJlxdraWq5du6bTbv78+VK+fHnp2rWrPHz40ACVpg//PqdER0dLhQoVZMWKFbJv3z6xsLCQxYsXi8inc/KsWbPinctT47maoxEp2ZmamkJEkCFDBgCfFucCgL1798LMzAwzZszAH3/8gcKFC6Nly5YwNjZGTEwMTEzUPd5do9Fg586daNq0KaytrVGlShUcPnwYAwYMwIYNG1CoUCFMnz4d1atXR4UKFXD16tV0N3vsS7Zs2YK1a9diwoQJiIyM1JlJZmxsjOjoaADA/Pnz0axZM4waNQrbtm3jbDMD0mg0+PPPP1GvXj00aNAAw4YNg4ODA0aMGIGbN28q7fr3748WLVrgwYMHyJgxowErVrfYc8ratWtx6tQpGBsbo0aNGli/fj1at26Nn3/+GT/88AMA4OXLlzh69Cjevn2rc4xUea42dLoidUloLEN0dLQ4OztL69atlW2xl2AaN24sBQoUEEdHR7ly5cpnj6EGcbtztVqtvHnzRqpUqSLTpk1Ttr9+/VqaN28ubm5uyifZixcvSufOneX27dspXnNq9fDhQzl8+LBER0fLtGnTpFy5cjJ06FDl/9XnenCmTJnCMRsGEttDEBYWJt27d5dZs2Ypj61Zs0bc3d2lRYsWcuvWLZ39eCkx+T1+/FiKFi0qixYtEhGRU6dOiY2NjVSsWFE57zx79kwaNGggbm5uaWK4AMMNJZm4byg3b96UwMBAefnypYiIHD58WCwsLOTHH3/U2adTp05y4cIFKVWqlE74UZuff/5ZJkyYoHNSeP/+vZQuXVoWLlwoIqKMM3rz5o04ODjI0KFDlbZxx+Okd0+fPpUcOXJIkSJFZPv27RITEyOTJ0/+bMCJiIiQ0aNHy9y5cw1YNYmI/PXXX1K8eHGpUqWKnDx5Uuex33//Xdzd3aVNmzZy/fp1A1WYfo0YMULy5cunhMlDhw5Jzpw5xdXVVYoVKyaVK1eWcuXKpZnxkKmwL4nSqti1EUaNGoU//vgD4eHhqF+/Pnr27IlatWph4cKF6Nu3Ly5fvoxChQrh9u3beP36NVxdXeHh4YErV64Y+Bkkn6ioKLRr1w7GxsaIjIxEhgwZEBMTAwBKV7yJiQmioqJgbW2NunXr6iw2F3tJj4A7d+7g9evXKFCgAH799VdER0djzJgxAIAdO3ZgzJgxmDZtGjJkyIAPHz5g2LBhWLp0Kfz8/AxbOKFw4cKwtLTE6dOnlVVuY9dV6dSpE4yMjDBr1izMnDkTy5cvh6mpqYErVp9/r48VFRUFU1NT9OjRA0ePHoWvry+6du2K2rVr4+jRo7h+/Tru37+PEiVKoEmTJsrl3lR5KSouQ6crStu0Wq3OgLTdu3dL3rx5Ze/evTJz5kxp2LChVKlSRU6dOiUiIteuXZP27dtL+/btpVevXsqngBYtWkjXrl0lJiZG1YNmT548KcOHD5eAgAAREdm8ebMYGxvLggULdNo1a9ZMevXqperX4lt069ZNypQpIy1atJAaNWqIr69vvB6c0NBQGTJkiGTKlEmZcUaG9/z5cylfvrwUK1ZMuUQY9//5xo0b5dGjR4YqL91YvXq1PHz4UJmtqtVqpUWLFlK7du3/3C+199jEYrihJLNjxw4ZMGCA/PLLL8q2w4cPS7NmzaRy5cpy5MiRePuEhITIkCFDJGfOnPLPP/+kZLnJKu4lurgzCWbMmCGFCxeWMWPGSGBgoIiITJ06VYyMjKRbt24yefJk6dOnj1hYWMiNGzdSvO7ULnZW1O7du6Vr166yf/9+ad68uVSpUkXnElXFihWlUKFCOmsFUcqKOyvq1KlTcuXKFeUydXBwsDg7O4uTk5My1ZtBPuXcuXNHypQpI9bW1tKzZ0/ZsmWLsr1IkSKyevVqA1f47Rhu6Ku0bdtWWQdBROTq1atSoUIFsba2lhkzZui0PXLkiDRv3lyqVasmu3fvVrbfu3dPpkyZIkWLFlWmOqtJQECAcsLesWOHzJ8/X0REJk+eLGXKlJGRI0cqJ3tfX1+pXLmyVKtWTRo0aKAMrqZPSwls27ZNZ9uLFy+kePHisnDhQnnx4oU0b95cqlatqgSc0aNHS8mSJfk6Gkjs//utW7eKnZ2dFC1aVMzMzKR+/fqyceNGEflfwClbtiwHeSezfwfH2A9fK1eulN69e0uGDBmkffv2MmXKFOnUqZMMHz5cp11axHBDenv27JksWLBAuaQUa8OGDVKhQgVxcnKKF1aOHj0qNWrUkF69einbtFqt3Lp1S+nBUJP379+Lo6OjuLu7y6ZNm0Sj0cj69euVxydOnKgEnNjnH7tac9xVm9M7f39/yZ49u2g0GmnQoIFs2rRJmb2xY8cOqVatmrx48UL++ecfad68udSsWVM2b94sWq1WCY5kGOfOnRNLS0tZtGiRPHv2TA4ePCgdOnQQV1dX2bx5s4h8CqkODg5SpUqVeOcTShpxA8rTp0/l3r17OhMUtFqtXLhwQb7//nupUaOGaDQaMTU1FT8/P0OUm2QYbkgv7969E5H/zexZvHixjBo1Snl806ZNUrNmTfH09Iz3y3Hx4kXlFy0tfyJIDK1WKzdv3pRs2bKJubm5cp+s2MsqIp8CjouLi4wZM0b8/f119qVPHj16JOXKlRM3NzcpW7as9OjRQ/Lnzy8+Pj6yadMmadSokezZs0dERG7cuCF16tSRBg0apNr73aQHseeGefPmSbVq1XQe8/Pzk9atW4unp6dyLgkODpYHDx6keJ3pQdzz7Pjx46VixYqSMWNG6dSpk3J/v1gfPnyQsLAwmTlzppQtW1YGDRqUpsdAMtxQoo0YMULs7Ozk7du3IiLy6tUrGTBggBQqVEimT5+utFu3bp3UqlVLPD09E7wsoPZgE+vx48diYmIiFhYW0rBhQ2V73E9NkydPlvz588ukSZPSzEC9lHbnzh1p3ry5eHp6yrZt2+TPP/8Ud3d38fT0FI1GIxUrVlRe01u3bimDtSnlPHr0SOccICKyZMkSKV68uDx//lxn+/bt28XU1JTrNqWg8ePHS86cOWXbtm1y9uxZcXd3FycnJ2XlYRHdsYHe3t5SpEgRZbBxWsRwQ4l24MABcXNzExcXF3nz5o2IiNy/f1/GjRsnxYoVkylTpiht161bJ3Xr1pWqVavK3bt3DVSx4d29e1f8/Pwkd+7c4uHhoWyPG3AWLFjA++d8wa1bt+S7776TevXqye3bt+Xdu3dy5swZadSokfIJNK1+wkzrYmJiZNq0aeLg4CDjx49Xtu/evVssLS1lzZo1Oh9obt26JSVKlIh3uwVKHidPnhQnJyc5ceKEiIgcP35czM3NpWrVqlK6dGn59ddflbaxvW6xlwvPnTtnkJqTAsMN6eXUqVNSq1YtcXZ2lpCQEBH5FHDGjBkTL+D8+uuv0r9//3TTUxP75nrr1i05ceKEzo0AT506Jblz55bvvvtOaTdv3rx4U8Dp8+7cuSP16tWTevXqKUsLUOoQFBQk48ePFxcXFxk9erSyfeDAgcpl2UePHsnHjx9l2LBhUqhQIXnx4oUBK04/AgMDZe7cuRIZGSkHDhyQ7Nmzy4oVKyQoKEgKFSokxYoVizcJZNKkSZI1a9Y0PR6S4Ya+KG442bp1q4wfP140Go24ubkpPTgPHjyQMWPGSIkSJXRuJ5DQMdRs69atYmVlJQUKFBBTU1NZsGCBsuLnqVOnJG/evFKyZEnp1KmTmJiYyNWrVw1ccdpy584dqV+/vnh4eMRb4ZYMKygoSEaPHi0uLi4ycuRIZfugQYMke/bsYm9vL+XKlZOcOXPKpUuXDFhp+hMWFiZRUVHSvHlzGTNmjHIJvFmzZuLo6Cj9+/fX6flctGhRml9CgeGGEm3w4MHKGi3NmjWT3Llzi7Ozs07AGTdunGTNmlUZQJsexJ4UHj16JC4uLrJkyRJ58OCBTJkyRSwsLGTKlCnKzJ0HDx5Ihw4dpGvXrgw2X+nOnTvSqFEjqVSpkpw5c8bQ5aRLt2/flmXLlsnhw4dF5H+XMwIDA2Xs2LFSunRpnYBz9OhR2bBhg7JwHCW9L42PiYyMlLJlyyr/LhEREdKuXTvZuHGjcg5T07g/hhv6rLhJ/tKlS5I7d245dOiQsm3Xrl3i6uoqLi4uyiDjO3fuiI+Pj6p+SRLj0KFDMmfOHJ1Vl0VEZs2aJVmyZJEpU6boDKyMO2uK9Hfz5k1p2bKlzqU/ShmvXr0SCwsL0Wg0YmlpKRUqVJAuXbrI3r175fnz56LVamXMmDFSrVo1GTZsmKHLTRe2bNkibdu21Zl1GZdWq5WwsDDp0qWL1KpVSwYOHCi1a9cWFxcX1c5gTeU3hyBDaNmyJRo1aoSuXbtCRKDRaBASEoKQkBDky5dPaVevXj28e/cOnTt3RuPGjeHr64siRYqgSJEiAICYmBgYGxsb6mmkqF27dmH+/PkoVqwY3rx5AxsbGwDAkCFDoNFoMG3aNLx//x4DBgyAra0tzMzMDFxx2la8eHGsW7eO99wygGzZsmHAgAHw8fFBp06dEBERgQ8fPqB9+/awsrJC7dq1YWNjg6JFi2Lfvn0wMzPDlClTDF22qtnY2GDTpk3ImDEjpkyZgjx58ug8rtFoYGFhgQEDBmDBggXw8/NDtmzZsHfvXhgZGcW735QqGDpdUery4cMH6dWrl5iYmCgLbYl8up7u5OQk8+fP1+nRef36tTg6OoqJiYl06NBBRNLvrJWJEyeKRqORxYsXS3h4uM5jU6ZMkXz58nFhOUrT4n66Hzp0qJQoUULmzp0r0dHRcvfuXdmzZ480atRIWQxOo9FInjx5+P8+GcX2kv/111+SIUMG6dy5szx58kR5/N/n42fPnumsXxN3CriaMNxQPGFhYTJ8+HAxMjJS7jkSHh4uHTt2lOrVq8uff/6ptH3+/Lk0b95c9u/fr7puzc+JPSkEBATI3bt35fr168pjAwYMkAwZMshvv/0Wb6XhV69epWidRMkh7iXnoUOHir29vcyaNUsJMJGRkRIVFSW7du2S2bNny82bNw1VaroRe+49efJkggFH5NN4qPLly8tPP/2kbFPzB1GGG1LETfCXL1+Wtm3birGxsRJmgoKCpHbt2lKpUiXp0aOHrFixQmrUqCHu7u7KL5fax9rEngy2bdsmZcuWlQIFCkilSpWkWbNmSpshQ4ZIhgwZZOXKlTo9OGo+kVD6Evf3fPjw4ZIvXz6ZPXt2vAX7KOX8O+B06tRJCTjBwcFSo0YNKViwYLq5zQXDDcUzcuRIKV++vDRp0kSyZcsmRkZGyn2RXrx4IRMnTpSqVauKq6urNG7cWPllSS89NwcPHpSMGTPKkiVLxN/fX1asWCEajUbnTrpDhw4VjUYTb4lzorTk35dX4/p3wMmfP7/MnTuX69cks9jzbOyHpbgfmv4dcLp27SqXL1+WatWqSfHixZVztVovRcXFcEM6NmzYIJkzZ5YzZ85IeHi4/PPPP9KnTx8xMjKSDRs2iIgo12tfvXql+uu2/6bVamX48OHKdMonT55I/vz5pW/fvvHajh49Wv7555+ULpEoSdy5c0fatGkjy5Yt+2ybuAFn1KhRkiVLFlm0aFG6+aCT0uJe6v73zYljxb72p06dkkyZMolGoxFnZ+d0FWxERFQ2PJq+1ZMnT1CuXDlUqlQJmTJlQokSJTBp0iR07NgRnTp1wu7du2FkZASNRoNs2bJBo9FARGBioq6JdyKS4HaNRoNr167BzMwMwcHBqFSpEjw8PLBgwQIAwLp167B69WoAwLRp01CiRIkUq5koqVy7dg3u7u7InDkztFrtZ9sZGxsjJiYGADB9+nQMHjwY9erVU9/Mm1Rg8+bNGD9+PABg4MCBaNq0Kd6+fRuvXezspypVqmD//v2oVasWzp8/D1NTU0RHR6vuXP05/B9IOqytreHn54egoCAAn97kc+TIgWbNmiEmJgaNGzfG4cOHdfbRaDSGKDVZaTQaBAcH49WrVwAAX19fbNmyBQBQs2ZN3L59G2XLlsV3330HHx8fAMCHDx9w4sQJPHz4EJGRkQarnehb3L9/Hw0aNECXLl3g4+ODXr16JdguNvTEDTgTJ05E4cKFU6zW9CQyMhKzZ8+Gm5sbVq9ejd27d8Pa2jrBD2JGRkaIiYlB1apVcejQoXQXbACGm3Trc5/G6tSpg6JFi2LatGkICAhQgkuePHnQrVs3/Prrr6hRo0ZKlpriRAQhISEoUaIE5s+fjxUrVqB58+aIjo4GAFSrVg3Hjh1DpkyZMGTIEACfTjxTp07Fnj170KFDB66/QmnW1q1bUa5cOYwbN05Zp+rp06c4d+4cli5diosXLyIyMlKndya9rGdlCPJp+Ag6duyIevXq4e+//0bz5s1RtGhRAJ//cPnvf5P0FGwAQCOf638n1Yq7YNPy5ctx8+ZNvH37Fh4eHmjZsiV+++03rFixAgULFkS/fv1gYWGBkSNHIkuWLNi4cSMApItPAdu2bUPbtm0RExODBQsWoE+fPsqihgcPHkTLli1RtmxZxMTEIEeOHDh16hT2798PFxcXQ5dO9NU6deqEFy9eYP/+/QA+hZ1NmzbhyJEjeP/+PXLnzo1BgwahT58+quy1TU1izzexpk6dChHBhAkTMHz4cAwbNgzZs2f/4n7pkbrfnShBscFm+PDhWL16NXr06IHIyEiMHDkSp0+fxrx58xAVFYVdu3ahSpUqKFKkCDJnzgxfX18AUOUYm7iBLyIiAmZmZihdujSAT8/35cuXePnyJXLkyAERQd26dXHw4EGcPn0aly9fhqurK2bMmKGszkyUlnz8+BEmJiYwMTFBvXr1MGrUKIwbNw6vXr3CH3/8gdatW2PdunXw8PBAixYtsH79enh5eSFTpkyGLl3VYgPKypUrYWZmhlGjRsHY2Bj58uWDl5cXgE/n8WzZsgEATp48iWrVqqX7YAOAKxSnVwcPHpSCBQvKuXPnROTTui3m5uaycuVKnXbnz5+XK1euKCPw1TzS3t/fXx49eiQiIjt27JAVK1bI9evXZcuWLaLRaGTUqFFcaZVU5/r161KnTh3lJpj+/v4yZswYKV26tLi4uMjOnTt1pnf7+PhIyZIluShlCvn48aNUr15dypUrJ7///rsy62n16tWi0Whk8ODB8tdff0mTJk2kbNmyXE/r/zHcpFNr166VSpUqicinm65lyZJFlixZIiIioaGhcuTIkXjTOdW8QF9YWJh4enpKhQoVZNGiRaLRaHRuPxF7Ihk7dqwEBweLiMjMmTNl69athiqZ6JvFxMRI5cqVRaPRSNmyZeXkyZPKYx8+fEhwnZsffvhBWrRoEW8FbkoaCYWTN2/eSNOmTcXNzU1Wr16tBJy1a9dKzpw5xdHRUcqXL59uFuhLDIabdCY2oGzcuFFatGghu3btEgsLCyXYiIhs375dBg8eLIGBgYYq0yAOHDggpUqVEhMTE5k9e7aIiERERCgnm9WrV4uJiYm0b99eOnToIBkyZPjsWhNEaYWvr6/UqVNHSpQoIXZ2dnLixAnlPBH3A05ISIiMHDlScuTIoXPLEUoeDx8+1Pn+7du30rhxY3Fzc5M1a9YoQeaff/6RixcvpovedX1wtpSKiYgyRTNW7Aj6ChUqYP/+/WjcuDF++eUX9O7dG8Cna+9LlizBq1evYGtrm+I1p5SEZosVK1YMkZGRKFCgAHbv3o2HDx8iQ4YMiI6Ohoigc+fOWL9+PcLDwxEaGorz58+jTJkyKV88URJycHAAACxdulSZVHD69GkA/xuft2TJEnTr1g0bN27EgQMHUKpUKUOVmy6sXLkSLVq0wMGDB5VtVlZWWL16NUxMTDB16lRs3rwZUVFRKFGiBMqWLatM/1bbeMivxdlSKvX69WtlkBkArF27Fvfv30e2bNlQvXp1ODs7Y8+ePWjbti06dOiAZs2aQavVYs6cOQgKCsKlS5dgYmKi6lH3t27dwpo1a/D9998jX758AIBHjx7h9u3bmDlzJmJiYrBq1SoUKFAAUVFRMDU1BQDExMQgOjoaZmZmhiyf6KtERkbGW6pgwIAB+Pvvv3H48GF06NABZ8+exebNm1GtWjUEBQVhw4YNePnyJbp164ZChQoZqPL04/Hjx2jSpAlsbW0xfPhw1KlTR3nswoULqFWrFvLly4dZs2ahfv36Bqw0FTNovxEli9GjR0ujRo3k2bNnIiIyePBgsba2lgoVKkiZMmXExMREuefR9u3bpUCBAmJvby/lypWTZs2aKd2dah5jExkZKeXLlxeNRiNFihSRwYMHK3dAFxHZv3+/VK9eXdzd3eXBgwciIjJr1ixZunQpl5anNOvKlStSpUoV8fb21hlf8+zZM6lXr56cPXtWREQ8PDwkV65cSpvIyEiJiIgwSM1q97nzyaNHj6Rs2bJSs2ZNOXDggLL90KFD0qVLFxk6dKiqz9Hfiv1XKmRlZYWQkBCMHDkSXl5euHXrFg4ePAhXV1e8evUKv/zyC7y8vGBhYQFPT09UrVoVISEhMDc3h52dHTQajerXsTE1NUWrVq3Qrl07ODo64q+//sL333+Pbdu2oXbt2ujWrRtiYmKwdOlSVK9eHbVq1cKaNWvg5+fHpeUpTdJqtejTpw9Onz4NU1NTeHt7o3fv3qhUqRKaNWuGrFmzYsmSJahYsSL27duHpk2bolatWjh+/Djc3NwMXb4qiYhyPlm7di3u3r2LrFmzonr16ihbtix8fX3RrFkzzJgxA/fu3UOtWrUwf/58uLi4YNKkSQA+9SRzEcX4eFlKRSTOJaQlS5Zgy5YtyJgxI0JCQrBnzx5YWloqbQcNGoQtW7bgzJkzsLe31zlO3DVf1OzYsWNo2rQpDh8+jHLlyiEwMBDLli2Dt7c3KlasiE6dOkGj0eDFixfw8/PD+PHjOdaA0rTg4GBUr14d2bNnR5cuXXDy5Ek8evQIVlZWcHd3x5QpU7Br1y5UrVoVANCmTRtMmTJFWQ2Xkk7c8/WwYcOwfPlyFC9eHBEREbh69Sp8fHzQvXt3BAQEoH///vDz80NUVBTy5s2LkydPwtTUVNXDBr6ZIbuNKOnF7eL85ZdfpEyZMmJlZSVPnjwRkf9dajp27Jjkzp1b/Pz8DFJnajF06FDp0KGDfPjwQURE2rRpI8WLF5eOHTtK7dq1xdTUVFasWMEueUrzYmf9PX/+XOzs7KRZs2Zy7tw5efXqlfTs2VPq1KkjGo1Gzpw5Y+BK05dLly5J48aN5e+//xYRkZcvX8rEiRPF2NhYNmzYICKfZkpdv35dTp48qZzDOSvqvzHcqMTnrtv6+PhI0aJFxdPTU1mgTkTkzp07Ym9vL8ePH0+pElOlLVu2iJubm8TExEj37t3F1tZWmeZ68+ZNmTt3Lqe9Upr18uVLuXHjhly9elVne1BQkOTKlUuqVq2qnBcCAwPT/YedlBB3HZt169ZJ1apVxc3NTd6+favTbujQoZI7d26d83YsjrX5Ml6WUoG4l5EOHToEc3NzZMyYEa6urgCAhQsXYv369TA1NcXEiRMRGRmJ+fPnIygoCOfPn0/312tr1KiBU6dOwc7ODnv27IGzs7OhSyL6ZtevX0e3bt0QHBwMEUG9evWwbNky5fHnz5/D1dUV+fPnx2+//YbixYsD4H2JktO/L/n7+Phg4cKFePLkCa5cuYJ8+fIpY2hOnjyJtm3bYvfu3Vxy4iuof2BFOhD7yzJixAi0bt0abdu2RZcuXbBgwQIAQL9+/dCpUycEBASgSZMmWLBgAUqWLIlz587B2Ng43lo46UVsrh8xYgQKFy6MRYsWwdnZGcz7lNZduXIFlSpVQvXq1bFy5Uo0atQIq1evxpIlSwB8mg5ua2uLixcv4vHjx+jTpw+uXr0K4PN3maZvc+TIEQQHBwMARo4ciSlTpqBXr14YMmQIbG1t0b9/fzx8+FD5sJk7d24YGxsjJCTEkGWnXYbsNqJvE7d78/bt21KhQgW5fPmynDx5UsaPHy9WVlYyc+ZMpY2Pj48UL15cpk2bpuzL67afuugLFy4sY8eONXQpRN/s7t27Ym5urvP/+cGDB5IhQwYZMmRIvPZBQUFiZmYmDRo04NiyZBISEiJ58+aVSpUqSY8ePcTS0lKuXLmiPL506VJxc3OTqlWrysGDB2Xv3r3SoEEDcXFx4SWor6Teub4qF7d78+PHjwgLC4OjoyOcnJxgbGyMIkWKwNTUFFOnToVGo8HQoUPx/fffw8rKCq1atYJGo1Hl3b2/hq2tLSZMmIDevXujcePGqFChgqFLIvoqWq0WK1asQJYsWZA9e3Zl+8aNGxEVFYW7d+9i3rx5yJ49u3IesLW1xePHjxEaGhpvcT9KGpaWlrh16xZsbW1x7do1+Pr6onTp0solqF69esHY2Bje3t5o0qQJ6tati9KlS2Pr1q1K73p6Hz6gL76zpVGxwWbSpEk4dOiQsppw7C+Ara0tevbsCQCYPn06wsLCMGnSJLRp0wZA+pnunVg1a9ZE+fLlkTt3bkOXQvTVjIyM0K9fP7x//x4bN26EmZkZwsLC8PPPP2PMmDEoU6YM1q1bh4CAAIwZMwZFihTBjz/+CE9PT1XfbsVQYs+zIoKXL1/CyMgIWbJkwZQpU1C8eHHkzZtXadOjRw8YGRnht99+g7W1NXr37g1zc3NERERwNfSvwAHFaUzcULJgwQJMnz4dXl5eePr0KdasWYPJkydj7NixSvsXL15gzpw5uHTpEvbv3w+A19Q/5+PHjzA3Nzd0GUTfLCgoCNOmTcPBgwdx//597N+/H7Vq1QIAZYHOhQsX4tKlSxg6dChKlixp4IrV7fTp06hcuTIA4OXLl6hUqRLs7OywadMm5MmTR6ft6tWrsXz5chQsWBATJkxAwYIFDVFy2mfQi2L01c6cOSMzZ84UX19fEREJCwuTBQsWiLGxsUyfPl2n7evXr5UxNnHH6RCRegUFBUn//v2ldOnSMmvWLGV73HE1HHOXvLRarfz111+i0Whk+vTp8vz5cxH5dMfvQoUKSfXq1eXhw4cSEREhrVu3ltmzZ4vIp/GRjo6O8v333/Pf6CvxslQaceHCBRQuXBjW1ta4ceOG8ilg5cqVAAALCwv06NEDGo0GAwcOhJGREUaMGAEAyJo1KwBO8SRKT2xtbTFq1ChotVps2bIF0dHRGDFihHKnexMTE465S2YajQaVK1fGpEmTMHv2bBgbG8PLywsODg44fPgw6tati6pVqyJnzpz48OEDVq9eDQD4/vvvYWpqilq1avHf6GsZOl3Rly1atEjy588vN2/eVLZt3bpVMmfOLL169VJW1xUR+fjxoyxevFg0Go38/vvvhiiXiFKRwMBA6devn1SpUkXGjx9v6HJULW7P+L9nOU2dOlWsrKxkxowZEhwcLCKfzteTJ0+WOXPmKD00Hz9+TLmCVYzhJpXz8fERY2Nj2bp1a7zH1q1bJ8bGxjJ69GjlTt4iIh8+fJCtW7eyO5OIRORTwOnatavUqVNHXr58aehyVG/GjBny66+/6nzwFBGZMmWKmJqaysyZM+Xp06fx9uO076TD/q5UbNmyZejbty82b96M5s2bK9svXrwIJycntG/fHgDQuXNnAJ9mTpmYmMDc3Fxpr/a7exPRl9nZ2eGnn34CAJ0p4vTtJM7l/ti/X7lyBZs3b0amTJnQvHlzZaLC2LFjcf36dcyZMwfv379H//79lWEDADjdOwnxXS+V2rp1K3r37o3169frBJsGDRoga9as+O233wAA7du3h0ajgZeXF0JDQzFv3jydXxAGGyICwKneyeTFixeIjIzEmzdvkDNnTuTKlQvr1q2DlZUVunfvDq1WixYtWiBjxowAgDx58iB79uw4e/Ysxo8fb+Dq1YvvfKmQiODQoUPInz8/Xr16pXwaaNmyJQICArBo0SKdKcvt2rVDWFgY1q5dy7VriIhSyPr167F06VLcu3cPQUFBKFCgADw8PLB48WIsXrwYWq0WPXv2hIigdu3ayJ07N54+fYpVq1bB1dVVWUyVEz2SHte5SaWioqIwYMAAXLp0CR07dsShQ4fw6NEj/PnnnyhQoIDOL8Tbt29hbW2tbOMvCxFR8lq5ciX69OmD2bNno3jx4jA1NcWKFSuwYcMG1KhRQ1lXrG/fvti6dSty5cqFmJgYREdH4+rVqzAxMeFiqsmI4SYVil1qOyoqCv369cOePXsQFRWFo0ePokSJEjrjaFq0aIFChQph5syZADjdm4gouV2+fBmtWrXC9OnT0bp1a2X7q1evsHnzZgwdOhRNmjTBhg0bAEBZFfrjx48YO3YsTExMeEuFZMbLUqlQ7L1ETE1NsXDhQgwZMgSnTp3Cvn37YG9vDwsLC8TExKBx48a4c+cONm7cqOzLYENElLwCAgJgYWGB6tWrKyFFRJA9e3a0a9cOz549w4IFC3DkyBHUqlULHTp00NmfwSb5sT8slYobcGbPno3y5ctjw4YN+PXXXxEeHo4WLVrgwYMHuHnzJkxNTREdHW3okomI0oXLly8jKCgIdnZ2SrCJ/WBpbW2NTp06ITw8HM+ePUtwfwab5Mdwk4r9uwenbNmy2Lx5M4oVK4abN2/i2rVrSrDhrCgiopRRokQJhIWF4cCBAwDi95gXLFgQdnZ2ePfunSHKIzDcGJRWq/1im7gBZ8GCBShcuDBKlCiB69evM9gQERlAuXLlYGpqimXLlsHf31/ZHhMTAwDw9/dHjhw5ULRoUUOVmO5xQLGBxO3G3LJlC549ewYHBwc0atQowS7L2FH1MTEx0Gg0MDIyYrAhIjKQDRs2wMvLCy1atMCQIUNQtmxZAMD79+/RunVrhIWF4ejRo5wNZSAMNwYQN9iMHj0a8+fPh6OjI86fP48ePXpg2LBhKFKkSLz94k4b5BRCIiLDiY6OxqpVq9C3b1/kzJkTzs7OsLa2hr+/P8LCwnD+/HmYmppy8LCB8N3RAGKDzc2bN3H27FkcO3YM586dw9GjR/HHH39g6tSpuHPnjtI+Nn/GDTMMNkREhmNiYoIePXrg3LlzaNq0KT58+ABTU1M0atQIFy5cUIYNMNgYBntuDMTb2xt///03zM3NsWrVKpiZmQEAjh07hhYtWqBx48YYM2ZMgj04RESUurHHxrD48d9A8uXLh+3bt+PMmTPKdEERgbu7O7Zt24a9e/diyJAhCAgIMHClRET0XxLqI2CwMSyGmxSQ0KyoDh06YNu2bfD398f8+fMRHBysXK6qUaMG1qxZg+joaOTJkyelyyUiIj1w8dTUh5elklncgb/Hjh1DcHAwjI2NUbduXWTJkgUbNmxAhw4dMGjQIIwaNQo5cuSIdwsFDh4mIiJKPM4jTmaxoWTEiBHw9fVFhgwZkD17dgwYMABnz55Fu3btYGJigjZt2sDY2BhDhw6FjY1NgscgIiKiL+O7ZgpYunQpVq5cibVr1+LatWto1aoVnj59ivPnzwMAWrVqhY0bN2LWrFnYvHmzgaslIiJK29hzkwz+fVnpn3/+wZAhQ1C+fHn4+vpi1KhR8PHxgaenJ0JDQ2FqaorWrVsje/bsqFGjhgErJyIiSvvYc5PE4gabyMhIAEBQUBAiIyOxe/dudOrUCTNmzEDPnj2h1WqxZs0a+Pj4IDo6GrVr14aJiQlvgklERPQNGG6SWGywmTZtGiZPngzg031IduzYgXbt2mHGjBn44YcfAABv3rzBnj178PHjR53bKPCWCkRERF+P4SYJTJ06FS9evADwv/UOjh8/jpIlSwIAvLy8oNFokC1bNjg5OSE0NBQPHz5Ep06d8PLlSwwdOtRgtRMREakNw803CggIwIQJE9C1a1e8evVKCTehoaHKIk45c+bE9u3bYWFhgT59+sDBwQEdOnTAmzdvcOrUKZiYmCh3kyUiIqJvw+sf38je3h7Xrl2Dh4cHOnbsiLVr1yJ79uyIjo5WFu+LjIxErly58Ndff+Hq1au4f/8+ChYsiCpVqsDY2Jh39yYiIkpCfEdNAiVLlsS+fftQr149tG3bFuvXr0fmzJmRKVMmpU1oaCisrKxQrFgxVKtWTdkeExPDYENERJSEuELxV/r3dG8AuHHjBmrVqoWCBQsiMDAQQUFBcHJywqtXrxAREQELCwtUrVoVv/32m4GqJiIiUj+Gm68Q926vsdO2Y3tfrl+/jtatW+PFixeYPXs27O3t8eHDB0RFRcHc3Bx16tRhTw0REVEyYrjRU1hYGLJkyQIAmD17Ni5cuIA7d+6gXbt2qFGjBsqXL48bN26gXr16qFSpElauXAlLS0udY8QNR0RERJS0OFtKD2vWrMHcuXMBACNHjsT06dNRvHhxlCpVClu2bMHgwYNx5MgRlCpVCvv378f58+dRs2ZNvH37Vuc4DDZERETJh9dHEsnHxwc//PAD9uzZg7t378LX1xd//PEHatasCQA4evQoli1bBm9vb+TPnx+Ojo7YuXMnJkyYEK/nhoiIiJIPe24SYc2aNfjxxx+xa9cu1K9fH+/evcPz5891xs7UrFkTXbp0wd27d/HkyRMAgLOzM3x9fWFkZKRMCyciIqLkxXDzBatWrUKXLl3g7u6OBg0aAABMTU1hY2ODx48fA/jfqsT169eHmZkZTp48Ge84RkZ8qYmIiFIC33H/w6+//oru3buje/fuuHHjBvr37w8AcHR0RIUKFTBkyBCcPn1amRL+5s0bZMqUCfb29oYsm4iIKF3jbKnPmDdvHgYPHozdu3fju+++g4+PD8aOHYs2bdpg4cKFAIBGjRrh7Nmz6Ny5M3LmzImjR48iKCgIly5d4nRvIiIiA2G4+Yzjx48jMDAQbdu2BQCEhIRg06ZNGDNmjE7AGTVqFK5du4Y3b96gcOHCWL58OUxNTTndm4iIyEAYbr4g7krEoaGh2LhxY7yA8/79exgZGcHc3BwAeK8oIiIiA+I78BfEvcWCpaWl0pMzduxYGBsbY/78+Tr3kBIRBhsiIiID4ruwnmIDjkajQa9evVCwYEEMGDBAefzf95siIiKilMXLUl/p7du3OH78OBo1asSxNURERKkIw00S4BgbIiKi1IPhhoiIiFSFi/gRERGRqjDcEBERkaow3BAREZGqMNwQERGRqjDcEBERkaow3BAREZGqMNwQERGRqjDcEBERkaow3BAREZGqMNwQERGRqjDcEBERkaow3BAREZGq/B/rTx7zu34IxAAAAABJRU5ErkJggg==",
30
+ "text/plain": [
31
+ "<Figure size 640x480 with 1 Axes>"
32
+ ]
33
+ },
34
+ "metadata": {},
35
+ "output_type": "display_data"
36
+ },
37
+ {
38
+ "name": "stdout",
39
+ "output_type": "stream",
40
+ "text": [
41
+ "Token Embeddings 51,463,168 11.33%\n",
42
+ "Attention 134,217,728 29.55%\n",
43
+ "MLP 268,435,456 59.10%\n",
44
+ "RMS Norm 66,560 0.01%\n",
45
+ "Output Layer 0 0.00%\n",
46
+ "\n",
47
+ "\n",
48
+ "Total parameters: 454,182,912\n"
49
+ ]
50
+ }
51
+ ],
52
+ "source": [
53
+ "import matplotlib.pyplot as plt\n",
54
+ "\n",
55
+ "parameter_counts = {\n",
56
+ " \"Token Embeddings\": vocabulary_size * embedding_dimensions,\n",
57
+ " \"Attention\": (embedding_dimensions ** 2 + embedding_dimensions * 3 * embedding_dimensions) * num_hidden_layers,\n",
58
+ " \"MLP\": embedding_dimensions * 4 * embedding_dimensions * 2 * num_hidden_layers,\n",
59
+ " \"RMS Norm\": embedding_dimensions * num_hidden_layers * 2 + embedding_dimensions,\n",
60
+ " \"Output Layer\": 0, # Tied to token embeddings\n",
61
+ "}\n",
62
+ "\n",
63
+ "plt.bar(parameter_counts.keys(), parameter_counts.values())\n",
64
+ "\n",
65
+ "plt.title(\"Model Parameters\")\n",
66
+ "plt.ylabel(\"# of Parameters\")\n",
67
+ "plt.xticks(rotation=45)\n",
68
+ "\n",
69
+ "plt.show()\n",
70
+ "\n",
71
+ "total_parameter_count = sum(parameter_counts.values())\n",
72
+ "\n",
73
+ "for name, count in parameter_counts.items():\n",
74
+ " print(f\"{name:20s} {count:20,d} {count / total_parameter_count * 100:10.2f}%\")\n",
75
+ "\n",
76
+ "print(\"\\n\")\n",
77
+ "\n",
78
+ "print(f\"Total parameters: {total_parameter_count:,}\")"
79
+ ]
80
+ },
81
+ {
82
+ "cell_type": "markdown",
83
+ "metadata": {},
84
+ "source": [
85
+ "Next, we'll estimate the size of the model in memory and on disk. Note that this does not include any intermediate variables that get memorized during training such as activations, gradients, optimizer state, and temporary buffers. Actual memory consumption will likely be much higher."
86
+ ]
87
+ },
88
+ {
89
+ "cell_type": "code",
90
+ "execution_count": 65,
91
+ "metadata": {},
92
+ "outputs": [
93
+ {
94
+ "name": "stdout",
95
+ "output_type": "stream",
96
+ "text": [
97
+ "Total gigabytes: 1.82\n"
98
+ ]
99
+ }
100
+ ],
101
+ "source": [
102
+ "bytes_per_parameter = 32 // 8 # Assuming 32-bit floating point\n",
103
+ "\n",
104
+ "total_bytes = total_parameter_count * bytes_per_parameter\n",
105
+ "\n",
106
+ "total_gigabytes = total_bytes / 1e9\n",
107
+ "\n",
108
+ "print(f\"Total gigabytes: {total_gigabytes:,.2f}\")"
109
+ ]
110
+ },
111
+ {
112
+ "cell_type": "markdown",
113
+ "metadata": {},
114
+ "source": [
115
+ "Next, we'll estimate the maximum number of floating point operations (FLOPs) required to perform a full forward pass of the network on a single sample."
116
+ ]
117
+ },
118
+ {
119
+ "cell_type": "code",
120
+ "execution_count": 66,
121
+ "metadata": {},
122
+ "outputs": [
123
+ {
124
+ "data": {
125
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAAioAAAHtCAYAAAA3NVUiAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABGRklEQVR4nO3dd3gUVd/G8XtTCCUFhEAIhCJVwNBUULp0EEVQpChVQAQUEGkWQEWKIghSH0oUpVelSglFlBYIRURagNBrSCCQQHLeP3iyb5aABp7ATpLv57r2gp05u/vbzG5yz5lzZmzGGCMAAAALcnF2AQAAAPdDUAEAAJZFUAEAAJZFUAEAAJZFUAEAAJZFUAEAAJZFUAEAAJZFUAEAAJZFUAEAAJZFUAHSKZvNpkGDBj3w444dOyabzaagoKAUryk1etifI4DkIagAThQUFCSbzSabzabffvstyXpjjAICAmSz2fTSSy85ocL/3YkTJ/TOO++oQIEC8vDwUM6cOdW4cWNt3rzZ2aUl2/LlywkjgJMQVAALyJgxo2bOnJlk+YYNG3Ty5El5eHg4oar/3ebNm/X0009r1qxZatq0qcaPH6/3339ff/75p6pUqaKxY8c6u8RkWb58uQYPHnzPdTdu3NDHH3/8mCsC0g83ZxcAQGrQoIHmzZunMWPGyM3t/7+WM2fOVPny5XXx4kUnVvdwrly5otdee02ZMmXS5s2bVahQIfu6Xr16qW7duurRo4fKly+vF1544bHWdv36dWXJkiVFnitjxowp8jwA7o0eFcACWrRooUuXLmn16tX2ZbGxsZo/f75atmx5z8dcv35dH3zwgQICAuTh4aFixYrp66+/1t0XRI+JiVHPnj3l6+srLy8vvfzyyzp58uQ9n/PUqVNq3769cuXKJQ8PD5UsWVLTpk17qPc0adIknT17Vl999ZVDSJGkTJky6fvvv5fNZtNnn31mX55wKGzjxo3q3LmzsmfPLm9vb7Vu3VpXrlxJ8horVqxQlSpVlCVLFnl5ealhw4b6888/Hdq0bdtWnp6eOnLkiBo0aCAvLy+1atVKkrRp0ya9/vrrypcvnzw8PBQQEKCePXvqxo0bDo8fN26cJNkP09lsNvv6e41R2bVrl+rXry9vb295enqqZs2a2rJli0ObhPe6efNm9erVS76+vsqSJYteffVVXbhwwaHtjh07VLduXeXIkUOZMmVSwYIF1b59+3/bBECaQI8KYAEFChTQ888/r1mzZql+/fqS7vwRvnr1qpo3b64xY8Y4tDfG6OWXX1ZwcLA6dOigMmXKaNWqVfrwww916tQpjRo1yt727bff1o8//qiWLVvqhRde0Lp169SwYcMkNZw7d04VK1aUzWZTt27d5OvrqxUrVqhDhw6KjIxUjx49Hug9/fLLL8qYMaOaNWt2z/UFCxZU5cqVtW7dOt24cUOZMmWyr+vWrZuyZs2qQYMG6e+//9aECRN0/PhxrV+/3h4SZsyYoTZt2qhu3boaPny4oqOjNWHCBFWuXFm7du1SgQIF7M93+/Zt1a1bV5UrV9bXX3+tzJkzS5LmzZun6OhodenSRdmzZ9e2bds0duxYnTx5UvPmzZMkde7cWadPn9bq1as1Y8aMf33fCYe1vL291adPH7m7u2vSpEmqXr26NmzYoAoVKji07969u7Jly6aBAwfq2LFjGj16tLp166Y5c+ZIks6fP686derI19dX/fr1U9asWXXs2DEtXLgw+RsDSM0MAKeZPn26kWS2b99uvvvuO+Pl5WWio6ONMca8/vrrpkaNGsYYY/Lnz28aNmxof9zixYuNJPPFF184PN9rr71mbDabOXz4sDHGmNDQUCPJvPvuuw7tWrZsaSSZgQMH2pd16NDB5M6d21y8eNGhbfPmzY2Pj4+9rrCwMCPJTJ8+/R/fW9asWU3p0qX/sc17771nJJk9e/Y4/DzKly9vYmNj7e1GjBhhJJklS5YYY4yJiooyWbNmNR07dnR4vrNnzxofHx+H5W3atDGSTL9+/ZK8fsJ7Smzo0KHGZrOZ48eP25d17drV3O/X5d0/x8aNG5sMGTKYI0eO2JedPn3aeHl5mapVq9qXJbzXWrVqmfj4ePvynj17GldXVxMREWGMMWbRokX2zwiQHnHoB7CIZs2a6caNG1q6dKmioqK0dOnS+x72Wb58uVxdXfXee+85LP/ggw9kjNGKFSvs7SQlaXd374gxRgsWLFCjRo1kjNHFixftt7p16+rq1avauXPnA72fqKgoeXl5/WObhPWRkZEOyzt16iR3d3f7/S5dusjNzc3+flavXq2IiAi1aNHCoVZXV1dVqFBBwcHBSV6rS5cuSZYl7sW5fv26Ll68qBdeeEHGGO3atSv5b/a/4uLi9Ouvv6px48Z68skn7ctz586tli1b6rfffrvne018KKlKlSqKi4vT8ePHJUlZs2aVJC1dulS3bt164JqA1C7NBJWNGzeqUaNG8vf3l81m0+LFix/o8Tdv3lTbtm319NNPy83NTY0bN07S5syZM2rZsqWKFi0qFxeXB+4KB/6Jr6+vatWqpZkzZ2rhwoWKi4vTa6+9ds+2x48fl7+/f5Ig8NRTT9nXJ/zr4uKSZIxIsWLFHO5fuHBBERERmjx5snx9fR1u7dq1k3TnEMSD8PLyUlRU1D+2SVh/9/soUqSIw31PT0/lzp1bx44dkyQdOnRIkvTiiy8mqffXX39NUqubm5vy5s2b5PVPnDihtm3b6oknnpCnp6d8fX1VrVo1SdLVq1eT/2b/68KFC4qOjk7y85XubJv4+HiFh4c7LM+XL5/D/WzZskmSfUxOtWrV1LRpUw0ePFg5cuTQK6+8ounTpysmJuaB6wNSozQzRuX69esqXbq02rdvryZNmjzw4+Pi4pQpUya99957WrBgwT3bxMTEyNfXVx9//LHDGAAgpbRs2VIdO3bU2bNnVb9+ffve9KMWHx8vSXrzzTfVpk2be7YJDAx8oOd86qmntGvXLsXExNx3evWePXvk7u6eJJgkt94ZM2bIz88vyfrEM6ckycPDQy4ujvtlcXFxql27ti5fvqy+ffuqePHiypIli06dOqW2bdvaX+NRc3V1vedy899B0TabTfPnz9eWLVv0yy+/aNWqVWrfvr1GjhypLVu2yNPT87HUCThLmgkq9evXtw9CvJeYmBh99NFHmjVrliIiIlSqVCkNHz5c1atXlyRlyZJFEyZMkHTn3A8RERFJnqNAgQL69ttvJemhZ0IA/+TVV19V586dtWXLFvtgynvJnz+/1qxZk+TwyoEDB+zrE/6Nj4/XkSNHHPby//77b4fnS5gRFBcXp1q1aqXIe3nppZf0xx9/aN68eXrzzTeTrD927Jg2bdqkWrVqORyCke70mNSoUcN+/9q1azpz5owaNGggSfYeopw5cz50vXv37tXBgwf1/fffq3Xr1vbliWdeJUh8aOaf+Pr6KnPmzEl+vtKdbePi4qKAgICHqrdixYqqWLGihgwZopkzZ6pVq1aaPXu23n777Yd6PiC1SDOHfv5Nt27d9Mcff2j27Nnas2ePXn/9ddWrV8/ehQxYgaenpyZMmKBBgwapUaNG923XoEEDxcXF6bvvvnNYPmrUKNlsNntoT/j37llDo0ePdrjv6uqqpk2basGCBdq3b1+S17t7umxydO7cWTlz5tSHH36oo0ePOqy7efOm2rVrJ2OMPv300ySPnTx5ssN4jAkTJuj27dv291O3bl15e3vryy+/vOe4jeTUm9CTYRJN5zbG2HdGEks458q9dmDufs46depoyZIl9sNU0p0ZVTNnzlTlypXl7e39r7UlduXKlSRTzsuUKSNJHP5BupBmelT+yYkTJzR9+nSdOHFC/v7+kqTevXtr5cqVmj59ur788ksnVwj8v/sdekmsUaNGqlGjhj766CMdO3ZMpUuX1q+//qolS5aoR48e9h6HMmXKqEWLFho/fryuXr2qF154QWvXrtXhw4eTPOewYcMUHBysChUqqGPHjipRooQuX76snTt3as2aNbp8+fIDvY/s2bNr/vz5atiwocqVK6e3335bJUqU0NmzZxUUFKTDhw/r22+/vefJ3mJjY1WzZk01a9ZMf//9t8aPH6/KlSvr5ZdfliR5e3trwoQJeuutt1SuXDk1b95cvr6+OnHihJYtW6ZKlSolCXF3K168uAoVKqTevXvr1KlT8vb21oIFC+55vpby5ctLujMouW7dunJ1dVXz5s3v+bxffPGFVq9ercqVK+vdd9+Vm5ubJk2apJiYGI0YMeKBfoaS9P3332v8+PF69dVXVahQIUVFRek///mPvL297T1MQJrmtPlGj5Aks2jRIvv9pUuXGkkmS5YsDjc3NzfTrFmzJI9v06aNeeWVV/7xNapVq2bef//9lC0c6U7i6cn/5O7pycbcmaLbs2dP4+/vb9zd3U2RIkXMV1995TDV1Rhjbty4Yd577z2TPXt2kyVLFtOoUSMTHh6eZFqtMcacO3fOdO3a1QQEBBh3d3fj5+dnatasaSZPnmxvk9zpyYnbd+zY0eTLl8+4u7ubHDlymJdfftls2rTpvj+PDRs2mE6dOpls2bIZT09P06pVK3Pp0qUk7YODg03dunWNj4+PyZgxoylUqJBp27at2bFjh71NmzZtTJYsWe5Z2/79+02tWrWMp6enyZEjh+nYsaPZvXt3kvd3+/Zt0717d+Pr62tsNpvDVOV7/Rx37txp6tatazw9PU3mzJlNjRo1zO+//37P93r3tg8ODjaSTHBwsP25WrRoYfLly2c8PDxMzpw5zUsvveTwHoG0zGbMXX2KaYDNZtOiRYvsM3fmzJmjVq1a6c8//0wycM3T0zPJYLy2bdsqIiLiH2cOVa9eXWXKlEnShQ7g4QUFBaldu3bavn27nnnmGWeXA8AC0sWhn7JlyyouLk7nz59XlSpVnF0OAABIpjQTVK5du+Zw3D0sLEyhoaF64oknVLRoUbVq1UqtW7fWyJEjVbZsWV24cEFr165VYGCg/XTi+/fvV2xsrC5fvqyoqCiFhoZK+v+Ba5Lsy65du6YLFy4oNDRUGTJkUIkSJR7XWwUAIN1IM4d+1q9f7zCdMUGbNm0UFBSkW7du6YsvvtAPP/ygU6dOKUeOHKpYsaIGDx6sp59+WtKd6ccJJ8pKLPGP6F7TFPPnz+8wwh/Aw+HQD4C7pZmgAgAA0p50cx4VAACQ+hBUAACAZaXqwbTx8fE6ffq0vLy8kn2KawAA4FzGGEVFRcnf3z/JdbjulqqDyunTpx/6uhkAAMC5wsPD73ll88RSdVBJuBhbeHj4A18/AwAAOEdkZKQCAgIcLqp6P6k6qCQc7vH29iaoAACQyiRn2AaDaQEAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGURVAAAgGW5ObsA4HEr0G+Zs0tIt44Na+jsEgCkMvSoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAyyKoAAAAy3JqUBk0aJBsNpvDrXjx4s4sCQAAWIibswsoWbKk1qxZY7/v5ub0kgAAgEU4PRW4ubnJz8/P2WUAAAALcvoYlUOHDsnf319PPvmkWrVqpRMnTty3bUxMjCIjIx1uAAAg7XJqUKlQoYKCgoK0cuVKTZgwQWFhYapSpYqioqLu2X7o0KHy8fGx3wICAh5zxQAA4HGyGWOMs4tIEBERofz58+ubb75Rhw4dkqyPiYlRTEyM/X5kZKQCAgJ09epVeXt7P85SkYoV6LfM2SWkW8eGNXR2CQAsIDIyUj4+Psn6++30MSqJZc2aVUWLFtXhw4fvud7Dw0MeHh6PuSoAAOAsTh+jkti1a9d05MgR5c6d29mlAAAAC3BqUOndu7c2bNigY8eO6ffff9err74qV1dXtWjRwpllAQAAi3DqoZ+TJ0+qRYsWunTpknx9fVW5cmVt2bJFvr6+ziwLAABYhFODyuzZs5358gAAwOIsNUYFAAAgMYIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLDdnF2BlBfotc3YJ6daxYQ2dXQIAwALoUQEAAJZFUAEAAJZFUAEAAJZlmaAybNgw2Ww29ejRw9mlAAAAi7BEUNm+fbsmTZqkwMBAZ5cCAAAsxOlB5dq1a2rVqpX+85//KFu2bM4uBwAAWIjTg0rXrl3VsGFD1apV61/bxsTEKDIy0uEGAADSLqeeR2X27NnauXOntm/fnqz2Q4cO1eDBgx9xVQAAwCqc1qMSHh6u999/Xz/99JMyZsyYrMf0799fV69etd/Cw8MfcZUAAMCZnNajEhISovPnz6tcuXL2ZXFxcdq4caO+++47xcTEyNXV1eExHh4e8vDweNylAgAAJ3FaUKlZs6b27t3rsKxdu3YqXry4+vbtmySkAACA9MdpQcXLy0ulSpVyWJYlSxZlz549yXIAAJA+OX3WDwAAwP1Y6urJ69evd3YJAADAQuhRAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlvXAQWXnzp3au3ev/f6SJUvUuHFjDRgwQLGxsSlaHAAASN8eOKh07txZBw8elCQdPXpUzZs3V+bMmTVv3jz16dMnxQsEAADp1wMHlYMHD6pMmTKSpHnz5qlq1aqaOXOmgoKCtGDBgpSuDwAApGMPHFSMMYqPj5ckrVmzRg0aNJAkBQQE6OLFiylbHQAASNceOKg888wz+uKLLzRjxgxt2LBBDRs2lCSFhYUpV65cKV4gAABIvx44qIwaNUohISHq1q2bPvroIxUuXFiSNH/+fL3wwgspXiAAAEi/3B70AaVLl9a+ffuSLP/qq6/k6uqaIkUBAABID9Cjcv36dXXp0kV58uSRr6+vmjdvrgsXLtjXZ8yYUe7u7o+kSAAAkD4lO6h88sknmjFjhl566SW1bNlS69atU6dOnR5lbQAAIJ1L9qGfRYsWafr06Xr99dclSa1bt1bFihV1+/Ztubk98BEkAACAf5XsHpWTJ0+qUqVK9vvly5eXu7u7Tp8+/UgKAwAASHZQiY+PTzIGxc3NTXFxcSleFAAAgPQAh36MMapZs6bDYZ7o6Gg1atRIGTJksC/buXNnylYIAADSrWQHlYEDByZZ9sorr6RoMQAAAIn9T0EFAADgUXqo6Tp79uyxX0G5aNGiCgwMTNGiAAAApAcMKtu2bVOHDh20f/9+GWMkSTabTSVLltTUqVP17LPPPpIiAQBA+pTsWT/79+9XzZo1lSlTJv3444/auXOndu7cqRkzZsjDw0M1a9bU/v37H2WtAAAgnUl2j8qgQYNUu3ZtLViwQDabzb68TJkyatGihZo0aaJBgwZp7ty5j6RQAACQ/iQ7qAQHB2vFihUOISWBzWbTgAED1KBBgxQtDgAApG/JPvQTFRWlXLly3Xe9n5+foqKiUqQoAAAA6QGCSv78+bVt27b7rt+6davy58+fIkUBAABIDxBUmjdvrl69emnfvn1J1u3du1e9e/fWG2+8kaLFAQCA9C3ZY1T69++vNWvWqEyZMqpdu7aeeuopGWP0119/ac2aNXruuec0YMCAR1krAABIZ5Ldo5IxY0YFBwdryJAhOnPmjCZOnKhJkybp7Nmz+uKLLxQcHKyMGTM+0ItPmDBBgYGB8vb2lre3t55//nmtWLHigd8EAABIm5IdVCQpQ4YM6tu3r0JDQxUdHa3o6GiFhoaqX79+unDhgjp16vRAL543b14NGzZMISEh2rFjh1588UW98sor+vPPPx/oeQAAQNr0QEHln1y6dElTp059oMc0atRIDRo0UJEiRVS0aFENGTJEnp6e2rJlS0qVBQAAUrGHutbPoxAXF6d58+bp+vXrev755+/ZJiYmRjExMfb7kZGRj6s8AADgBCnWo/Kw9u7dK09PT3l4eOidd97RokWLVKJEiXu2HTp0qHx8fOy3gICAx1wtAAB4nJweVIoVK6bQ0FBt3bpVXbp0UZs2be57zaD+/fvr6tWr9lt4ePhjrhYAADxOyT7006RJk39cHxER8VAFZMiQQYULF5YklS9fXtu3b9e3336rSZMmJWnr4eEhDw+Ph3odAACQ+iQ7qPj4+Pzr+tatW//PBcXHxzuMQwEAAOlXsoPK9OnTU/zF+/fvr/r16ytfvnyKiorSzJkztX79eq1atSrFXwsAAKQ+yQ4qR48eVcGCBe959eSHdf78ebVu3VpnzpyRj4+PAgMDtWrVKtWuXTvFXgMAAKReyQ4qRYoU0ZkzZ5QzZ05J0htvvKExY8b84xWV/82DnncFAACkL8me9WOMcbi/fPlyXb9+PcULAgAASOD06ckAAAD3k+ygYrPZkoxPScnxKgAAAHdL9hgVY4zatm1rP4/JzZs39c477yhLliwO7RYuXJiyFQIAgHQr2UGlTZs2DvfffPPNFC8GAAAgMaeeRwUAAOCfMJgWAABYFkEFAABYFkEFAABYFkEFAABYVrKCSrly5XTlyhVJ0meffabo6OhHWhQAAICUzKDy119/2U+XP3jwYF27du2RFgUAACAlc3pymTJl1K5dO1WuXFnGGH399dfy9PS8Z9tPP/00RQsEAADpV7KCSlBQkAYOHKilS5fKZrNpxYoVcnNL+lCbzUZQAQAAKSZZQaVYsWKaPXu2JMnFxUVr165Vzpw5H2lhAAAAyT4zbYL4+PhHUQcAAEASDxxUJOnIkSMaPXq0/vrrL0lSiRIl9P7776tQoUIpWhwAAEjfHvg8KqtWrVKJEiW0bds2BQYGKjAwUFu3blXJkiW1evXqR1EjAABIpx64R6Vfv37q2bOnhg0blmR53759Vbt27RQrDgAApG8P3KPy119/qUOHDkmWt2/fXvv370+RogAAAKSHCCq+vr4KDQ1Nsjw0NJSZQAAAIEU98KGfjh07qlOnTjp69KheeOEFSdLmzZs1fPhw9erVK8ULBAAA6dcDB5VPPvlEXl5eGjlypPr37y9J8vf316BBg/Tee++leIEAACD9euCgYrPZ1LNnT/Xs2VNRUVGSJC8vrxQvDAAA4KHOo5KAgAIAAB6lBx5MCwAA8LgQVAAAgGURVAAAgGURVAAAgGU9VFDp1q2bLl++nNK1AAAAOEh2UDl58qT9/zNnztS1a9ckSU8//bTCw8NTvjIAAJDuJXt6cvHixZU9e3ZVqlRJN2/eVHh4uPLly6djx47p1q1bj7JGAACQTiW7RyUiIkLz5s1T+fLlFR8frwYNGqho0aKKiYnRqlWrdO7cuUdZJwAASIeSHVRu3bql5557Th988IEyZcqkXbt2afr06XJ1ddW0adNUsGBBFStW7FHWCgAA0plkH/rJmjWrypQpo0qVKik2NlY3btxQpUqV5Obmpjlz5ihPnjzavn37o6wVAACkM8nuUTl16pQ+/vhjeXh46Pbt2ypfvryqVKmi2NhY7dy5UzabTZUrV36UtQIAgHQm2UElR44catSokYYOHarMmTNr+/bt6t69u2w2m3r37i0fHx9Vq1btUdYKAADSmYc+4ZuPj4+aNWsmd3d3rVu3TmFhYXr33XdTsjYAAJDOPdTVk/fs2aM8efJIkvLnzy93d3f5+fnpjTfeSNHiAABA+vZQQSUgIMD+/3379qVYMQAAAIlxrR8AAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZBBUAAGBZTg0qQ4cO1bPPPisvLy/lzJlTjRs31t9//+3MkgAAgIU4Nahs2LBBXbt21ZYtW7R69WrdunVLderU0fXr151ZFgAAsAg3Z774ypUrHe4HBQUpZ86cCgkJUdWqVZ1UFQAAsAqnBpW7Xb16VZL0xBNP3HN9TEyMYmJi7PcjIyMfS10AAMA5LDOYNj4+Xj169FClSpVUqlSpe7YZOnSofHx87LeAgIDHXCUAAHicLBNUunbtqn379mn27Nn3bdO/f39dvXrVfgsPD3+MFQIAgMfNEod+unXrpqVLl2rjxo3Kmzfvfdt5eHjIw8PjMVYGAACcyalBxRij7t27a9GiRVq/fr0KFizozHIAAIDFODWodO3aVTNnztSSJUvk5eWls2fPSpJ8fHyUKVMmZ5YGAAAswKljVCZMmKCrV6+qevXqyp07t/02Z84cZ5YFAAAswumHfgAAAO7HMrN+AAAA7kZQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAlkVQAQAAluXUoLJx40Y1atRI/v7+stlsWrx4sTPLAQAAFuPmzBe/fv26Spcurfbt26tJkybOLAUAYGEF+i1zdgnp1rFhDZ36+k4NKvXr11f9+vWdWQIAALAwpwaVBxUTE6OYmBj7/cjISCdWAwAAHrVUNZh26NCh8vHxsd8CAgKcXRIAAHiEUlVQ6d+/v65evWq/hYeHO7skAADwCKWqQz8eHh7y8PBwdhkAAOAxSVU9KgAAIH1xao/KtWvXdPjwYfv9sLAwhYaG6oknnlC+fPmcWBkAALACpwaVHTt2qEaNGvb7vXr1kiS1adNGQUFBTqoKAABYhVODSvXq1WWMcWYJAADAwhijAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALIugAgAALMsSQWXcuHEqUKCAMmbMqAoVKmjbtm3OLgkAAFiA04PKnDlz1KtXLw0cOFA7d+5U6dKlVbduXZ0/f97ZpQEAACdzelD55ptv1LFjR7Vr104lSpTQxIkTlTlzZk2bNs3ZpQEAACdzc+aLx8bGKiQkRP3797cvc3FxUa1atfTHH38kaR8TE6OYmBj7/atXr0qSIiMjH0l98THRj+R58e8e1TaV2K7O9Ci3K9I2vrfO8yi+twnPaYz517ZODSoXL15UXFyccuXK5bA8V65cOnDgQJL2Q4cO1eDBg5MsDwgIeGQ1wjl8Rju7AjwKbFcg9XmU39uoqCj5+Pj8YxunBpUH1b9/f/Xq1ct+Pz4+XpcvX1b27Nlls9mcWJm1REZGKiAgQOHh4fL29nZ2OUhBbNu0ie2adrFt780Yo6ioKPn7+/9rW6cGlRw5csjV1VXnzp1zWH7u3Dn5+fklae/h4SEPDw+HZVmzZn2UJaZq3t7efDHSKLZt2sR2TbvYtkn9W09KAqcOps2QIYPKly+vtWvX2pfFx8dr7dq1ev75551YGQAAsAKnH/rp1auX2rRpo2eeeUbPPfecRo8erevXr6tdu3bOLg0AADiZ04PKG2+8oQsXLujTTz/V2bNnVaZMGa1cuTLJAFskn4eHhwYOHJjkMBlSP7Zt2sR2TbvYtv87m0nO3CAAAAAncPoJ3wAAAO6HoAIAACyLoAIAACyLoAIAACyLoAIAACyLoAIAACyLoAIAQCoXHx/v7BIeGYIKkkg4tQ6n2Ek92FZA+ubicufPea9evTR27FgnV5OyCCpwYIyRzWbTmjVr1K9fP3Xt2lVHjhzR7du3nV0aErl774mrh+N+0vKeNhx3UlatWqUFCxYoMDDQiRWlPM5MiyRWrlypl19+WfXq1dOePXt08+ZNffvtt2rUqJEyZ87s7PLSvYQwKUkTJkzQgQMHFBUVpbZt2+rZZ59VpkyZnFwhnG3v3r0yxqhUqVL2PW2kbUuWLNHy5cuVL18+ffTRRw6/J1I7PsGQ9P+pPCIiQitWrNC4ceP0888/69ixY6pTp44++OADLVq0SNHR0U6uNH2Lj4+3//Lp27evPvroIx0/flxhYWGqWbOmhg8frtOnTzu5SjhTv379VKNGDTVs2FBly5bVoUOHnF0SHrG///5bI0aM0Jw5c3T9+nVJd3pZ00o/BEEFku58qLdt26ann35aW7ZsUYECBezrfvjhB7344ovq06ePlixZYv8i4PEyxtj3jk+fPq0rV65o1apVWrx4sYKDgzV69GiNHTtWM2fOlESXf3q0ceNG/fzzz5o5c6amTZumXLlyqXr16tq+fbuzS0MKujuAFCtWTH369FGpUqU0a9Ysbd68WVIaOiRsgERq1qxpbDabmThxorl9+7bDunbt2pmMGTOauXPnmvj4eCdVmP4sW7bM4f6MGTNM5syZTbFixcyBAwcctsXXX39tMmfObMLCwh5zlbCCkJAQM3z4cPv969evmwYNGhh/f3+zfft2J1aGlBIXF2f//5UrV8zp06ft91evXm1q1Khh6tevb7Zs2eKM8h4JelTgYM2aNapZs6YGDx6sDRs2KC4uzr5u2rRpatu2rcqUKZN2krrFLVmyRC+99JK+++47SXf2pPz9/VWtWjWdOHFCMTExstlsunHjhiSpbdu2ypYtm3bt2uXMsvGYDR8+XG+++aZef/117du3Tzdv3pQkZc6cWfPnz1fZsmXVpEkT/f77706uFP8Lk6hXdciQIWrYsKEqVaqkF198UWvWrFGtWrXUq1cvxcXF6fPPP9e2bducXHEKcXZSgnMk7IXv3bvXLF682Kxevdr8+eef9vXVqlUzefPmNWvXrk3Ss4LHJzo62nz11VfG1dXVfPvtt8aYO3tUv/32m6lQoYLJnz+/OX/+vL39yZMnTd68ec3PP//srJLxmI0aNcp4eXmZTp06mQoVKhhPT0+zePFiExMTY28THR1tKlSoYF5++WUnVoqUMnDgQJMrVy7z448/mhMnTpgCBQqYcuXKmRMnThhjjFmyZImpV6+eqVChgsPv9dSKoJKOzZ8/32TPnt0EBgaabNmymbJly5pRo0bZ11erVs0ULFjQrFy5krDiRDdu3DAjRowwNpvNjB492hhzJ2hu3rzZPPfccyZPnjxm6tSp5qeffjINGzY0pUuXZnulE3v37jUdO3Y0a9eutS9r3Lix8fX1NUuXLjWxsbH25TExMQ6HDZA6nTx50jzzzDNmyZIlxhhjgoODjaenp5k0aZJDu9mzZ5sePXqkiW1OUElHEv/xCg0NNdmyZTPjxo0zUVFRJiQkxPTt29fkyZPHvudujDHlypUzJUuWNNevX3dGyenW3b9c4uPjzdChQ43NZrOHyYSwUqVKFWOz2cybb75pxo4da99WhJW07ZdffjFPPPGECQgIMOvXr3dY98orrxhfX1+zfPlyh54VY5J+tmBtd2+vQ4cOmaJFixpj7oxf8/T0NBMmTDDGGBMVFWWmTp1qbty48Y/PkdoQVNKBKVOmJFn2008/mXLlyjl8oMPDw80HH3xgKlasaI4ePWpffuzYscdSJ+5I/Etl+fLlZs6cOebvv/828fHxZuTIkUnCysaNG029evVM8eLFzblz54wxd7r6kXaNHj3aHDt2zHTu3Nl4eHiYwYMHm6tXrzq0adq0qbHZbOb33393UpVISVu3bjXG3OkZCwwMNG+99Zbx9vY2kydPtrf566+/TOXKlc2qVaucVeYjQVBJ43bv3m1KlizpEDyMMWbRokXGz8/P/PXXXw7Lf/vtN5MlSxbzxx9/PM4ycQ/9+vUzWbJkMYULFzZubm5m3Lhx5uzZs+abb75xOAwUFxdnNm3aZKpUqWICAwMdZgEg7QkKCjI2m83+3W3fvr0pXLiwmTp1qomKinJo279/f3rW0oCtW7cam81m7zkbNmyY8fX1NS1atLC3uXHjhmnYsKGpV69equ9BuZubswfz4tEqVqyYNm/eLB8fH+3cuVPlypWTJOXJk0deXl5auHChOnfurOzZs0uSChcurIIFCyo2NtaZZadL5r9nkjTG6Pjx4/rtt9+0evVqFStWTNOmTVO3bt0UFRWlNm3ayGaz6cMPP1RUVJQ+/vhjVa5cWSNGjFCnTp3UpEkTbd68WTabjdlZacyyZct048YN/fTTTypevLgkaerUqWrTpo2GDx8uSWrWrJk8PT0lSV9++aUk6fbt23Jz49d9auXv76/q1atrz549qlatml599VX9/fff2rBhg1q2bClfX1/t3r1bly5d0s6dO+Xi4qL4+Pi0c1ZiZyclPDqJU/W5c+eMv7+/qV27tn3ZkCFDjI+Pj/nss89MSEiIuXTpkunTp4/Jmzcve+WPWeJtdenSJXPw4EHTr18/h73h0aNHG5vNZoYPH27OnDljPvvsM1O5cmV7m/j4eLN161YO1aVRe/bsMV5eXsZms5np06cbY4zDods2bdqYp556yowZM4ZDf6nY/XpDPv74Y5MzZ077Ib4jR46YoKAgU61aNdOqVSvTr18/c+vWLWOMsf+bVhBU0piED3niX2AJXcRz5swxRYsWNY0aNbKv+/LLL03JkiVN1qxZTWBgoPH39zc7d+58vEXDbsCAAebZZ581Pj4+JjAw0Bw4cMBh/ejRo42bm5v5+OOPzaVLl+zTzOneT/uuXLlipk+fbgICAsyrr75qX37z5k37/19++WXTvHlzTsiYBhw5csRERETY71+9etU8++yzZvjw4f94aCct/i4gqKRBR44cMe3btzdhYWFm7ty5xmazmQMHDpjo6GizYMECU7BgQfPSSy/Z2+/du9esXbvWLF++3Jw8edKJlac/iX/hzJo1y+TOnduMGTPG9OjRw2TOnNn07t07SQ/JF198YSpVqmT/Y8QfpbQvYRtfu3bNBAUFGR8fH9O+fXv7+sRhJeEzxeci9Vq4cKHJnDmzadKkiVmwYIG9h+S9994ztWrVsre7fft2utjOBJU0aMuWLSZHjhymcuXKxsPDwwQFBdnX3bhxwx5WEveswLnWr19v3n33XfP999/bl40bN87kzZvX9O3bN0lYIaSkD5MmTTI9evQwr776qvnll1/MpUuXjDF3BtTmypXLvP322/a2iachp7XBlGndvb7HQUFB5oMPPjAeHh6mcePGZsKECebgwYPG29vbzJgxwwlVOg9BJY1J+MB/8803xsXFxTz33HNJZvbcvHnTLFiwwBQtWtTUqFHDGWUikTNnzphChQoZT09P+0yeBN99953JmzevGTBggDly5IjDOkJK2ta7d2+TI0cO06pVK1OnTh2TLVs20717d3PkyBETFxdnvv/+e5MnTx7TtGlTZ5eK/0HiUBkbG5vkvDchISFmwIABpnDhwqZUqVImR44cpkWLFiY2NjbdBFKCShqTcHxy/PjxZsiQIaZw4cKmWbNmSS5IFh0dbWbPnm1KlSplwsPDnVEqEtm9e7cpWrSoqV27ttmzZ4/DuvHjxxtXV1f7SZ2Q9m3YsMHkzZvX4Xs7depUExgYaPr27Wvi4+NNRESEmTBhgmnUqFG6+YOV1iTebmPHjjWNGzc2jRs3Np9++mmSdrdu3TJDhgwxjRo1Mh4eHiYkJORxl+s0BJU0ImHvOjIy0mH5hg0bzJNPPmlef/11hw92wsmD7j7vApwnNDTUlC1b1nTs2NHs27fPYd2CBQvS5CA53Nuvv/5q8ufPb+89STB+/HiTJUsWc+jQIWPMnR2OhO8+YSV1Sdwj2rdvX5M7d24zYMAAM2zYMJMlSxbzzjvv2NcnvhRCRESEee2110zbtm1NbGxsuuhZTSOTrGGz2bR8+XI1a9ZMDRo00OjRo3Xx4kVVrVpV06dP186dOzVixAgtXrxYn332mSpWrKhz587Zz7cA5ytdurSmTp2qkJAQffvtt9q/f799XZMmTeTq6upwNWukXbdu3dKVK1cUGxsrFxcX+9Wx3377bfn4+Gjr1q2SpEyZMtnPvZNmzpmRxl29elWS7Oc4mj9/vhYtWqSFCxdqyJAhKl68uOLi4jR58mQ1a9ZMkuTu7q7bt29Lknx8fFS2bFmdOHFC7u7u6eJcSXyy04gtW7bo1Vdf1dNPPy1jjObOnavu3bvr3Llzqlq1qoKCgnT48GENGjRIQUFB2r59u3LlyuXssnGXsmXLasqUKQoNDdXAgQMVFhbmsN7V1dVJleFRi4mJsf+/QYMGeuaZZ/TKK6/o2rVrypQpkyTpwoUL8vT01BNPPOHw2PTwxyot6NSpk0aMGKHz589LunOSx6ioKHXo0EEVK1bUsmXL1K5dO40cOVLz58/X/Pnz1aVLF0mSm5ubjDGSpOjoaJ08eVKRkZFOey+Pk80kvHOkWgcPHtQvv/wiY4x69+4tSZo8ebJmzJih3Llza+zYscqVK5eOHz+ua9euKXv27PLz83Ny1fgn27Zt08SJEzVlyhT2lNOBUaNGae3atQoICFCtWrXUtGlTHTx4UC1bttS5c+c0dOhQ2Ww2zZw5U2fPntW2bdsIralQz549tWjRInXp0kVt2rSRn5+fYmNjderUKfn4+KhOnTp67bXX1K9fPx09elTVqlXTqVOn1LdvXw0dOlSSFB4ern79+ql3794qW7ask9/RY+LEw05IAYcOHTJVq1Y1efLkcRhsGRcXZyZNmmQqVapkmjdvbs6cOePEKvEwGHuQPowYMcJkzZrVvP/+++aZZ54xFSpUMF9//bUx5s6MsJYtW5pChQqZ0qVLm0aNGtnHKzBmKfVIPI5k4MCBJiAgwAwdOtThDOChoaGmUKFC9pM8njhxwrRt29b8/vvvSbZ1eruaPRd/SOXy5s2rKlWqKCwsTEuXLlX79u2VIUMGubi4qGPHjnJ1ddWoUaM0YMAA9s5TGcYepH1bt27VuXPnNH/+fNWsWVMnT57UqFGjNGPGDMXFxalPnz766aefFB4eLi8vL/n4+Mhms3HtnlTGZrPZr70zaNAgGWM0fvx4SVK7du2UK1cuZc+eXWfPntX48ePVpk0b9e/fXy4uLqpYsaJsNpvi4uLk4uIim82mzJkzO/kdPV580lMZ898L10lSXFycMmbMqE8++UReXl6aNWuW+vbtqyFDhihz5syy2Wxq37693NzcVK1aNf7gpUKMPUi7fvnlFw0YMECxsbHq1KmTpDs7Hj169JDNZtPs2bMVFxen/v37KyAgwP64+Ph4QkoqlPhCgYMHD5Yke1hp06aN8ubNqzFjxqhXr15asWKFsmfPro0bN9p3WNLzoT4+7alIQkhZt26dli5dqrCwMNWoUUOtW7fWBx98oFu3bmnp0qUaMGCAvvzyS3tYadOmjbNLB3CXfPnyqVSpUlq2bJlWrFihokWLSpICAgLUo0cPubi4aNy4cQoICNCbb75pfxw7HKnXP4WVzp07q3379mrQoIHOnj2rwMBAubi40HsmBtOmOosWLVLbtm3VtGlTFSxYUEOGDFGDBg00bdo0ZcqUScOHD9evv/6qp556SmPGjLHPFgBgPYcOHdLgwYN16NAhdenSRW3btrWvO378uJYuXap33nknXe9Np0UJYUWSBg4cqOnTp6tbt25q1aqV8uTJc8926RlBJRUJDw9X/fr19e677+rdd9+VMUbZsmVTp06dNGzYMLm4uCgmJkYDBw5USEiIfvzxR6YgAxa3f/9+DR06VEePHlXHjh0dwkqCuLg4wkoakziEDB48WEOGDNG0adMces9wB0HFwhKPR5Hu7GG9/vrr2rx5s06cOKGqVauqYcOGmjx5sqQ7U1qfe+45xcTEKCoqSjly5HBW6QAewP79+zVs2DAdO3ZMzZs317vvvuvskpACoqOj/3Hga+KwMnXqVLVt25ZAeg/0KVlMfHy8JCk2NtYeUk6ePKmYmBjdvHlTp0+f1qpVq1S3bl01bNhQEyZMkCTt2bNHn3/+uUJCQuTh4UFIAVKREiVKqF+/fvL29taePXvE/mPqtHr1avtZhAcNGqTJkyfbf6ffi4uLi/1s0x06dJCrq6tu3br1WGpNTQgqFuPi4qITJ06oa9euunHjhhYvXqyKFSvq9OnTKlasmOrVq6cmTZro6aef1uTJk+3pe86cObpw4YL8/f2d/A4A3C05waNEiRIaO3asxo8fb5/pgdTj3Llz6t27typUqKBu3bpp2LBhqlWr1r+OMUm8PioqSu7u7o+61FQnfQ8ltqj169drz549qlevnrZs2aLp06erYMGCkqTmzZsrLCxMZ86c0fLlyxUfH69169Zp6tSp2rRpk3Lnzu3k6gEkmD9/vsqVK6cnn3zyX9vGx8fbv+cMokx9cubMqaCgINWqVUvTpk1TcHCwSpUqpdjYWGXIkOGej0l8eH/06NEaN26cdu/ene7Ok/Jv+CZYSMIeVOvWrfXiiy9q06ZNKleunOrUqWNvU6tWLfXs2VNFihTRa6+9pgEDBig0NFSbNm1SYGCgs0oHcJf+/furR48eWrJkif1wwP0kPrHfli1bdPHixcdRIlJAwqEdm80mFxcXZc+eXfny5dP777+vyMhIZciQwX5BwbsflxBSJk2apC+++EKDBw8mpNwDg2mdLGHP6datW/Yuvx07dmjlypW6cuWKQkNDlTNnTn355Zf2va0EJ06cULZs2SRJXl5ej712APf2xRdf6Ntvv9Xy5ctVokQJZcmS5b5tE+9Vjx8/Xj179tSOHTv09NNPP65ykQJ27NihZ555RleuXNHBgwfVtWtXGWMUHBwsb29ve7uoqCiH39eTJk1Snz59NG3aNDVt2tQZpVvfYzxdP+7j6NGjpnbt2ub27dtmzpw5xt/f32zevNkYY8zkyZNNlSpVzBtvvGHCwsLsj9m3b5+5ceOGkyoGcD9XrlwxderUMT/88IMxxpjw8HATHBxsWrVqZSZOnGgOHz5sb5v4GjATJ0402bJlM3Pnzn3sNeN/s3XrVmOz2cw333xjjLlzHabg4GDzzDPPmGeffdZcvXrVGGNM+/btzaRJk+yPmzRpkvHx8THz5893St2pBUHFAsLDw01AQIApWbKksdlsJigoyGH9f/7zH1OtWjXTrFkzs3v3bjNo0CBToEABc+XKFecUDOC+IiIiTL58+UzPnj1NcHCwee2110zFihVN5cqVTfbs2c3w4cONMcbcunXL/piJEycab29v/mClYl999ZXJkCGDGTVqlDHm/8PKs88+a3LkyGEqV65s8uXLZ9/uM2fONDabzSxYsMCJVacOBBWLmDBhgrHZbKZIkSL29J34qrlBQUGmatWqxt/f3+TLl89s3brVWaUC+Bc//PCDyZ49u/Hx8TF9+/Y1a9asMcYY07FjR9OiRQuHthMmTDDZsmUjpKQSiXvB7r4/cuRIY7PZ7GElLi7OHD582Hz++efmk08+sYeUmJgYs2bNGrNy5crHVndqxhgVi9i0aZN27NihKVOmKEuWLFqwYIECAgIczkh54sQJHT58WEWKFHG4SBkA55o2bZr9u1m7dm3lzZtXJ0+e1M2bN1W4cGFJd8aj1a1bV+XLl9ewYcNkjNHWrVv1wgsvaN68eYxPSGWGDRumggUL6o033nAYZzRy5Eh9+OGH+u677+554r6Ea/eYu07oifsjqDhJwof0wIEDunr1qtzc3FS+fHmdOHFCDRo0UKZMmbR48WL7dR9WrVql6tWry8PDw8mVA0jsk08+0cSJE1W4cGFFRkYqT548GjlypH0wbFRUlHbu3Kmvv/5ax48f186dOx0uMrd3714GzqYyMTEx6t69u6ZMmaJFixbplVdesf9Oj4mJUYsWLfTzzz9r2LBh6t27t7PLTfWYnuwECR/oxYsXq379+mrbtq2qVKmidu3ayd3dXStWrNDNmzf1yiuvaP369erfv7/eeustnT9/3tmlA0jk1q1bOn78uFatWqU//vhDQ4cOlZubmzp27Kg///xT0p3ZIN98841u376tkJAQubm5KS4uzn5GUkKK9d19dlkPDw999dVX6tGjh5o2barFixfbe0c8PDxUsGBBlS9fXkuWLOHEfSnBSYec0r1Vq1aZrFmzmkmTJpmYmBizfPlyY7PZzBtvvGHCw8PN2bNnTbly5UyhQoVMgQIFTEhIiLNLBpDI3r17zV9//WVefPFFc+jQIfvylStXmvr165uKFSuaAwcOGGOM2bVrl33MWeJBtLC+xGMF//77bxMSEmJiY2PtY1O6d+9u3NzczIIFC8ytW7fMrVu3zOuvv25WrVplf9zd41rwYDj04wSRkZH68MMPlSdPHn366acKCwtT7dq1VbZsWa1evVpVq1bV5MmT5efnpz179sjPz085c+Z0dtkA/qtfv36aPHmysmfPrkuXLmndunUqU6aMff2qVas0duxYHThwQGvWrFGBAgUkccbZ1KxPnz6aO3euzp07p1KlSqlmzZr69NNPlTlzZn344YcaOXKkXnjhBV2+fFkZMmTQjh07GIuSQggqThAbG6slS5aoXLlyypYtm2rVqqVy5cppypQpmjVrllq1aqXatWtr8uTJyp8/v7PLBZDIb7/9pjfffFNTpkzR0aNHNXPmTB08eFDBwcEqVqyYvd2SJUu0adMmDR8+nCvipkKJQ+WsWbPUv39/fffdd/Lz89PcuXO1adMmFSlSRJMmTVKmTJm0YMECbd26VZkzZ9bHH39sP8THtv/fEVSc5ObNm8qYMaN+/PFHjR8/XnPnzlXevHk1e/ZsTZo0SWFhYdq4caPy5cvn7FIB/NfEiRN18+ZN3bhxQ/3795ck7dq1S5988on27Nmj1atXO4SVBPzBSr0WLVqkvXv3KmPGjOrTp4+kOzub06dP1+TJk9W5c2d16tQpyeMSZvfgf0cfpJNkzJhRkhQWFqaoqCj7KbZ3796tpk2b6tChQ4QUwEJ69+6td999V7169VJYWJh9edmyZfX555+rdOnSqlevnn0QbWKElNTBGGMf5CxJ169fV/PmzTVo0CAdPXrUvjxDhgzq3Lmz/Pz8tGzZsns+FyEl5RBUnOyll17SoUOH1KhRI9WqVUvjx49X1apVudQ3YCFz587VnDlzNHPmTL311luaM2eOdu3aZV+fEFb8/Pz0ySefOLFS/C/Cw8PtoXL+/PlycXHR8ePHVahQIa1bt04hISEOM4CqVq2qiIgIRUdHO6vkdIGg4mRly5ZVcHCwChYsqOLFi+v333/nKsiAhWzYsEHr1q1Tv3791Lx5cw0ZMkQvvviiatWqpb1799rblSlTRj/88IPmz5/vxGrxsLZt26bq1atr1apV+vDDD9WxY0edPXtWfn5+2rBhg6KiotSzZ09t2rRJ0dHRioiI0M8//yw/Pz+uePyIMUbFIhIu+c3ocMA6zp49q8qVK+v8+fP66KOP1LdvX0nSqVOn1LVrV23evFnBwcEqVaqUw+OY3ZP67N69WxMmTNDChQt1+/Zt7d69WwEBAYqJiZGHh4dOnTql5557ThERESpTpoz8/Px05swZrV+/XhkyZGB2zyPEN8kiXFxc+JADFuPn56eFCxcqV65c+vnnn+2He/LkyaNx48apatWqCgwMdBi/IImQkookHMopXbq0AgICdPHiRWXNmlV79uyRdOcEbrGxscqTJ4927Nghf39/HTlyRB06dNCmTZuUIUMGxcbG8vv7EeLbBAD/IDAwUAsWLFB0dLTGjx+vffv2SboTVkaNGqW+fftyGoFULHGobNKkiZYsWaJ69eqpd+/e9sN47u7uiouLU+7cufXbb79JunNNn7CwMMXFxSlDhgxOqT294NAPACTDrl279Pbbb6t8+fJ6//33VbJkSYf1TEdNvcaMGaPZs2fr999/lyRt375dkyZN0ubNmzVkyBA1adJEkjRu3Di9/fbbunDhgp5//nnlypVLs2bNUpEiRZxZfppHjwoAJEPZsmU1ZcoUhYaGJpmuKjEdNbUyxqhIkSI6cuSI6tSpI0l69tln9c4776hKlSr68MMP9eWXX6phw4YaM2aMJClv3rzavHmzoqKi6E15DOhRAYAHsG3bNk2cOFFTpkxhLEoqdK+BznFxcdqwYYPeeustFS9eXGvXrpV0Z4Dt7NmztWzZMhUsWFDz58+Xu7u7fYAtJ/J7PAgqAPCAEmZ4MLsn9Vq6dKleeukl+/24uDitX79erVu3VokSJbR69WpJd66QHRMToyxZsshmszkc4mOmz+PBNwwAHpDNZpMxhpCSSu3bt09NmjTRW2+9ZV/m6uqqqlWr6rvvvtPatWvVvHlzSXcG0np6etqDaeJDfISUx4NvGQA8BP5IpR6JzyYrScWKFdOUKVP022+/qXXr1vbl7u7ueu6551SkSBHNnTtX3bt3d3gcwdQ5OPQDAEizEh+eW7Jkic6cOaNMmTKpTJky2rNnjwYPHqznn39eM2bMkCRdvnxZH3zwgd5++21VrFiRMSgWQFABAKR5vXv31vfff69ixYopNDRU5cuXV61atVSgQAF9+umnevLJJ9WiRQvNnDlTbm5uWrlypVxcXBgwawH0YwEA0rT58+dr5syZWrlypTZt2qSTJ0+qaNGi2rRpky5fvqygoCBFREToP//5jzw8PLRs2TK5uLgoPj6ekGIB9KgAANK0ESNGaOHChdq0aZNcXV3l4uKic+fOqUuXLoqOjtbKlSslSRcvXlT27NmTzO6Bc9GjAgBIkxL2w93c3HTz5k3FxsbKxcVFt2/fVq5cuTRgwAD9+uuvCgkJkSTlyJHDPqOLkGIdBBUAQJqUMDOrXr162rdvn77++mtJ/38W4bi4OJUqVUrZsmW75+NgDURGAECaVqJECU2dOlUdO3ZUZGSkmjZtqmzZsmnw4MHKmjWrChQo4OwS8Q8YowIASBcWLlyobt26yWazKXPmzMqZM6fWr18vd3d3zjJsYQQVAEC6cfbsWZ07d06xsbEqX768fcwKY1Ksi6ACAEi36EmxPoIKAACwLGIkAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwLIIKAACwrP8DP8H9nuMh23IAAAAASUVORK5CYII=",
126
+ "text/plain": [
127
+ "<Figure size 640x480 with 1 Axes>"
128
+ ]
129
+ },
130
+ "metadata": {},
131
+ "output_type": "display_data"
132
+ },
133
+ {
134
+ "name": "stdout",
135
+ "output_type": "stream",
136
+ "text": [
137
+ "attention 412,316,860,416 38.63%\n",
138
+ "mlp 549,756,993,536 51.50%\n",
139
+ "rms_norm 236,544 0.00%\n",
140
+ "output_layer 105,396,568,064 9.87%\n",
141
+ "\n",
142
+ "\n",
143
+ "Total forward FLOPs: 1,067,470,658,560\n"
144
+ ]
145
+ }
146
+ ],
147
+ "source": [
148
+ "ops_per_matmul = 2 # Multiply + accumulate (MAC)\n",
149
+ "ops_per_activation = 9 # Assuming GELU\n",
150
+ "ops_per_rms_norm = 7 # y = (x / sqrt(rms[x] + epsilon)) * gamma\n",
151
+ "\n",
152
+ "# K, Q, V projections\n",
153
+ "attention = ops_per_matmul * block_size * embedding_dimensions * 3 * embedding_dimensions\n",
154
+ "\n",
155
+ "# Attention logits\n",
156
+ "attention += 2 * ops_per_matmul * block_size ** 2 * embedding_dimensions\n",
157
+ "\n",
158
+ "# Output projection\n",
159
+ "attention += ops_per_matmul * block_size * embedding_dimensions ** 2\n",
160
+ "\n",
161
+ "attention *= num_hidden_layers\n",
162
+ "\n",
163
+ "# Linear transformations\n",
164
+ "mlp = 2 * ops_per_matmul * block_size * embedding_dimensions * 4 * embedding_dimensions\n",
165
+ "\n",
166
+ "# Non-linear activations\n",
167
+ "mlp += ops_per_activation * 4 * embedding_dimensions\n",
168
+ "\n",
169
+ "mlp *= num_hidden_layers\n",
170
+ "\n",
171
+ "rms_norm = ops_per_rms_norm * embedding_dimensions * (num_hidden_layers + 1)\n",
172
+ "\n",
173
+ "output_layer = ops_per_matmul * block_size * embedding_dimensions * vocabulary_size\n",
174
+ "\n",
175
+ "flops = {\n",
176
+ " \"attention\": attention,\n",
177
+ " \"mlp\": mlp,\n",
178
+ " \"rms_norm\": rms_norm,\n",
179
+ " \"output_layer\": output_layer,\n",
180
+ "}\n",
181
+ "\n",
182
+ "plt.bar(flops.keys(), flops.values())\n",
183
+ "\n",
184
+ "plt.title(\"Model Operations\")\n",
185
+ "plt.ylabel(\"# of FLOPs\")\n",
186
+ "plt.xticks(rotation=45)\n",
187
+ "\n",
188
+ "plt.show()\n",
189
+ "\n",
190
+ "total_forward_flops = sum(flops.values())\n",
191
+ "\n",
192
+ "for name, count in flops.items():\n",
193
+ " print(f\"{name:20s} {count:20,d} {count / total_forward_flops * 100:10.2f}%\")\n",
194
+ "\n",
195
+ "print(\"\\n\")\n",
196
+ "\n",
197
+ "print(f\"Total forward FLOPs: {total_forward_flops:,}\")"
198
+ ]
199
+ },
200
+ {
201
+ "cell_type": "markdown",
202
+ "metadata": {},
203
+ "source": [
204
+ "Next, we'll estimate the number of FLOPs for the backward pass. For this we use a simple heuristic of 2X the forward pass."
205
+ ]
206
+ },
207
+ {
208
+ "cell_type": "code",
209
+ "execution_count": 67,
210
+ "metadata": {},
211
+ "outputs": [
212
+ {
213
+ "name": "stdout",
214
+ "output_type": "stream",
215
+ "text": [
216
+ "Total backward FLOPs: 2,134,941,317,120\n"
217
+ ]
218
+ }
219
+ ],
220
+ "source": [
221
+ "total_backward_flops = 2 * total_forward_flops\n",
222
+ "\n",
223
+ "print(f\"Total backward FLOPs: {total_backward_flops:,}\")"
224
+ ]
225
+ },
226
+ {
227
+ "cell_type": "markdown",
228
+ "metadata": {},
229
+ "source": [
230
+ "We'll do the same for the total FLOPs per roundtrip."
231
+ ]
232
+ },
233
+ {
234
+ "cell_type": "code",
235
+ "execution_count": 68,
236
+ "metadata": {},
237
+ "outputs": [
238
+ {
239
+ "name": "stdout",
240
+ "output_type": "stream",
241
+ "text": [
242
+ "Total roundtrip FLOPs: 3,202,411,975,680\n"
243
+ ]
244
+ }
245
+ ],
246
+ "source": [
247
+ "total_roundtrip_flops = total_forward_flops + total_backward_flops\n",
248
+ "\n",
249
+ "print(f\"Total roundtrip FLOPs: {total_roundtrip_flops:,}\")"
250
+ ]
251
+ },
252
+ {
253
+ "cell_type": "markdown",
254
+ "metadata": {},
255
+ "source": [
256
+ "Now, let's estimate how long it would take to train over every sample in the Openwebtext training set at least once in expectation using a few well-known Nvidia GPUs as benchmarks. Note that these results shown here are a best-case scenario and neglect to factor in overhead such as moving data to and from VRAM."
257
+ ]
258
+ },
259
+ {
260
+ "cell_type": "code",
261
+ "execution_count": 69,
262
+ "metadata": {},
263
+ "outputs": [
264
+ {
265
+ "name": "stdout",
266
+ "output_type": "stream",
267
+ "text": [
268
+ "Total tokens: 8,994,885,755\n",
269
+ "Epochs required: 2,145\n",
270
+ "\n",
271
+ "RTX A2000: 513.19 seconds/epoch, 12.74 days required\n",
272
+ "A100 SXM: 52.55 seconds/epoch, 1.30 days required\n",
273
+ "HGX B100: 1.17 seconds/epoch, 0.03 days required\n"
274
+ ]
275
+ }
276
+ ],
277
+ "source": [
278
+ "RTX_A2000_BF16_FLOPS_PER_SECOND = 63.9e12\n",
279
+ "A100_SXM_BF16_FLOPS_PER_SECOND = 624.0e12\n",
280
+ "HGX_B100_BF16_FLOPS_PER_SECOND = 28000e12\n",
281
+ "\n",
282
+ "ESTIMATED_FLOPS_UTILIZATION = 0.4\n",
283
+ "\n",
284
+ "num_training_tokens = 8994885755\n",
285
+ "samples_per_epoch = 4096\n",
286
+ "\n",
287
+ "num_epochs_required = round(num_training_tokens / (samples_per_epoch * block_size))\n",
288
+ "\n",
289
+ "print(f\"Total tokens: {num_training_tokens:,}\")\n",
290
+ "print(f\"Epochs required: {num_epochs_required:,}\", end=\"\\n\\n\")\n",
291
+ "\n",
292
+ "gpus = {\n",
293
+ " \"RTX A2000\": RTX_A2000_BF16_FLOPS_PER_SECOND,\n",
294
+ " \"A100 SXM\": A100_SXM_BF16_FLOPS_PER_SECOND,\n",
295
+ " \"HGX B100\": HGX_B100_BF16_FLOPS_PER_SECOND,\n",
296
+ "}\n",
297
+ "\n",
298
+ "for name, flops_per_second in gpus.items():\n",
299
+ " flops_per_second *= ESTIMATED_FLOPS_UTILIZATION\n",
300
+ "\n",
301
+ " seconds_per_epoch = samples_per_epoch * total_roundtrip_flops / flops_per_second\n",
302
+ "\n",
303
+ " days_required = num_epochs_required * seconds_per_epoch / 60 / 60 / 24\n",
304
+ "\n",
305
+ " print(f\"{name}: {seconds_per_epoch:.2f} seconds/epoch, {days_required:,.2f} days required\")"
306
+ ]
307
+ }
308
+ ],
309
+ "metadata": {
310
+ "kernelspec": {
311
+ "display_name": ".venv",
312
+ "language": "python",
313
+ "name": "python3"
314
+ },
315
+ "language_info": {
316
+ "codemirror_mode": {
317
+ "name": "ipython",
318
+ "version": 3
319
+ },
320
+ "file_extension": ".py",
321
+ "mimetype": "text/x-python",
322
+ "name": "python",
323
+ "nbconvert_exporter": "python",
324
+ "pygments_lexer": "ipython3",
325
+ "version": "3.12.3"
326
+ }
327
+ },
328
+ "nbformat": 4,
329
+ "nbformat_minor": 2
330
+ }
models/lightgpt-small.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9dfe5de2f2668272d38a7b2d29bc904229a359e9e970585eb7457ee4cb1ef8c
3
+ size 1819529541
out/.gitignore ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ *
2
+ !.gitignore
pre-train.py ADDED
@@ -0,0 +1,320 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ import random
3
+ import signal
4
+ import warnings
5
+
6
+ from os import path, environ
7
+ from argparse import ArgumentParser
8
+ from contextlib import nullcontext
9
+
10
+ import torch
11
+
12
+ from torch.utils.data import DataLoader
13
+ from torch.optim import Adafactor
14
+ from torch.amp import autocast
15
+ from torch.cuda import set_device, is_available as cuda_is_available, is_bf16_supported
16
+ from torch.nn.utils import clip_grad_norm_
17
+ from torch.distributed import init_process_group, destroy_process_group
18
+ from torch.distributed.optim import ZeroRedundancyOptimizer
19
+ from torch.nn.parallel import DistributedDataParallel
20
+
21
+ from torchmetrics.text import Perplexity
22
+
23
+ from model import GPT
24
+ from data import Openwebtext
25
+
26
+ from tqdm import tqdm
27
+
28
+ RANK = int(environ.get("RANK", -1))
29
+ LOCAL_RANK = int(environ.get("LOCAL_RANK", -1))
30
+ WORLD_SIZE = int(environ.get("WORLD_SIZE", -1))
31
+
32
+ IS_DDP = WORLD_SIZE > 1
33
+
34
+ IS_MASTER = RANK == 0 or not IS_DDP
35
+
36
+ DDP_BACKEND = "nccl" # nccl, gloo, etc.
37
+
38
+
39
+ def main():
40
+ parser = ArgumentParser(description="Pre-train the GPT.")
41
+
42
+ parser.add_argument("--batch_size", default=1, type=int)
43
+ parser.add_argument("--gradient_accumulation_steps", default=128, type=int)
44
+ parser.add_argument("--samples_per_epoch", default=4096, type=int)
45
+ parser.add_argument("--learning_rate", default=1e-2, type=float)
46
+ parser.add_argument("--max_gradient_norm", default=1.0, type=float)
47
+ parser.add_argument("--dropout", default=0.1, type=float)
48
+ parser.add_argument("--num_epochs", default=2140, type=int)
49
+ parser.add_argument("--block_size", default=1024, type=int)
50
+ parser.add_argument("--embedding_dimensions", default=1024, type=int)
51
+ parser.add_argument("--num_attention_heads", default=16, type=int)
52
+ parser.add_argument("--num_hidden_layers", default=32, type=int)
53
+ parser.add_argument("--activation_checkpointing", action="store_true")
54
+ parser.add_argument("--eval_interval", default=10, type=int)
55
+ parser.add_argument("--checkpoint_interval", default=20, type=int)
56
+ parser.add_argument("--checkpoint_path", default="./out/checkpoint.pt", type=str)
57
+ parser.add_argument("--checkpoint_history", action="store_true")
58
+ parser.add_argument("--resume", action="store_true")
59
+ parser.add_argument("--dataset_path", default="./dataset", type=str)
60
+ parser.add_argument("--num_dataset_processes", default=8, type=int)
61
+ parser.add_argument("--device", default="cuda", type=str)
62
+ parser.add_argument("--seed", default=None, type=int)
63
+
64
+ args = parser.parse_args()
65
+
66
+ if args.batch_size < 1:
67
+ raise ValueError(f"Batch size must be greater than 0, {args.batch_size} given.")
68
+
69
+ if args.gradient_accumulation_steps < 1:
70
+ raise ValueError(
71
+ f"Gradient accumulation steps must be greater than 0, {args.gradient_accumulation_steps} given."
72
+ )
73
+
74
+ if args.learning_rate < 0:
75
+ raise ValueError(
76
+ f"Learning rate must be a positive value, {args.learning_rate} given."
77
+ )
78
+
79
+ if args.num_epochs < 1:
80
+ raise ValueError(f"Must train for at least 1 epoch, {args.num_epochs} given.")
81
+
82
+ if args.eval_interval < 1:
83
+ raise ValueError(
84
+ f"Eval interval must be greater than 0, {args.eval_interval} given."
85
+ )
86
+
87
+ if args.checkpoint_interval < 1:
88
+ raise ValueError(
89
+ f"Checkpoint interval must be greater than 0, {args.checkpoint_interval} given."
90
+ )
91
+
92
+ if IS_DDP:
93
+ init_process_group(backend=DDP_BACKEND, world_size=WORLD_SIZE)
94
+
95
+ args.device = f"cuda:{LOCAL_RANK}"
96
+
97
+ set_device(args.device)
98
+
99
+ if args.seed:
100
+ args.seed += RANK
101
+
102
+ if args.gradient_accumulation_steps % WORLD_SIZE != 0:
103
+ warnings.warn(
104
+ "Number of gradient accumulation steps does not"
105
+ "divide evenly into the world size."
106
+ )
107
+
108
+ args.gradient_accumulation_steps //= WORLD_SIZE
109
+
110
+ assert (
111
+ args.gradient_accumulation_steps > 0
112
+ ), "World size is larger than the number of gradient accumulation steps."
113
+
114
+ if args.samples_per_epoch % WORLD_SIZE != 0:
115
+ warnings.warn(
116
+ "Number of samples per epoch does not"
117
+ "divide evenly into the world size."
118
+ )
119
+
120
+ args.samples_per_epoch //= WORLD_SIZE
121
+
122
+ assert (
123
+ args.samples_per_epoch > 0
124
+ ), "World size is larger than the number of samples per epoch."
125
+
126
+ torch.set_float32_matmul_precision("high")
127
+
128
+ if "cuda" in args.device and not cuda_is_available():
129
+ raise RuntimeError("Cuda is not available.")
130
+
131
+ dtype = (
132
+ torch.bfloat16
133
+ if "cuda" in args.device and is_bf16_supported()
134
+ else torch.float32
135
+ )
136
+
137
+ forward_context = autocast(device_type=args.device, dtype=dtype)
138
+
139
+ if args.seed:
140
+ torch.manual_seed(args.seed)
141
+ random.seed(args.seed)
142
+
143
+ training = Openwebtext(
144
+ root_path=args.dataset_path,
145
+ train=True,
146
+ tokens_per_sample=args.block_size,
147
+ samples_per_epoch=args.samples_per_epoch,
148
+ num_processes=args.num_dataset_processes,
149
+ )
150
+ testing = Openwebtext(
151
+ root_path=args.dataset_path,
152
+ train=False,
153
+ tokens_per_sample=args.block_size,
154
+ samples_per_epoch=args.samples_per_epoch,
155
+ num_processes=args.num_dataset_processes,
156
+ )
157
+
158
+ train_loader = DataLoader(
159
+ training, batch_size=args.batch_size, pin_memory="cpu" not in args.device
160
+ )
161
+ test_loader = DataLoader(
162
+ testing, batch_size=args.batch_size, pin_memory="cpu" not in args.device
163
+ )
164
+
165
+ model_args = {
166
+ "block_size": args.block_size,
167
+ "embedding_dimensions": args.embedding_dimensions,
168
+ "num_heads": args.num_attention_heads,
169
+ "num_layers": args.num_hidden_layers,
170
+ "dropout": args.dropout,
171
+ "vocabulary_size": training.vocabulary_size,
172
+ "padding_index": training.PADDING_INDEX,
173
+ "eos_index": training.eos_index,
174
+ }
175
+
176
+ model = GPT(**model_args, activation_checkpointing=args.activation_checkpointing)
177
+
178
+ if IS_DDP:
179
+ model = DistributedDataParallel(model, device_ids=[LOCAL_RANK])
180
+
181
+ print("Compiling model")
182
+ model = torch.compile(model).to(args.device)
183
+
184
+ if IS_DDP:
185
+ optimizer = ZeroRedundancyOptimizer(
186
+ model.parameters(),
187
+ optimizer_class=Adafactor,
188
+ lr=args.learning_rate,
189
+ )
190
+ else:
191
+ optimizer = Adafactor(model.parameters(), lr=args.learning_rate)
192
+
193
+ starting_epoch = 1
194
+
195
+ if args.resume:
196
+ checkpoint = torch.load(
197
+ args.checkpoint_path, map_location="cpu", weights_only=True
198
+ ) # Always load into CPU RAM first to prevent CUDA out-of-memory errors.
199
+
200
+ model.load_state_dict(checkpoint["model"])
201
+ optimizer.load_state_dict(checkpoint["optimizer"])
202
+ starting_epoch += checkpoint["epoch"]
203
+
204
+ model = model.to(args.device)
205
+
206
+ print("Previous checkpoint resumed successfully")
207
+
208
+ model.train()
209
+
210
+ print(f"Model has {model.num_trainable_params:,} trainable parameters")
211
+
212
+ perplexity_metric = Perplexity(ignore_index=training.PADDING_INDEX).to(args.device)
213
+
214
+ signal.signal(signal.SIGTERM, on_sigterm)
215
+
216
+ print("Pre-training ...")
217
+
218
+ for epoch in range(starting_epoch, args.num_epochs + 1):
219
+ total_cross_entropy, total_gradient_norm = 0.0, 0.0
220
+ total_batches, total_steps = 0, 0
221
+
222
+ for step, (x, y) in enumerate(
223
+ tqdm(train_loader, desc=f"Epoch {epoch}", leave=False), start=1
224
+ ):
225
+ x = x.to(args.device, non_blocking=True)
226
+ y = y.to(args.device, non_blocking=True)
227
+
228
+ with forward_context:
229
+ y_pred, loss = model(x, y)
230
+
231
+ scaled_loss = loss / args.gradient_accumulation_steps
232
+
233
+ sync_and_step = step % args.gradient_accumulation_steps == 0
234
+
235
+ backward_context = (
236
+ model.no_sync() if IS_DDP and not sync_and_step else nullcontext()
237
+ )
238
+
239
+ with backward_context:
240
+ scaled_loss.backward()
241
+
242
+ total_cross_entropy += loss.item()
243
+
244
+ if sync_and_step:
245
+ norm = clip_grad_norm_(model.parameters(), args.max_gradient_norm)
246
+
247
+ optimizer.step()
248
+
249
+ optimizer.zero_grad(set_to_none=True)
250
+
251
+ total_gradient_norm += norm.item()
252
+ total_steps += 1
253
+
254
+ total_batches += 1
255
+
256
+ average_cross_entropy = total_cross_entropy / total_batches
257
+ average_gradient_norm = total_gradient_norm / total_steps
258
+
259
+ print(
260
+ f"Epoch {epoch}:",
261
+ f"Cross Entropy: {average_cross_entropy:.5f},",
262
+ f"Gradient Norm: {average_gradient_norm:.4f}",
263
+ )
264
+
265
+ if epoch % args.eval_interval == 0 and IS_MASTER:
266
+ model.eval()
267
+
268
+ for x, y in tqdm(test_loader, desc="Testing", leave=False):
269
+ x = x.to(args.device, non_blocking=True)
270
+ y = y.to(args.device, non_blocking=True)
271
+
272
+ with torch.no_grad():
273
+ y_pred, _ = model(x)
274
+
275
+ perplexity_metric.update(y_pred, y)
276
+
277
+ perplexity = perplexity_metric.compute()
278
+
279
+ print(f"Perplexity: {perplexity:.3f}")
280
+
281
+ perplexity_metric.reset()
282
+
283
+ model.train()
284
+
285
+ if epoch % args.checkpoint_interval == 0 and IS_MASTER:
286
+ checkpoint = {
287
+ "epoch": epoch,
288
+ "model_args": model_args,
289
+ "model": model.state_dict(),
290
+ "optimizer": optimizer.state_dict(),
291
+ }
292
+
293
+ if args.checkpoint_history:
294
+ root, ext = path.splitext(args.checkpoint_path)
295
+
296
+ checkpoint_path = f"{root}-{epoch}{ext}"
297
+ else:
298
+ checkpoint_path = args.checkpoint_path
299
+
300
+ torch.save(checkpoint, checkpoint_path)
301
+
302
+ print("Checkpoint saved")
303
+
304
+ if IS_DDP:
305
+ destroy_process_group()
306
+
307
+ print("Done!")
308
+
309
+
310
+ def on_sigterm(signum, frame):
311
+ print("Hold on, attempting to exit gracefully.")
312
+
313
+ if IS_DDP:
314
+ destroy_process_group()
315
+
316
+ sys.exit(0)
317
+
318
+
319
+ if __name__ == "__main__":
320
+ main()
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ datasets==3.0.2
2
+ numpy==1.26.4
3
+ torch==2.5.1
4
+ torchmetrics==1.5.1
5
+ tiktoken==0.8.0
6
+ tqdm==4.66.6
7
+ matplotlib==3.9.2