|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- Fishfishfishfishfish/Synthetic_text.txt |
|
language: |
|
- en |
|
--- |
|
the only files needed for inference is inference.py, word2idk.pkl, and lstm_Hxxx.safetensors |
|
|
|
input tokens must be space separated, as they aren't tokenized like the training data is. |
|
>python inference.py --temp 0.5 --top_k 64 --model_file lstm_H256.safetensors --start_sequence "User : what is the capital of France ? Bot : " --max_length 32 |
|
|
|
usually results in something like |
|
|
|
>The capital of the world of the world of the world of the world of the |
|
|
|
its not very accurate yet, its trained on only 1.2mb of text |
|
|
|
Each safetensors file represents a different hidden dim value. |
|
Each trained for 1 epoch. |
|
|
|
inference.py hidden dim value must be edited for each safetensors. |
|
|
|
>sequence_length = 64 |
|
> |
|
>batch_size = 16 |
|
> |
|
>learning_rate = 0.0001 |
|
> |
|
>embedding_dim = 256 |
|
> |
|
>num_layers = 4 |
|
> |