Valerie v0.1 Model Card
Overview
Valerie v0.1 is a custom language model created using llama.cpp
(commit: 532c173) with a context length of 256 tokens, embedding length of 256, 8 heads, and 16 layers. This model was pretrained on a dataset consisting of female V's dialog from Cyberpunk 2077, extracted using the Voice Over Subtitle Map mod.
Model Information
Full sampling
Model name | Adam iteration | Model filename | Vocabulary size |
---|---|---|---|
Valerie v0.1 Checkpoint | 1750 | chk-valerie-v0.1-256x32-1750.gguf | 32,000 |
Valerie v0.1 Model | 1750 | ggml-valerie-v0.1-256x32-f32-1750.gguf | 32,000 |
The ggml-valerie-v0.1-256x32-f32-1750.gguf
release represents a single epoch of all 51443 samples, completing over 1700 iterations over the entire dataset, and took approximately 3 hours for training.
Repeat sampling
Model name | Adam iteration | Model filename | Vocabulary size |
---|---|---|---|
Valerie v0.1 Checkpoint | 3600 | chk-valerie-v0.1-256x32-LATEST.gguf | 32,000 |
Valerie v0.1 Model | 3600 | ggml-valerie-v0.1-256x32-f32-LATEST.gguf | 32,000 |
The ggml-valerie-v0.1-256x32-f32-LATEST.gguf
release represents two epochs of all 51443 samples, completing over 3600 iterations over the entire dataset, and took approximately 6 hours for training.
Files and versions
- ggml-vocab-mistral.gguf: Extracted Mistral 7B model vocabulary.
- ggml-valerie-v0.1-256x32-f32-1750.gguf: The pretrained model checkpoint version 1750.
- ggml-valerie-v0.1-256x32-f32-LATEST.gguf: The latest pretrained model checkpoint. Currently 3600.
Settings
- Vocabulary size: 32,000
- Context length: 256 tokens
- Embedding length: 256
- Heads: 8
- Layers: 16
- Batch size: 32
- Seed: 1
- Saved checkpoint every 50 iterations
Usage
To use Valerie v0.1, follow these steps:
- Clone the
llama.cpp
library
git clone https://github.com/ggerganov/llama.cpp
Reference the llama.cpp
README.md for more information about building. You can build using raw CPU or even OpenBLAS. CUDA, ROCm, Vulkan, and other backends are also available.
Arch Linux Example:
# CPU build using BLAS backend on Arch Linux
sudo pacman -S openblas openblas64
make LLAMA_OPENBLAS=1
- Download the latest model.
wget https://huggingface.co/teleprint-me/cyberpunk-valerie-v0.1/resolve/main/ggml-valerie-v0.1-256x32-f32-LATEST.gguf?download=true -O
ggml-valerie-v0.1-256x32-f32-LATEST.gguf
This will download the latest available base model.
- Perform inference with the latest model checkpoint using the provided command:
./main -m models/valerie/v0.1/ggml-valerie-v0.1-256x32-f32-LATEST.gguf --color -e -s 1 -c 4096
Benchmarks
Performance metrics for evaluating v0.1 iteration 3600 on CPU, BLAS, and Vulkan backends.
llama-bench
model | size | params | backend | threads | test | t/s |
---|---|---|---|---|---|---|
llama ?B all F32 | 114.53 MiB | 30.02 M | CPU | 8 | pp 512 | 12781.37 ± 2258.61 |
llama ?B all F32 | 114.53 MiB | 30.02 M | CPU | 8 | tg 128 | 410.74 ± 6.13 |
llama ?B all F32 | 114.53 MiB | 30.02 M | BLAS | 8 | pp 512 | 233.53 ± 1.56 |
llama ?B all F32 | 114.53 MiB | 30.02 M | BLAS | 8 | tg 128 | 391.63 ± 14.02 |
llama ?B all F32 | 114.53 MiB | 30.02 M | Vulkan | 99 | pp 512 | 18779.40 ± 111.01 |
llama ?B all F32 | 114.53 MiB | 30.02 M | Vulkan | 99 | tg 128 | 96.25 ± 0.46 |
build: ab0dee5 (2686)
batched-bench - CPU
PP | TG | B | N_KV | T_PP s | S_PP t/s | T_TG s | S_TG t/s | T s | S t/s |
---|---|---|---|---|---|---|---|---|---|
128 | 128 | 1 | 256 | 0.009 | 14365.88 | 0.345 | 370.86 | 0.354 | 723.06 |
128 | 128 | 2 | 512 | 0.022 | 11514.42 | 0.377 | 679.29 | 0.399 | 1282.90 |
128 | 128 | 4 | 1024 | 0.052 | 9811.44 | 0.438 | 1168.69 | 0.490 | 2088.60 |
128 | 128 | 8 | 2048 | 0.093 | 11067.40 | 0.745 | 1373.82 | 0.838 | 2444.24 |
128 | 256 | 1 | 384 | 0.011 | 11861.74 | 0.705 | 363.37 | 0.715 | 536.83 |
128 | 256 | 2 | 768 | 0.022 | 11649.60 | 0.768 | 666.97 | 0.790 | 972.62 |
128 | 256 | 4 | 1536 | 0.050 | 10252.10 | 0.912 | 1122.94 | 0.962 | 1596.95 |
256 | 128 | 1 | 384 | 0.021 | 12028.94 | 0.345 | 370.85 | 0.366 | 1047.94 |
256 | 128 | 2 | 768 | 0.049 | 10351.80 | 0.404 | 633.82 | 0.453 | 1694.02 |
256 | 128 | 4 | 1536 | 0.118 | 8688.72 | 0.484 | 1058.15 | 0.602 | 2552.70 |
256 | 256 | 1 | 512 | 0.022 | 11477.76 | 0.715 | 357.83 | 0.738 | 694.02 |
256 | 256 | 2 | 1024 | 0.050 | 10263.61 | 0.822 | 622.72 | 0.872 | 1174.20 |
256 | 256 | 4 | 2048 | 0.092 | 11089.45 | 0.990 | 1033.97 | 1.083 | 1891.58 |
512 | 128 | 1 | 640 | 0.050 | 10235.70 | 0.372 | 344.35 | 0.422 | 1517.52 |
512 | 128 | 2 | 1280 | 0.093 | 10987.83 | 0.445 | 575.12 | 0.538 | 2377.77 |
512 | 256 | 1 | 768 | 0.050 | 10208.56 | 0.783 | 326.97 | 0.833 | 921.85 |
512 | 256 | 2 | 1536 | 0.091 | 11216.51 | 0.925 | 553.26 | 1.017 | 1510.73 |
main: n_kv_max = 2048, n_batch = 2048, n_ubatch = 512, is_pp_shared = 0, n_gpu_layers = 999, n_threads = 8, n_threads_batch = 8
Citations
When using Valerie v0.1 in your research, please remember to cite the following:
- aberrio. (2024). Valerie v0.1: A custom language model for female V's dialog from Cyberpunk 2077. https://huggingface.co/teleprint-me/cyberpunk-valerie-v0.1
- GGML team. (2023).
llama.cpp
version532c173
. Georgi Gerganov Machine Learning Library. https://github.com/ggerganov/llama.cpp - MistralAI (2023). Extracted sentencepiece model vocabulary: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
- julieisdead (2021). Voice Over Subtitle Map: Files that contain the IDs and content for Voice Over files. https://www.nexusmods.com/cyberpunk2077/mods/2045
- CD Projekt RED (2020). Cyberpunk 2077: GTA is a close second. https://cyberpunk.net
Contributors
Austin (teleprint-me) - Created and trained Valerie v0.1 using llama.cpp
and the referenced dataset.
Community
Join the community of fellow language model enthusiasts and researchers by sharing your knowledge, asking questions, and collaborating on projects related to creating custom models using llama.cpp
.
License
Valerie v0.1 is released under the CC-BY-NC-SA-3.0 license. You are free to use, modify, and redistribute this model for non-commercial purposes, but you must provide attribution to the original authors and release any derived works under the same license.
- Downloads last month
- 13
32-bit