obahamonde/chaski-7B

Files changed (5) hide show

README.md CHANGED Viewed

@@ -8,23 +8,32 @@ model-index:
   results: []
 ---
-# Chasky-7B
-![Chasky](https://aws-call-4-speakers.s3.us-east-1.amazonaws.com/chasky.png)
 ## Model description
-A chasqui (also spelled chaski) was a messenger of the Inca empire.
-Agile, highly trained and physically fit, they were in charge of carrying messages in the form of quipus or oral information and small packets.
-Along the Inca road system there were relay stations called chaskiwasi (house of chasqui), placed at about 2.5 kilometres (1.6 mi) from each other,
-where the chasqui switched, exchanging their message(s) with the fresh messenger.
-The chasqui system could be able to deliver a message or a gift along a distance of up to 300 kilometres (190 mi) per day
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
-- train_batch_size: 4
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
@@ -32,6 +41,10 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_ratio: 0.03
 - num_epochs: 5
 ### Framework versions
 - Transformers 4.36.2

   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# chaski-7B
+This model is a fine-tuned version of [cognitivecomputations/dolphin-2.1-mistral-7b](https://huggingface.co/cognitivecomputations/dolphin-2.1-mistral-7b) on an unknown dataset.
 ## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 0.0002
+- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_warmup_ratio: 0.03
 - num_epochs: 5
+### Training results
 ### Framework versions
 - Transformers 4.36.2

adapter_config.json CHANGED Viewed

@@ -19,13 +19,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "k_proj",
     "down_proj",
     "gate_proj",
-    "up_proj",
     "v_proj",
     "q_proj",
-    "o_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "down_proj",
     "gate_proj",
+    "o_proj",
     "v_proj",
+    "k_proj",
     "q_proj",
+    "up_proj"
   ],
   "task_type": "CAUSAL_LM",
   "use_rslora": false

runs/Mar21_16-32-14_lab/events.out.tfevents.1711038735.lab.153745.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:77e986ef95d0303fedbc70dadedc8a47a17a7025e721720a442af5cbabbe7ab8
+size 5576

tokenizer.json CHANGED Viewed

@@ -1,6 +1,11 @@
 {
   "version": "1.0",
-  "truncation": null,
   "padding": {
     "strategy": "BatchLongest",
     "direction": "Right",

 {
   "version": "1.0",
+  "truncation": {
+    "direction": "Right",
+    "max_length": 512,
+    "strategy": "LongestFirst",
+    "stride": 0
+  },
   "padding": {
     "strategy": "BatchLongest",
     "direction": "Right",

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ad6ea7e984b94d60cd1ce82807509a4605d551c369dae971949ade7690f31b5b
-size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:b589a293338e419f8ac50fb3d91d904e3f21101acea7f7150aa9a9082819050d
+size 4728