File size: 1,987 Bytes
aced770
f7537ce
fddce67
 
 
 
 
 
 
 
 
a415be5
aced770
f7537ce
f4e5103
f7537ce
f4e5103
 
f7537ce
f4e5103
f7537ce
f4e5103
fddce67
f4e5103
 
 
fddce67
1eba9b4
f4e5103
 
 
fddce67
f4e5103
 
 
fddce67
f4e5103
 
 
 
 
 
 
 
 
 
 
fddce67
 
 
 
 
 
 
 
 
f4e5103
 
 
 
 
 
 
 
 
 
 
a415be5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
library_name: peft
tags:
- tiiuae-falcon-180B
- code
- instruct
- databricks-dolly-15k
- falcon-180B
datasets:
- databricks/databricks-dolly-15k
base_model: tiiuae/falcon-180B
license: apache-2.0
---

### Finetuning Overview:

**Model Used:** tiiuae/falcon-180B  
**Dataset:** Databricks-dolly-15k  

#### Dataset Insights:

The Databricks-dolly-15k dataset represents a substantial collection of over 15,000 records, curated through the dedicated and collective efforts of numerous Databricks professionals. It's meticulously designed to:

- Enhance the magical interactivity of ChatGPT-like models.
- Offer prompt/response pairs across eight different instruction categories, comprising the seven categories from the InstructGPT paper and an added open-ended category.
- Ensure authenticity with restrictions against online sourcing (with the exception of Wikipedia for some categories) and the use of generative AI in crafting content.

During the dataset's creation, contributors responded to peer questions. A focus was placed on rephrasing the original queries and emphasizing accurate responses. Furthermore, certain data subsets incorporate Wikipedia references, identifiable by bracketed citation numbers like [42].
#### Finetuning Details:

Our finetuning harnessed the capabilities of [MonsterAPI](https://monsterapi.ai)'s no-code [LLM finetuner](https://docs.monsterapi.ai/fine-tune-a-large-language-model-llm):

- **Duration:** The session spanned 41.7 hours.
- **Cost:** The entire process cost `$184.314`.
- **Hardware Utilized:** 2x A100 80GB GPUs.

#### Hyperparameters & Additional Details:

- **Model Path:** tiiuae/falcon-180B
- **Learning Rate:** 0.0002
- **Epochs:** 1
- **Data Split:** Training 90% / Validation 10%
- **Gradient Accumulation Steps:** 1

---

### Prompt Used: 

```
### INSTRUCTION:
[instruction]

[context]

### RESPONSE:
[response]
```

Loss metrics

Training loss:
![training loss](train-loss.png "Training loss")



---

license: apache-2.0