medmekk HF staff commited on
Commit
05f37de
·
verified ·
1 Parent(s): d159216

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -28
README.md CHANGED
@@ -15,27 +15,13 @@ For a deeper dive into the methods and results, check out our [blog post](https:
15
 
16
  ## Model Details
17
 
18
- ### Model Description
19
 
20
- <!-- Provide a longer summary of what this model is. -->
21
-
22
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
23
-
24
- - **Developed by:** [More Information Needed]
25
- - **Funded by [optional]:** [More Information Needed]
26
- - **Shared by [optional]:** [More Information Needed]
27
- - **Model type:** [More Information Needed]
28
- - **Language(s) (NLP):** [More Information Needed]
29
- - **License:** [More Information Needed]
30
- - **Finetuned from model [optional]:** [More Information Needed]
31
-
32
- ### Model Sources [optional]
33
 
34
  <!-- Provide the basic links for the model. -->
35
 
36
- - **Repository:** [More Information Needed]
37
- - **Paper [optional]:** [More Information Needed]
38
- - **Demo [optional]:** [More Information Needed]
39
 
40
  ## Uses
41
 
@@ -81,20 +67,11 @@ Use the code below to get started with the model.
81
 
82
  ### Training Data
83
 
84
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
85
 
86
  [More Information Needed]
87
 
88
- ### Training Procedure
89
-
90
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
91
-
92
- #### Preprocessing [optional]
93
-
94
- [More Information Needed]
95
-
96
-
97
- #### Training Hyperparameters
98
 
99
  - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
100
 
 
15
 
16
  ## Model Details
17
 
 
18
 
19
+ ### Model Sources
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  <!-- Provide the basic links for the model. -->
22
 
23
+ - **Repository:** [Model](https://huggingface.co/HF1BitLLM/Llama3-8B-1.58-100B-tokens)
24
+ - **Paper:** [The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits](https://arxiv.org/abs/2402.17764)
 
25
 
26
  ## Uses
27
 
 
67
 
68
  ### Training Data
69
 
70
+ The model was trained on a subset of [FineWeb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu)
71
 
72
  [More Information Needed]
73
 
74
+ ### Training Hyperparameters
 
 
 
 
 
 
 
 
 
75
 
76
  - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
77