avemio-digital
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -106,7 +106,7 @@ The summarization task teaches models to distill complex information into clear,
|
|
106 |
### Architecture
|
107 |
|
108 |
|
109 |
-
| Parameter | GRAG-
|
110 |
|-----------------------|-----------------------------------------------------------------------------------------------|
|
111 |
| **d_model** | 4096 |
|
112 |
| **num heads** | 32 |
|
@@ -124,7 +124,7 @@ The summarization task teaches models to distill complex information into clear,
|
|
124 |
### Hyperparameters
|
125 |
|
126 |
|
127 |
-
| Parameter | GRAG-
|
128 |
|---------------------------|--------------------|
|
129 |
| **warmup steps** | 50 |
|
130 |
| **peak LR** | 5.0E-07 |
|
|
|
106 |
### Architecture
|
107 |
|
108 |
|
109 |
+
| Parameter | GRAG-PHI-CPT |
|
110 |
|-----------------------|-----------------------------------------------------------------------------------------------|
|
111 |
| **d_model** | 4096 |
|
112 |
| **num heads** | 32 |
|
|
|
124 |
### Hyperparameters
|
125 |
|
126 |
|
127 |
+
| Parameter | GRAG-PHI-CPT |
|
128 |
|---------------------------|--------------------|
|
129 |
| **warmup steps** | 50 |
|
130 |
| **peak LR** | 5.0E-07 |
|