allenai
/

OLMo-2-1124-7B

Safetensors

English

olmo2

Model card Files Files and versions Community

amanrangapur commited on Nov 26, 2024

Commit

67a1620

verified ·

1 Parent(s): 83f13fa

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -107,7 +107,7 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?
 <!-- TODO -->
 ## Evaluation
-Core model results for OLMo2 7B models are found below.
 | Task              | Llama-7b | Llama2-7b | Falcon-7b | Mpt-7b | OLMo-7B | Llama2-13b | OLMo 7B April 2024 | **OLMo2 7B** |
 |-------------------|----------|-----------|-----------|--------|---------|------------|--------------------|-----------------------|
@@ -157,9 +157,9 @@ In contrast to OLMo 1.0, we trained OLMo 7B July with a two-stage curriculum:
 Both stages contribute equally to the final performance of the OLMo model. After the first stage, OLMo 1.7 already outperforms OLMo 1.0. The second stage consistently adds 2 to 3 points of performance on top.
-### Architecture
-OLMo 7B architecture with peer models for comparison.
 |                        | **OLMo2 7B** | [OLMo2 13B](https://huggingface.co/allenai/OLMo2-13B-1124) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) | PaLM 8B |
 |------------------------|-------------------|-------------------|---------------------|--------------------|--------------------|------------------|
@@ -203,7 +203,7 @@ Optimizer settings comparison with peer models.
 | gradient clipping     | global 1.0       | global 1.0       | global 1.0          | global 1.0         | global 1.0         |
 | gradient reduce dtype | FP32             | FP32             | FP32                | FP32               | BF16               |
 | optimizer state dtype | FP32             | FP32             | most likely FP32    | FP32               | FP32               |
 ## Bias, Risks, and Limitations

 <!-- TODO -->
 ## Evaluation
+Core model results for OLMo2 7B models are found below:
 | Task              | Llama-7b | Llama2-7b | Falcon-7b | Mpt-7b | OLMo-7B | Llama2-13b | OLMo 7B April 2024 | **OLMo2 7B** |
 |-------------------|----------|-----------|-----------|--------|---------|------------|--------------------|-----------------------|
 Both stages contribute equally to the final performance of the OLMo model. After the first stage, OLMo 1.7 already outperforms OLMo 1.0. The second stage consistently adds 2 to 3 points of performance on top.
+<!-- ### Architecture
+OLMo2 7B architecture with peer models for comparison.
 |                        | **OLMo2 7B** | [OLMo2 13B](https://huggingface.co/allenai/OLMo2-13B-1124) | [Llama 2 7B](https://huggingface.co/meta-llama/Llama-2-7b) | [OpenLM 7B](https://laion.ai/blog/open-lm/) | [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) | PaLM 8B |
 |------------------------|-------------------|-------------------|---------------------|--------------------|--------------------|------------------|
 | gradient clipping     | global 1.0       | global 1.0       | global 1.0          | global 1.0         | global 1.0         |
 | gradient reduce dtype | FP32             | FP32             | FP32                | FP32               | BF16               |
 | optimizer state dtype | FP32             | FP32             | most likely FP32    | FP32               | FP32               |
+ -->
 ## Bias, Risks, and Limitations