zijianhu commited on
Commit
d13ba2d
·
verified ·
1 Parent(s): 3c2bd69

Update README.md

Browse files

Add technical report link

Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -113,8 +113,8 @@ by [TensorOpera AI](https://tensoropera.ai/). The model was trained with a 3-sta
113
  tokens of text and code data in 8K sequence length. Fox-1 uses Grouped Query Attention (GQA) with 4 key-value heads and
114
  16 attention heads for faster inference.
115
 
116
- For the full details of this model please read
117
- our [release blog post](https://blog.tensoropera.ai/tensoropera-unveils-fox-foundation-model-a-pioneering-open-source-slm-leading-the-way-against-tech-giants).
118
 
119
  ## Benchmarks
120
 
 
113
  tokens of text and code data in 8K sequence length. Fox-1 uses Grouped Query Attention (GQA) with 4 key-value heads and
114
  16 attention heads for faster inference.
115
 
116
+ For the full details of this model please read [Fox-1 technical report](https://arxiv.org/abs/2411.05281)
117
+ and [release blog post](https://blog.tensoropera.ai/tensoropera-unveils-fox-foundation-model-a-pioneering-open-source-slm-leading-the-way-against-tech-giants).
118
 
119
  ## Benchmarks
120