|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- PleIAs/common_corpus |
|
language: |
|
- en |
|
- fr |
|
- es |
|
- de |
|
- it |
|
- la |
|
- nl |
|
- pl |
|
--- |
|
**Pleias-360m-Preview** is an early preview of a 360 million parameter base model trained by Pleias on Common Corpus. |
|
|
|
Like all the base and specialized models from Pleias, Pleias-360m-Preview has only been trained on open data out of copyright (public domain) or under a permissible license. |
|
|
|
## Description |
|
Pleias-360m-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference. |
|
|
|
It includes the following features, that would apply to any responsibly trained variant: |
|
* Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content. |
|
* Extensive multilingual support for main European languages. |
|
* A new tokenizer designed for enhanced document processing tasks and better multilingual support. |
|
* Extremely low level of toxicity and problematic content. |
|
|
|
Pleias-360m-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese. |
|
|
|
Given its size, Pleias-360m-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release. |
|
|
|
## Recommended use |
|
As a base model, Pleias-360m-Preview is only able to run continuation prompts. |
|
|
|
Text generation is currently able to support a range of creative writing tasks in multiple European languages. For more consistent results we recommend using a low or null temperature with a slight repetition penalty (1.1-1.2). |
|
|
|
Pleias-360m-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA. |
|
|
|
## Examples |
|
With only 360 million parameters, Pleias-360m-Preview is able to generate literary texts in multiple languages: |
|
|
|
English poetry: |
|
|
|
<blockquote> |
|
<b>Saturn is fallen, am I too to fall?<br/> |
|
Am I to leave this haven of my rest,<br/> |
|
This cradle of my glory, this soft clime,<br/> |
|
This calm luxuriance of blissful light,<br/></b> |
|
And the sweetness that was not in me?<br/> |
|
I have been a slave and an enemy.<br/> |
|
The world has made me its prey ; it's wrong;<br/> |
|
It's cruelty! It's madness! — but you know<br/> |
|
That all things are good for one man or another :<br/> |
|
You see how he can be happy without them:<br/> |
|
He cannot be miserable unless they come<br/> |
|
To his heart with their own poisonous breaths.<br/> |
|
But if there were no other way left us,<br/> |
|
We should never find our happiness again.<br/> |
|
If we could only live on earth alone,<br/> |
|
With nothing else about us than ourselves,<br/> |
|
Our very being would become more real.<br/> |
|
There must be some other place where we may<br/> |
|
Live happily as before, and enjoy life<br/> |
|
As well as now. But what shall we do here?<br/> |
|
What will happen when we die? What will hap-<br/> |
|
[pen then?<br/> |
|
When death comes, let him who knows the end<br/> |
|
</blockquote> |
|
|
|
French classic play with the 17th century writing style (including the long s): |
|
|
|
<blockquote> |
|
<b>Helas! pour mon malheur ie l’ay trop écouté.<br/> |
|
Ie n’ay point du ſilence affecté le myſtère.<br/> |
|
Ie croyois ſans péril pouuoir eſtre ſincere.<br/> |
|
Et ſans armer mes yeux d’vn moment de rigueur,<br/></b> |
|
Je ne voy pas que ce foit vn crime à la nature:<br/> |
|
Mais je vous prie, qu'on me donne quelque recompenſe :<br/> |
|
Pourquoy donc auec tant de gloire &c en vne heure?<br/> |
|
Cela fait voir bien clairement que ma deſpenſe<br/> |
|
Eſt plus grande que celle des autres hommes ;<br/> |
|
Car il eſt certain que les Princes font moins heureux;<br/> |
|
Que ceux qui ont eu leur vie par leurs mains confacrée.<br/> |
|
Il y a deux fortes de perſonnes dans le monde :<br/> |
|
Les uns fe nomment gens de lettres ou de proueſleurz,<br/> |
|
Dont on dit que c'eſt eux-mesme qui nous donnent tousjours<br/> |
|
Leur efprit & leur cœur aux affaires publiques ;<br/> |
|
Ainſi ils s'appellent gens de cour, comme ſi cela<br/> |
|
N'eſtoit autre choſe que ces grands Seigneurs qui gouuernenc<br/> |
|
L'Etat,&qui ſe conſeruent toujours avec honneur.<br/> |
|
Ils ſont fort peu de choſes ſans ambition ;<br/> |
|
</blockquote> |
|
|
|
## Training |
|
Pleias-360m-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release. |
|
|
|
Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens). |
|
|
|
## Update |
|
Pleias-360m-Preview is currently released as an early preview. |
|
|
|
The model will undergo several more round of post-training to enhance reasoning capacities and fine-tunability as well as in anticipation of a generalist instruct version. |
|
|