Safetensors
llama
Pleias-350m-Preview / README.md
Pclanglais's picture
Update README.md
3bb4d60 verified
|
raw
history blame
5.06 kB
---
license: apache-2.0
datasets:
- PleIAs/common_corpus
language:
- en
- fr
- es
- de
- it
- la
- nl
- pl
---
**Pleias-360m-Preview** is an early preview of a 360 million parameter base model trained by Pleias on Common Corpus.
Like all the base and specialized models from Pleias, Pleias-360m-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
## Description
Pleias-360m-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.
It includes the following features, that would apply to any responsibly trained variant:
* Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
* Extensive multilingual support for main European languages.
* A new tokenizer designed for enhanced document processing tasks and better multilingual support.
* Extremely low level of toxicity and problematic content.
Pleias-360m-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese.
Given its size, Pleias-360m-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.
## Recommended use
As a base model, Pleias-360m-Preview is only able to run continuation prompts.
Text generation is currently able to support a range of creative writing tasks in multiple European languages. For more consistent results we recommend using a low or null temperature with a slight repetition penalty (1.1-1.2).
Pleias-360m-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
## Examples
With only 360 million parameters, Pleias-360m-Preview is able to generate literary texts in multiple languages:
English poetry:
<blockquote>
<b>Saturn is fallen, am I too to fall?<br/>
Am I to leave this haven of my rest,<br/>
This cradle of my glory, this soft clime,<br/>
This calm luxuriance of blissful light,<br/></b>
And the sweetness that was not in me?<br/>
I have been a slave and an enemy.<br/>
The world has made me its prey ; it's wrong;<br/>
It's cruelty! It's madness! — but you know<br/>
That all things are good for one man or another :<br/>
You see how he can be happy without them:<br/>
He cannot be miserable unless they come<br/>
To his heart with their own poisonous breaths.<br/>
But if there were no other way left us,<br/>
We should never find our happiness again.<br/>
If we could only live on earth alone,<br/>
With nothing else about us than ourselves,<br/>
Our very being would become more real.<br/>
There must be some other place where we may<br/>
Live happily as before, and enjoy life<br/>
As well as now. But what shall we do here?<br/>
What will happen when we die? What will hap-<br/>
[pen then?<br/>
When death comes, let him who knows the end<br/>
</blockquote>
French classic play with the 17th century writing style (including the long s):
<blockquote>
<b>Helas! pour mon malheur ie l’ay trop écouté.<br/>
Ie n’ay point du ſilence affecté le myſtère.<br/>
Ie croyois ſans péril pouuoir eſtre ſincere.<br/>
Et ſans armer mes yeux d’vn moment de rigueur,<br/></b>
Je ne voy pas que ce foit vn crime à la nature:<br/>
Mais je vous prie, qu'on me donne quelque recompenſe :<br/>
Pourquoy donc auec tant de gloire &c en vne heure?<br/>
Cela fait voir bien clairement que ma deſpenſe<br/>
Eſt plus grande que celle des autres hommes ;<br/>
Car il eſt certain que les Princes font moins heureux;<br/>
Que ceux qui ont eu leur vie par leurs mains confacrée.<br/>
Il y a deux fortes de perſonnes dans le monde :<br/>
Les uns fe nomment gens de lettres ou de proueſleurz,<br/>
Dont on dit que c'eſt eux-mesme qui nous donnent tousjours<br/>
Leur efprit & leur cœur aux affaires publiques ;<br/>
Ainſi ils s'appellent gens de cour, comme ſi cela<br/>
N'eſtoit autre choſe que ces grands Seigneurs qui gouuernenc<br/>
L'Etat,&qui ſe conſeruent toujours avec honneur.<br/>
Ils ſont fort peu de choſes ſans ambition ;<br/>
</blockquote>
## Training
Pleias-360m-Preview was fully pretrained at Jean Zay on 64 h100s for 46 hours with Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
Training schedule includes 518,000 steps (batch size 1,024) on a filtered and enhanced version of Common Corpus (1,086,324,736,000 tokens).
## Update
Pleias-360m-Preview is currently released as an early preview.
The model will undergo several more round of post-training to enhance reasoning capacities and fine-tunability as well as in anticipation of a generalist instruct version.