File size: 2,511 Bytes
7de141b e5e94da 7de141b f280ca8 7de141b f280ca8 ba51b62 f280ca8 b15a64c f280ca8 15a8676 f280ca8 1d07f36 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
---
inference: false
license: openrail
language:
- it
datasets:
- teelinsan/camoscio
---
# ExtremITA Camoscio 7 bilion parameters
This is the base model trained on Italian instructions, a sibling of Alpaca.
It is based on [tellinsan/camoscio-7b-llama](https://huggingface.co/teelinsan/camoscio-7b-llama) adapters and the original LLaMA model, and it adds nothing new to [tellinsan/camoscio-7b-llama](https://huggingface.co/teelinsan/camoscio-7b-llama). Our version is the merged model with the adapters in order to obtain a more stable model that can be further fine-tuned, which we did for the [EVALITA 2023](https://www.evalita.it/campaigns/evalita-2023/) challenge.
# Usage
Checkout the github repository for more insights and codes: https://github.com/crux82/ExtremITA
```python
from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
import torch
tokenizer = LLaMATokenizer.from_pretrained("yahma/llama-7b-hf")
model = LLaMAForCausalLM.from_pretrained(
"sag-uniroma2/extremITA-Camoscio-7b",
load_in_8bit=True,
device_map="auto",
)
generation_config = GenerationConfig(
temperature=0.2,
top_p=0.75,
top_k=40,
num_beams=4,
)
prompts = [
"Riassumi la storia di Pinocchio",
"Scrivi un programma che stampa i numeri da 1 a 100. Ma per i multipli \
di tre stampa 'Fizz' al posto del numero e per i multipli di cinque \
stampa 'Buzz'. Per i numeri che sono multipli sia di tre che di cinque \
stampa 'FizzBuzz'."
]
inputs = tokenizer(prompts, return_tensors="pt", padding=True, \
truncation=True).to(model.device)
with torch.no_grad():
gen_outputs = model.generate(
**inputs,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
)
for i in range(len(gen_outputs)):
output = tokenizer.decode(gen_outputs[i], skip_special_tokens=True)
print(output)
```
# Citation
```
@inproceedings{hromei2023extremita,
author = {Claudiu Daniel Hromei and
Danilo Croce and
Valerio Basile and
Roberto Basili},
title = {ExtremITA at EVALITA 2023: Multi-Task Sustainable Scaling to Large Language Models at its Extreme},
booktitle = {Proceedings of the Eighth Evaluation Campaign of Natural Language
Processing and Speech Tools for Italian. Final Workshop (EVALITA 2023)},
publisher = {CEUR.org},
year = {2023},
month = {September},
address = {Parma, Italy}
}
```
|