Epigr_2_Llama-3.1-8B-Instruct_text

This is a finetuned version Llama-3.1-8B-Instruct specialized on reconstructing spans of 1–20 missing characters in ancient Greek inscriptions. In spans of 1–10 missing characters it did so with a Character Error Rate of 20.5%, a top-1 accuracy of 63.7%, and top-20 of 83.0% on a test set of 7,811 unseen editions of inscriptions. See https://arxiv.org/abs/2409.13870.

Usage

To run the model on a GPU with large memory capacity, follow these steps:

1. Download and load the model

import json
from transformers import pipeline, AutoTokenizer, LlamaForCausalLM
from accelerate import init_empty_weights, load_checkpoint_and_dispatch
import torch
import warnings
warnings.filterwarnings("ignore", message=".*copying from a non-meta parameter in the checkpoint*")
model_id = "Ericu950/Epigr_2_Llama-3.1-8B-Instruct_text"

with init_empty_weights():
    model = LlamaForCausalLM.from_pretrained(model_id)

model = load_checkpoint_and_dispatch(
    model,
    model_id,
    device_map="auto",
    offload_folder="offload",
    offload_state_dict=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_id)

generation_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device_map="auto",
)

2. Run inference on an inscription of your choice

# this is https://inscriptions.packhum.org/text/359280?bookid=879&location=1678, Cos and Calymna IG XII,4 5:4043 
inscription_edition = "----εκτηι ισταμενου· ευμολποσ μολπου επεστατει· πρυτανεων γνωμη μεσσηνεωσ του διονοσ κατασταθεντοσ υπο --—ου του μυγαλου ερμωνοσ του μυιστρου κατασταθεντοσ υπο --—ρατου του προμαχου μολπου του μολπου λεοντοσ του -—ιππου κατασταθεντοσ υπο αριστοφανου του νουμηνιου του στησιοχου ηρακλειτου του αρτεμιδωρου δημοφωντοσ του πρυτανιοσ δαμωνοσ του ονφαλιωνοσ· επειδη οι δικασται οι αποσταλεντεσ εισ καλυμναν κομιζουσιν ψηφισμα παρα του δημου του καλυμνιων εν ωι γεγραπται οτι ο δημοσ ο καλυμνιων στεφανοι τον δημον χρυσωι στεφανωι αρετησ ενεκεν και ευνοιασ τησ εισ αυτον στεφανοι δε και τουσ δικαστασ τουσ αποσταλεντασ χρυσωι στεφανωι καλοκαγαθιασ ενεκεν κλεανδρον διοδωρου λεοντα ευβουλου κεφαλον δρακοντοσ θεοδωρον νουμηνιου λεοντα δρακοντιδου και περι τουτων οιεται δειν επιμελειαν ποιησασθαι τον δημον οπωσ ο τησ πολεωσ στεφανοσ αναγορευθηι και ο των δικαστων εν τωι θεατρωι διονυσιοισ δεδοχθαι τωι δημωι· τον μεν αγωνοθετην αναγγειλαι τον τησ πολεωσ στεφανον και τον των δικαστων κυκλιων τηι πρωτηι· επηιν[7 missing letters] και τουσ δικαστασ τουσ αποσταλεντασ επειδη αξιοι γενομενοι του δημου τιμασ περιεποιησαν τηι πολει·"
system_prompt = "Fill in the missing letters in this inscription!"
input_messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": inscription_edition},
]
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = generation_pipeline(
    input_messages,
    max_new_tokens=10,
    num_beams=30, # Set this as high as your memory will allow!
    num_return_sequences=10,
    early_stopping=True,
)
beam_contents = []
for output in outputs:
    generated_text = output.get('generated_text', [])
    for item in generated_text:
        if item.get('role') == 'assistant':
            beam_contents.append(item.get('content'))
real_response = "ησθαι δε"
print(f"The masked sequence: {real_response}")
for i, content in enumerate(beam_contents, start=1):
    print(f"Suggestion {i}: {content}")

Expected Output:

The masked sequence: ησθαι δε
Suggestion 1: ησθαι δε
Suggestion 2: εσθαι δε
Suggestion 3: εισθαι δ
Suggestion 4: εσαι δε ο
Suggestion 5: εκεν δε ο
Suggestion 6: εισθαι ο
Suggestion 7: εσ ο δημος
Suggestion 8: εσεν δε ο
Suggestion 9: εισθαι δε
Suggestion 10: εσ δε και

Usage on free tier in Google Colab

If you don’t have access to a larger GPU but want to try the model out, you can run it in a quantized format in Google Colab. The quality of the responses will deteriorate significantly! Follow these steps:

Step 1: Connect to free GPU

  1. Click Connect arrow_drop_down near the top right of the notebook.
  2. Select Change runtime type.
  3. In the modal window, select T4 GPU as your hardware accelerator.
  4. Click Save.
  5. Click the Connect button to connect to your runtime. After some time, the button will present a green checkmark, along with RAM and disk usage graphs. This indicates that a server has successfully been created with your required hardware.

Step 2: Install Dependencies

!pip install -U bitsandbytes
import os
os._exit(00)

Step 3: Download and quantize the model

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
import torch
quant_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained("Ericu950/Epigr_2_Llama-3.1-8B-Instruct_text",
device_map = "auto", quantization_config = quant_config)
tokenizer = AutoTokenizer.from_pretrained("Ericu950/Epigr_2_Llama-3.1-8B-Instruct_text")
generation_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device_map="auto",
)

Step 4: Run inference on an inscription of your choice

inscription_edition = "----εκτηι ισταμενου· ευμολποσ μολπου επεστατει· πρυτανεων γνωμη μεσσηνεωσ του διονοσ κατασταθεντοσ υπο --—ου του μυγαλου ερμωνοσ του μυιστρου κατασταθεντοσ υπο --—ρατου του προμαχου μολπου του μολπου λεοντοσ του -—ιππου κατασταθεντοσ υπο αριστοφανου του νουμηνιου του στησιοχου ηρακλειτου του αρτεμιδωρου δημοφωντοσ του πρυτανιοσ δαμωνοσ του ονφαλιωνοσ· επειδη οι δικασται οι αποσταλεντεσ εισ καλυμναν κομιζουσιν ψηφισμα παρα του δημου του καλυμνιων εν ωι γεγραπται οτι ο δημοσ ο καλυμνιων στεφανοι τον δημον χρυσωι στεφανωι αρετησ ενεκεν και ευνοιασ τησ εισ αυτον στεφανοι δε και τουσ δικαστασ τουσ αποσταλεντασ χρυσωι στεφανωι καλοκαγαθιασ ενεκεν κλεανδρον διοδωρου λεοντα ευβουλου κεφαλον δρακοντοσ θεοδωρον νουμηνιου λεοντα δρακοντιδου και περι τουτων οιεται δειν επιμελειαν ποιησασθαι τον δημον οπωσ ο τησ πολεωσ στεφανοσ αναγορευθηι και ο των δικαστων εν τωι θεατρωι διονυσιοισ δεδοχθαι τωι δημωι· τον μεν αγωνοθετην αναγγειλαι τον τησ πολεωσ στεφανον και τον των δικαστων κυκλιων τηι πρωτηι· επηιν[7 missing letters] και τουσ δικαστασ τουσ αποσταλεντασ επειδη αξιοι γενομενοι του δημου τιμασ περιεποιησαν τηι πολει·"
system_prompt = "Fill in the missing letters in this inscription!"
input_messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": inscription_edition},
]
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = generation_pipeline(
    input_messages,
    max_new_tokens=10,
    num_beams=25, # Set this as high as your memory will allow!
    num_return_sequences=10,
    early_stopping=True,
)
beam_contents = []
for output in outputs:
    generated_text = output.get('generated_text', [])
    for item in generated_text:
        if item.get('role') == 'assistant':
            beam_contents.append(item.get('content'))
real_response = "ησθαι δε"
print(f"The masked sequence: {real_response}")
for i, content in enumerate(beam_contents, start=1):
    print(f"Suggestion {i}: {content}")

Expected Output:

The masked sequence: ησθαι δε
Suggestion 1: ησαμενοσ·
Suggestion 2: ησμενοσ·
Suggestion 3: ησασθαι·
Suggestion 4: ημενουν 0·
Suggestion 5: ησται δε 0
Suggestion 6: ησθαι δε 
Suggestion 7: ησαμεθα·
Suggestion 8: ημεν δε 00·
Suggestion 9: ησθαι δε·
Suggestion 10: ησατω δε 0

Observe that performance declines! If we change

   load_in_4bit=True,
   bnb_4bit_compute_dtype=torch.bfloat16

in the second cell to

   load_in_8bit=True,

we get

The masked sequence: ησθαι δε
Suggestion 1: ησθαι δε
Suggestion 2: εσθαι δε
Suggestion 3: εσαι δε ο
Suggestion 4: εισθαι δ
Suggestion 5: εσ ο δημος
Suggestion 6: εσεν δε ο
Suggestion 7: εσ ο δημο
Suggestion 8: εκεν δε ο
Suggestion 9: εσαι δε σ
Suggestion 10: εισθαι ο

Information about configuration for merging

The finetuned model was remerged with Llama-3.1-8B-Instruct using the TIES merge method. This did not afect CER or top-1 accuracy, but the effect on top-20 accuracy was positive. The following YAML configuration was used:

models:
  - model: original # Llama 3.1
  - model: DDbDP_reconstructer_5 # A model fintuned on the 95 % of the DDbDP for 11 epochs
    parameters:
      density: 0.5
      weight: 1
merge_method: ties
base_model: original # Llama 3.1
parameters:
  normalize: true
dtype: bfloat16
Downloads last month
16
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Ericu950/Epigr_2_Llama-3.1-8B-Instruct_text

Finetuned
(697)
this model

Dataset used to train Ericu950/Epigr_2_Llama-3.1-8B-Instruct_text

Collection including Ericu950/Epigr_2_Llama-3.1-8B-Instruct_text