Collective Action Participation Detection Model - Fine-Tuned LLama3

Note: this is the second step of a layered approach, see this model for the first step.

This model detects expressions of levels of participation in collective action from text. First, the binary presence of participation expression should be detected with this model for the first step. Second, for the messages expressing participation, participation levels can be detected. For details on the framework and useful code snippets, see the paper "Extracting Participation in Collective Action from Social Media", Pera and Aiello (2025).

Usage Example

To use the model, follow the example below:

from transformers import (AutoModelForCausalLM, 
                          AutoTokenizer, 
                          BitsAndBytesConfig, 
                          pipeline)

model_dir = "ariannap22/collectiveaction_sft_annotated_only_v6_prompt_v6_p100_synthetic_balanced_more_layered"

# Define the text you want to predict
texts = [
    "We need to stand together for our rights!",
    "I volunteer at the local food bank."
]

# Define levels of participation in collective action¨
dim_def = {'Problem-Solution': "The comment highlights an issue and possibly suggests a way to fix it, often naming those responsible.",
            'Call-to-Action': "The comment asks readers to take part in a specific activity, effort, or movement.",
            'Intention': "The commenter shares their own desire to do something or be involved in solving a particular issue.",
            'Execution': "The commenter is describing their personal experience taking direct actions towards a common goal."}

# Define the prompt
def generate_test_prompt6(data_point):
    return f"""
            You have the following knowledge about levels of participation in collective action that can be expressed in social media comments: {dim_def}. 
            
            ### Definitions and Criteria:
            **Collective Action Problem:** A present issue caused by human actions or decisions that affects a group and can be addressed through individual or collective efforts.

            **Participation in collective action**: A comment must clearly reference a collective action problem, social movement, or activism by meeting at least one of the levels in the list {dim_def.keys()}.

            Classify the following social media comment into one of the levels within the list {list(dim_def.keys())}. 

            ### Example of correct output format:
            text: xyz
            label: None
            
            Return the answer as the corresponding participation in collective action level label.

            text: {data_point}
            label: """.strip()

texts_prompts = [generate_test_prompt6(text) for text in texts]

# Prepare datasets and load model
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=False,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16",
)

model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    device_map="auto",
    torch_dtype="float16",
    quantization_config=bnb_config, 
)

model.config.use_cache = False
model.config.pretraining_tp = 1

tokenizer = AutoTokenizer.from_pretrained(model_dir)

tokenizer.pad_token_id = tokenizer.eos_token_id

# Define prediction 
def predict(texts, model, tokenizer):
    y_pred = []
    answers = []
    categories = list(dim_def.keys())

    for i in range(len(texts)):
        prompt = texts[i]
        pipe = pipeline(task="text-generation", 
                        model=model, 
                        tokenizer=tokenizer, 
                        max_new_tokens=20, 
                        temperature=0.1)
        
        result = pipe(prompt)
        answer = result[0]['generated_text'].split("label:")[-1].strip()
        answers.append(answer)
        
        # Determine the predicted category
        for category in categories:
            if category.lower() in answer.lower():
                y_pred.append(category)
                break
        else:
            y_pred.append("error")
    
    return y_pred, answers

y_pred, answer = predict(texts_prompts, model, tokenizer)

# Print results
for text, pred in zip(texts, y_pred):
    print(f"Text: {text}")
    print(f"Predicted Class: {pred}")
    print("---")
Downloads last month
30
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for ariannap22/collectiveaction_sft_annotated_only_v6_prompt_v6_p100_synthetic_balanced_more_layered

Finetuned
(781)
this model