leaderboard-pr-bot's picture
Adding Evaluation Results
d8a8205
|
raw
history blame
1.41 kB
metadata
library_name: peft
license: apache-2.0
datasets:
  - venkycs/llm4security
language:
  - en
metrics:
  - accuracy
tags:
  - security
  - infosec
  - cybersec
  - hack

Training procedure

This repository contains security-specific domain data collected using Stanford Alpaca. We want to express our gratitude to the Stanford Alpaca team for their valuable contributions. This dataset enhances our understanding of security-related language and supports the development of safer AI applications.

Acknowledgments - Special thanks to Stanford Alpaca for providing the tools and resources to gather security domain data.

For more information about Stanford Alpaca, visit their website.

Framework versions

  • PEFT 0.4.0 (Thanks to HuggingFace team)

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 42.99
ARC (25-shot) 49.83
HellaSwag (10-shot) 77.33
MMLU (5-shot) 44.41
TruthfulQA (0-shot) 47.96
Winogrande (5-shot) 71.74
GSM8K (5-shot) 3.87
DROP (3-shot) 5.81