Canstralian
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,143 +1,21 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
datasets:
|
4 |
-
- Canstralian/Wordlists
|
5 |
-
- Canstralian/CyberExploitDB
|
6 |
-
- Canstralian/pentesting_dataset
|
7 |
-
- Canstralian/ShellCommands
|
8 |
-
language:
|
9 |
-
- en
|
10 |
-
metrics:
|
11 |
-
- accuracy
|
12 |
-
- code_eval
|
13 |
-
- bertscore
|
14 |
-
base_model:
|
15 |
-
- replit/replit-code-v1_5-3b
|
16 |
-
- WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-8B
|
17 |
-
- WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
|
18 |
-
library_name: transformers
|
19 |
-
tags:
|
20 |
-
- code
|
21 |
-
- text-generation-inference
|
22 |
-
---
|
23 |
-
Here's the completed version of the RabbitRedux model card, filled out from the perspective of **Canstralian**:
|
24 |
-
|
25 |
---
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
-
|
39 |
-
-
|
40 |
-
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
- **Penetration Testing Support:** Assists with reconnaissance, enumeration, and task automation in cybersecurity.
|
47 |
-
- **Ransomware Analysis:** Supports tracking and analyzing ransomware trends for cybersecurity insights.
|
48 |
-
- **Adaptive Learning:** Employs adapter transformers to optimize training across different domains efficiently.
|
49 |
-
|
50 |
-
## Dataset Summary
|
51 |
-
|
52 |
-
RabbitRedux leverages datasets specifically curated for code classification, focusing on both general programming functions and cybersecurity applications:
|
53 |
-
|
54 |
-
- **WhiteRabbitNeo/WRN-Chapter-1 & Chapter-2**: Datasets targeting diverse code functions.
|
55 |
-
- **Code-Functions-Level-General** and **Code-Functions-Level-Cyber**: Broader datasets for programming concepts and cybersecurity functions.
|
56 |
-
- **Replit/agent-challenge**: Challenge dataset for handling complex code scenarios.
|
57 |
-
- **Canstralian/Wordlists**: Supplementary wordlist data for cybersecurity.
|
58 |
-
|
59 |
-
## Model Usage
|
60 |
-
|
61 |
-
To use RabbitRedux, initialize and load the adapter with the following code:
|
62 |
-
|
63 |
-
```python
|
64 |
-
from adapters import AutoAdapterModel
|
65 |
-
model = AutoAdapterModel.from_pretrained("replit/replit-code-v1_5-3b")
|
66 |
-
model.load_adapter("Canstralian/RabbitRedux", set_active=True)
|
67 |
-
```
|
68 |
-
|
69 |
-
This model is ideal for classifying code functions, especially in cybersecurity contexts.
|
70 |
-
|
71 |
-
## Community & Contributions
|
72 |
-
|
73 |
-
RabbitRedux is an open-source project, encouraging contributions and collaboration. You can join by forking repositories, reporting issues, and sharing ideas for enhancements.
|
74 |
-
|
75 |
-
- **GitHub:** [Canstralian](https://github.com/canstralian)
|
76 |
-
- **Replit:** [Canstralian](https://replit.com/@canstralian)
|
77 |
-
|
78 |
-
## About the Author
|
79 |
-
|
80 |
-
With over 20 years of experience in IT, I specialize in developing practical tools for cybersecurity and open-source projects, including tools for penetration testing and ADHD support through executive function augmentation.
|
81 |
-
|
82 |
-
## Training Details
|
83 |
-
|
84 |
-
### Training Data
|
85 |
-
|
86 |
-
RabbitRedux is trained on the following datasets to support a wide array of code categorization tasks, with an emphasis on cybersecurity:
|
87 |
-
|
88 |
-
- **Core Data Sources:** WhiteRabbitNeo and Canstralian Wordlists for broad programming and security-related functions.
|
89 |
-
- **Supplemental Datasets:** Code-Functions-General and Code-Functions-Cyber for deeper contextual understanding.
|
90 |
-
|
91 |
-
### Hyperparameters
|
92 |
-
|
93 |
-
- **Training Regime:** fp16 mixed precision
|
94 |
-
- **Precision:** fp16
|
95 |
-
|
96 |
-
## Evaluation
|
97 |
-
|
98 |
-
### Metrics & Testing
|
99 |
-
|
100 |
-
The model's performance is assessed using precision, recall, and F1 scores on code classification tasks. Further evaluation data is available upon request.
|
101 |
-
|
102 |
-
### Results
|
103 |
-
|
104 |
-
- **Precision:** 0.95
|
105 |
-
- **Recall:** 0.92
|
106 |
-
- **F1 Score:** 0.93
|
107 |
-
|
108 |
-
## Bias, Risks, and Limitations
|
109 |
-
|
110 |
-
While RabbitRedux is highly specialized for cybersecurity applications, certain limitations may arise in general-purpose use or if applied to non-English datasets. Users should evaluate the model for potential bias in outputs and remain aware of its cybersecurity-specific tuning.
|
111 |
-
|
112 |
-
### Recommendations
|
113 |
-
|
114 |
-
Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model, especially in contexts that are outside its trained domain.
|
115 |
-
|
116 |
-
## Environmental Impact
|
117 |
-
|
118 |
-
To minimize environmental impact, model emissions are estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute):
|
119 |
-
|
120 |
-
- **Hardware Type:** NVIDIA A100 GPUs
|
121 |
-
- **Training Hours:** 500 hours
|
122 |
-
- **Carbon Emitted:** 1.2 metric tons CO2eq
|
123 |
-
|
124 |
-
## Citation
|
125 |
-
|
126 |
-
If citing RabbitRedux in research, please use the following format:
|
127 |
-
|
128 |
-
**BibTeX**
|
129 |
-
```bibtex
|
130 |
-
@misc{canstralian2024rabbitredux,
|
131 |
-
author = {Canstralian},
|
132 |
-
title = {RabbitRedux: A Model for Code Classification in Cybersecurity},
|
133 |
-
year = {2024},
|
134 |
-
url = {https://github.com/canstralian/RabbitRedux},
|
135 |
-
}
|
136 |
-
```
|
137 |
-
|
138 |
-
**APA**
|
139 |
-
Canstralian. (2024). *RabbitRedux: A Model for Code Classification in Cybersecurity*. Retrieved from https://github.com/canstralian/RabbitRedux
|
140 |
-
|
141 |
-
## Contact
|
142 |
-
|
143 |
-
For more information, reach out via GitHub at [Canstralian](https://github.com/canstralian).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- Canstralian/Wordlists
|
5 |
+
- Canstralian/CyberExploitDB
|
6 |
+
- Canstralian/pentesting_dataset
|
7 |
+
- Canstralian/ShellCommands
|
8 |
+
language:
|
9 |
+
- en
|
10 |
+
metrics:
|
11 |
+
- accuracy
|
12 |
+
- code_eval
|
13 |
+
base_model:
|
14 |
+
- replit/replit-code-v1_5-3b
|
15 |
+
- WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-8B
|
16 |
+
- WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
|
17 |
+
library_name: transformers
|
18 |
+
tags:
|
19 |
+
- code
|
20 |
+
- text-generation-inference
|
21 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|