Canstralian commited on
Commit
bff2939
·
verified ·
1 Parent(s): d2193d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -142
README.md CHANGED
@@ -1,143 +1,21 @@
1
- ---
2
- license: mit
3
- datasets:
4
- - Canstralian/Wordlists
5
- - Canstralian/CyberExploitDB
6
- - Canstralian/pentesting_dataset
7
- - Canstralian/ShellCommands
8
- language:
9
- - en
10
- metrics:
11
- - accuracy
12
- - code_eval
13
- - bertscore
14
- base_model:
15
- - replit/replit-code-v1_5-3b
16
- - WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-8B
17
- - WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
18
- library_name: transformers
19
- tags:
20
- - code
21
- - text-generation-inference
22
- ---
23
- Here's the completed version of the RabbitRedux model card, filled out from the perspective of **Canstralian**:
24
-
25
  ---
26
-
27
- # Model Card for RabbitRedux
28
-
29
- RabbitRedux is a code classification model tailored for cybersecurity applications, based on the `replit/replit-code-v1_5-3b` model. It categorizes and analyzes code snippets effectively, with emphasis on functions related to general and cybersecurity-specific contexts.
30
-
31
- ## Model Details
32
-
33
- ### Overview
34
-
35
- **RabbitRedux** expands upon the `replit/replit-code-v1_5-3b` model to provide specialized support in areas such as penetration testing and ransomware analysis. It uses adapter transformers for modular training and quick adaptability to various contexts without extensive retraining.
36
-
37
- - **Developer:** [Canstralian](https://github.com/canstralian)
38
- - **Model Type:** Adapter-enhanced code classification
39
- - **Language(s):** English
40
- - **License:** Apache 2.0
41
- - **Base Model:** `replit/replit-code-v1_5-3b`
42
- - **Library:** Adapter Transformers
43
-
44
- ## Key Features
45
-
46
- - **Penetration Testing Support:** Assists with reconnaissance, enumeration, and task automation in cybersecurity.
47
- - **Ransomware Analysis:** Supports tracking and analyzing ransomware trends for cybersecurity insights.
48
- - **Adaptive Learning:** Employs adapter transformers to optimize training across different domains efficiently.
49
-
50
- ## Dataset Summary
51
-
52
- RabbitRedux leverages datasets specifically curated for code classification, focusing on both general programming functions and cybersecurity applications:
53
-
54
- - **WhiteRabbitNeo/WRN-Chapter-1 & Chapter-2**: Datasets targeting diverse code functions.
55
- - **Code-Functions-Level-General** and **Code-Functions-Level-Cyber**: Broader datasets for programming concepts and cybersecurity functions.
56
- - **Replit/agent-challenge**: Challenge dataset for handling complex code scenarios.
57
- - **Canstralian/Wordlists**: Supplementary wordlist data for cybersecurity.
58
-
59
- ## Model Usage
60
-
61
- To use RabbitRedux, initialize and load the adapter with the following code:
62
-
63
- ```python
64
- from adapters import AutoAdapterModel
65
- model = AutoAdapterModel.from_pretrained("replit/replit-code-v1_5-3b")
66
- model.load_adapter("Canstralian/RabbitRedux", set_active=True)
67
- ```
68
-
69
- This model is ideal for classifying code functions, especially in cybersecurity contexts.
70
-
71
- ## Community & Contributions
72
-
73
- RabbitRedux is an open-source project, encouraging contributions and collaboration. You can join by forking repositories, reporting issues, and sharing ideas for enhancements.
74
-
75
- - **GitHub:** [Canstralian](https://github.com/canstralian)
76
- - **Replit:** [Canstralian](https://replit.com/@canstralian)
77
-
78
- ## About the Author
79
-
80
- With over 20 years of experience in IT, I specialize in developing practical tools for cybersecurity and open-source projects, including tools for penetration testing and ADHD support through executive function augmentation.
81
-
82
- ## Training Details
83
-
84
- ### Training Data
85
-
86
- RabbitRedux is trained on the following datasets to support a wide array of code categorization tasks, with an emphasis on cybersecurity:
87
-
88
- - **Core Data Sources:** WhiteRabbitNeo and Canstralian Wordlists for broad programming and security-related functions.
89
- - **Supplemental Datasets:** Code-Functions-General and Code-Functions-Cyber for deeper contextual understanding.
90
-
91
- ### Hyperparameters
92
-
93
- - **Training Regime:** fp16 mixed precision
94
- - **Precision:** fp16
95
-
96
- ## Evaluation
97
-
98
- ### Metrics & Testing
99
-
100
- The model's performance is assessed using precision, recall, and F1 scores on code classification tasks. Further evaluation data is available upon request.
101
-
102
- ### Results
103
-
104
- - **Precision:** 0.95
105
- - **Recall:** 0.92
106
- - **F1 Score:** 0.93
107
-
108
- ## Bias, Risks, and Limitations
109
-
110
- While RabbitRedux is highly specialized for cybersecurity applications, certain limitations may arise in general-purpose use or if applied to non-English datasets. Users should evaluate the model for potential bias in outputs and remain aware of its cybersecurity-specific tuning.
111
-
112
- ### Recommendations
113
-
114
- Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model, especially in contexts that are outside its trained domain.
115
-
116
- ## Environmental Impact
117
-
118
- To minimize environmental impact, model emissions are estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute):
119
-
120
- - **Hardware Type:** NVIDIA A100 GPUs
121
- - **Training Hours:** 500 hours
122
- - **Carbon Emitted:** 1.2 metric tons CO2eq
123
-
124
- ## Citation
125
-
126
- If citing RabbitRedux in research, please use the following format:
127
-
128
- **BibTeX**
129
- ```bibtex
130
- @misc{canstralian2024rabbitredux,
131
- author = {Canstralian},
132
- title = {RabbitRedux: A Model for Code Classification in Cybersecurity},
133
- year = {2024},
134
- url = {https://github.com/canstralian/RabbitRedux},
135
- }
136
- ```
137
-
138
- **APA**
139
- Canstralian. (2024). *RabbitRedux: A Model for Code Classification in Cybersecurity*. Retrieved from https://github.com/canstralian/RabbitRedux
140
-
141
- ## Contact
142
-
143
- For more information, reach out via GitHub at [Canstralian](https://github.com/canstralian).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ datasets:
4
+ - Canstralian/Wordlists
5
+ - Canstralian/CyberExploitDB
6
+ - Canstralian/pentesting_dataset
7
+ - Canstralian/ShellCommands
8
+ language:
9
+ - en
10
+ metrics:
11
+ - accuracy
12
+ - code_eval
13
+ base_model:
14
+ - replit/replit-code-v1_5-3b
15
+ - WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-8B
16
+ - WhiteRabbitNeo/Llama-3.1-WhiteRabbitNeo-2-70B
17
+ library_name: transformers
18
+ tags:
19
+ - code
20
+ - text-generation-inference
21
+ ---