Update README.md
Browse files
README.md
CHANGED
@@ -86,9 +86,9 @@ NVIDIA models are trained on a diverse set of public and proprietary datasets. T
|
|
86 |
## AI Safety Efforts
|
87 |
|
88 |
The Llama-2-7B-DMC-8x model underwent AI safety evaluation including adversarial testing via three distinct methods:
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
|
93 |
## Inference
|
94 |
|
|
|
86 |
## AI Safety Efforts
|
87 |
|
88 |
The Llama-2-7B-DMC-8x model underwent AI safety evaluation including adversarial testing via three distinct methods:
|
89 |
+
* [Garak](https://github.com/leondz/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
|
90 |
+
* [AEGIS](https://huggingface.co/datasets/nvidia/Aegis-AI-Content-Safety-Dataset-1.0), is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
|
91 |
+
* Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
|
92 |
|
93 |
## Inference
|
94 |
|