ZeroXClem commited on
Commit
f83c2e2
Β·
verified Β·
1 Parent(s): a17ad8b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -6
README.md CHANGED
@@ -6,17 +6,30 @@ tags:
6
  - lazymergekit
7
  - hydra-project/ChatHercules-2.5-Mistral-7B
8
  - Nitral-Archive/Prima-Pastacles-7b
 
 
 
 
 
 
9
  ---
 
 
 
 
 
 
 
10
 
11
- # ZeroXClem/Mistral-2.5-Prima-Hercules-Fusion-7B
 
12
 
13
- ZeroXClem/Mistral-2.5-Prima-Hercules-Fusion-7B is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
14
- * [hydra-project/ChatHercules-2.5-Mistral-7B](https://huggingface.co/hydra-project/ChatHercules-2.5-Mistral-7B)
15
- * [Nitral-Archive/Prima-Pastacles-7b](https://huggingface.co/Nitral-Archive/Prima-Pastacles-7b)
16
 
17
- ## 🧩 Configuration
18
 
19
  ```yaml
 
20
  slices:
21
  - sources:
22
  - model: hydra-project/ChatHercules-2.5-Mistral-7B
@@ -33,5 +46,107 @@ parameters:
33
  value: [1, 0.5, 0.7, 0.3, 0]
34
  - value: 0.5
35
  dtype: bfloat16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- ```
 
6
  - lazymergekit
7
  - hydra-project/ChatHercules-2.5-Mistral-7B
8
  - Nitral-Archive/Prima-Pastacles-7b
9
+ language:
10
+ - en
11
+ base_model:
12
+ - hydra-project/ChatHercules-2.5-Mistral-7B
13
+ - Nitral-Archive/Prima-Pastacles-7b
14
+ library_name: transformers
15
  ---
16
+ # Mistral-2.5-Prima-Hercules-Fusion-7B
17
+
18
+ **Mistral-2.5-Prima-Hercules-Fusion-7B** is a sophisticated language model crafted by merging **hydra-project/ChatHercules-2.5-Mistral-7B** with **Nitral-Archive/Prima-Pastacles-7b** using the **spherical linear interpolation (SLERP)** method. This fusion leverages the conversational depth of Hercules and the contextual adaptability of Prima, resulting in a model that excels in dynamic assistant applications and multi-turn conversations.
19
+
20
+ ## πŸš€ Merged Models
21
+
22
+ This model merge incorporates the following:
23
 
24
+ - [**hydra-project/ChatHercules-2.5-Mistral-7B**](https://huggingface.co/hydra-project/ChatHercules-2.5-Mistral-7B): Serves as the primary model, renowned for its exceptional conversational abilities and robust language comprehension.
25
+ - [**Nitral-Archive/Prima-Pastacles-7b**](https://huggingface.co/Nitral-Archive/Prima-Pastacles-7b): Enhances contextual adaptability and task-switching capabilities, providing intuitive context management for diverse applications.
26
 
27
+ ## 🧩 Merge Configuration
 
 
28
 
29
+ The configuration below outlines how the models are merged using **spherical linear interpolation (SLERP)**. This method ensures a seamless blend of architectural layers from both source models, optimizing their unique strengths for enhanced performance.
30
 
31
  ```yaml
32
+ # Mistral-2.5-Prima-Hercules-Fusion-7B Merge Configuration
33
  slices:
34
  - sources:
35
  - model: hydra-project/ChatHercules-2.5-Mistral-7B
 
46
  value: [1, 0.5, 0.7, 0.3, 0]
47
  - value: 0.5
48
  dtype: bfloat16
49
+ ```
50
+
51
+ ### Key Parameters
52
+
53
+ - **Self-Attention Filtering** (`self_attn`): Modulates the blending across self-attention layers, allowing the model to balance attention mechanisms from both source models effectively.
54
+ - **MLP Filtering** (`mlp`): Fine-tunes the integration within Multi-Layer Perceptrons, ensuring optimal neural network layer performance.
55
+ - **Global Weight (`t.value`)**: Applies a universal interpolation factor to layers not explicitly filtered, maintaining an even blend between models.
56
+ - **Data Type (`dtype`)**: Utilizes `bfloat16` to maintain computational efficiency while preserving high precision.
57
+
58
+ ## πŸ† Performance Highlights
59
+
60
+ - **Enhanced Multi-Turn Conversation Handling**: Improved context retention facilitates more coherent and contextually aware multi-turn interactions.
61
+ - **Dynamic Assistant Applications**: Excels in role-play and scenario-based interactions, providing nuanced and adaptable responses.
62
+ - **Balanced Integration**: Combines the conversational depth of Hercules with the contextual adaptability of Prima for versatile performance across various tasks.
63
+
64
+ ## 🎯 Use Case & Applications
65
+
66
+ **Mistral-2.5-Prima-Hercules-Fusion-7B** is designed to excel in environments that demand both conversational prowess and specialized task execution. Ideal applications include:
67
+
68
+ - **Advanced Conversational Agents**: Powering chatbots and virtual assistants with nuanced understanding and responsive capabilities.
69
+ - **Educational Tools**: Assisting in tutoring systems, providing explanations, and facilitating interactive learning experiences.
70
+ - **Content Generation**: Creating high-quality, contextually relevant content for blogs, articles, and marketing materials.
71
+ - **Technical Support**: Offering precise and efficient support in specialized domains such as IT, healthcare, and finance.
72
+ - **Role-Playing Scenarios**: Enhancing interactive storytelling and simulation-based training with dynamic and contextually aware responses.
73
+
74
+ ## πŸ“ Usage
75
+
76
+ To utilize **Mistral-2.5-Prima-Hercules-Fusion-7B**, follow the steps below:
77
+
78
+ ### Installation
79
+
80
+ First, install the necessary libraries:
81
+
82
+ ```bash
83
+ pip install -qU transformers accelerate
84
+ ```
85
+
86
+ ### Inference
87
+
88
+ Below is an example of how to load and use the model for text generation:
89
+
90
+ ```python
91
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
92
+ import torch
93
+
94
+ # Define the model name
95
+ model_name = "ZeroXClem/Mistral-2.5-Prima-Hercules-Fusion-7B"
96
+
97
+ # Load the tokenizer
98
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
99
+
100
+ # Load the model
101
+ model = AutoModelForCausalLM.from_pretrained(
102
+ model_name,
103
+ torch_dtype=torch.bfloat16,
104
+ device_map="auto"
105
+ )
106
+
107
+ # Initialize the pipeline
108
+ text_generator = pipeline(
109
+ "text-generation",
110
+ model=model,
111
+ tokenizer=tokenizer,
112
+ torch_dtype=torch.bfloat16,
113
+ device_map="auto"
114
+ )
115
+
116
+ # Define the input prompt
117
+ prompt = "Explain the significance of artificial intelligence in modern healthcare."
118
+
119
+ # Generate the output
120
+ outputs = text_generator(
121
+ prompt,
122
+ max_new_tokens=150,
123
+ do_sample=True,
124
+ temperature=0.7,
125
+ top_k=50,
126
+ top_p=0.95
127
+ )
128
+
129
+ # Print the generated text
130
+ print(outputs[0]["generated_text"])
131
+ ```
132
+
133
+ ### Notes
134
+
135
+ - **Fine-Tuning**: This merged model requires fine-tuning for optimal performance in specific applications.
136
+ - **Resource Requirements**: Ensure that your environment has sufficient computational resources, especially if deploying on GPU-enabled hardware for faster inference.
137
+
138
+
139
+ ## πŸ“œ License
140
+
141
+ This model is open-sourced under the **Apache-2.0 License**.
142
+
143
+ ## πŸ’‘ Tags
144
+
145
+ - `merge`
146
+ - `mergekit`
147
+ - `slerp`
148
+ - `Mistral`
149
+ - `hydra-project/ChatHercules-2.5-Mistral-7B`
150
+ - `Nitral-Archive/Prima-Pastacles-7b`
151
 
152
+ ---