--- license: apache-2.0 tags: - merge - mergekit - lazymergekit - NousResearch/Hermes-2-Pro-Llama-3-8B - shenzhi-wang/Llama3-8B-Chinese-Chat --- # Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge Hermes-2-Pro-Llama-3-8B-Llama3-8B-Chinese-Chat-slerp-merge is a merge of the following models using [mergekit](https://github.com/cg123/mergekit): * [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B) * [shenzhi-wang/Llama3-8B-Chinese-Chat](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat) ## 🧩 Merge Configuration ```yaml slices: - sources: - model: NousResearch/Hermes-2-Pro-Llama-3-8B layer_range: [0, 31] - model: shenzhi-wang/Llama3-8B-Chinese-Chat layer_range: [0, 31] merge_method: slerp base_model: NousResearch/Hermes-2-Pro-Llama-3-8B parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 dtype: float16 ``` ## Model Details Hermes-2-Pro is an upgraded version of the Nous Hermes model, designed for general task and conversation capabilities, with a focus on function calling and structured outputs. It has been fine-tuned on a cleaned version of the OpenHermes 2.5 dataset, achieving high scores in function calling evaluations. Llama3-8B-Chinese-Chat is an instruction-tuned model specifically for Chinese and English users, excelling in roleplaying and tool-using tasks. ## Description The merged model combines the advanced generative capabilities of Hermes-2-Pro with the specialized tuning of Llama3-8B-Chinese-Chat. This results in a versatile model that excels in both English and Chinese text generation, providing enhanced context understanding and nuanced responses across various NLP tasks. ## Use Cases - **Conversational AI**: Engage users in natural dialogue in both English and Chinese. - **Function Calling**: Execute predefined functions based on user queries, enhancing interactivity. - **Roleplaying**: Simulate characters or scenarios in a conversational context. - **Text Generation**: Generate creative content, including stories, poems, and structured outputs. ## Model Features - **Bilingual Capabilities**: Supports both English and Chinese, making it suitable for diverse user bases. - **Function Calling**: Enhanced ability to perform actions based on user input, improving user experience. - **Structured Outputs**: Capable of generating outputs in specific formats, such as JSON, for easier integration into applications. ## Evaluation Results - **Hermes-2-Pro**: Achieved a 90% score on function calling evaluations and an 84% on structured JSON output evaluations. - **Llama3-8B-Chinese-Chat**: Demonstrated superior performance in Chinese language tasks, surpassing previous models in roleplay and function calling capabilities. ## Limitations While the merged model inherits the strengths of both parent models, it may also carry over some limitations, including: - **Biases**: Potential biases present in the training data of both models may affect the outputs. - **Contextual Understanding**: Although improved, the model may still struggle with highly nuanced or context-specific queries. - **Performance Variability**: Performance may vary based on the complexity of the task and the language used. This model represents a significant advancement in bilingual conversational AI, combining the best features of its predecessors to deliver a powerful tool for various applications.