Update README.md

39df54d verified 27 days ago

4.18 kB

	---
	base_model: Spestly/AwA-1.5B
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- qwen2
	- trl
	- llama-cpp
	- gguf-my-repo
	license: apache-2.0
	language:
	- en
	library_name: transformers
	---

	# Triangle104/AwA-1.5B-Q4_K_S-GGUF
	This model was converted to GGUF format from [`Spestly/AwA-1.5B`](https://huggingface.co/Spestly/AwA-1.5B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/Spestly/AwA-1.5B) for more details on the model.

	---
	Model details:
	-
	AwA (Answers with Athena) is my portfolio project, showcasing a cutting-edge Chain-of-Thought (CoT) reasoning model. I created AwA to excel in providing detailed, step-by-step answers to complex questions across diverse domains. This model represents my dedication to advancing AI’s capability for enhanced comprehension, problem-solving, and knowledge synthesis.
	Key Features

	Chain-of-Thought Reasoning: AwA delivers step-by-step breakdowns of solutions, mimicking logical human thought processes.

	Domain Versatility: Performs exceptionally across a wide range of domains, including mathematics, science, literature, and more.

	Adaptive Responses: Adjusts answer depth and complexity based on input queries, catering to both novices and experts.

	Interactive Design: Designed for educational tools, research assistants, and decision-making systems.

	Intended Use Cases

	Educational Applications: Supports learning by breaking down complex problems into manageable steps.

	Research Assistance: Generates structured insights and explanations in academic or professional research.

	Decision Support: Enhances understanding in business, engineering, and scientific contexts.

	General Inquiry: Provides coherent, in-depth answers to everyday questions.

	Type: Chain-of-Thought (CoT) Reasoning Model

	Base Architecture: Adapted from [qwen2]

	Parameters: [1.54B]

	Fine-tuning: Specialized fine-tuning on Chain-of-Thought reasoning datasets to enhance step-by-step explanatory capabilities.

	Ethical Considerations

	Bias Mitigation: I have taken steps to minimise biases in the training data. However, users are encouraged to cross-verify outputs in sensitive contexts.

	Limitations: May not provide exhaustive answers for niche topics or domains outside its training scope.

	User Responsibility: Designed as an assistive tool, not a replacement for expert human judgment.

	Usage
	Option A: Local

	Using locally with the Transformers library

	# Use a pipeline as a high-level helper
	from transformers import pipeline

	messages = [
	{"role": "user", "content": "Who are you?"},
	]
	pipe = pipeline("text-generation", model="Spestly/AwA-1.5B")
	pipe(messages)

	Option B: API & Space

	You can use the AwA HuggingFace space or the AwA API (Coming soon!)
	Roadmap

	More AwA model sizes e.g 7B and 14B
	Create AwA API via spestly package

	---
	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo Triangle104/AwA-1.5B-Q4_K_S-GGUF --hf-file awa-1.5b-q4_k_s.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo Triangle104/AwA-1.5B-Q4_K_S-GGUF --hf-file awa-1.5b-q4_k_s.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo Triangle104/AwA-1.5B-Q4_K_S-GGUF --hf-file awa-1.5b-q4_k_s.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo Triangle104/AwA-1.5B-Q4_K_S-GGUF --hf-file awa-1.5b-q4_k_s.gguf -c 2048
	```