Evolution Learning Network (ELN) with QLoRA and Genetic Algorithms For LLM

Overview

This project implements an Evolution Learning Network (ELN) to fine-tune transformer-based models like LLaMA using a combination of Quantized Low-Rank Adaptation (QLoRA) and Genetic Algorithms (GA). The primary objective is to evolve a population of models across multiple generations to optimize for performance (fitness) and specialization, while maintaining diversity.

Key Features

Efficient model fine-tuning using QLoRA.
Evolutionary strategies, including random mutations and fitness-based selection.
Hardware-efficient training with 4-bit quantization.
Comprehensive experiment tracking with WandB.
Diversity maintenance through LoRA weight fingerprinting.

Model Details

Base Model

Name: meta-llama/Llama-3.2-1B (can be replaced with any Hugging Face model).
Architecture: Transformer-based causal language model.

Quantization Configuration

Quantization Type: 4-bit using bitsandbytes (bnb_4bit).
Parameters:
- Compute Type: torch.float16
- Quantization Type: "nf4" (Nonlinear quantization).
- Double Quantization: Enabled.
- Nested Quantization: Enabled.

LoRA (Low-Rank Adaptation)

Dimensions (r): 8
Alpha (Scaling): 16
Target Modules: Query and Value projections (q_proj, v_proj).
Dropout: 0.05
Task Type: Causal Language Modeling (CAUSAL_LM).

Training Strategy

Optimizer: paged_adamw_8bit for memory-efficient updates.
Precision: Mixed precision (fp16) for faster training.

Hyperparameters

General Parameters

Generations: 10
Population Size: 4
Dataset Size: 2000 samples per split (adjustable for larger datasets).

Training

Batch Size: 8
Gradient Accumulation: 16 steps.
Learning Rate: 2e-4
Epochs per Model: 2

Mutations

Mutation Rate: 10% (probability per parameter).
Mutation Scale: Noise added with a standard deviation of 0.02.

Dataset Details

Source

Name: WikiText (wikitext-2-raw-v1 for larger datasets).
Splits:
- train → Model training.
- validation → General task evaluation.
- test → Specific task evaluation.

Tokenization

Tokenizer: Hugging Face AutoTokenizer.
Max Token Length: 128 tokens.
Padding: Fixed to "max_length".

Results

Summary

Total Generations: 10
Best Fitness Achieved: 0.4772
Final Population Diversity: 0.0011

Evolution History (Highlights)

Generation	Best Fitness	Avg Fitness	Diversity	Best Specialization
1	0.4096	0.4023	0.00097	0.9967
5	0.4727	0.4722	0.00099	0.9968
10	0.4772	0.4768	0.00106	0.9972

Hardware & Framework

Hardware

Multi-GPU support with torch.nn.parallel.DistributedDataParallel or Accelerator.
Logs GPU/CPU usage with psutil and torch.cuda.

Frameworks & Libraries

Transformers: Hugging Face model and tokenizer handling.
Datasets: Data loading and processing.
WandB: Experiment tracking and visualization.
BitsAndBytes: 4-bit quantization.
PEFT: LoRA-based fine-tuning.

Future Work

Explore larger population sizes and more generations for enhanced diversity.
Experiment with other datasets to generalize findings.
Integrate additional mutation strategies for broader exploration.

Citation

Remaining

Code to run locally

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
model = PeftModel.from_pretrained(base_model, "diabolic6045/ELN-AOC-CAIN")

Framework versions

PEFT 0.14.0

diabolic6045
/

ELN-AOC-CAIN