Spaces:
Sleeping
Sleeping
# VampNet | |
This repository contains recipes for training generative music models on top of the Lyrebird Audio Codec. | |
# Setting up | |
install AudioTools | |
```bash | |
git clone https://github.com/hugofloresgarcia/audiotools.git | |
pip install -e ./audiotools | |
``` | |
install the LAC library. | |
```bash | |
git clone https://github.com/hugofloresgarcia/lac.git | |
pip install -e ./lac | |
``` | |
install VampNet | |
```bash | |
git clone https://github.com/hugofloresgarcia/vampnet2.git | |
pip install -e ./vampnet2 | |
``` | |
## A note on argbind | |
This repository relies on [argbind](https://github.com/pseeth/argbind) to manage CLIs and config files. | |
Config files are stored in the `conf/` folder. | |
## Getting the Pretrained Models | |
Download the pretrained models from [this link](https://drive.google.com/file/d/1ZIBMJMt8QRE8MYYGjg4lH7v7BLbZneq2/view?usp=sharing). Then, extract the models to the `models/` folder. | |
# How the code is structured | |
This code was written fast to meet a publication deadline, so it can be messy and redundant at times. Currently working on cleaning it up. | |
``` | |
βββ conf <- (conf files for training, finetuning, etc) | |
βββ demo.py <- (gradio UI for playing with vampnet) | |
βββ env <- (environment variables) | |
βΒ Β βββ env.sh | |
βββ models <- (extract pretrained models) | |
βΒ Β βββ spotdl | |
βΒ Β βΒ Β βββ c2f.pth <- (coarse2fine checkpoint) | |
βΒ Β βΒ Β βββ coarse.pth <- (coarse checkpoint) | |
βΒ Β βΒ Β βββ codec.pth <- (codec checkpoint) | |
βΒ Β βββ wavebeat.pth | |
βββ README.md | |
βββ scripts | |
βΒ Β βββ exp | |
βΒ Β βΒ Β βββ eval.py <- (eval script) | |
βΒ Β βΒ Β βββ train.py <- (training/finetuning script) | |
βΒ Β βββ utils | |
βββ vampnet | |
βΒ Β βββ beats.py <- (beat tracking logic) | |
βΒ Β βββ __init__.py | |
βΒ Β βββ interface.py <- (high-level programmatic interface) | |
βΒ Β βββ mask.py | |
βΒ Β βββ modules | |
βΒ Β βΒ Β βββ activations.py | |
βΒ Β βΒ Β βββ __init__.py | |
βΒ Β βΒ Β βββ layers.py | |
βΒ Β βΒ Β βββ transformer.py <- (architecture + sampling code) | |
βΒ Β βββ scheduler.py | |
βΒ Β βββ util.py | |
``` | |
# Usage | |
First, you'll want to set up your environment | |
```bash | |
source ./env/env.sh | |
``` | |
## Staging a Run | |
Staging a run makes a copy of all the git-tracked files in the codebase and saves them to a folder for reproducibility. You can then run the training script from the staged folder. | |
``` | |
stage --name my_run --run_dir /path/to/staging/folder | |
``` | |
## Training a model | |
```bash | |
python scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints | |
``` | |
## Fine-tuning | |
To fine-tune a model, see the configuration files under `conf/lora/`. | |
You just need to provide a list of audio files // folders to fine-tune on, then launch the training job as usual. | |
```bash | |
python scripts/exp/train.py --args.load conf/lora/birds.yml --save_path /path/to/checkpoints | |
``` | |
## Launching the Gradio Interface | |
```bash | |
python demo.py --args.load conf/interface/spotdl.yml --Interface.device cuda | |
``` |