Hugo Flores commited on
Commit
5582d2e
Β·
1 Parent(s): 50f034f

readme cleanup

Browse files
Files changed (3) hide show
  1. README.md +2 -188
  2. lyrebird-audio-codec +1 -0
  3. lyrebird-audiotools +1 -0
README.md CHANGED
@@ -1,6 +1,6 @@
1
- # Lyrebird Wav2Wav
2
 
3
- This repository contains recipes for training Wav2Wav models.
4
 
5
  ## Install hooks
6
 
@@ -24,65 +24,6 @@ If you need to run it on all files:
24
 
25
  pre-commit run --all-files
26
 
27
- ## Usage & model zoo
28
-
29
- To download the model, one must be authenticated to the `lyrebird-research` project on Google Cloud.
30
- To see all available models, run
31
-
32
- ```bash
33
- python -m wav2wav.list_models
34
- ```
35
-
36
- which outputs something like this:
37
-
38
- ```
39
- gs://research-models/wav2wav
40
- └── prod
41
- └── v3
42
- └── ckpt
43
- β”œβ”€β”€ best
44
- β”‚ └── generator
45
- β”‚ β”œβ”€β”€ ❌ model.onnx
46
- β”‚ β”œβ”€β”€ ❌ nvidia_geforce_rtx_2080_ti_11_7.trt
47
- β”‚ β”œβ”€β”€ βœ… package.pth
48
- β”‚ β”œβ”€β”€ ❌ tesla_t4_11_7.trt
49
- β”‚ └── βœ… weights.pth
50
- └── latest
51
- └── generator
52
- β”œβ”€β”€ ❌ package.pth
53
- └── ❌ weights.pth
54
- └── v2
55
- ...
56
- └── dev
57
- ...
58
- ```
59
-
60
- This will show all the models that are available on GCP. Models that are available locally are marked with a βœ…, while those not available locally
61
- are marked with ❌. `.onnx` indicates a model that must be run with
62
- the `ONNX` runtime, while `.trt` indicate models that have been optimized
63
- with TensorRT. Note that TensorRT models are specific to GPU and CUDA
64
- runtime, and their file names indicate what to use to run them.
65
-
66
- `package.pth` is a version of the model that is saved using `torch.package`,
67
- and contains a copy of the model code within it, which allow it to work
68
- even if the model code in `wav2wav/modules/generator.py` changes. `weights.pth`
69
- contains the model weights, and the code must match the code used
70
- to create the model.
71
-
72
- To use a model from this list, simply write its path and give it to the `enhance` script,
73
- like so:
74
-
75
- ```
76
- python -m wav2wav.interface \
77
- [input_path]
78
- --model_path=prod/v3/ckpt/best/generator/weights.pth
79
- --output_path [output_path]
80
- ```
81
-
82
- Models are downloaded to the location set by the environment variable `MODEL_LOCAL_PATH`, and defaults to `~/.wav2wav/models`. Similarly,
83
- The model bucket is determined by `MODEL_GCS_PATH` and defaults to
84
- `gs://research-models/wav2wav/`.
85
-
86
  ## Development
87
  ### Setting everything up
88
 
@@ -130,16 +71,6 @@ To tear down your development environment, just do
130
  docker compose down
131
  ```
132
 
133
- ### Downloading data and pre-processing
134
- Next, from within the Docker environment (or an appropriately configured Conda environment with environment variables set as above), do the following:
135
-
136
- ```
137
- python -m wav2wav.preprocess.download
138
- ```
139
-
140
- This will download all the necessary data, which are referenced by
141
- the CSV files in `conf/audio/*`. These CSVs were generated via
142
- `python -m wav2wav.preprocess.organize`.
143
 
144
  ### Launching an experiment
145
 
@@ -159,43 +90,8 @@ torchrun --nproc_per_node gpu \
159
 
160
  The full settings are in [conf/daps/train.yml](conf/daps/train.yml).
161
 
162
- ### Evaluating an experiment
163
-
164
- There are two ways to evaluate an experiment: quantitative and qualitative.
165
- For the first, we can use the `scripts/exp/evaluate.py` script. This script evaluates the model over the `val_data` and `test_data`, defined in your
166
- `train` script, and takes as input an experiment directory. The metrics
167
- computed by this script are saved to the same folder.
168
-
169
- The other way is via a preference test. Let's say we want to compare
170
- the v3 prod model against the v2 prod model. to do this, we use the
171
- `scripts/exp/qa.py` script. This script creates a zip file containing all
172
- the samples and an HTML page for easy viewing. It also creates a Papaya
173
- preference test. Use it like this:
174
-
175
- ```bash
176
- WAV2WAV_MODELS=a,b python scripts/exp/qa.py \
177
- --a/model_path prod/v3/ckpt/best/generator/package.pth \
178
- --b/model_path prod/v2/ckpt/best/generator/package.pth \
179
- --a/name "v3" --b/name "v2" \
180
- --device cuda:0 \
181
- --n_samples 20 \
182
- --zip_path "samples/out.zip"
183
- ```
184
-
185
  ### Useful commands
186
 
187
- #### Monitoring the machine
188
-
189
- There's a useful `tmux` workspace that you can launch via:
190
-
191
- ```bash
192
- tmuxp load ./workspace.yml
193
- ```
194
-
195
- which will have a split pane with a shell to launch commands on the left,
196
- and GPU monitoring, `htop`, and a script that watches for changes in your
197
- directory on the right, in three split panes.
198
-
199
  #### Cleaning up after a run
200
 
201
  Sometimes DDP runs fail to clear themselves out of the machine. To fix this, run
@@ -203,85 +99,3 @@ Sometimes DDP runs fail to clear themselves out of the machine. To fix this, run
203
  ```bash
204
  cleanup
205
  ```
206
-
207
- ### Deploying a new model to production
208
-
209
- Okay, so you ran a model and it seems promising and you want to upload it
210
- to GCS so it can be QA'd fully, and then shipped. First, upload
211
- your experiment to the `dev` bucket on GCS via:
212
-
213
- ```bash
214
- gsutil cp -r /path/to/{exp_name} gs://research-models/wav2wav/dev/{exp_name}
215
- ```
216
-
217
- Once uploaded, QA can access the models by specifying
218
- `model_path=dev/{exp_name}/ckpt/{best,latest}/generator/package.pth` when using the
219
- `wav2wav.interface.enhance` function. If it passes QA, and is scheduled to
220
- ship to production, then next we have to generate the TensorRT model file,
221
- which requires us to have a machine that matches that of a production machine.
222
-
223
- There is a script that automates this procedure, that does not require any
224
- fiddling from our end. Navigate to the repository root and run:
225
-
226
- ```
227
- python scripts/utils/convert_on_gcp.py dev/{exp_name}/ckpt/{best,latest}//generator/weights.pth
228
- ```
229
-
230
- This will provision the machine, download the relevant model from GCS, optimize it on
231
- the production GPU with the correct CUDA runtime, and then upload the generated `.trt`
232
- and `.onnx` models back to the bucket.
233
-
234
- Finally, copy the model to the `prod` bucket, incrementing the version number by one:
235
-
236
- ```bash
237
- gsutil cp -r gs://research-models/wav2wav/dev/{exp_name} gs://research-models/wav2wav/prod/v{N}
238
- ```
239
-
240
- where `N` is the next version (e.g. if v3 is the latest, the new one is v4). Then, update
241
- the model table in [Notion](https://www.notion.so/descript/fc04de4b46e6417eba1d06bdc8de6c75?v=e56db4e6b37c4d9b9eca8d9be15c826a) with the new model.
242
-
243
- Once the above is all done, we update the code in two places:
244
-
245
- 1. In `interface.py`, we update `PROD_MODEL_PATH` to point to the `weights.pth`
246
- for whichever tag ended up shipping (either `best` or `latest`).
247
- 2. In `interface.py`, we update `PROD_TRT_PATH` to point the generated
248
- TensorRT checkpoint generated by the script above.
249
-
250
- After merging to master, a new Docker image will be created, and one can update the relevant lines
251
- in descript-workflows like in this [PR](https://github.com/descriptinc/descript-workflows/pull/477/files).
252
-
253
- We have Github action workflows in [.github/workflows/deploy.yml](.github/workflows/deploy.yml) to build and deploy new docker images. Two images are built - one for staging and another for production.
254
- To deploy a new release version, follow the instructions in [this coda doc](https://coda.io/d/Research-Engineering_dOABAWL46p-/Deploying-Services_su1am#_lu7E8).
255
-
256
- Coda doc with informations about deploying speech-enhance worker is [here](https://coda.io/d/Research-Engineering_dOABAWL46p-/Deploying-Services_su1am#_lu7E8).
257
-
258
- And that's it! Once the new staging is built, you're done.
259
-
260
- ## Testing
261
-
262
- ### Profiling and Regression testing
263
-
264
- - The [profiling script](tests/profile_inference.py) profiles the `wav2wav.interface.enhance` function.
265
- - NOTE: ALWAYS run the profiler on a T4 GPU. ALWAYS run the profiling in isolation i.e kill all other processes on the GPU. Recommended vm size on GCP is `n1-standard-32` as the stress test of six hours of audio requires ~35GB of system memory.
266
- - To run profiling use the [profiling script](tests/profile_inference.py) via command `python3 -m tests.profile_inference`. Results will be printed after `1` run.
267
- - Use the [test_regression.py](tests/test_regression.py) script to run tests that
268
- - compare performance stats of current model with known best model
269
- - test for output deviation from the last model
270
- - Run `git lfs checkout` to checkout input file and model weights required for testing the model.
271
- - To launch these tests, run `python3 -m pytest tests/test_regression.py -v`.
272
- - As a side effect, this will update the `tests/stat.csv` file if the current model performs better than last best known model as per `tests/stat.csv`.
273
- - NOTE: In case of architecture change, purge the weights files : `tests/assets/{quick|slow}.pth` and reference stat file : `tests/assets/baseline.json` file. Running the [test_regression.py](tests/test_regression.py) script in absence of reference stat file, will generate new baseline referece stats as well as append new performance stats to stats file. In the absence of saved weights, new weights are generated and saved on disk. Make sure to commit these files (stat.csv, baseline.json, *.pth) when the model architecture changes.
274
-
275
- ### Unit tests
276
- Regular unit tests that test functionality such as training resume etc. These are run on CPU. Update them when new features are added.
277
-
278
- ### Profiling tests
279
- These tests profile the model's resource consumption. They are run on T4 GPU with 32 cores and >35GB memory. Their usage is reported in the above sections.
280
-
281
- ### Functional tests
282
- These tests detect deviation from known baseline model. A category of these tests ensure that a new pytorch model doesn't deviate from the previous one. Another category ensures that the TensorRT version of the current pytorch model doens't deviate from it. These tests are marked with the marker `output_qa` and can be run via the command line `python3 -m pytest -v -m output_qa`. Some of these tests require a GPU.
283
-
284
- ### CI tests
285
- - The tests are divided into two categories depending on the platform requirement - CPU tests and GPU tests.
286
- - The CPU tests contains unit tests.
287
- - The GPU tests contain a subset of functional tests. These tests can be run by the command `python3 -m pytest -v -m gpu_ci_test`.
 
1
+ # Lyrebird VampNet
2
 
3
+ This repository contains recipes for training generative music models on top of the Lyrebird Audio Codec.
4
 
5
  ## Install hooks
6
 
 
24
 
25
  pre-commit run --all-files
26
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
  ## Development
28
  ### Setting everything up
29
 
 
71
  docker compose down
72
  ```
73
 
 
 
 
 
 
 
 
 
 
 
74
 
75
  ### Launching an experiment
76
 
 
90
 
91
  The full settings are in [conf/daps/train.yml](conf/daps/train.yml).
92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  ### Useful commands
94
 
 
 
 
 
 
 
 
 
 
 
 
 
95
  #### Cleaning up after a run
96
 
97
  Sometimes DDP runs fail to clear themselves out of the machine. To fix this, run
 
99
  ```bash
100
  cleanup
101
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
lyrebird-audio-codec ADDED
@@ -0,0 +1 @@
 
 
1
+ Subproject commit fac6b9b624ab84603a2775d7ad9a345f9ec0b1e5
lyrebird-audiotools ADDED
@@ -0,0 +1 @@
 
 
1
+ Subproject commit 018a055ff7406c7bcb3b175551356ec18ba895b7