Spaces:
Sleeping
Sleeping
Hugo Flores
commited on
Commit
Β·
5582d2e
1
Parent(s):
50f034f
readme cleanup
Browse files- README.md +2 -188
- lyrebird-audio-codec +1 -0
- lyrebird-audiotools +1 -0
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
-
# Lyrebird
|
2 |
|
3 |
-
This repository contains recipes for training
|
4 |
|
5 |
## Install hooks
|
6 |
|
@@ -24,65 +24,6 @@ If you need to run it on all files:
|
|
24 |
|
25 |
pre-commit run --all-files
|
26 |
|
27 |
-
## Usage & model zoo
|
28 |
-
|
29 |
-
To download the model, one must be authenticated to the `lyrebird-research` project on Google Cloud.
|
30 |
-
To see all available models, run
|
31 |
-
|
32 |
-
```bash
|
33 |
-
python -m wav2wav.list_models
|
34 |
-
```
|
35 |
-
|
36 |
-
which outputs something like this:
|
37 |
-
|
38 |
-
```
|
39 |
-
gs://research-models/wav2wav
|
40 |
-
βββ prod
|
41 |
-
βββ v3
|
42 |
-
βββ ckpt
|
43 |
-
βββ best
|
44 |
-
β βββ generator
|
45 |
-
β βββ β model.onnx
|
46 |
-
β βββ β nvidia_geforce_rtx_2080_ti_11_7.trt
|
47 |
-
β βββ β
package.pth
|
48 |
-
β βββ β tesla_t4_11_7.trt
|
49 |
-
β βββ β
weights.pth
|
50 |
-
βββ latest
|
51 |
-
βββ generator
|
52 |
-
βββ β package.pth
|
53 |
-
βββ β weights.pth
|
54 |
-
βββ v2
|
55 |
-
...
|
56 |
-
βββ dev
|
57 |
-
...
|
58 |
-
```
|
59 |
-
|
60 |
-
This will show all the models that are available on GCP. Models that are available locally are marked with a β
, while those not available locally
|
61 |
-
are marked with β. `.onnx` indicates a model that must be run with
|
62 |
-
the `ONNX` runtime, while `.trt` indicate models that have been optimized
|
63 |
-
with TensorRT. Note that TensorRT models are specific to GPU and CUDA
|
64 |
-
runtime, and their file names indicate what to use to run them.
|
65 |
-
|
66 |
-
`package.pth` is a version of the model that is saved using `torch.package`,
|
67 |
-
and contains a copy of the model code within it, which allow it to work
|
68 |
-
even if the model code in `wav2wav/modules/generator.py` changes. `weights.pth`
|
69 |
-
contains the model weights, and the code must match the code used
|
70 |
-
to create the model.
|
71 |
-
|
72 |
-
To use a model from this list, simply write its path and give it to the `enhance` script,
|
73 |
-
like so:
|
74 |
-
|
75 |
-
```
|
76 |
-
python -m wav2wav.interface \
|
77 |
-
[input_path]
|
78 |
-
--model_path=prod/v3/ckpt/best/generator/weights.pth
|
79 |
-
--output_path [output_path]
|
80 |
-
```
|
81 |
-
|
82 |
-
Models are downloaded to the location set by the environment variable `MODEL_LOCAL_PATH`, and defaults to `~/.wav2wav/models`. Similarly,
|
83 |
-
The model bucket is determined by `MODEL_GCS_PATH` and defaults to
|
84 |
-
`gs://research-models/wav2wav/`.
|
85 |
-
|
86 |
## Development
|
87 |
### Setting everything up
|
88 |
|
@@ -130,16 +71,6 @@ To tear down your development environment, just do
|
|
130 |
docker compose down
|
131 |
```
|
132 |
|
133 |
-
### Downloading data and pre-processing
|
134 |
-
Next, from within the Docker environment (or an appropriately configured Conda environment with environment variables set as above), do the following:
|
135 |
-
|
136 |
-
```
|
137 |
-
python -m wav2wav.preprocess.download
|
138 |
-
```
|
139 |
-
|
140 |
-
This will download all the necessary data, which are referenced by
|
141 |
-
the CSV files in `conf/audio/*`. These CSVs were generated via
|
142 |
-
`python -m wav2wav.preprocess.organize`.
|
143 |
|
144 |
### Launching an experiment
|
145 |
|
@@ -159,43 +90,8 @@ torchrun --nproc_per_node gpu \
|
|
159 |
|
160 |
The full settings are in [conf/daps/train.yml](conf/daps/train.yml).
|
161 |
|
162 |
-
### Evaluating an experiment
|
163 |
-
|
164 |
-
There are two ways to evaluate an experiment: quantitative and qualitative.
|
165 |
-
For the first, we can use the `scripts/exp/evaluate.py` script. This script evaluates the model over the `val_data` and `test_data`, defined in your
|
166 |
-
`train` script, and takes as input an experiment directory. The metrics
|
167 |
-
computed by this script are saved to the same folder.
|
168 |
-
|
169 |
-
The other way is via a preference test. Let's say we want to compare
|
170 |
-
the v3 prod model against the v2 prod model. to do this, we use the
|
171 |
-
`scripts/exp/qa.py` script. This script creates a zip file containing all
|
172 |
-
the samples and an HTML page for easy viewing. It also creates a Papaya
|
173 |
-
preference test. Use it like this:
|
174 |
-
|
175 |
-
```bash
|
176 |
-
WAV2WAV_MODELS=a,b python scripts/exp/qa.py \
|
177 |
-
--a/model_path prod/v3/ckpt/best/generator/package.pth \
|
178 |
-
--b/model_path prod/v2/ckpt/best/generator/package.pth \
|
179 |
-
--a/name "v3" --b/name "v2" \
|
180 |
-
--device cuda:0 \
|
181 |
-
--n_samples 20 \
|
182 |
-
--zip_path "samples/out.zip"
|
183 |
-
```
|
184 |
-
|
185 |
### Useful commands
|
186 |
|
187 |
-
#### Monitoring the machine
|
188 |
-
|
189 |
-
There's a useful `tmux` workspace that you can launch via:
|
190 |
-
|
191 |
-
```bash
|
192 |
-
tmuxp load ./workspace.yml
|
193 |
-
```
|
194 |
-
|
195 |
-
which will have a split pane with a shell to launch commands on the left,
|
196 |
-
and GPU monitoring, `htop`, and a script that watches for changes in your
|
197 |
-
directory on the right, in three split panes.
|
198 |
-
|
199 |
#### Cleaning up after a run
|
200 |
|
201 |
Sometimes DDP runs fail to clear themselves out of the machine. To fix this, run
|
@@ -203,85 +99,3 @@ Sometimes DDP runs fail to clear themselves out of the machine. To fix this, run
|
|
203 |
```bash
|
204 |
cleanup
|
205 |
```
|
206 |
-
|
207 |
-
### Deploying a new model to production
|
208 |
-
|
209 |
-
Okay, so you ran a model and it seems promising and you want to upload it
|
210 |
-
to GCS so it can be QA'd fully, and then shipped. First, upload
|
211 |
-
your experiment to the `dev` bucket on GCS via:
|
212 |
-
|
213 |
-
```bash
|
214 |
-
gsutil cp -r /path/to/{exp_name} gs://research-models/wav2wav/dev/{exp_name}
|
215 |
-
```
|
216 |
-
|
217 |
-
Once uploaded, QA can access the models by specifying
|
218 |
-
`model_path=dev/{exp_name}/ckpt/{best,latest}/generator/package.pth` when using the
|
219 |
-
`wav2wav.interface.enhance` function. If it passes QA, and is scheduled to
|
220 |
-
ship to production, then next we have to generate the TensorRT model file,
|
221 |
-
which requires us to have a machine that matches that of a production machine.
|
222 |
-
|
223 |
-
There is a script that automates this procedure, that does not require any
|
224 |
-
fiddling from our end. Navigate to the repository root and run:
|
225 |
-
|
226 |
-
```
|
227 |
-
python scripts/utils/convert_on_gcp.py dev/{exp_name}/ckpt/{best,latest}//generator/weights.pth
|
228 |
-
```
|
229 |
-
|
230 |
-
This will provision the machine, download the relevant model from GCS, optimize it on
|
231 |
-
the production GPU with the correct CUDA runtime, and then upload the generated `.trt`
|
232 |
-
and `.onnx` models back to the bucket.
|
233 |
-
|
234 |
-
Finally, copy the model to the `prod` bucket, incrementing the version number by one:
|
235 |
-
|
236 |
-
```bash
|
237 |
-
gsutil cp -r gs://research-models/wav2wav/dev/{exp_name} gs://research-models/wav2wav/prod/v{N}
|
238 |
-
```
|
239 |
-
|
240 |
-
where `N` is the next version (e.g. if v3 is the latest, the new one is v4). Then, update
|
241 |
-
the model table in [Notion](https://www.notion.so/descript/fc04de4b46e6417eba1d06bdc8de6c75?v=e56db4e6b37c4d9b9eca8d9be15c826a) with the new model.
|
242 |
-
|
243 |
-
Once the above is all done, we update the code in two places:
|
244 |
-
|
245 |
-
1. In `interface.py`, we update `PROD_MODEL_PATH` to point to the `weights.pth`
|
246 |
-
for whichever tag ended up shipping (either `best` or `latest`).
|
247 |
-
2. In `interface.py`, we update `PROD_TRT_PATH` to point the generated
|
248 |
-
TensorRT checkpoint generated by the script above.
|
249 |
-
|
250 |
-
After merging to master, a new Docker image will be created, and one can update the relevant lines
|
251 |
-
in descript-workflows like in this [PR](https://github.com/descriptinc/descript-workflows/pull/477/files).
|
252 |
-
|
253 |
-
We have Github action workflows in [.github/workflows/deploy.yml](.github/workflows/deploy.yml) to build and deploy new docker images. Two images are built - one for staging and another for production.
|
254 |
-
To deploy a new release version, follow the instructions in [this coda doc](https://coda.io/d/Research-Engineering_dOABAWL46p-/Deploying-Services_su1am#_lu7E8).
|
255 |
-
|
256 |
-
Coda doc with informations about deploying speech-enhance worker is [here](https://coda.io/d/Research-Engineering_dOABAWL46p-/Deploying-Services_su1am#_lu7E8).
|
257 |
-
|
258 |
-
And that's it! Once the new staging is built, you're done.
|
259 |
-
|
260 |
-
## Testing
|
261 |
-
|
262 |
-
### Profiling and Regression testing
|
263 |
-
|
264 |
-
- The [profiling script](tests/profile_inference.py) profiles the `wav2wav.interface.enhance` function.
|
265 |
-
- NOTE: ALWAYS run the profiler on a T4 GPU. ALWAYS run the profiling in isolation i.e kill all other processes on the GPU. Recommended vm size on GCP is `n1-standard-32` as the stress test of six hours of audio requires ~35GB of system memory.
|
266 |
-
- To run profiling use the [profiling script](tests/profile_inference.py) via command `python3 -m tests.profile_inference`. Results will be printed after `1` run.
|
267 |
-
- Use the [test_regression.py](tests/test_regression.py) script to run tests that
|
268 |
-
- compare performance stats of current model with known best model
|
269 |
-
- test for output deviation from the last model
|
270 |
-
- Run `git lfs checkout` to checkout input file and model weights required for testing the model.
|
271 |
-
- To launch these tests, run `python3 -m pytest tests/test_regression.py -v`.
|
272 |
-
- As a side effect, this will update the `tests/stat.csv` file if the current model performs better than last best known model as per `tests/stat.csv`.
|
273 |
-
- NOTE: In case of architecture change, purge the weights files : `tests/assets/{quick|slow}.pth` and reference stat file : `tests/assets/baseline.json` file. Running the [test_regression.py](tests/test_regression.py) script in absence of reference stat file, will generate new baseline referece stats as well as append new performance stats to stats file. In the absence of saved weights, new weights are generated and saved on disk. Make sure to commit these files (stat.csv, baseline.json, *.pth) when the model architecture changes.
|
274 |
-
|
275 |
-
### Unit tests
|
276 |
-
Regular unit tests that test functionality such as training resume etc. These are run on CPU. Update them when new features are added.
|
277 |
-
|
278 |
-
### Profiling tests
|
279 |
-
These tests profile the model's resource consumption. They are run on T4 GPU with 32 cores and >35GB memory. Their usage is reported in the above sections.
|
280 |
-
|
281 |
-
### Functional tests
|
282 |
-
These tests detect deviation from known baseline model. A category of these tests ensure that a new pytorch model doesn't deviate from the previous one. Another category ensures that the TensorRT version of the current pytorch model doens't deviate from it. These tests are marked with the marker `output_qa` and can be run via the command line `python3 -m pytest -v -m output_qa`. Some of these tests require a GPU.
|
283 |
-
|
284 |
-
### CI tests
|
285 |
-
- The tests are divided into two categories depending on the platform requirement - CPU tests and GPU tests.
|
286 |
-
- The CPU tests contains unit tests.
|
287 |
-
- The GPU tests contain a subset of functional tests. These tests can be run by the command `python3 -m pytest -v -m gpu_ci_test`.
|
|
|
1 |
+
# Lyrebird VampNet
|
2 |
|
3 |
+
This repository contains recipes for training generative music models on top of the Lyrebird Audio Codec.
|
4 |
|
5 |
## Install hooks
|
6 |
|
|
|
24 |
|
25 |
pre-commit run --all-files
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
## Development
|
28 |
### Setting everything up
|
29 |
|
|
|
71 |
docker compose down
|
72 |
```
|
73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
74 |
|
75 |
### Launching an experiment
|
76 |
|
|
|
90 |
|
91 |
The full settings are in [conf/daps/train.yml](conf/daps/train.yml).
|
92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
### Useful commands
|
94 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
#### Cleaning up after a run
|
96 |
|
97 |
Sometimes DDP runs fail to clear themselves out of the machine. To fix this, run
|
|
|
99 |
```bash
|
100 |
cleanup
|
101 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
lyrebird-audio-codec
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
Subproject commit fac6b9b624ab84603a2775d7ad9a345f9ec0b1e5
|
lyrebird-audiotools
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
Subproject commit 018a055ff7406c7bcb3b175551356ec18ba895b7
|