Your Name commited on
Commit
e593cad
Β·
1 Parent(s): 37989dd

update .gitignore file to ignore .pyc files

Browse files
Files changed (2) hide show
  1. .gitignore +2 -1
  2. app.py +20 -14
.gitignore CHANGED
@@ -1,2 +1,3 @@
1
  *.pth
2
- assets/
 
 
1
  *.pth
2
+ assets/
3
+ __pycache__/
app.py CHANGED
@@ -26,9 +26,9 @@ ASSET_DIR = "./assets"
26
  DEFUALT_SR = 16_000
27
  DEFUALT_HIGH_CUT = 8_000
28
  DEFUALT_LOW_CUT = 1_000
29
- DEVICE = "cpu" #"cuda" if torch.cuda.is_available() else "cpu"
30
 
31
- print(f"Device: {DEVICE}")
32
 
33
  if not os.path.exists(ASSET_DIR):
34
  os.makedirs(ASSET_DIR)
@@ -167,12 +167,12 @@ Thomas Radinger [ [email protected] | [LinkedIn](https://www.linkedin.com
167
 
168
  Birds are key indicators of ecosystem health and play pivotal roles in maintaining biodiversity [1]. To monitor and protect bird species, automatic bird sound recognition systems are essential. These systems can help in identifying bird species, monitoring their populations, and understanding their behavior. However, building such systems is challenging due to the diversity of bird sounds, complex acoustic interference and limited labeled data.
169
 
170
- To tackle these challenges, we expored the potential of deep learning models for bird sound recognition. In our work, we developed two Audio Spectrogram Transformer (AST) based models: BirdAST and BirdAST_Seq, to predict bird species from audio recordings. We evaluated the models on a dataset of 728 bird species and achieved promising results. As the field-recordings may contain various types of audio rather than only bird songs/calls, we also employed an Audio Masked AutoEncoder (AudioMAE) model to pre-classify audio clips into bird, insects, rain, environmental noise, and other types. Details of the models and evaluation results are provided in the table below.
171
 
172
  Our contributions have shown the potential of deep learning models for bird sound recognition. We hope that our work can contribute to the development of automatic bird sound recognition systems and help in monitoring and protecting bird species.
173
 
174
-
175
  <div align="center">
 
176
 
177
  | Model name | Architecture | ROC-AUC Score |
178
  | --------------- |:------------------------------:|:-------------:|
@@ -187,11 +187,11 @@ Our contributions have shown the potential of deep learning models for bird soun
187
  2. Upload an audio clip and specify the start and end time for prediction.
188
  3. Click on the "Predict" button to get the predictions.
189
  4. In the output, you will get the audio type classification (e.g., bird, insects, rain, etc.) in the panel "Class Prediction" and the predicted bird species in the panel "Species Prediction".
190
- - The audio types are predicted as multi-lable classification based on the AudioMAE model [2]. The predicted classes indicate the possible presence of different types of audio in the recording.
191
- - The bird species are predicted as a multi-class classification using the selected model. The predicted classes indicate the most possible bird species present in the recording.
192
  5. The waveform and spectrogram of the audio clip are displayed in the respective panels.
193
 
194
- Notes:
195
  - For an unknown bird species, the model may predict the most similar bird species based on the training data.
196
  - If an audio clip contains non-bird sounds (predicted by the AudioMAE), the bird species prediction may not be accurate.
197
 
@@ -307,6 +307,10 @@ def handle_model_selection(model_name, download_status):
307
  # Inform user that download is starting
308
  # gr.Info(f"Downloading model weights for {model_name}...")
309
  print(f"Downloading model weights for {model_name}...")
 
 
 
 
310
  assets = ASSET_DICT[model_name]
311
  model_weights_url = assets["model_weights"]
312
  download_flag = True
@@ -324,7 +328,7 @@ def handle_model_selection(model_name, download_status):
324
  break
325
 
326
  if download_flag:
327
- download_status = f"Model <{model_name}> is ready for prediction!"
328
  else:
329
  download_status = f"An error occurred while downloading model weights."
330
 
@@ -363,12 +367,14 @@ with gr.Blocks(theme = seafoam, css = css, js = js) as demo:
363
  waveform_output = gr.Plot(label="Waveform")
364
  spectrogram_output = gr.Plot(label="Spectrogram")
365
 
366
- # gr.Examples(
367
- # examples=[
368
- # ["1094_Pionus_fuscus_2.wav", 0, 10],
369
- # ],
370
- # inputs=[audio_input, start_time_input, end_time_input]
371
- # )
 
 
372
 
373
  gr.Button("Predict").click(predict, [audio_input, start_time_input, end_time_input, model_dropdown], [raw_class_output, species_output, waveform_output, spectrogram_output])
374
 
 
26
  DEFUALT_SR = 16_000
27
  DEFUALT_HIGH_CUT = 8_000
28
  DEFUALT_LOW_CUT = 1_000
29
+ DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
30
 
31
+ print(f"Use Device: {DEVICE}")
32
 
33
  if not os.path.exists(ASSET_DIR):
34
  os.makedirs(ASSET_DIR)
 
167
 
168
  Birds are key indicators of ecosystem health and play pivotal roles in maintaining biodiversity [1]. To monitor and protect bird species, automatic bird sound recognition systems are essential. These systems can help in identifying bird species, monitoring their populations, and understanding their behavior. However, building such systems is challenging due to the diversity of bird sounds, complex acoustic interference and limited labeled data.
169
 
170
+ To tackle these challenges, we expored the potential of deep learning models for bird sound recognition. In our work, we developed two Audio Spectrogram Transformer (AST) based models: BirdAST and BirdAST_Seq, to predict bird species from audio recordings. We evaluated the models on a dataset of 728 bird species and achieved promising results. Details of the models and evaluation results are provided in the table below. As the field-recordings may contain various types of audio rather than only bird songs/calls, we also employed an Audio Masked AutoEncoder (AudioMAE) model to pre-classify audio clips into bird, insects, rain, environmental noise, and other types [2].
171
 
172
  Our contributions have shown the potential of deep learning models for bird sound recognition. We hope that our work can contribute to the development of automatic bird sound recognition systems and help in monitoring and protecting bird species.
173
 
 
174
  <div align="center">
175
+ <b>Model Details</b>
176
 
177
  | Model name | Architecture | ROC-AUC Score |
178
  | --------------- |:------------------------------:|:-------------:|
 
187
  2. Upload an audio clip and specify the start and end time for prediction.
188
  3. Click on the "Predict" button to get the predictions.
189
  4. In the output, you will get the audio type classification (e.g., bird, insects, rain, etc.) in the panel "Class Prediction" and the predicted bird species in the panel "Species Prediction".
190
+ * The audio types are predicted as multi-lable classification based on the AudioMAE model. The predicted classes indicate the possible presence of different types of audio in the recording.
191
+ * The bird species are predicted as a multi-class classification using the selected model. The predicted classes indicate the most possible bird species present in the recording.
192
  5. The waveform and spectrogram of the audio clip are displayed in the respective panels.
193
 
194
+ **Notes:**
195
  - For an unknown bird species, the model may predict the most similar bird species based on the training data.
196
  - If an audio clip contains non-bird sounds (predicted by the AudioMAE), the bird species prediction may not be accurate.
197
 
 
307
  # Inform user that download is starting
308
  # gr.Info(f"Downloading model weights for {model_name}...")
309
  print(f"Downloading model weights for {model_name}...")
310
+
311
+ if model_name is None:
312
+ model_name = "BirdAST"
313
+
314
  assets = ASSET_DICT[model_name]
315
  model_weights_url = assets["model_weights"]
316
  download_flag = True
 
328
  break
329
 
330
  if download_flag:
331
+ download_status = f"Model <{model_name}> is ready! πŸŽ‰πŸŽ‰πŸŽ‰\nUsing Device: {DEVICE.upper()}"
332
  else:
333
  download_status = f"An error occurred while downloading model weights."
334
 
 
367
  waveform_output = gr.Plot(label="Waveform")
368
  spectrogram_output = gr.Plot(label="Spectrogram")
369
 
370
+ gr.Examples(
371
+ examples=[
372
+ ["XC226833-Chestnut-belted_20Chat-Tyrant_20A_2010989.mp3", 0, 10],
373
+ ["XC812290-Many-striped-Canastero_Teaben_Pe_1jul2022_FSchmitt_1.mp3", 0, 10],
374
+ ["XC763511-Synallaxis-maronica_Bagua-grande_MixPre-1746.mp3", 0, 10]
375
+ ],
376
+ inputs=[audio_input, start_time_input, end_time_input]
377
+ )
378
 
379
  gr.Button("Predict").click(predict, [audio_input, start_time_input, end_time_input, model_dropdown], [raw_class_output, species_output, waveform_output, spectrogram_output])
380