|
--- |
|
license: ecl-2.0 |
|
datasets: |
|
- mozilla-foundation/common_voice_11_0 |
|
language: |
|
- en |
|
- pt |
|
metrics: |
|
- accuracy |
|
library_name: transformers |
|
tags: |
|
- code |
|
--- |
|
|
|
# Speech Portuguese (Brazilian) Accent Classifier |
|
|
|
🎙️🤖🇧🇷 |
|
|
|
This project is a speech accent classifier that distinguishes between Portuguese (Brazilian) and other accents. |
|
|
|
## Project Overview |
|
|
|
This application uses a trained model to classify speech accents into two categories: |
|
1. Portuguese (Brazilian) |
|
2. Other |
|
|
|
The model is based on the author's work [results] and utilizes the Portuguese portion of the Common Voice dataset (version 11.0) from Mozilla Foundation. |
|
|
|
## Dataset |
|
|
|
The project uses the Portuguese subset of the Common Voice dataset: |
|
- Dataset: "mozilla-foundation/common_voice_11_0", "pt" |
|
|
|
Brazilian accents included in the dataset: |
|
- Português do Brasil, Região Sul do Brasil |
|
- Paulistano |
|
- Paulista, Brasileiro |
|
- Carioca |
|
- Mato Grosso |
|
- Mineiro |
|
- Interior Paulista |
|
- Gaúcho |
|
- Nordestino |
|
- And various regional mixes |
|
|
|
## Model and Processor |
|
|
|
The project utilizes the following model and processor: |
|
- Base Model: "facebook/wav2vec2-base-960h" |
|
- Processor: Wav2Vec2Processor.from_pretrained |
|
|
|
## Model Versions |
|
|
|
Was trained three versions of the model with different configurations: |
|
|
|
1. **(OLD) v 1.1**: |
|
- Epochs: 3 |
|
- Training samples: 1000 |
|
- Validation samples: 200 |
|
|
|
2. **(OLD) v 1.2**: |
|
- Epochs: 10 |
|
- Training samples: 1000 |
|
- Validation samples: 500 |
|
|
|
3. **(NEW) v 1.3**: |
|
- Epochs: 20 |
|
- Training samples: 5000 |
|
- Validation samples: 1000 |
|
|
|
All models were trained using high RAM GPU on Google Colab Pro. |
|
|
|
## Model Structure (files) |
|
|
|
Each version of the model includes the following files: |
|
results config.json | preprocessor_config.json | model.safetensors | special_tokens_map.json | tokenizer_config.json | vocab.json |
|
|
|
|
|
## How to Use |
|
|
|
Test with recording or uploading an audio file. To test, I recommend short sentences. |
|
|
|
## License |
|
|
|
This project is licensed under the Eclipse Public License 2.0 (ECL-2.0). |
|
|
|
## Developer Information |
|
|
|
Developed by Ramon Mayor Martins (2024) |
|
- Email: [email protected] |
|
- Homepage: https://rmayormartins.github.io/ |
|
- Twitter: @rmayormartins |
|
- GitHub: https://github.com/rmayormartins |
|
|
|
## Acknowledgements |
|
|
|
Special thanks to Instituto Federal de Santa Catarina (Federal Institute of Santa Catarina) IFSC-São José-Brazil. |
|
|
|
## Contact |
|
|
|
For any queries or suggestions, please contact the developer using the information provided above. |
|
|