File size: 2,485 Bytes
de7d2e6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
license: ecl-2.0
datasets:
- mozilla-foundation/common_voice_11_0
language:
- en
- pt
metrics:
- accuracy
library_name: transformers
tags:
- code
---

# Speech Portuguese (Brazilian) Accent Classifier

🎙️🤖🇧🇷

This project is a speech accent classifier that distinguishes between Portuguese (Brazilian) and other accents.

## Project Overview

This application uses a trained model to classify speech accents into two categories:
1. Portuguese (Brazilian)
2. Other

The model is based on the author's work [results] and utilizes the Portuguese portion of the Common Voice dataset (version 11.0) from Mozilla Foundation.

## Dataset

The project uses the Portuguese subset of the Common Voice dataset:
- Dataset: "mozilla-foundation/common_voice_11_0", "pt"

Brazilian accents included in the dataset:
- Português do Brasil, Região Sul do Brasil
- Paulistano
- Paulista, Brasileiro
- Carioca
- Mato Grosso
- Mineiro
- Interior Paulista
- Gaúcho
- Nordestino
- And various regional mixes

## Model and Processor

The project utilizes the following model and processor:
- Base Model: "facebook/wav2vec2-base-960h"
- Processor: Wav2Vec2Processor.from_pretrained

## Model Versions

Was trained three versions of the model with different configurations:

1. **(OLD) v 1.1**:
   - Epochs: 3
   - Training samples: 1000
   - Validation samples: 200

2. **(OLD) v 1.2**:
   - Epochs: 10
   - Training samples: 1000
   - Validation samples: 500

3. **(NEW) v 1.3**:
   - Epochs: 20
   - Training samples: 5000
   - Validation samples: 1000

All models were trained using high RAM GPU on Google Colab Pro.

## Model Structure (files)

Each version of the model includes the following files:
results config.json | preprocessor_config.json | model.safetensors | special_tokens_map.json | tokenizer_config.json | vocab.json 


## How to Use

Test with recording or uploading an audio file. To test, I recommend short sentences.

## License

This project is licensed under the Eclipse Public License 2.0 (ECL-2.0).

## Developer Information

Developed by Ramon Mayor Martins (2024)
- Email: [email protected]
- Homepage: https://rmayormartins.github.io/
- Twitter: @rmayormartins
- GitHub: https://github.com/rmayormartins

## Acknowledgements

Special thanks to Instituto Federal de Santa Catarina (Federal Institute of Santa Catarina) IFSC-São José-Brazil.

## Contact

For any queries or suggestions, please contact the developer using the information provided above.