Model Card for Resume Section Classifier

This model is designed to classify sections within Hungarian resumes into categories such as Skills, Education, Experience, and others. It utilizes the facebook/fasttext-hu-vectors model as its base and has been fine-tuned on the ganchengguang/resume_seven_class dataset. The dataaset was in English so I translated it into Hungarian. It's not the best approach but it still works.

Model Details

Model Description

This model leverages the facebook/fasttext-hu-vectors pre-trained embeddings to classify Hungarian resume sections into predefined categories. It has been fine-tuned on the ganchengguang/resume_seven_class dataset, which includes seven categories: Experience, Education, Knowledge, Project, and others.

Model type: Text Classification
Language(s): Hungarian
Finetuned from model: facebook/fasttext-hu-vectors

Uses

Direct Use

This model can be used directly to classify sections of Hungarian resumes into categories such as Skills, Education, Experience, and others. It is suitable for applications in recruitment and resume analysis.

Downstream Use

The model can be integrated into larger systems for automated resume screening, assisting HR professionals in efficiently processing and categorizing resume information.

Out-of-Scope Use

This model is not intended for use with resumes in languages other than Hungarian. It may not perform accurately on resumes with non-standard formats or those containing significant amounts of non-Hungarian text.

Bias, Risks, and Limitations

The model has been trained on a specific dataset and may not generalize well to resumes with formats or content significantly different from those in the training data. Users should be aware of potential biases in the training data and the model's limitations in handling diverse resume formats.

Recommendations

Users should validate the model's predictions and consider incorporating human oversight, especially when dealing with resumes that deviate from the standard formats present in the training data.

How to Get Started with the Model

https://github.com/ssobii2/Wozify-CV-Parser
Check Fasttext Website

Training Details

Training Data

The model was fine-tuned on the ganchengguang/resume_seven_class dataset, which contains English resume sections labeled into seven categories: Experience, Education, Knowledge, Project, and others. I translated the dataset into Hungarian.

Training Procedure

The model was fine-tuned using standard text classification procedures, adjusting hyperparameters to optimize performance on the resume classification task.

Evaluation

Testing Data, Factors & Metrics

The model's performance was evaluated on a held-out test set from the ganchengguang/resume_seven_class dataset, using accuracy and F1-score as evaluation metrics.

Metrics

Accuracy: Measures the proportion of correctly classified sections.
F1-score: Harmonic mean of precision and recall, providing a balance between the two.

Environmental Impact

The training of this model was conducted on standard hardware, resulting in minimal carbon emissions. Users should consider the environmental impact of training large models and explore options for model distillation or quantization to reduce energy consumption.

Technical Specifications

Model Architecture and Objective

The model is based on the facebook/fasttext-hu-vectors architecture, fine-tuned for the task of classifying Hungarian resume sections into predefined categories.

Compute Infrastructure

The model was trained my personal gaming laptop.

Hardware

GPU: RTX 4070 Laptop GPU 8GB VRAM
CPI: Intel Core-i7-13620H
RAM: 16GB

ThunderJaw
/

hu_fasttext_resume_sections