general-mar11

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Thang203/general-mar11")

topic_model.get_topic_info()

Topic overview

Number of topics: 20
Number of training documents: 12146

Click here for an overview of all topics.

Topic ID	Topic Keywords	Topic Frequency	Label
-1	models - language - llms - language models - large	11	-1_models_language_llms_language models
0	models - language - language models - model - llms	3795	0_models_language_language models_model
1	code - language - llms - models - large	3252	1_code_language_llms_models
2	visual - image - models - multimodal - video	1236	2_visual_image_models_multimodal
3	detection - text - models - attacks - adversarial	724	3_detection_text_models_attacks
4	bias - models - biases - llms - gender	610	4_bias_models_biases_llms
5	medical - clinical - models - language - llms	451	5_medical_clinical_models_language
6	legal - financial - sentiment - models - language	437	6_legal_financial_sentiment_models
7	ai - chatgpt - generative - design - generative ai	377	7_ai_chatgpt_generative_design
8	privacy - data - private - models - federated	274	8_privacy_data_private_models
9	students - education - chatgpt - ai - student	201	9_students_education_chatgpt_ai
10	driving - autonomous - autonomous driving - traffic - spatial	164	10_driving_autonomous_autonomous driving_traffic
11	protein - molecular - materials - chemical - drug	113	11_protein_molecular_materials_chemical
12	reinforcement learning - reinforcement - learning - rl - policy	103	12_reinforcement learning_reinforcement_learning_rl
13	math - mathematical - problems - reasoning - theorem	100	13_math_mathematical_problems_reasoning
14	vulnerability - code - vulnerabilities - security - log	91	14_vulnerability_code_vulnerabilities_security
15	forecasting - climate - time series - data - carbon	78	15_forecasting_climate_time series_data
16	style - poetry - style transfer - transfer - poems	78	16_style_poetry_style transfer_transfer
17	regression - matrix - softmax - mathbbrn - bf	37	17_regression_matrix_softmax_mathbbrn
18	recipes - recipe - food - cooking - dietary	14	18_recipes_recipe_food_cooking

Training hyperparameters

calculate_probabilities: False
language: None
low_memory: False
min_topic_size: 10
n_gram_range: (1, 1)
nr_topics: 20
seed_topic_list: None
top_n_words: 10
verbose: True
zeroshot_min_similarity: 0.7
zeroshot_topic_list: None

Framework versions

Numpy: 1.25.2
HDBSCAN: 0.8.33
UMAP: 0.5.5
Pandas: 1.5.3
Scikit-Learn: 1.2.2
Sentence-transformers: 2.5.1
Transformers: 4.38.2
Numba: 0.58.1
Plotly: 5.15.0
Python: 3.10.12