ShivamSrng
commited on
Fine-tuned Topic Model for aspects_to_improve column
Browse files- .gitattributes +1 -0
- README.md +129 -0
- config.json +16 -0
- ctfidf.safetensors +3 -0
- ctfidf_config.json +3 -0
- topic_embeddings.safetensors +3 -0
- topics.json +0 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
ctfidf_config.json filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
@@ -0,0 +1,129 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
---
|
3 |
+
tags:
|
4 |
+
- bertopic
|
5 |
+
library_name: bertopic
|
6 |
+
pipeline_tag: text-classification
|
7 |
+
---
|
8 |
+
|
9 |
+
# after_covid_face_to_face_aspects_to_improve
|
10 |
+
|
11 |
+
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
|
12 |
+
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
|
13 |
+
|
14 |
+
## Usage
|
15 |
+
|
16 |
+
To use this model, please install BERTopic:
|
17 |
+
|
18 |
+
```
|
19 |
+
pip install -U bertopic
|
20 |
+
```
|
21 |
+
|
22 |
+
You can use the model as follows:
|
23 |
+
|
24 |
+
```python
|
25 |
+
from bertopic import BERTopic
|
26 |
+
topic_model = BERTopic.load("ShivamSrng/after_covid_face_to_face_aspects_to_improve")
|
27 |
+
|
28 |
+
topic_model.get_topic_info()
|
29 |
+
```
|
30 |
+
|
31 |
+
## Topic overview
|
32 |
+
|
33 |
+
* Number of topics: 60
|
34 |
+
* Number of training documents: 76623
|
35 |
+
|
36 |
+
<details>
|
37 |
+
<summary>Click here for an overview of all topics.</summary>
|
38 |
+
|
39 |
+
| Topic ID | Topic Keywords | Topic Frequency | Label |
|
40 |
+
|----------|----------------|-----------------|-------|
|
41 |
+
| 0 | practice problems - grading - common exams - grades - improve | 53887 | 0_practice problems_grading_common exams_grades |
|
42 |
+
| 1 | lab reports - lab assignments - labs class - lab reports lab - class labs | 3550 | 1_lab reports_lab assignments_labs class_lab reports lab |
|
43 |
+
| 2 | ramblings cheers dallas - ramblings cheers dallas cowboys - cheers dallas cowboys year - lectures ramblings cheers dallas - cheers dallas cowboys | 522 | 2_ramblings cheers dallas_ramblings cheers dallas cowboys_cheers dallas cowboys year_lectures ramblings cheers dallas |
|
44 |
+
| 3 | class everything - class needs structure - helpful class way - purpose class - class needs | 508 | 3_class everything_class needs structure_helpful class way_purpose class |
|
45 |
+
| 4 | web development - softwares - technologies - implementation - data analytics | 503 | 4_web development_softwares_technologies_implementation |
|
46 |
+
| 5 | way misunderstands deliverable approach - misunderstands deliverable approach - way misunderstands deliverable - misunderstands deliverable - having entire role unfair | 486 | 5_way misunderstands deliverable approach_misunderstands deliverable approach_way misunderstands deliverable_misunderstands deliverable |
|
47 |
+
| 6 | research create highquality - case studies - economics ethics actual engineering - research - research create highquality scientificallyaccurate | 464 | 6_research create highquality_case studies_economics ethics actual engineering_research |
|
48 |
+
| 7 | calculators - calculator - calculations formulas - formula sheets - math formulas | 457 | 7_calculators_calculator_calculations formulas_formula sheets |
|
49 |
+
| 8 | teach matlab - teaching matlab - matlab taught class - students matlab - matlab taught | 445 | 8_teach matlab_teaching matlab_matlab taught class_students matlab |
|
50 |
+
| 9 | using creo - software - mylab mandatory installing - different software - vmware | 426 | 9_using creo_software_mylab mandatory installing_different software |
|
51 |
+
| 10 | material difficult - material challenging - materials better - approach materials - material needs | 393 | 10_material difficult_material challenging_materials better_approach materials |
|
52 |
+
| 11 | maybe movies - films - maybe mention oddities - maybe mention oddities life - film | 385 | 11_maybe movies_films_maybe mention oddities_maybe mention oddities life |
|
53 |
+
| 12 | class review exams - class review - review material exams - class review sessions - exam reviews | 383 | 12_class review exams_class review_review material exams_class review sessions |
|
54 |
+
| 13 | grade conclude keeps hostage - understand lab activity - understand lab activity approached - mentioned assigned lab activities - microscopes need repaired | 382 | 13_grade conclude keeps hostage_understand lab activity_understand lab activity approached_mentioned assigned lab activities |
|
55 |
+
| 14 | lab equipment outdated - better lab equipment - lab equipment better - equipment better lab - lab equipment old | 382 | 14_lab equipment outdated_better lab equipment_lab equipment better_equipment better lab |
|
56 |
+
| 15 | schedule felt disrespectful availability - schedule felt disrespectful - release schedule felt disrespectful - felt disrespectful availability advisor - disrespectful availability advisor | 381 | 15_schedule felt disrespectful availability_schedule felt disrespectful_release schedule felt disrespectful_felt disrespectful availability advisor |
|
57 |
+
| 16 | instructor studio critic - studio curriculum - studio classes - studio difficult - studio students | 379 | 16_instructor studio critic_studio curriculum_studio classes_studio difficult |
|
58 |
+
| 17 | prototyping - drawings projects - drawings class - model making - cad drawings | 378 | 17_prototyping_drawings projects_drawings class_model making |
|
59 |
+
| 18 | questions requires paying attention - understanding everything - understand asking based - understand asking based wording - homework problem word arbitrary | 373 | 18_questions requires paying attention_understanding everything_understand asking based_understand asking based wording |
|
60 |
+
| 19 | students software - students stay current remote - students access - provide students - available students | 371 | 19_students software_students stay current remote_students access_provide students |
|
61 |
+
| 20 | examples needs - examples help - examples help understand - examples helpful - need examples | 366 | 20_examples needs_examples help_examples help understand_examples helpful |
|
62 |
+
| 21 | class hard understand professor - everything class hard understand - everything professor - enjoyed class - great professor class | 362 | 21_class hard understand professor_everything class hard understand_everything professor_enjoyed class |
|
63 |
+
| 22 | vocareum terrible feedback wrong - vocareum terrible feedback - feedback wrong simply - having warning set instructor - simply having warning set | 358 | 22_vocareum terrible feedback wrong_vocareum terrible feedback_feedback wrong simply_having warning set instructor |
|
64 |
+
| 23 | industry projects - students looking internships jobs - career development - students looking internships - resumes | 356 | 23_industry projects_students looking internships jobs_career development_students looking internships |
|
65 |
+
| 24 | writing essays engineering - writing essays engineering topics - essays engineering topics - essays engineering - involved writing essays engineering | 354 | 24_writing essays engineering_writing essays engineering topics_essays engineering topics_essays engineering |
|
66 |
+
| 25 | architecture students - architecture taught - design courses - engineering design courses - design students | 353 | 25_architecture students_architecture taught_design courses_engineering design courses |
|
67 |
+
| 26 | office hours having time - office hours - office hours having - office hours nice independent - office hour | 353 | 26_office hours having time_office hours_office hours having_office hours nice independent |
|
68 |
+
| 27 | suggestions - suggestions think - suggestions right - suggestions right suggestions right - think suggestions | 347 | 27_suggestions_suggestions think_suggestions right_suggestions right suggestions right |
|
69 |
+
| 28 | physics classes - physics courses - mastering physics - students physics - physics class | 337 | 28_physics classes_physics courses_mastering physics_students physics |
|
70 |
+
| 29 | concepts realworld applications - examples realworld applications - real world application - practical examples - applications concepts | 335 | 29_concepts realworld applications_examples realworld applications_real world application_practical examples |
|
71 |
+
| 30 | class attendance policy - attendance policy - students attendance - way attendance taken - taking attendance | 331 | 30_class attendance policy_attendance policy_students attendance_way attendance taken |
|
72 |
+
| 31 | information felt waste time - information content - information felt waste - information harder - dense uninteresting | 331 | 31_information felt waste time_information content_information felt waste_information harder |
|
73 |
+
| 32 | chapter practice - learn chapters - learning chapters - learn chapter - homework chapters | 328 | 32_chapter practice_learn chapters_learning chapters_learn chapter |
|
74 |
+
| 33 | instruction manual needs clear - instructions manual - manual needs clear clearly - instruction manual - diagrams lab manual | 323 | 33_instruction manual needs clear_instructions manual_manual needs clear clearly_instruction manual |
|
75 |
+
| 34 | overwhelming workload - work workload - workload overwhelming - workload decrease - work load work | 319 | 34_overwhelming workload_work workload_workload overwhelming_workload decrease |
|
76 |
+
| 35 | respond classmates postings - forcing discussion - respond classmates postings following - think discussion - discussion posts | 318 | 35_respond classmates postings_forcing discussion_respond classmates postings following_think discussion |
|
77 |
+
| 36 | pearson online homework - pearson homework feel - pearson textbook - students pearson - homework assignments pearson | 317 | 36_pearson online homework_pearson homework feel_pearson textbook_students pearson |
|
78 |
+
| 37 | lecture hall difficult - lecture room - students room - class room - lecture hall class | 311 | 37_lecture hall difficult_lecture room_students room_class room |
|
79 |
+
| 38 | ebnf simply expanded years - simply expanded years pas - expanded years pas different - ebnf simply expanded - research ebnf simply expanded | 305 | 38_ebnf simply expanded years_simply expanded years pas_expanded years pas different_ebnf simply expanded |
|
80 |
+
| 39 | reading assignments week - multiple reading assignments week - reading assignments - readings boring - having multiple reading assignments | 304 | 39_reading assignments week_multiple reading assignments week_reading assignments_readings boring |
|
81 |
+
| 40 | opportunities active duty operations - active duty operations - opportunities active duty - duty operations person commissioning - operations person commissioning | 302 | 40_opportunities active duty operations_active duty operations_opportunities active duty_duty operations person commissioning |
|
82 |
+
| 41 | students notes - notes exams - class better notes - class notes - semester notes | 299 | 41_students notes_notes exams_class better notes_class notes |
|
83 |
+
| 42 | videos beneficial - video lectures - lesson videos - videos lectures - tutorial videos | 297 | 42_videos beneficial_video lectures_lesson videos_videos lectures |
|
84 |
+
| 43 | ideas discussed agreed jury - design ideas discussed agreed - ideas discussed - ideas discussed agreed - enjoyed everything | 297 | 43_ideas discussed agreed jury_design ideas discussed agreed_ideas discussed_ideas discussed agreed |
|
85 |
+
| 44 | time management - time spent section - minutes instead - minutes - minutes questions | 289 | 44_time management_time spent section_minutes instead_minutes |
|
86 |
+
| 45 | difficult difficulty level - difficult difficulty - tedious time consuming - difficult time easier - difficult | 287 | 45_difficult difficulty level_difficult difficulty_tedious time consuming_difficult time easier |
|
87 |
+
| 46 | design process - final critiques interior designers - design development - design work - design project | 287 | 46_design process_final critiques interior designers_design development_design work |
|
88 |
+
| 47 | paper research - choose research paper - write research paper - research writing - research paper write | 276 | 47_paper research_choose research paper_write research paper_research writing |
|
89 |
+
| 48 | nj making students feel - requirements nj making students - covering naab requirements nj - nj making students - covering naab requirements | 268 | 48_nj making students feel_requirements nj making students_covering naab requirements nj_nj making students |
|
90 |
+
| 49 | underneath studio desk avoid - dirty dusty students - studio spaces - studio space - underneath studio desk | 258 | 49_underneath studio desk avoid_dirty dusty students_studio spaces_studio space |
|
91 |
+
| 50 | chem taught - learning chem - chemistry students - courses chem - taking chemistry | 253 | 50_chem taught_learning chem_chemistry students_courses chem |
|
92 |
+
| 51 | pacing felt little fast - think pacing better - pacing little bit slow - bit quick pacing - possible pacing | 244 | 51_pacing felt little fast_think pacing better_pacing little bit slow_bit quick pacing |
|
93 |
+
| 52 | everything great - think everything great - everything everything great - everything alright - everything alright everything | 235 | 52_everything great_think everything great_everything everything great_everything alright |
|
94 |
+
| 53 | midterm using practice exam - prepare midterm - midterm using practice - assessments midterm format learn - practice midterm | 225 | 53_midterm using practice exam_prepare midterm_midterm using practice_assessments midterm format learn |
|
95 |
+
| 54 | recitation assignments taught - recitation assignments relate lecture - recitation class recitation - recitation teaching - recitation class | 216 | 54_recitation assignments taught_recitation assignments relate lecture_recitation class recitation_recitation teaching |
|
96 |
+
| 55 | think way opinion think - think way great - think works way - great way think - way great way think | 206 | 55_think way opinion think_think way great_think works way_great way think |
|
97 |
+
| 56 | critiques - nice far material review - nitpicking experience great - opinion aspects milano beautiful - opinion review nice | 182 | 56_critiques_nice far material review_nitpicking experience great_opinion aspects milano beautiful |
|
98 |
+
| 57 | pearson homework terrible - using pearson - using pearson homework - pearson homeworks - pearson homework horrible | 178 | 57_pearson homework terrible_using pearson_using pearson homework_pearson homeworks |
|
99 |
+
| 58 | personally change - think change best personally - change personally - change everything - personally change change | 173 | 58_personally change_think change best personally_change personally_change everything |
|
100 |
+
| 59 | current state think great - right think opinion great - right think great - think opinion great - think coruse great | 88 | 59_current state think great_right think opinion great_right think great_think opinion great |
|
101 |
+
|
102 |
+
</details>
|
103 |
+
|
104 |
+
## Training hyperparameters
|
105 |
+
|
106 |
+
* calculate_probabilities: False
|
107 |
+
* language: None
|
108 |
+
* low_memory: False
|
109 |
+
* min_topic_size: 10
|
110 |
+
* n_gram_range: (1, 1)
|
111 |
+
* nr_topics: auto
|
112 |
+
* seed_topic_list: None
|
113 |
+
* top_n_words: 7
|
114 |
+
* verbose: True
|
115 |
+
* zeroshot_min_similarity: 0.7
|
116 |
+
* zeroshot_topic_list: None
|
117 |
+
|
118 |
+
## Framework versions
|
119 |
+
|
120 |
+
* Numpy: 1.26.4
|
121 |
+
* HDBSCAN: 0.8.39
|
122 |
+
* UMAP: 0.5.7
|
123 |
+
* Pandas: 2.2.3
|
124 |
+
* Scikit-Learn: 1.5.2
|
125 |
+
* Sentence-transformers: 3.2.1
|
126 |
+
* Transformers: 4.46.2
|
127 |
+
* Numba: 0.60.0
|
128 |
+
* Plotly: 5.24.1
|
129 |
+
* Python: 3.10.11
|
config.json
ADDED
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"calculate_probabilities": false,
|
3 |
+
"language": null,
|
4 |
+
"low_memory": false,
|
5 |
+
"min_topic_size": 10,
|
6 |
+
"n_gram_range": [
|
7 |
+
1,
|
8 |
+
1
|
9 |
+
],
|
10 |
+
"nr_topics": "auto",
|
11 |
+
"seed_topic_list": null,
|
12 |
+
"top_n_words": 7,
|
13 |
+
"verbose": true,
|
14 |
+
"zeroshot_min_similarity": 0.7,
|
15 |
+
"zeroshot_topic_list": null
|
16 |
+
}
|
ctfidf.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5e5b220a78a344ddce307d187cf9b5aaea81fa4a75d1cdc5297aff5a5c1bdf86
|
3 |
+
size 24191708
|
ctfidf_config.json
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0e79605cb71093ea150df77e1805f36bcae52df3fe62cba91d2fbe5bb13560e0
|
3 |
+
size 49368777
|
topic_embeddings.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:343dc508e86b0a77a0973d8b99fe6881e062a2f38da73202f682bd1e33195c0c
|
3 |
+
size 184408
|
topics.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|