ShivamSrng commited on
Commit
c9a6a8e
·
verified ·
1 Parent(s): 5fc8539

Fine-tuned Topic Model for aspects_to_improve column

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ ctfidf_config.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,129 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - bertopic
5
+ library_name: bertopic
6
+ pipeline_tag: text-classification
7
+ ---
8
+
9
+ # after_covid_face_to_face_aspects_to_improve
10
+
11
+ This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
12
+ BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
13
+
14
+ ## Usage
15
+
16
+ To use this model, please install BERTopic:
17
+
18
+ ```
19
+ pip install -U bertopic
20
+ ```
21
+
22
+ You can use the model as follows:
23
+
24
+ ```python
25
+ from bertopic import BERTopic
26
+ topic_model = BERTopic.load("ShivamSrng/after_covid_face_to_face_aspects_to_improve")
27
+
28
+ topic_model.get_topic_info()
29
+ ```
30
+
31
+ ## Topic overview
32
+
33
+ * Number of topics: 60
34
+ * Number of training documents: 76623
35
+
36
+ <details>
37
+ <summary>Click here for an overview of all topics.</summary>
38
+
39
+ | Topic ID | Topic Keywords | Topic Frequency | Label |
40
+ |----------|----------------|-----------------|-------|
41
+ | 0 | practice problems - grading - common exams - grades - improve | 53887 | 0_practice problems_grading_common exams_grades |
42
+ | 1 | lab reports - lab assignments - labs class - lab reports lab - class labs | 3550 | 1_lab reports_lab assignments_labs class_lab reports lab |
43
+ | 2 | ramblings cheers dallas - ramblings cheers dallas cowboys - cheers dallas cowboys year - lectures ramblings cheers dallas - cheers dallas cowboys | 522 | 2_ramblings cheers dallas_ramblings cheers dallas cowboys_cheers dallas cowboys year_lectures ramblings cheers dallas |
44
+ | 3 | class everything - class needs structure - helpful class way - purpose class - class needs | 508 | 3_class everything_class needs structure_helpful class way_purpose class |
45
+ | 4 | web development - softwares - technologies - implementation - data analytics | 503 | 4_web development_softwares_technologies_implementation |
46
+ | 5 | way misunderstands deliverable approach - misunderstands deliverable approach - way misunderstands deliverable - misunderstands deliverable - having entire role unfair | 486 | 5_way misunderstands deliverable approach_misunderstands deliverable approach_way misunderstands deliverable_misunderstands deliverable |
47
+ | 6 | research create highquality - case studies - economics ethics actual engineering - research - research create highquality scientificallyaccurate | 464 | 6_research create highquality_case studies_economics ethics actual engineering_research |
48
+ | 7 | calculators - calculator - calculations formulas - formula sheets - math formulas | 457 | 7_calculators_calculator_calculations formulas_formula sheets |
49
+ | 8 | teach matlab - teaching matlab - matlab taught class - students matlab - matlab taught | 445 | 8_teach matlab_teaching matlab_matlab taught class_students matlab |
50
+ | 9 | using creo - software - mylab mandatory installing - different software - vmware | 426 | 9_using creo_software_mylab mandatory installing_different software |
51
+ | 10 | material difficult - material challenging - materials better - approach materials - material needs | 393 | 10_material difficult_material challenging_materials better_approach materials |
52
+ | 11 | maybe movies - films - maybe mention oddities - maybe mention oddities life - film | 385 | 11_maybe movies_films_maybe mention oddities_maybe mention oddities life |
53
+ | 12 | class review exams - class review - review material exams - class review sessions - exam reviews | 383 | 12_class review exams_class review_review material exams_class review sessions |
54
+ | 13 | grade conclude keeps hostage - understand lab activity - understand lab activity approached - mentioned assigned lab activities - microscopes need repaired | 382 | 13_grade conclude keeps hostage_understand lab activity_understand lab activity approached_mentioned assigned lab activities |
55
+ | 14 | lab equipment outdated - better lab equipment - lab equipment better - equipment better lab - lab equipment old | 382 | 14_lab equipment outdated_better lab equipment_lab equipment better_equipment better lab |
56
+ | 15 | schedule felt disrespectful availability - schedule felt disrespectful - release schedule felt disrespectful - felt disrespectful availability advisor - disrespectful availability advisor | 381 | 15_schedule felt disrespectful availability_schedule felt disrespectful_release schedule felt disrespectful_felt disrespectful availability advisor |
57
+ | 16 | instructor studio critic - studio curriculum - studio classes - studio difficult - studio students | 379 | 16_instructor studio critic_studio curriculum_studio classes_studio difficult |
58
+ | 17 | prototyping - drawings projects - drawings class - model making - cad drawings | 378 | 17_prototyping_drawings projects_drawings class_model making |
59
+ | 18 | questions requires paying attention - understanding everything - understand asking based - understand asking based wording - homework problem word arbitrary | 373 | 18_questions requires paying attention_understanding everything_understand asking based_understand asking based wording |
60
+ | 19 | students software - students stay current remote - students access - provide students - available students | 371 | 19_students software_students stay current remote_students access_provide students |
61
+ | 20 | examples needs - examples help - examples help understand - examples helpful - need examples | 366 | 20_examples needs_examples help_examples help understand_examples helpful |
62
+ | 21 | class hard understand professor - everything class hard understand - everything professor - enjoyed class - great professor class | 362 | 21_class hard understand professor_everything class hard understand_everything professor_enjoyed class |
63
+ | 22 | vocareum terrible feedback wrong - vocareum terrible feedback - feedback wrong simply - having warning set instructor - simply having warning set | 358 | 22_vocareum terrible feedback wrong_vocareum terrible feedback_feedback wrong simply_having warning set instructor |
64
+ | 23 | industry projects - students looking internships jobs - career development - students looking internships - resumes | 356 | 23_industry projects_students looking internships jobs_career development_students looking internships |
65
+ | 24 | writing essays engineering - writing essays engineering topics - essays engineering topics - essays engineering - involved writing essays engineering | 354 | 24_writing essays engineering_writing essays engineering topics_essays engineering topics_essays engineering |
66
+ | 25 | architecture students - architecture taught - design courses - engineering design courses - design students | 353 | 25_architecture students_architecture taught_design courses_engineering design courses |
67
+ | 26 | office hours having time - office hours - office hours having - office hours nice independent - office hour | 353 | 26_office hours having time_office hours_office hours having_office hours nice independent |
68
+ | 27 | suggestions - suggestions think - suggestions right - suggestions right suggestions right - think suggestions | 347 | 27_suggestions_suggestions think_suggestions right_suggestions right suggestions right |
69
+ | 28 | physics classes - physics courses - mastering physics - students physics - physics class | 337 | 28_physics classes_physics courses_mastering physics_students physics |
70
+ | 29 | concepts realworld applications - examples realworld applications - real world application - practical examples - applications concepts | 335 | 29_concepts realworld applications_examples realworld applications_real world application_practical examples |
71
+ | 30 | class attendance policy - attendance policy - students attendance - way attendance taken - taking attendance | 331 | 30_class attendance policy_attendance policy_students attendance_way attendance taken |
72
+ | 31 | information felt waste time - information content - information felt waste - information harder - dense uninteresting | 331 | 31_information felt waste time_information content_information felt waste_information harder |
73
+ | 32 | chapter practice - learn chapters - learning chapters - learn chapter - homework chapters | 328 | 32_chapter practice_learn chapters_learning chapters_learn chapter |
74
+ | 33 | instruction manual needs clear - instructions manual - manual needs clear clearly - instruction manual - diagrams lab manual | 323 | 33_instruction manual needs clear_instructions manual_manual needs clear clearly_instruction manual |
75
+ | 34 | overwhelming workload - work workload - workload overwhelming - workload decrease - work load work | 319 | 34_overwhelming workload_work workload_workload overwhelming_workload decrease |
76
+ | 35 | respond classmates postings - forcing discussion - respond classmates postings following - think discussion - discussion posts | 318 | 35_respond classmates postings_forcing discussion_respond classmates postings following_think discussion |
77
+ | 36 | pearson online homework - pearson homework feel - pearson textbook - students pearson - homework assignments pearson | 317 | 36_pearson online homework_pearson homework feel_pearson textbook_students pearson |
78
+ | 37 | lecture hall difficult - lecture room - students room - class room - lecture hall class | 311 | 37_lecture hall difficult_lecture room_students room_class room |
79
+ | 38 | ebnf simply expanded years - simply expanded years pas - expanded years pas different - ebnf simply expanded - research ebnf simply expanded | 305 | 38_ebnf simply expanded years_simply expanded years pas_expanded years pas different_ebnf simply expanded |
80
+ | 39 | reading assignments week - multiple reading assignments week - reading assignments - readings boring - having multiple reading assignments | 304 | 39_reading assignments week_multiple reading assignments week_reading assignments_readings boring |
81
+ | 40 | opportunities active duty operations - active duty operations - opportunities active duty - duty operations person commissioning - operations person commissioning | 302 | 40_opportunities active duty operations_active duty operations_opportunities active duty_duty operations person commissioning |
82
+ | 41 | students notes - notes exams - class better notes - class notes - semester notes | 299 | 41_students notes_notes exams_class better notes_class notes |
83
+ | 42 | videos beneficial - video lectures - lesson videos - videos lectures - tutorial videos | 297 | 42_videos beneficial_video lectures_lesson videos_videos lectures |
84
+ | 43 | ideas discussed agreed jury - design ideas discussed agreed - ideas discussed - ideas discussed agreed - enjoyed everything | 297 | 43_ideas discussed agreed jury_design ideas discussed agreed_ideas discussed_ideas discussed agreed |
85
+ | 44 | time management - time spent section - minutes instead - minutes - minutes questions | 289 | 44_time management_time spent section_minutes instead_minutes |
86
+ | 45 | difficult difficulty level - difficult difficulty - tedious time consuming - difficult time easier - difficult | 287 | 45_difficult difficulty level_difficult difficulty_tedious time consuming_difficult time easier |
87
+ | 46 | design process - final critiques interior designers - design development - design work - design project | 287 | 46_design process_final critiques interior designers_design development_design work |
88
+ | 47 | paper research - choose research paper - write research paper - research writing - research paper write | 276 | 47_paper research_choose research paper_write research paper_research writing |
89
+ | 48 | nj making students feel - requirements nj making students - covering naab requirements nj - nj making students - covering naab requirements | 268 | 48_nj making students feel_requirements nj making students_covering naab requirements nj_nj making students |
90
+ | 49 | underneath studio desk avoid - dirty dusty students - studio spaces - studio space - underneath studio desk | 258 | 49_underneath studio desk avoid_dirty dusty students_studio spaces_studio space |
91
+ | 50 | chem taught - learning chem - chemistry students - courses chem - taking chemistry | 253 | 50_chem taught_learning chem_chemistry students_courses chem |
92
+ | 51 | pacing felt little fast - think pacing better - pacing little bit slow - bit quick pacing - possible pacing | 244 | 51_pacing felt little fast_think pacing better_pacing little bit slow_bit quick pacing |
93
+ | 52 | everything great - think everything great - everything everything great - everything alright - everything alright everything | 235 | 52_everything great_think everything great_everything everything great_everything alright |
94
+ | 53 | midterm using practice exam - prepare midterm - midterm using practice - assessments midterm format learn - practice midterm | 225 | 53_midterm using practice exam_prepare midterm_midterm using practice_assessments midterm format learn |
95
+ | 54 | recitation assignments taught - recitation assignments relate lecture - recitation class recitation - recitation teaching - recitation class | 216 | 54_recitation assignments taught_recitation assignments relate lecture_recitation class recitation_recitation teaching |
96
+ | 55 | think way opinion think - think way great - think works way - great way think - way great way think | 206 | 55_think way opinion think_think way great_think works way_great way think |
97
+ | 56 | critiques - nice far material review - nitpicking experience great - opinion aspects milano beautiful - opinion review nice | 182 | 56_critiques_nice far material review_nitpicking experience great_opinion aspects milano beautiful |
98
+ | 57 | pearson homework terrible - using pearson - using pearson homework - pearson homeworks - pearson homework horrible | 178 | 57_pearson homework terrible_using pearson_using pearson homework_pearson homeworks |
99
+ | 58 | personally change - think change best personally - change personally - change everything - personally change change | 173 | 58_personally change_think change best personally_change personally_change everything |
100
+ | 59 | current state think great - right think opinion great - right think great - think opinion great - think coruse great | 88 | 59_current state think great_right think opinion great_right think great_think opinion great |
101
+
102
+ </details>
103
+
104
+ ## Training hyperparameters
105
+
106
+ * calculate_probabilities: False
107
+ * language: None
108
+ * low_memory: False
109
+ * min_topic_size: 10
110
+ * n_gram_range: (1, 1)
111
+ * nr_topics: auto
112
+ * seed_topic_list: None
113
+ * top_n_words: 7
114
+ * verbose: True
115
+ * zeroshot_min_similarity: 0.7
116
+ * zeroshot_topic_list: None
117
+
118
+ ## Framework versions
119
+
120
+ * Numpy: 1.26.4
121
+ * HDBSCAN: 0.8.39
122
+ * UMAP: 0.5.7
123
+ * Pandas: 2.2.3
124
+ * Scikit-Learn: 1.5.2
125
+ * Sentence-transformers: 3.2.1
126
+ * Transformers: 4.46.2
127
+ * Numba: 0.60.0
128
+ * Plotly: 5.24.1
129
+ * Python: 3.10.11
config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "calculate_probabilities": false,
3
+ "language": null,
4
+ "low_memory": false,
5
+ "min_topic_size": 10,
6
+ "n_gram_range": [
7
+ 1,
8
+ 1
9
+ ],
10
+ "nr_topics": "auto",
11
+ "seed_topic_list": null,
12
+ "top_n_words": 7,
13
+ "verbose": true,
14
+ "zeroshot_min_similarity": 0.7,
15
+ "zeroshot_topic_list": null
16
+ }
ctfidf.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e5b220a78a344ddce307d187cf9b5aaea81fa4a75d1cdc5297aff5a5c1bdf86
3
+ size 24191708
ctfidf_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e79605cb71093ea150df77e1805f36bcae52df3fe62cba91d2fbe5bb13560e0
3
+ size 49368777
topic_embeddings.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:343dc508e86b0a77a0973d8b99fe6881e062a2f38da73202f682bd1e33195c0c
3
+ size 184408
topics.json ADDED
The diff for this file is too large to render. See raw diff