ShivamSrng commited on
Commit
e514f97
·
verified ·
1 Parent(s): 5246b87

Fine-tuned Topic Model for instructor_comments column

Browse files
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - bertopic
5
+ library_name: bertopic
6
+ pipeline_tag: text-classification
7
+ ---
8
+
9
+ # after_covid_distance_learning_instructor_comments
10
+
11
+ This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
12
+ BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
13
+
14
+ ## Usage
15
+
16
+ To use this model, please install BERTopic:
17
+
18
+ ```
19
+ pip install -U bertopic
20
+ ```
21
+
22
+ You can use the model as follows:
23
+
24
+ ```python
25
+ from bertopic import BERTopic
26
+ topic_model = BERTopic.load("ShivamSrng/after_covid_distance_learning_instructor_comments")
27
+
28
+ topic_model.get_topic_info()
29
+ ```
30
+
31
+ ## Topic overview
32
+
33
+ * Number of topics: 63
34
+ * Number of training documents: 4673
35
+
36
+ <details>
37
+ <summary>Click here for an overview of all topics.</summary>
38
+
39
+ | Topic ID | Topic Keywords | Topic Frequency | Label |
40
+ |----------|----------------|-----------------|-------|
41
+ | 0 | lecture - lectures - class - classes - online class | 3612 | 0_lecture_lectures_class_classes |
42
+ | 1 | needs consideration students canvas - organized canvas discussion students - canvas announcement - canvas needs - needs revise canvas | 58 | 1_needs consideration students canvas_organized canvas discussion students_canvas announcement_canvas needs |
43
+ | 2 | negatively week general feedback - need improvement returned feedback - improvement returned feedback benefitted - inconsistent posting material feedback - posting material feedback thorough | 33 | 2_negatively week general feedback_need improvement returned feedback_improvement returned feedback benefitted_inconsistent posting material feedback |
44
+ | 3 | reading documentation watching youtube - learning basically - novice subject matter felt - python code learning - online related topics class | 28 | 3_reading documentation watching youtube_learning basically_novice subject matter felt_python code learning |
45
+ | 4 | provides responses - respond emails lin - respond day - nonresponsive questions - messages intend submit complaint | 28 | 4_provides responses_respond emails lin_respond day_nonresponsive questions |
46
+ | 5 | portion discussion students - pushed discussion questions students - portion discussion students copied - students effectively summarize - posts students effectively summarize | 27 | 5_portion discussion students_pushed discussion questions students_portion discussion students copied_students effectively summarize |
47
+ | 6 | quick respond knowledgeable - kind helpful excellent constructive - personality willing work knowledgeable - knowledgeable interested subject - provides resources answers emails | 27 | 6_quick respond knowledgeable_kind helpful excellent constructive_personality willing work knowledgeable_knowledgeable interested subject |
48
+ | 7 | group projects - participated group work - worst group projects educational - group project - participated group work enforced | 27 | 7_group projects_participated group work_worst group projects educational_group project |
49
+ | 8 | proctoring software - realizing different proctoring software - scrap paper online exam - scrap paper exams - proctoring software protectu | 25 | 8_proctoring software_realizing different proctoring software_scrap paper online exam_scrap paper exams |
50
+ | 9 | midterm exam disorganized - prepare midterm exam - midterm exam - midterm exam final exam - prepare midterm | 24 | 9_midterm exam disorganized_prepare midterm exam_midterm exam_midterm exam final exam |
51
+ | 10 | future students review - health future students review - health future students - opinion student student general - mental health future students | 23 | 10_future students review_health future students review_health future students_opinion student student general |
52
+ | 11 | weeks breaks courses - weeks breaks courses grab - order weeks breaks courses - great fun enjoyed challenging - fun enjoyed challenging | 23 | 11_weeks breaks courses_weeks breaks courses grab_order weeks breaks courses_great fun enjoyed challenging |
53
+ | 12 | mba courses groupwork required - practicality student thinking - practicality student - presentations business world career - presentations business world | 23 | 12_mba courses groupwork required_practicality student thinking_practicality student_presentations business world career |
54
+ | 13 | information hard read - fx difficult learn - information hard read everything - java fx difficult learn - fx difficult learn understand | 22 | 13_information hard read_fx difficult learn_information hard read everything_java fx difficult learn |
55
+ | 14 | words captured bad experience - investigates disrespectful attend - investigates disrespectful - holistic experience nightmare - investigates disrespectful attend guy | 22 | 14_words captured bad experience_investigates disrespectful attend_investigates disrespectful_holistic experience nightmare |
56
+ | 15 | mention personal attack intellect - mention personal attack - understanding explanation voice - points rude demeaning emails - losing points rude demeaning | 20 | 15_mention personal attack intellect_mention personal attack_understanding explanation voice_points rude demeaning emails |
57
+ | 16 | group discussions posted week - good usage weekly discussions - material discussed week - weekly announcement - month group discussions posted | 20 | 16_group discussions posted week_good usage weekly discussions_material discussed week_weekly announcement |
58
+ | 17 | punctual responding emailed - promptly responds emails - punctual responding emailed questions - needs respond emails faster - students timely manner available | 20 | 17_punctual responding emailed_promptly responds emails_punctual responding emailed questions_needs respond emails faster |
59
+ | 18 | level decomposition students future - level decomposition students - instructions needed confusion programs - algorithms - needed confusion programs | 19 | 18_level decomposition students future_level decomposition students_instructions needed confusion programs_algorithms |
60
+ | 19 | provided office hours - reminds students office hours - provided office hours week - office hours improvement - office hours availability | 19 | 19_provided office hours_reminds students office hours_provided office hours week_office hours improvement |
61
+ | 20 | organization class better - published promptly decent class - offering class little organization - somewhat class runs smoothly - interesting class class | 19 | 20_organization class better_published promptly decent class_offering class little organization_somewhat class runs smoothly |
62
+ | 21 | complaints negative comments minor - reviews true real complaint - notes complaints perfect - negative comments minor details - complaints negative | 19 | 21_complaints negative comments minor_reviews true real complaint_notes complaints perfect_negative comments minor details |
63
+ | 22 | learned withdraw class felt - major assignments returned feel - learned withdraw class - knowledge major assignments returned - possibly learned withdraw class | 19 | 22_learned withdraw class felt_major assignments returned feel_learned withdraw class_knowledge major assignments returned |
64
+ | 23 | feedback discussion - feedback discussion post - posts lectures provide feedback - instructors rare feedback discussion - liked direct feedback discussion | 19 | 23_feedback discussion_feedback discussion post_posts lectures provide feedback_instructors rare feedback discussion |
65
+ | 24 | valuable worthwhile case studies - understanding technologies gave tools - valuable reading - projects zybooks valuable reading - introduce industry experiencesexamples relate | 18 | 24_valuable worthwhile case studies_understanding technologies gave tools_valuable reading_projects zybooks valuable reading |
66
+ | 25 | thank great semester thank - thank great semester - thank amazing semester pleasure - thank great semester hope - thank awesome semester | 17 | 25_thank great semester thank_thank great semester_thank amazing semester pleasure_thank great semester hope |
67
+ | 26 | making tests exams midday - obligations exam days - obligations exam days saturday - open til midnight exam - works tests earlier day | 17 | 26_making tests exams midday_obligations exam days_obligations exam days saturday_open til midnight exam |
68
+ | 27 | massive dataset images - massive dataset images label - provide images tasks - provide images tasks weeks - provided descriptions bunch images | 17 | 27_massive dataset images_massive dataset images label_provide images tasks_provide images tasks weeks |
69
+ | 28 | past semester emailed asking - respond emails meet office - respond emails meet - online diligent responses emails - reachable email responses frequently | 17 | 28_past semester emailed asking_respond emails meet office_respond emails meet_online diligent responses emails |
70
+ | 29 | honest valid shows care - fail demoralizing - means fail demoralizing - xf means fail demoralizing - means fail demoralizing failed | 17 | 29_honest valid shows care_fail demoralizing_means fail demoralizing_xf means fail demoralizing |
71
+ | 30 | quicker response times emails - replies emails quickly helpful - response times emails appreciated - reason replies emails quickly - takes long time respond | 16 | 30_quicker response times emails_replies emails quickly helpful_response times emails appreciated_reason replies emails quickly |
72
+ | 31 | paul ranky needs understand - questioned critics personal - understand learn questioned critics - questioned critics - questioned critics personal opinions | 16 | 31_paul ranky needs understand_questioned critics personal_understand learn questioned critics_questioned critics |
73
+ | 32 | question told read lectures - lectures notes hoping - receive better grade responded - read lectures notes hoping - lectures notes | 16 | 32_question told read lectures_lectures notes hoping_receive better grade responded_read lectures notes hoping |
74
+ | 33 | offer future ethical analysis - future ethical analysis graded - important ethics society reason - theme thought topics discussed - important ethics | 15 | 33_offer future ethical analysis_future ethical analysis graded_important ethics society reason_theme thought topics discussed |
75
+ | 34 | review videos week helpful - review videos - software helpful review videos - review videos week - provide powerpoints accompany videos | 15 | 34_review videos week helpful_review videos_software helpful review videos_review videos week |
76
+ | 35 | lectures unnecessarily long - materials lectures unnecessarily long - lengthy lectures design - lengthy lectures design class - lengthy lectures | 15 | 35_lectures unnecessarily long_materials lectures unnecessarily long_lengthy lectures design_lengthy lectures design class |
77
+ | 36 | apologize late gratitude - late gratitude thank experience - late gratitude - thankfulness appreciated helpful happy - thank apologize late gratitude | 15 | 36_apologize late gratitude_late gratitude thank experience_late gratitude_thankfulness appreciated helpful happy |
78
+ | 37 | prevent cheating far excessive - thought process prevent cheating - prevent cheating far - prevent cheating - cheating far excessive | 15 | 37_prevent cheating far excessive_thought process prevent cheating_prevent cheating far_prevent cheating |
79
+ | 38 | student minute planning lack - job takes priority involvement - student minute planning - office hours work student - rid hire professors daher | 15 | 38_student minute planning lack_job takes priority involvement_student minute planning_office hours work student |
80
+ | 39 | waiting assignments graded october - notification delay october homeworks - midterm grades takes weeks - october homeworks everything - instead december waiting assignments | 14 | 39_waiting assignments graded october_notification delay october homeworks_midterm grades takes weeks_october homeworks everything |
81
+ | 40 | skills allowed great learning - learned write better essays - job learned far - learned helps current job - new skills improved writing | 14 | 40_skills allowed great learning_learned write better essays_job learned far_learned helps current job |
82
+ | 41 | sense logical perspective unquantifiable - unquantifiable scope knowledge presented - solid foundation complete falsehood - sense logical perspective - unemployment rate struggling clear | 14 | 41_sense logical perspective unquantifiable_unquantifiable scope knowledge presented_solid foundation complete falsehood_sense logical perspective |
83
+ | 42 | feels weird leave blank - leave blank add feels - weird leave blank - blank using stuff future - blank using stuff | 14 | 42_feels weird leave blank_leave blank add feels_weird leave blank_blank using stuff future |
84
+ | 43 | needs canvas - needs canvas instead - website listed canvas syllabus - operate canvas knowledge - listed canvas syllabus | 13 | 43_needs canvas_needs canvas instead_website listed canvas syllabus_operate canvas knowledge |
85
+ | 44 | issues forcing discussion - discussion posts - issues forcing discussion posts - way discussions smoothly opinion - discussion posts feel | 13 | 44_issues forcing discussion_discussion posts_issues forcing discussion posts_way discussions smoothly opinion |
86
+ | 45 | videos watch supposed teach - videos years ago lecture - years ago lecture videos - resources pushed burden teaching - teach helpful stuff | 13 | 45_videos watch supposed teach_videos years ago lecture_years ago lecture videos_resources pushed burden teaching |
87
+ | 46 | prompt chat gpt scarily - voicethread students chat gpt - prompt chat - prompt chat gpt - chat gpt | 12 | 46_prompt chat gpt scarily_voicethread students chat gpt_prompt chat_prompt chat gpt |
88
+ | 47 | personally interacted - interacted class online contact - semester actively messaged personally - smoothly semester liked interact - personally interacted class online | 12 | 47_personally interacted_interacted class online contact_semester actively messaged personally_smoothly semester liked interact |
89
+ | 48 | sporadically posted day deadline - team members posted - studio taken away - team members posted spoke - recordings shared sessions submit | 12 | 48_sporadically posted day deadline_team members posted_studio taken away_team members posted spoke |
90
+ | 49 | prompt questions felt misleading - questions felt misleading - questions felt misleading answer - confusing roundabout responses - surprised confusing roundabout responses | 12 | 49_prompt questions felt misleading_questions felt misleading_questions felt misleading answer_confusing roundabout responses |
91
+ | 50 | reaching regarding issues received - regarding issues received response - regarding issues received - sent emails asking help - tried reaching regarding | 12 | 50_reaching regarding issues received_regarding issues received response_regarding issues received_sent emails asking help |
92
+ | 51 | great ability communicate deliver - meetings great ability communicate - ultimately effective teaching - lessons promt effective communicating - great ability communicate | 12 | 51_great ability communicate deliver_meetings great ability communicate_ultimately effective teaching_lessons promt effective communicating |
93
+ | 52 | letters help emphasize point - emphasize point - helps completing homework - capitalized letters help emphasize - help emphasize point | 10 | 52_letters help emphasize point_emphasize point_helps completing homework_capitalized letters help emphasize |
94
+ | 53 | recorded videos professors class - videos professors class - video lectures includes recorded - provided recorded lectures video - uploaded video lectures | 10 | 53_recorded videos professors class_videos professors class_video lectures includes recorded_provided recorded lectures video |
95
+ | 54 | time waisted creating website - students supposed create website - supposed create website scratch - website provided build structure - spent hours receive project | 9 | 54_time waisted creating website_students supposed create website_supposed create website scratch_website provided build structure |
96
+ | 55 | core java covered effective - core java covered - spring version class cybersecurity - version class cybersecurity - spring version | 9 | 55_core java covered effective_core java covered_spring version class cybersecurity_version class cybersecurity |
97
+ | 56 | thank hard work - thank hard work good - good work sir - great work good work - great work good | 9 | 56_thank hard work_thank hard work good_good work sir_great work good work |
98
+ | 57 | instruction yoo interacted sharma - sharma knowledge confirmed cheating - comments related directly sharma - related directly sharma yoo - yoo interacted sharma knowledge | 9 | 57_instruction yoo interacted sharma_sharma knowledge confirmed cheating_comments related directly sharma_related directly sharma yoo |
99
+ | 58 | believe doctor longer instruct - doctor longer instruct think - doctor longer instruct - lipuma interactive kept students - kept students aware | 8 | 58_believe doctor longer instruct_doctor longer instruct think_doctor longer instruct_lipuma interactive kept students |
100
+ | 59 | employer paying degree enjoy - paying degree enjoy - jobs pay education - paying degree - school work fulltime | 8 | 59_employer paying degree enjoy_paying degree enjoy_jobs pay education_paying degree |
101
+ | 60 | quality admiration senior engineer - state econ phds compliment - iowa state econ phds - making tenure - tenure track applicable | 7 | 60_quality admiration senior engineer_state econ phds compliment_iowa state econ phds_making tenure |
102
+ | 61 | discussion post lost points - earn points discussion boards - earn points discussion - word discussion post lost - discussion post lost | 6 | 61_discussion post lost points_earn points discussion boards_earn points discussion_word discussion post lost |
103
+ | 62 | rebuttal process - rebuttal process mentioned - rebuttal process mentioned said - rebuttal process safeguard - perception rebuttal process safeguard | 4 | 62_rebuttal process_rebuttal process mentioned_rebuttal process mentioned said_rebuttal process safeguard |
104
+
105
+ </details>
106
+
107
+ ## Training hyperparameters
108
+
109
+ * calculate_probabilities: False
110
+ * language: None
111
+ * low_memory: False
112
+ * min_topic_size: 10
113
+ * n_gram_range: (1, 1)
114
+ * nr_topics: auto
115
+ * seed_topic_list: None
116
+ * top_n_words: 7
117
+ * verbose: True
118
+ * zeroshot_min_similarity: 0.7
119
+ * zeroshot_topic_list: None
120
+
121
+ ## Framework versions
122
+
123
+ * Numpy: 1.26.4
124
+ * HDBSCAN: 0.8.39
125
+ * UMAP: 0.5.7
126
+ * Pandas: 2.2.3
127
+ * Scikit-Learn: 1.5.2
128
+ * Sentence-transformers: 3.2.1
129
+ * Transformers: 4.46.2
130
+ * Numba: 0.60.0
131
+ * Plotly: 5.24.1
132
+ * Python: 3.10.11
config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "calculate_probabilities": false,
3
+ "language": null,
4
+ "low_memory": false,
5
+ "min_topic_size": 10,
6
+ "n_gram_range": [
7
+ 1,
8
+ 1
9
+ ],
10
+ "nr_topics": "auto",
11
+ "seed_topic_list": null,
12
+ "top_n_words": 7,
13
+ "verbose": true,
14
+ "zeroshot_min_similarity": 0.7,
15
+ "zeroshot_topic_list": null
16
+ }
ctfidf.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aa3838b9c3164acf7bd2e49ddbf3166f49177a7325936c4f57222ca898b3ae0b
3
+ size 1753632
ctfidf_config.json ADDED
The diff for this file is too large to render. See raw diff
 
topic_embeddings.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b30941508ed7f7811637ac363ba0099672a2a00957beb9bc1a333904cbfa485
3
+ size 193624
topics.json ADDED
The diff for this file is too large to render. See raw diff