Thang203 commited on
Commit
476ef25
·
verified ·
1 Parent(s): 46c5f14

Add BERTopic model

Browse files
Files changed (6) hide show
  1. README.md +79 -0
  2. config.json +17 -0
  3. ctfidf.bin +3 -0
  4. ctfidf_config.json +0 -0
  5. topic_embeddings.bin +3 -0
  6. topics.json +2310 -0
README.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ tags:
4
+ - bertopic
5
+ library_name: bertopic
6
+ pipeline_tag: text-classification
7
+ ---
8
+
9
+ # industry-mar11Top10
10
+
11
+ This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
12
+ BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
13
+
14
+ ## Usage
15
+
16
+ To use this model, please install BERTopic:
17
+
18
+ ```
19
+ pip install -U bertopic
20
+ ```
21
+
22
+ You can use the model as follows:
23
+
24
+ ```python
25
+ from bertopic import BERTopic
26
+ topic_model = BERTopic.load("Thang203/industry-mar11Top10")
27
+
28
+ topic_model.get_topic_info()
29
+ ```
30
+
31
+ ## Topic overview
32
+
33
+ * Number of topics: 10
34
+ * Number of training documents: 516
35
+
36
+ <details>
37
+ <summary>Click here for an overview of all topics.</summary>
38
+
39
+ | Topic ID | Topic Keywords | Topic Frequency | Label |
40
+ |----------|----------------|-----------------|-------|
41
+ | -1 | models - language - data - large - language models | 15 | -1_models_language_data_large |
42
+ | 0 | models - model - language - training - language models | 169 | 0_models_model_language_training |
43
+ | 1 | code - language - models - llms - programming | 118 | 1_code_language_models_llms |
44
+ | 2 | ai - models - language - dialogue - human | 49 | 2_ai_models_language_dialogue |
45
+ | 3 | detection - models - text - language - model | 47 | 3_detection_models_text_language |
46
+ | 4 | multimodal - visual - image - models - generation | 32 | 4_multimodal_visual_image_models |
47
+ | 5 | agents - language - policy - learning - tasks | 24 | 5_agents_language_policy_learning |
48
+ | 6 | speech - asr - text - speaker - recognition | 22 | 6_speech_asr_text_speaker |
49
+ | 7 | reasoning - cot - models - problems - commonsense | 21 | 7_reasoning_cot_models_problems |
50
+ | 8 | retrieval - information - query - llms - models | 19 | 8_retrieval_information_query_llms |
51
+
52
+ </details>
53
+
54
+ ## Training hyperparameters
55
+
56
+ * calculate_probabilities: False
57
+ * language: english
58
+ * low_memory: False
59
+ * min_topic_size: 10
60
+ * n_gram_range: (1, 1)
61
+ * nr_topics: 10
62
+ * seed_topic_list: None
63
+ * top_n_words: 10
64
+ * verbose: True
65
+ * zeroshot_min_similarity: 0.7
66
+ * zeroshot_topic_list: None
67
+
68
+ ## Framework versions
69
+
70
+ * Numpy: 1.25.2
71
+ * HDBSCAN: 0.8.33
72
+ * UMAP: 0.5.5
73
+ * Pandas: 1.5.3
74
+ * Scikit-Learn: 1.2.2
75
+ * Sentence-transformers: 2.6.1
76
+ * Transformers: 4.38.2
77
+ * Numba: 0.58.1
78
+ * Plotly: 5.15.0
79
+ * Python: 3.10.12
config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "calculate_probabilities": false,
3
+ "language": "english",
4
+ "low_memory": false,
5
+ "min_topic_size": 10,
6
+ "n_gram_range": [
7
+ 1,
8
+ 1
9
+ ],
10
+ "nr_topics": 10,
11
+ "seed_topic_list": null,
12
+ "top_n_words": 10,
13
+ "verbose": true,
14
+ "zeroshot_min_similarity": 0.7,
15
+ "zeroshot_topic_list": null,
16
+ "embedding_model": "sentence-transformers/all-MiniLM-L6-v2"
17
+ }
ctfidf.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38c1cb28578b70f76de2256777d4c7fc0fab248ed2622c44a272196d2941f2a2
3
+ size 318275
ctfidf_config.json ADDED
The diff for this file is too large to render. See raw diff
 
topic_embeddings.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9d7790c9c90b44e922f3a8e7d4523063e77100c457e5a33ca962e096a838c42a
3
+ size 16649
topics.json ADDED
@@ -0,0 +1,2310 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "topic_representations": {
3
+ "-1": [
4
+ [
5
+ "models",
6
+ 0.036874579738434304
7
+ ],
8
+ [
9
+ "language",
10
+ 0.031011734360675242
11
+ ],
12
+ [
13
+ "data",
14
+ 0.02740357251248468
15
+ ],
16
+ [
17
+ "large",
18
+ 0.024331696551107916
19
+ ],
20
+ [
21
+ "language models",
22
+ 0.02287739800299974
23
+ ],
24
+ [
25
+ "model",
26
+ 0.02123690372233833
27
+ ],
28
+ [
29
+ "tasks",
30
+ 0.02117889409597425
31
+ ],
32
+ [
33
+ "llms",
34
+ 0.020210440809796944
35
+ ],
36
+ [
37
+ "large language",
38
+ 0.019999417196753248
39
+ ],
40
+ [
41
+ "large language models",
42
+ 0.019126572684958956
43
+ ]
44
+ ],
45
+ "0": [
46
+ [
47
+ "models",
48
+ 0.03888243759552385
49
+ ],
50
+ [
51
+ "model",
52
+ 0.03647492283412293
53
+ ],
54
+ [
55
+ "language",
56
+ 0.03613590283186468
57
+ ],
58
+ [
59
+ "training",
60
+ 0.025581428828302905
61
+ ],
62
+ [
63
+ "language models",
64
+ 0.02386262298037925
65
+ ],
66
+ [
67
+ "tasks",
68
+ 0.02360941221543806
69
+ ],
70
+ [
71
+ "data",
72
+ 0.021604280018978572
73
+ ],
74
+ [
75
+ "performance",
76
+ 0.021213047327713713
77
+ ],
78
+ [
79
+ "large",
80
+ 0.020365016161611835
81
+ ],
82
+ [
83
+ "method",
84
+ 0.01788214168631935
85
+ ]
86
+ ],
87
+ "1": [
88
+ [
89
+ "code",
90
+ 0.08112439886630912
91
+ ],
92
+ [
93
+ "language",
94
+ 0.03515934823155083
95
+ ],
96
+ [
97
+ "models",
98
+ 0.034093014905089085
99
+ ],
100
+ [
101
+ "llms",
102
+ 0.03351276274167474
103
+ ],
104
+ [
105
+ "programming",
106
+ 0.03221809114638236
107
+ ],
108
+ [
109
+ "software",
110
+ 0.024215765671622126
111
+ ],
112
+ [
113
+ "language models",
114
+ 0.023501871498181743
115
+ ],
116
+ [
117
+ "tasks",
118
+ 0.021362088649701006
119
+ ],
120
+ [
121
+ "model",
122
+ 0.021028623583260922
123
+ ],
124
+ [
125
+ "large language",
126
+ 0.020242713470511334
127
+ ]
128
+ ],
129
+ "2": [
130
+ [
131
+ "ai",
132
+ 0.03748085558879784
133
+ ],
134
+ [
135
+ "models",
136
+ 0.032123956517937674
137
+ ],
138
+ [
139
+ "language",
140
+ 0.030708509906927736
141
+ ],
142
+ [
143
+ "dialogue",
144
+ 0.02863305325688509
145
+ ],
146
+ [
147
+ "human",
148
+ 0.027796744355540557
149
+ ],
150
+ [
151
+ "llms",
152
+ 0.027095383693882993
153
+ ],
154
+ [
155
+ "chatgpt",
156
+ 0.02427426857972807
157
+ ],
158
+ [
159
+ "large language",
160
+ 0.024177158942537805
161
+ ],
162
+ [
163
+ "large",
164
+ 0.023491817699557018
165
+ ],
166
+ [
167
+ "model",
168
+ 0.022240448993628016
169
+ ]
170
+ ],
171
+ "3": [
172
+ [
173
+ "detection",
174
+ 0.04600933370915614
175
+ ],
176
+ [
177
+ "models",
178
+ 0.0376182869533305
179
+ ],
180
+ [
181
+ "text",
182
+ 0.03622151327830574
183
+ ],
184
+ [
185
+ "language",
186
+ 0.03555056937300613
187
+ ],
188
+ [
189
+ "model",
190
+ 0.02910562167494557
191
+ ],
192
+ [
193
+ "large",
194
+ 0.026737322113278325
195
+ ],
196
+ [
197
+ "language models",
198
+ 0.026260255642963005
199
+ ],
200
+ [
201
+ "misinformation",
202
+ 0.022438367434259674
203
+ ],
204
+ [
205
+ "dataset",
206
+ 0.021178404179731523
207
+ ],
208
+ [
209
+ "large language",
210
+ 0.020266242724238725
211
+ ]
212
+ ],
213
+ "4": [
214
+ [
215
+ "multimodal",
216
+ 0.06377037276103617
217
+ ],
218
+ [
219
+ "visual",
220
+ 0.0609342279209814
221
+ ],
222
+ [
223
+ "image",
224
+ 0.05031813021481461
225
+ ],
226
+ [
227
+ "models",
228
+ 0.04428945209100523
229
+ ],
230
+ [
231
+ "generation",
232
+ 0.03866971167435956
233
+ ],
234
+ [
235
+ "video",
236
+ 0.03452530411071284
237
+ ],
238
+ [
239
+ "understanding",
240
+ 0.03174883479055843
241
+ ],
242
+ [
243
+ "large",
244
+ 0.02994331997174661
245
+ ],
246
+ [
247
+ "model",
248
+ 0.027842071361726516
249
+ ],
250
+ [
251
+ "instruction",
252
+ 0.02744625284444433
253
+ ]
254
+ ],
255
+ "5": [
256
+ [
257
+ "agents",
258
+ 0.032621488861863626
259
+ ],
260
+ [
261
+ "language",
262
+ 0.032046686285534975
263
+ ],
264
+ [
265
+ "policy",
266
+ 0.031585563861493055
267
+ ],
268
+ [
269
+ "learning",
270
+ 0.030550747755560888
271
+ ],
272
+ [
273
+ "tasks",
274
+ 0.029270078392980483
275
+ ],
276
+ [
277
+ "llms",
278
+ 0.028067175067745524
279
+ ],
280
+ [
281
+ "agent",
282
+ 0.026011640827111927
283
+ ],
284
+ [
285
+ "games",
286
+ 0.025255064827310037
287
+ ],
288
+ [
289
+ "knowledge",
290
+ 0.02496878818528055
291
+ ],
292
+ [
293
+ "model",
294
+ 0.024630611822384848
295
+ ]
296
+ ],
297
+ "6": [
298
+ [
299
+ "speech",
300
+ 0.12032183461065618
301
+ ],
302
+ [
303
+ "asr",
304
+ 0.0784134014691984
305
+ ],
306
+ [
307
+ "text",
308
+ 0.04816267150192302
309
+ ],
310
+ [
311
+ "speaker",
312
+ 0.04549115752552982
313
+ ],
314
+ [
315
+ "recognition",
316
+ 0.044013060675693126
317
+ ],
318
+ [
319
+ "speech recognition",
320
+ 0.03480823666083872
321
+ ],
322
+ [
323
+ "model",
324
+ 0.0329226249448169
325
+ ],
326
+ [
327
+ "language",
328
+ 0.031171151406766243
329
+ ],
330
+ [
331
+ "voice",
332
+ 0.030863819919231247
333
+ ],
334
+ [
335
+ "proposed",
336
+ 0.029531042059903895
337
+ ]
338
+ ],
339
+ "7": [
340
+ [
341
+ "reasoning",
342
+ 0.09733768593924219
343
+ ],
344
+ [
345
+ "cot",
346
+ 0.04159609177483568
347
+ ],
348
+ [
349
+ "models",
350
+ 0.04032110830244759
351
+ ],
352
+ [
353
+ "problems",
354
+ 0.038531107231743966
355
+ ],
356
+ [
357
+ "commonsense",
358
+ 0.0328390198222387
359
+ ],
360
+ [
361
+ "language",
362
+ 0.03061562593615061
363
+ ],
364
+ [
365
+ "prompting",
366
+ 0.03050017742462947
367
+ ],
368
+ [
369
+ "language models",
370
+ 0.028282815332533393
371
+ ],
372
+ [
373
+ "math",
374
+ 0.026470858073982147
375
+ ],
376
+ [
377
+ "chainofthought",
378
+ 0.026470858073982147
379
+ ]
380
+ ],
381
+ "8": [
382
+ [
383
+ "retrieval",
384
+ 0.05391749257643426
385
+ ],
386
+ [
387
+ "information",
388
+ 0.041311727463775545
389
+ ],
390
+ [
391
+ "query",
392
+ 0.03998637165786005
393
+ ],
394
+ [
395
+ "llms",
396
+ 0.0360048263616992
397
+ ],
398
+ [
399
+ "models",
400
+ 0.03235786882267994
401
+ ],
402
+ [
403
+ "language",
404
+ 0.03201012649638935
405
+ ],
406
+ [
407
+ "queries",
408
+ 0.031828706522162444
409
+ ],
410
+ [
411
+ "language models",
412
+ 0.02804152194835136
413
+ ],
414
+ [
415
+ "large",
416
+ 0.026588466396316807
417
+ ],
418
+ [
419
+ "knowledge",
420
+ 0.02430262486413176
421
+ ]
422
+ ]
423
+ },
424
+ "topics": [
425
+ 0,
426
+ 3,
427
+ 6,
428
+ -1,
429
+ 0,
430
+ -1,
431
+ 0,
432
+ 0,
433
+ -1,
434
+ 0,
435
+ 0,
436
+ -1,
437
+ 0,
438
+ 0,
439
+ 1,
440
+ -1,
441
+ 0,
442
+ -1,
443
+ -1,
444
+ 7,
445
+ -1,
446
+ 0,
447
+ 0,
448
+ -1,
449
+ 0,
450
+ 8,
451
+ 0,
452
+ -1,
453
+ -1,
454
+ 2,
455
+ 2,
456
+ 8,
457
+ 0,
458
+ 2,
459
+ 0,
460
+ 0,
461
+ 5,
462
+ 8,
463
+ 0,
464
+ 0,
465
+ 0,
466
+ 0,
467
+ 0,
468
+ 0,
469
+ 2,
470
+ -1,
471
+ 3,
472
+ 2,
473
+ 3,
474
+ 0,
475
+ 6,
476
+ -1,
477
+ 3,
478
+ -1,
479
+ 2,
480
+ 0,
481
+ 0,
482
+ -1,
483
+ 1,
484
+ 0,
485
+ 3,
486
+ 1,
487
+ 0,
488
+ 1,
489
+ 0,
490
+ 0,
491
+ 0,
492
+ 2,
493
+ 0,
494
+ 0,
495
+ 0,
496
+ -1,
497
+ -1,
498
+ 6,
499
+ -1,
500
+ -1,
501
+ 2,
502
+ 3,
503
+ 0,
504
+ 0,
505
+ 0,
506
+ 2,
507
+ 0,
508
+ 7,
509
+ 0,
510
+ -1,
511
+ 6,
512
+ 3,
513
+ 2,
514
+ -1,
515
+ -1,
516
+ 0,
517
+ 0,
518
+ -1,
519
+ 3,
520
+ 0,
521
+ 4,
522
+ 0,
523
+ 1,
524
+ 3,
525
+ 0,
526
+ 0,
527
+ 0,
528
+ 1,
529
+ 0,
530
+ 7,
531
+ 2,
532
+ -1,
533
+ 6,
534
+ 5,
535
+ -1,
536
+ -1,
537
+ 0,
538
+ 0,
539
+ -1,
540
+ 0,
541
+ 2,
542
+ -1,
543
+ 0,
544
+ 0,
545
+ 7,
546
+ 0,
547
+ -1,
548
+ 3,
549
+ 1,
550
+ -1,
551
+ -1,
552
+ 3,
553
+ 5,
554
+ 7,
555
+ 6,
556
+ 8,
557
+ 0,
558
+ 5,
559
+ 1,
560
+ 1,
561
+ 1,
562
+ 1,
563
+ 5,
564
+ 0,
565
+ -1,
566
+ -1,
567
+ 5,
568
+ 3,
569
+ 3,
570
+ -1,
571
+ -1,
572
+ 5,
573
+ 1,
574
+ 3,
575
+ 0,
576
+ -1,
577
+ -1,
578
+ 0,
579
+ 0,
580
+ 0,
581
+ 7,
582
+ -1,
583
+ 3,
584
+ 7,
585
+ 0,
586
+ -1,
587
+ 6,
588
+ 8,
589
+ -1,
590
+ 0,
591
+ -1,
592
+ 0,
593
+ 0,
594
+ -1,
595
+ 7,
596
+ 0,
597
+ 1,
598
+ 4,
599
+ 0,
600
+ 7,
601
+ 0,
602
+ -1,
603
+ 1,
604
+ -1,
605
+ 2,
606
+ -1,
607
+ 0,
608
+ -1,
609
+ -1,
610
+ 2,
611
+ 0,
612
+ -1,
613
+ 0,
614
+ -1,
615
+ 0,
616
+ -1,
617
+ 2,
618
+ -1,
619
+ -1,
620
+ 3,
621
+ 8,
622
+ 3,
623
+ 6,
624
+ -1,
625
+ -1,
626
+ 2,
627
+ 4,
628
+ 0,
629
+ 6,
630
+ -1,
631
+ 4,
632
+ -1,
633
+ 7,
634
+ 2,
635
+ 4,
636
+ -1,
637
+ 8,
638
+ -1,
639
+ 0,
640
+ 0,
641
+ 4,
642
+ 0,
643
+ 2,
644
+ 2,
645
+ -1,
646
+ 3,
647
+ -1,
648
+ -1,
649
+ -1,
650
+ -1,
651
+ 8,
652
+ -1,
653
+ 0,
654
+ 4,
655
+ -1,
656
+ -1,
657
+ 1,
658
+ 1,
659
+ 1,
660
+ 8,
661
+ 0,
662
+ 1,
663
+ 2,
664
+ -1,
665
+ 1,
666
+ -1,
667
+ 2,
668
+ 2,
669
+ -1,
670
+ 4,
671
+ 2,
672
+ -1,
673
+ 0,
674
+ 6,
675
+ -1,
676
+ 4,
677
+ -1,
678
+ -1,
679
+ -1,
680
+ 7,
681
+ -1,
682
+ -1,
683
+ 0,
684
+ -1,
685
+ 1,
686
+ -1,
687
+ 0,
688
+ -1,
689
+ 0,
690
+ -1,
691
+ 2,
692
+ 1,
693
+ 2,
694
+ 0,
695
+ -1,
696
+ -1,
697
+ -1,
698
+ 2,
699
+ 0,
700
+ 2,
701
+ -1,
702
+ 8,
703
+ 7,
704
+ 0,
705
+ 1,
706
+ 5,
707
+ -1,
708
+ -1,
709
+ -1,
710
+ 0,
711
+ 2,
712
+ 0,
713
+ -1,
714
+ 0,
715
+ -1,
716
+ -1,
717
+ -1,
718
+ 3,
719
+ 2,
720
+ -1,
721
+ 7,
722
+ -1,
723
+ 0,
724
+ 0,
725
+ -1,
726
+ -1,
727
+ 1,
728
+ -1,
729
+ -1,
730
+ 0,
731
+ 1,
732
+ 3,
733
+ 7,
734
+ 1,
735
+ -1,
736
+ 0,
737
+ -1,
738
+ 0,
739
+ -1,
740
+ -1,
741
+ 0,
742
+ -1,
743
+ -1,
744
+ 0,
745
+ 5,
746
+ -1,
747
+ 1,
748
+ 0,
749
+ 1,
750
+ 8,
751
+ 0,
752
+ 2,
753
+ 1,
754
+ -1,
755
+ 1,
756
+ 5,
757
+ 0,
758
+ -1,
759
+ 4,
760
+ 1,
761
+ 1,
762
+ 0,
763
+ -1,
764
+ -1,
765
+ 2,
766
+ 4,
767
+ -1,
768
+ 0,
769
+ 0,
770
+ -1,
771
+ 2,
772
+ 0,
773
+ -1,
774
+ 2,
775
+ 1,
776
+ 5,
777
+ 3,
778
+ 6,
779
+ 5,
780
+ 2,
781
+ 1,
782
+ 4,
783
+ 5,
784
+ -1,
785
+ -1,
786
+ 2,
787
+ -1,
788
+ 6,
789
+ 0,
790
+ 2,
791
+ -1,
792
+ -1,
793
+ -1,
794
+ 3,
795
+ 4,
796
+ 4,
797
+ -1,
798
+ 1,
799
+ -1,
800
+ 6,
801
+ -1,
802
+ -1,
803
+ 1,
804
+ -1,
805
+ 5,
806
+ -1,
807
+ 4,
808
+ 1,
809
+ 4,
810
+ -1,
811
+ 0,
812
+ 0,
813
+ -1,
814
+ -1,
815
+ 6,
816
+ 5,
817
+ 2,
818
+ -1,
819
+ -1,
820
+ -1,
821
+ -1,
822
+ -1,
823
+ -1,
824
+ 4,
825
+ -1,
826
+ -1,
827
+ -1,
828
+ -1,
829
+ 5,
830
+ -1,
831
+ -1,
832
+ 0,
833
+ -1,
834
+ 2,
835
+ 3,
836
+ 1,
837
+ 2,
838
+ -1,
839
+ 1,
840
+ 7,
841
+ -1,
842
+ 4,
843
+ -1,
844
+ 1,
845
+ 3,
846
+ -1,
847
+ 8,
848
+ 0,
849
+ 1,
850
+ -1,
851
+ 0,
852
+ 1,
853
+ 0,
854
+ 4,
855
+ 8,
856
+ -1,
857
+ 3,
858
+ -1,
859
+ 4,
860
+ 4,
861
+ 2,
862
+ 5,
863
+ 8,
864
+ 3,
865
+ 7,
866
+ 3,
867
+ 0,
868
+ 1,
869
+ 8,
870
+ -1,
871
+ 6,
872
+ 4,
873
+ 0,
874
+ 7,
875
+ -1,
876
+ 6,
877
+ 4,
878
+ -1,
879
+ 6,
880
+ -1,
881
+ 0,
882
+ -1,
883
+ -1,
884
+ 7,
885
+ 1,
886
+ 3,
887
+ -1,
888
+ 0,
889
+ 6,
890
+ -1,
891
+ 1,
892
+ 2,
893
+ 3,
894
+ 2,
895
+ 1,
896
+ 5,
897
+ 0,
898
+ -1,
899
+ 6,
900
+ -1,
901
+ 0,
902
+ 0,
903
+ 1,
904
+ 6,
905
+ 5,
906
+ -1,
907
+ 0,
908
+ 2,
909
+ -1,
910
+ -1,
911
+ 0,
912
+ 3,
913
+ 0,
914
+ 2,
915
+ 3,
916
+ 2,
917
+ 2,
918
+ 7,
919
+ 1,
920
+ -1,
921
+ 1,
922
+ 1,
923
+ -1,
924
+ 3,
925
+ -1,
926
+ 6,
927
+ 0,
928
+ 4,
929
+ 0,
930
+ 5,
931
+ -1,
932
+ -1,
933
+ -1,
934
+ -1,
935
+ 5,
936
+ 5,
937
+ -1,
938
+ 2,
939
+ -1,
940
+ -1
941
+ ],
942
+ "topic_sizes": {
943
+ "0": 118,
944
+ "3": 32,
945
+ "6": 21,
946
+ "-1": 169,
947
+ "1": 49,
948
+ "7": 19,
949
+ "8": 15,
950
+ "2": 47,
951
+ "5": 22,
952
+ "4": 24
953
+ },
954
+ "topic_mapper": [
955
+ [
956
+ -1,
957
+ -1,
958
+ -1,
959
+ -1
960
+ ],
961
+ [
962
+ 0,
963
+ 0,
964
+ 7,
965
+ 4
966
+ ],
967
+ [
968
+ 1,
969
+ 1,
970
+ 4,
971
+ 6
972
+ ],
973
+ [
974
+ 2,
975
+ 2,
976
+ 8,
977
+ 3
978
+ ],
979
+ [
980
+ 3,
981
+ 3,
982
+ 5,
983
+ 1
984
+ ],
985
+ [
986
+ 4,
987
+ 4,
988
+ 6,
989
+ 5
990
+ ],
991
+ [
992
+ 5,
993
+ 5,
994
+ 3,
995
+ 7
996
+ ],
997
+ [
998
+ 6,
999
+ 6,
1000
+ 1,
1001
+ 8
1002
+ ],
1003
+ [
1004
+ 7,
1005
+ 7,
1006
+ 2,
1007
+ 2
1008
+ ],
1009
+ [
1010
+ 8,
1011
+ 8,
1012
+ 0,
1013
+ 0
1014
+ ],
1015
+ [
1016
+ 9,
1017
+ 9,
1018
+ 0,
1019
+ 0
1020
+ ],
1021
+ [
1022
+ 10,
1023
+ 10,
1024
+ 0,
1025
+ 0
1026
+ ]
1027
+ ],
1028
+ "topic_labels": {
1029
+ "-1": "-1_models_language_data_large",
1030
+ "0": "0_models_model_language_training",
1031
+ "1": "1_code_language_models_llms",
1032
+ "2": "2_ai_models_language_dialogue",
1033
+ "3": "3_detection_models_text_language",
1034
+ "4": "4_multimodal_visual_image_models",
1035
+ "5": "5_agents_language_policy_learning",
1036
+ "6": "6_speech_asr_text_speaker",
1037
+ "7": "7_reasoning_cot_models_problems",
1038
+ "8": "8_retrieval_information_query_llms"
1039
+ },
1040
+ "custom_labels": null,
1041
+ "_outliers": 1,
1042
+ "topic_aspects": {
1043
+ "KeyBERT": {
1044
+ "-1": [
1045
+ [
1046
+ "large language models",
1047
+ 0.6703740358352661
1048
+ ],
1049
+ [
1050
+ "large language models llms",
1051
+ 0.6190640330314636
1052
+ ],
1053
+ [
1054
+ "language models",
1055
+ 0.6147422790527344
1056
+ ],
1057
+ [
1058
+ "language models llms",
1059
+ 0.567597508430481
1060
+ ],
1061
+ [
1062
+ "language model",
1063
+ 0.5490379333496094
1064
+ ],
1065
+ [
1066
+ "large language",
1067
+ 0.47846218943595886
1068
+ ],
1069
+ [
1070
+ "natural language",
1071
+ 0.47019103169441223
1072
+ ],
1073
+ [
1074
+ "language",
1075
+ 0.36398622393608093
1076
+ ],
1077
+ [
1078
+ "training data",
1079
+ 0.36353152990341187
1080
+ ],
1081
+ [
1082
+ "models",
1083
+ 0.3585664629936218
1084
+ ]
1085
+ ],
1086
+ "0": [
1087
+ [
1088
+ "large language models",
1089
+ 0.651195228099823
1090
+ ],
1091
+ [
1092
+ "pretrained language",
1093
+ 0.512614905834198
1094
+ ],
1095
+ [
1096
+ "language models",
1097
+ 0.49944019317626953
1098
+ ],
1099
+ [
1100
+ "large language",
1101
+ 0.49680691957473755
1102
+ ],
1103
+ [
1104
+ "language model",
1105
+ 0.44212523102760315
1106
+ ],
1107
+ [
1108
+ "machine translation",
1109
+ 0.3898525834083557
1110
+ ],
1111
+ [
1112
+ "sparse",
1113
+ 0.3684082329273224
1114
+ ],
1115
+ [
1116
+ "memory",
1117
+ 0.35640034079551697
1118
+ ],
1119
+ [
1120
+ "corpus",
1121
+ 0.3460950255393982
1122
+ ],
1123
+ [
1124
+ "attention",
1125
+ 0.34196916222572327
1126
+ ]
1127
+ ],
1128
+ "1": [
1129
+ [
1130
+ "code generation",
1131
+ 0.5884341597557068
1132
+ ],
1133
+ [
1134
+ "code completion",
1135
+ 0.5430147647857666
1136
+ ],
1137
+ [
1138
+ "source code",
1139
+ 0.5036313533782959
1140
+ ],
1141
+ [
1142
+ "large language models",
1143
+ 0.4955924153327942
1144
+ ],
1145
+ [
1146
+ "large language models llms",
1147
+ 0.48612886667251587
1148
+ ],
1149
+ [
1150
+ "language models",
1151
+ 0.44613733887672424
1152
+ ],
1153
+ [
1154
+ "software engineering",
1155
+ 0.44518738985061646
1156
+ ],
1157
+ [
1158
+ "language models llms",
1159
+ 0.44061607122421265
1160
+ ],
1161
+ [
1162
+ "programming",
1163
+ 0.41835474967956543
1164
+ ],
1165
+ [
1166
+ "coding",
1167
+ 0.4044494926929474
1168
+ ]
1169
+ ],
1170
+ "2": [
1171
+ [
1172
+ "large language models",
1173
+ 0.6216679215431213
1174
+ ],
1175
+ [
1176
+ "conversational ai",
1177
+ 0.6001573204994202
1178
+ ],
1179
+ [
1180
+ "large language models llms",
1181
+ 0.588668167591095
1182
+ ],
1183
+ [
1184
+ "language models",
1185
+ 0.5686337351799011
1186
+ ],
1187
+ [
1188
+ "chatbots",
1189
+ 0.5604218244552612
1190
+ ],
1191
+ [
1192
+ "language models llms",
1193
+ 0.5467207431793213
1194
+ ],
1195
+ [
1196
+ "language model",
1197
+ 0.5185490250587463
1198
+ ],
1199
+ [
1200
+ "large language",
1201
+ 0.5117849111557007
1202
+ ],
1203
+ [
1204
+ "natural language",
1205
+ 0.4800942540168762
1206
+ ],
1207
+ [
1208
+ "dialogues",
1209
+ 0.437444806098938
1210
+ ]
1211
+ ],
1212
+ "3": [
1213
+ [
1214
+ "large language models",
1215
+ 0.5753244161605835
1216
+ ],
1217
+ [
1218
+ "large language models llms",
1219
+ 0.5593785047531128
1220
+ ],
1221
+ [
1222
+ "language models",
1223
+ 0.5217305421829224
1224
+ ],
1225
+ [
1226
+ "language models llms",
1227
+ 0.5088766813278198
1228
+ ],
1229
+ [
1230
+ "machinegenerated text",
1231
+ 0.49884361028671265
1232
+ ],
1233
+ [
1234
+ "language model",
1235
+ 0.45426321029663086
1236
+ ],
1237
+ [
1238
+ "large language",
1239
+ 0.4042874574661255
1240
+ ],
1241
+ [
1242
+ "texts",
1243
+ 0.3673853576183319
1244
+ ],
1245
+ [
1246
+ "classifier",
1247
+ 0.354655921459198
1248
+ ],
1249
+ [
1250
+ "text",
1251
+ 0.3459568917751312
1252
+ ]
1253
+ ],
1254
+ "4": [
1255
+ [
1256
+ "multimodal large language",
1257
+ 0.6466671228408813
1258
+ ],
1259
+ [
1260
+ "multimodal models",
1261
+ 0.63934326171875
1262
+ ],
1263
+ [
1264
+ "multimodal",
1265
+ 0.6179039478302002
1266
+ ],
1267
+ [
1268
+ "multimodal large",
1269
+ 0.5376994609832764
1270
+ ],
1271
+ [
1272
+ "visual",
1273
+ 0.47933536767959595
1274
+ ],
1275
+ [
1276
+ "large language models",
1277
+ 0.4537416696548462
1278
+ ],
1279
+ [
1280
+ "visionlanguage",
1281
+ 0.4349161982536316
1282
+ ],
1283
+ [
1284
+ "language models",
1285
+ 0.42795825004577637
1286
+ ],
1287
+ [
1288
+ "large language model",
1289
+ 0.4277690649032593
1290
+ ],
1291
+ [
1292
+ "visual foundation models",
1293
+ 0.40677300095558167
1294
+ ]
1295
+ ],
1296
+ "5": [
1297
+ [
1298
+ "large language models llms",
1299
+ 0.4626759886741638
1300
+ ],
1301
+ [
1302
+ "ai",
1303
+ 0.4613281488418579
1304
+ ],
1305
+ [
1306
+ "language models llms",
1307
+ 0.45701661705970764
1308
+ ],
1309
+ [
1310
+ "agent",
1311
+ 0.4489193260669708
1312
+ ],
1313
+ [
1314
+ "large language models",
1315
+ 0.4476342499256134
1316
+ ],
1317
+ [
1318
+ "agents",
1319
+ 0.44667837023735046
1320
+ ],
1321
+ [
1322
+ "interactive",
1323
+ 0.439677357673645
1324
+ ],
1325
+ [
1326
+ "language models",
1327
+ 0.4368625581264496
1328
+ ],
1329
+ [
1330
+ "reinforcement",
1331
+ 0.4350704550743103
1332
+ ],
1333
+ [
1334
+ "language model",
1335
+ 0.42887791991233826
1336
+ ]
1337
+ ],
1338
+ "6": [
1339
+ [
1340
+ "automatic speech",
1341
+ 0.6606317758560181
1342
+ ],
1343
+ [
1344
+ "automatic speech recognition asr",
1345
+ 0.5792312622070312
1346
+ ],
1347
+ [
1348
+ "speech recognition",
1349
+ 0.5414796471595764
1350
+ ],
1351
+ [
1352
+ "speech recognition asr",
1353
+ 0.5414656400680542
1354
+ ],
1355
+ [
1356
+ "automatic speech recognition",
1357
+ 0.5386157035827637
1358
+ ],
1359
+ [
1360
+ "large language models",
1361
+ 0.529854416847229
1362
+ ],
1363
+ [
1364
+ "large language model",
1365
+ 0.5051016211509705
1366
+ ],
1367
+ [
1368
+ "utterances",
1369
+ 0.49932384490966797
1370
+ ],
1371
+ [
1372
+ "language models",
1373
+ 0.46869075298309326
1374
+ ],
1375
+ [
1376
+ "voice",
1377
+ 0.43832945823669434
1378
+ ]
1379
+ ],
1380
+ "7": [
1381
+ [
1382
+ "reasoning large language models",
1383
+ 0.69033282995224
1384
+ ],
1385
+ [
1386
+ "reasoning tasks",
1387
+ 0.6320525407791138
1388
+ ],
1389
+ [
1390
+ "reasoning large language",
1391
+ 0.630852460861206
1392
+ ],
1393
+ [
1394
+ "reasoning capabilities",
1395
+ 0.6158041954040527
1396
+ ],
1397
+ [
1398
+ "reasoning benchmarks",
1399
+ 0.5364079475402832
1400
+ ],
1401
+ [
1402
+ "large language models",
1403
+ 0.48382115364074707
1404
+ ],
1405
+ [
1406
+ "large language models llms",
1407
+ 0.4739667773246765
1408
+ ],
1409
+ [
1410
+ "complex reasoning",
1411
+ 0.46622762084007263
1412
+ ],
1413
+ [
1414
+ "language models",
1415
+ 0.46207302808761597
1416
+ ],
1417
+ [
1418
+ "language models llms",
1419
+ 0.453142374753952
1420
+ ]
1421
+ ],
1422
+ "8": [
1423
+ [
1424
+ "large language models llm",
1425
+ 0.6180689334869385
1426
+ ],
1427
+ [
1428
+ "large language models llms",
1429
+ 0.6018953323364258
1430
+ ],
1431
+ [
1432
+ "large language models",
1433
+ 0.5865136384963989
1434
+ ],
1435
+ [
1436
+ "language models llm",
1437
+ 0.5565090179443359
1438
+ ],
1439
+ [
1440
+ "language models llms",
1441
+ 0.5427590608596802
1442
+ ],
1443
+ [
1444
+ "language models",
1445
+ 0.5051120519638062
1446
+ ],
1447
+ [
1448
+ "information retrieval",
1449
+ 0.5001324415206909
1450
+ ],
1451
+ [
1452
+ "retrieval",
1453
+ 0.46649327874183655
1454
+ ],
1455
+ [
1456
+ "knowledge bases",
1457
+ 0.4627561569213867
1458
+ ],
1459
+ [
1460
+ "large language",
1461
+ 0.3926961421966553
1462
+ ]
1463
+ ]
1464
+ },
1465
+ "MMR": {
1466
+ "-1": [
1467
+ [
1468
+ "models",
1469
+ 0.036874579738434304
1470
+ ],
1471
+ [
1472
+ "language",
1473
+ 0.031011734360675242
1474
+ ],
1475
+ [
1476
+ "data",
1477
+ 0.02740357251248468
1478
+ ],
1479
+ [
1480
+ "large",
1481
+ 0.024331696551107916
1482
+ ],
1483
+ [
1484
+ "language models",
1485
+ 0.02287739800299974
1486
+ ],
1487
+ [
1488
+ "model",
1489
+ 0.02123690372233833
1490
+ ],
1491
+ [
1492
+ "tasks",
1493
+ 0.02117889409597425
1494
+ ],
1495
+ [
1496
+ "llms",
1497
+ 0.020210440809796944
1498
+ ],
1499
+ [
1500
+ "large language",
1501
+ 0.019999417196753248
1502
+ ],
1503
+ [
1504
+ "large language models",
1505
+ 0.019126572684958956
1506
+ ]
1507
+ ],
1508
+ "0": [
1509
+ [
1510
+ "models",
1511
+ 0.03888243759552385
1512
+ ],
1513
+ [
1514
+ "model",
1515
+ 0.03647492283412293
1516
+ ],
1517
+ [
1518
+ "language",
1519
+ 0.03613590283186468
1520
+ ],
1521
+ [
1522
+ "training",
1523
+ 0.025581428828302905
1524
+ ],
1525
+ [
1526
+ "language models",
1527
+ 0.02386262298037925
1528
+ ],
1529
+ [
1530
+ "tasks",
1531
+ 0.02360941221543806
1532
+ ],
1533
+ [
1534
+ "data",
1535
+ 0.021604280018978572
1536
+ ],
1537
+ [
1538
+ "performance",
1539
+ 0.021213047327713713
1540
+ ],
1541
+ [
1542
+ "large",
1543
+ 0.020365016161611835
1544
+ ],
1545
+ [
1546
+ "method",
1547
+ 0.01788214168631935
1548
+ ]
1549
+ ],
1550
+ "1": [
1551
+ [
1552
+ "code",
1553
+ 0.08112439886630912
1554
+ ],
1555
+ [
1556
+ "language",
1557
+ 0.03515934823155083
1558
+ ],
1559
+ [
1560
+ "models",
1561
+ 0.034093014905089085
1562
+ ],
1563
+ [
1564
+ "llms",
1565
+ 0.03351276274167474
1566
+ ],
1567
+ [
1568
+ "programming",
1569
+ 0.03221809114638236
1570
+ ],
1571
+ [
1572
+ "software",
1573
+ 0.024215765671622126
1574
+ ],
1575
+ [
1576
+ "language models",
1577
+ 0.023501871498181743
1578
+ ],
1579
+ [
1580
+ "tasks",
1581
+ 0.021362088649701006
1582
+ ],
1583
+ [
1584
+ "model",
1585
+ 0.021028623583260922
1586
+ ],
1587
+ [
1588
+ "large language",
1589
+ 0.020242713470511334
1590
+ ]
1591
+ ],
1592
+ "2": [
1593
+ [
1594
+ "ai",
1595
+ 0.03748085558879784
1596
+ ],
1597
+ [
1598
+ "models",
1599
+ 0.032123956517937674
1600
+ ],
1601
+ [
1602
+ "language",
1603
+ 0.030708509906927736
1604
+ ],
1605
+ [
1606
+ "dialogue",
1607
+ 0.02863305325688509
1608
+ ],
1609
+ [
1610
+ "human",
1611
+ 0.027796744355540557
1612
+ ],
1613
+ [
1614
+ "llms",
1615
+ 0.027095383693882993
1616
+ ],
1617
+ [
1618
+ "chatgpt",
1619
+ 0.02427426857972807
1620
+ ],
1621
+ [
1622
+ "large language",
1623
+ 0.024177158942537805
1624
+ ],
1625
+ [
1626
+ "large",
1627
+ 0.023491817699557018
1628
+ ],
1629
+ [
1630
+ "model",
1631
+ 0.022240448993628016
1632
+ ]
1633
+ ],
1634
+ "3": [
1635
+ [
1636
+ "detection",
1637
+ 0.04600933370915614
1638
+ ],
1639
+ [
1640
+ "models",
1641
+ 0.0376182869533305
1642
+ ],
1643
+ [
1644
+ "text",
1645
+ 0.03622151327830574
1646
+ ],
1647
+ [
1648
+ "language",
1649
+ 0.03555056937300613
1650
+ ],
1651
+ [
1652
+ "model",
1653
+ 0.02910562167494557
1654
+ ],
1655
+ [
1656
+ "large",
1657
+ 0.026737322113278325
1658
+ ],
1659
+ [
1660
+ "language models",
1661
+ 0.026260255642963005
1662
+ ],
1663
+ [
1664
+ "misinformation",
1665
+ 0.022438367434259674
1666
+ ],
1667
+ [
1668
+ "dataset",
1669
+ 0.021178404179731523
1670
+ ],
1671
+ [
1672
+ "large language",
1673
+ 0.020266242724238725
1674
+ ]
1675
+ ],
1676
+ "4": [
1677
+ [
1678
+ "multimodal",
1679
+ 0.06377037276103617
1680
+ ],
1681
+ [
1682
+ "visual",
1683
+ 0.0609342279209814
1684
+ ],
1685
+ [
1686
+ "image",
1687
+ 0.05031813021481461
1688
+ ],
1689
+ [
1690
+ "models",
1691
+ 0.04428945209100523
1692
+ ],
1693
+ [
1694
+ "generation",
1695
+ 0.03866971167435956
1696
+ ],
1697
+ [
1698
+ "video",
1699
+ 0.03452530411071284
1700
+ ],
1701
+ [
1702
+ "understanding",
1703
+ 0.03174883479055843
1704
+ ],
1705
+ [
1706
+ "large",
1707
+ 0.02994331997174661
1708
+ ],
1709
+ [
1710
+ "model",
1711
+ 0.027842071361726516
1712
+ ],
1713
+ [
1714
+ "instruction",
1715
+ 0.02744625284444433
1716
+ ]
1717
+ ],
1718
+ "5": [
1719
+ [
1720
+ "agents",
1721
+ 0.032621488861863626
1722
+ ],
1723
+ [
1724
+ "language",
1725
+ 0.032046686285534975
1726
+ ],
1727
+ [
1728
+ "policy",
1729
+ 0.031585563861493055
1730
+ ],
1731
+ [
1732
+ "learning",
1733
+ 0.030550747755560888
1734
+ ],
1735
+ [
1736
+ "tasks",
1737
+ 0.029270078392980483
1738
+ ],
1739
+ [
1740
+ "llms",
1741
+ 0.028067175067745524
1742
+ ],
1743
+ [
1744
+ "agent",
1745
+ 0.026011640827111927
1746
+ ],
1747
+ [
1748
+ "games",
1749
+ 0.025255064827310037
1750
+ ],
1751
+ [
1752
+ "knowledge",
1753
+ 0.02496878818528055
1754
+ ],
1755
+ [
1756
+ "model",
1757
+ 0.024630611822384848
1758
+ ]
1759
+ ],
1760
+ "6": [
1761
+ [
1762
+ "speech",
1763
+ 0.12032183461065618
1764
+ ],
1765
+ [
1766
+ "asr",
1767
+ 0.0784134014691984
1768
+ ],
1769
+ [
1770
+ "text",
1771
+ 0.04816267150192302
1772
+ ],
1773
+ [
1774
+ "speaker",
1775
+ 0.04549115752552982
1776
+ ],
1777
+ [
1778
+ "recognition",
1779
+ 0.044013060675693126
1780
+ ],
1781
+ [
1782
+ "speech recognition",
1783
+ 0.03480823666083872
1784
+ ],
1785
+ [
1786
+ "model",
1787
+ 0.0329226249448169
1788
+ ],
1789
+ [
1790
+ "language",
1791
+ 0.031171151406766243
1792
+ ],
1793
+ [
1794
+ "voice",
1795
+ 0.030863819919231247
1796
+ ],
1797
+ [
1798
+ "proposed",
1799
+ 0.029531042059903895
1800
+ ]
1801
+ ],
1802
+ "7": [
1803
+ [
1804
+ "reasoning",
1805
+ 0.09733768593924219
1806
+ ],
1807
+ [
1808
+ "cot",
1809
+ 0.04159609177483568
1810
+ ],
1811
+ [
1812
+ "models",
1813
+ 0.04032110830244759
1814
+ ],
1815
+ [
1816
+ "problems",
1817
+ 0.038531107231743966
1818
+ ],
1819
+ [
1820
+ "commonsense",
1821
+ 0.0328390198222387
1822
+ ],
1823
+ [
1824
+ "language",
1825
+ 0.03061562593615061
1826
+ ],
1827
+ [
1828
+ "prompting",
1829
+ 0.03050017742462947
1830
+ ],
1831
+ [
1832
+ "language models",
1833
+ 0.028282815332533393
1834
+ ],
1835
+ [
1836
+ "math",
1837
+ 0.026470858073982147
1838
+ ],
1839
+ [
1840
+ "chainofthought",
1841
+ 0.026470858073982147
1842
+ ]
1843
+ ],
1844
+ "8": [
1845
+ [
1846
+ "retrieval",
1847
+ 0.05391749257643426
1848
+ ],
1849
+ [
1850
+ "information",
1851
+ 0.041311727463775545
1852
+ ],
1853
+ [
1854
+ "query",
1855
+ 0.03998637165786005
1856
+ ],
1857
+ [
1858
+ "llms",
1859
+ 0.0360048263616992
1860
+ ],
1861
+ [
1862
+ "models",
1863
+ 0.03235786882267994
1864
+ ],
1865
+ [
1866
+ "language",
1867
+ 0.03201012649638935
1868
+ ],
1869
+ [
1870
+ "queries",
1871
+ 0.031828706522162444
1872
+ ],
1873
+ [
1874
+ "language models",
1875
+ 0.02804152194835136
1876
+ ],
1877
+ [
1878
+ "large",
1879
+ 0.026588466396316807
1880
+ ],
1881
+ [
1882
+ "knowledge",
1883
+ 0.02430262486413176
1884
+ ]
1885
+ ]
1886
+ },
1887
+ "POS": {
1888
+ "-1": [
1889
+ [
1890
+ "models",
1891
+ 0.036874579738434304
1892
+ ],
1893
+ [
1894
+ "language",
1895
+ 0.031011734360675242
1896
+ ],
1897
+ [
1898
+ "data",
1899
+ 0.02740357251248468
1900
+ ],
1901
+ [
1902
+ "large",
1903
+ 0.024331696551107916
1904
+ ],
1905
+ [
1906
+ "model",
1907
+ 0.02123690372233833
1908
+ ],
1909
+ [
1910
+ "tasks",
1911
+ 0.02117889409597425
1912
+ ],
1913
+ [
1914
+ "large language",
1915
+ 0.019999417196753248
1916
+ ],
1917
+ [
1918
+ "learning",
1919
+ 0.017245729294018734
1920
+ ],
1921
+ [
1922
+ "knowledge",
1923
+ 0.015578401017865536
1924
+ ],
1925
+ [
1926
+ "performance",
1927
+ 0.015293299507868716
1928
+ ]
1929
+ ],
1930
+ "0": [
1931
+ [
1932
+ "models",
1933
+ 0.03888243759552385
1934
+ ],
1935
+ [
1936
+ "model",
1937
+ 0.03647492283412293
1938
+ ],
1939
+ [
1940
+ "language",
1941
+ 0.03613590283186468
1942
+ ],
1943
+ [
1944
+ "training",
1945
+ 0.025581428828302905
1946
+ ],
1947
+ [
1948
+ "tasks",
1949
+ 0.02360941221543806
1950
+ ],
1951
+ [
1952
+ "data",
1953
+ 0.021604280018978572
1954
+ ],
1955
+ [
1956
+ "performance",
1957
+ 0.021213047327713713
1958
+ ],
1959
+ [
1960
+ "large",
1961
+ 0.020365016161611835
1962
+ ],
1963
+ [
1964
+ "method",
1965
+ 0.01788214168631935
1966
+ ],
1967
+ [
1968
+ "translation",
1969
+ 0.015317468043852814
1970
+ ]
1971
+ ],
1972
+ "1": [
1973
+ [
1974
+ "code",
1975
+ 0.08112439886630912
1976
+ ],
1977
+ [
1978
+ "language",
1979
+ 0.03515934823155083
1980
+ ],
1981
+ [
1982
+ "models",
1983
+ 0.034093014905089085
1984
+ ],
1985
+ [
1986
+ "programming",
1987
+ 0.03221809114638236
1988
+ ],
1989
+ [
1990
+ "software",
1991
+ 0.024215765671622126
1992
+ ],
1993
+ [
1994
+ "tasks",
1995
+ 0.021362088649701006
1996
+ ],
1997
+ [
1998
+ "model",
1999
+ 0.021028623583260922
2000
+ ],
2001
+ [
2002
+ "large language",
2003
+ 0.020242713470511334
2004
+ ],
2005
+ [
2006
+ "large",
2007
+ 0.01969750985041782
2008
+ ],
2009
+ [
2010
+ "program",
2011
+ 0.017892959453975895
2012
+ ]
2013
+ ],
2014
+ "2": [
2015
+ [
2016
+ "models",
2017
+ 0.032123956517937674
2018
+ ],
2019
+ [
2020
+ "language",
2021
+ 0.030708509906927736
2022
+ ],
2023
+ [
2024
+ "dialogue",
2025
+ 0.02863305325688509
2026
+ ],
2027
+ [
2028
+ "human",
2029
+ 0.027796744355540557
2030
+ ],
2031
+ [
2032
+ "large language",
2033
+ 0.024177158942537805
2034
+ ],
2035
+ [
2036
+ "large",
2037
+ 0.023491817699557018
2038
+ ],
2039
+ [
2040
+ "model",
2041
+ 0.022240448993628016
2042
+ ],
2043
+ [
2044
+ "chatbots",
2045
+ 0.021090782635767247
2046
+ ],
2047
+ [
2048
+ "responses",
2049
+ 0.020358247264396636
2050
+ ],
2051
+ [
2052
+ "agents",
2053
+ 0.019356726824660043
2054
+ ]
2055
+ ],
2056
+ "3": [
2057
+ [
2058
+ "detection",
2059
+ 0.04600933370915614
2060
+ ],
2061
+ [
2062
+ "models",
2063
+ 0.0376182869533305
2064
+ ],
2065
+ [
2066
+ "text",
2067
+ 0.03622151327830574
2068
+ ],
2069
+ [
2070
+ "language",
2071
+ 0.03555056937300613
2072
+ ],
2073
+ [
2074
+ "model",
2075
+ 0.02910562167494557
2076
+ ],
2077
+ [
2078
+ "large",
2079
+ 0.026737322113278325
2080
+ ],
2081
+ [
2082
+ "misinformation",
2083
+ 0.022438367434259674
2084
+ ],
2085
+ [
2086
+ "dataset",
2087
+ 0.021178404179731523
2088
+ ],
2089
+ [
2090
+ "large language",
2091
+ 0.020266242724238725
2092
+ ],
2093
+ [
2094
+ "bias",
2095
+ 0.019222454111824376
2096
+ ]
2097
+ ],
2098
+ "4": [
2099
+ [
2100
+ "multimodal",
2101
+ 0.06377037276103617
2102
+ ],
2103
+ [
2104
+ "visual",
2105
+ 0.0609342279209814
2106
+ ],
2107
+ [
2108
+ "image",
2109
+ 0.05031813021481461
2110
+ ],
2111
+ [
2112
+ "models",
2113
+ 0.04428945209100523
2114
+ ],
2115
+ [
2116
+ "generation",
2117
+ 0.03866971167435956
2118
+ ],
2119
+ [
2120
+ "video",
2121
+ 0.03452530411071284
2122
+ ],
2123
+ [
2124
+ "understanding",
2125
+ 0.03174883479055843
2126
+ ],
2127
+ [
2128
+ "large",
2129
+ 0.02994331997174661
2130
+ ],
2131
+ [
2132
+ "model",
2133
+ 0.027842071361726516
2134
+ ],
2135
+ [
2136
+ "instruction",
2137
+ 0.02744625284444433
2138
+ ]
2139
+ ],
2140
+ "5": [
2141
+ [
2142
+ "agents",
2143
+ 0.032621488861863626
2144
+ ],
2145
+ [
2146
+ "language",
2147
+ 0.032046686285534975
2148
+ ],
2149
+ [
2150
+ "policy",
2151
+ 0.031585563861493055
2152
+ ],
2153
+ [
2154
+ "learning",
2155
+ 0.030550747755560888
2156
+ ],
2157
+ [
2158
+ "tasks",
2159
+ 0.029270078392980483
2160
+ ],
2161
+ [
2162
+ "agent",
2163
+ 0.026011640827111927
2164
+ ],
2165
+ [
2166
+ "games",
2167
+ 0.025255064827310037
2168
+ ],
2169
+ [
2170
+ "knowledge",
2171
+ 0.02496878818528055
2172
+ ],
2173
+ [
2174
+ "model",
2175
+ 0.024630611822384848
2176
+ ],
2177
+ [
2178
+ "models",
2179
+ 0.02357361082959911
2180
+ ]
2181
+ ],
2182
+ "6": [
2183
+ [
2184
+ "speech",
2185
+ 0.12032183461065618
2186
+ ],
2187
+ [
2188
+ "text",
2189
+ 0.04816267150192302
2190
+ ],
2191
+ [
2192
+ "speaker",
2193
+ 0.04549115752552982
2194
+ ],
2195
+ [
2196
+ "recognition",
2197
+ 0.044013060675693126
2198
+ ],
2199
+ [
2200
+ "model",
2201
+ 0.0329226249448169
2202
+ ],
2203
+ [
2204
+ "language",
2205
+ 0.031171151406766243
2206
+ ],
2207
+ [
2208
+ "voice",
2209
+ 0.030863819919231247
2210
+ ],
2211
+ [
2212
+ "systems",
2213
+ 0.02868879719738342
2214
+ ],
2215
+ [
2216
+ "error",
2217
+ 0.027433755186485595
2218
+ ],
2219
+ [
2220
+ "prompt",
2221
+ 0.027359560787395366
2222
+ ]
2223
+ ],
2224
+ "7": [
2225
+ [
2226
+ "reasoning",
2227
+ 0.09733768593924219
2228
+ ],
2229
+ [
2230
+ "models",
2231
+ 0.04032110830244759
2232
+ ],
2233
+ [
2234
+ "problems",
2235
+ 0.038531107231743966
2236
+ ],
2237
+ [
2238
+ "commonsense",
2239
+ 0.0328390198222387
2240
+ ],
2241
+ [
2242
+ "language",
2243
+ 0.03061562593615061
2244
+ ],
2245
+ [
2246
+ "prompting",
2247
+ 0.03050017742462947
2248
+ ],
2249
+ [
2250
+ "math",
2251
+ 0.026470858073982147
2252
+ ],
2253
+ [
2254
+ "model",
2255
+ 0.02522199037356587
2256
+ ],
2257
+ [
2258
+ "performance",
2259
+ 0.025100359151578013
2260
+ ],
2261
+ [
2262
+ "large",
2263
+ 0.024219197113476695
2264
+ ]
2265
+ ],
2266
+ "8": [
2267
+ [
2268
+ "retrieval",
2269
+ 0.05391749257643426
2270
+ ],
2271
+ [
2272
+ "information",
2273
+ 0.041311727463775545
2274
+ ],
2275
+ [
2276
+ "query",
2277
+ 0.03998637165786005
2278
+ ],
2279
+ [
2280
+ "models",
2281
+ 0.03235786882267994
2282
+ ],
2283
+ [
2284
+ "language",
2285
+ 0.03201012649638935
2286
+ ],
2287
+ [
2288
+ "queries",
2289
+ 0.031828706522162444
2290
+ ],
2291
+ [
2292
+ "large",
2293
+ 0.026588466396316807
2294
+ ],
2295
+ [
2296
+ "knowledge",
2297
+ 0.02430262486413176
2298
+ ],
2299
+ [
2300
+ "augmentation",
2301
+ 0.022439589434192657
2302
+ ],
2303
+ [
2304
+ "results",
2305
+ 0.021446519611670142
2306
+ ]
2307
+ ]
2308
+ }
2309
+ }
2310
+ }