C1-Topic-Model-100 / README.md
AlexanderHolmes0's picture
Add BERTopic model
f3f3f0b verified
|
raw
history blame
10.5 kB
metadata
tags:
  - bertopic
library_name: bertopic
pipeline_tag: text-classification

C1-topic-model-100

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("AlexanderHolmes0/C1-topic-model-100")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 100
  • Number of training documents: 112332
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 stop - people - need - help - banking 84 -1_stop_people_need_help
0 account - card - customer - service - phone 4738 0_account_card_customer_service
1 game - thank - win - team - bazinga_sb 21155 1_game_thank_win_team
2 app - need - stop - help - lol 11918 2_app_need_stop_help
3 card - credit - account - customer - told 11946 3_card_credit_account_customer
4 friend - request - friends - send - mind 6152 4_friend_request_friends_send
5 community - event - lounge - thank - presented 3403 5_community_event_lounge_thank
6 trading - whatsapp - forex - investment - profit 4398 6_trading_whatsapp_forex_investment
7 - - - - 2804 7____
8 ho - genial - travolta - john - love 1191 8_ho_genial_travolta_john
9 santa - john - travolta - love - movie 1157 9_santa_john_travolta_love
10 que - en - la - para - el 1695 10_que_en_la_para
11 worst - service - company - customer - bank 1532 11_worst_service_company_customer
12 love - awesome - happy - best - great 2417 12_love_awesome_happy_best
13 people - blessed - grands - cus - paying 2736 13_people_blessed_grands_cus
14 later - connect - definitely - inbox - hello 896 14_later_connect_definitely_inbox
15 shimmerr - kt - txscapitalone - capit - word 4157 15_shimmerr_kt_txscapitalone_capit
16 credit - card - score - limit - pay 535 16_credit_card_score_limit
17 spell - dr - caster - ex - lover 5072 17_spell_dr_caster_ex
18 love - business - thank - credit - best 672 18_love_business_thank_credit
19 links - mm - click - nutshell - join 2133 19_links_mm_click_nutshell
20 travel - points - venture - miles - flights 515 20_travel_points_venture_miles
21 slash - guitar - guitarist - riff - song 1271 21_slash_guitar_guitarist_riff
22 reachout - miller - david - comes - investing 630 22_reachout_miller_david_comes
23 preston - mourning - underneath - commenting - kelly 303 23_preston_mourning_underneath_commenting
24 tnt - qbs - reminded - tune - headed 211 24_tnt_qbs_reminded_tune
25 working - website - isn - expired - enter 222 25_working_website_isn_expired
26 suck - sucks - lol - fake - sounds 947 26_suck_sucks_lol_fake
27 love - cute - absolutely - ittttt - lovelove 1603 27_love_cute_absolutely_ittttt
28 mobile - app - digital - options - shopping 292 28_mobile_app_digital_options
29 cardigan - flannel - plaid - tsxccapitalone - cardigans 554 29_cardigan_flannel_plaid_tsxccapitalone
30 presale - event - taylorswift - venue - np 191 30_presale_event_taylorswift_venue
31 donna - pescow - annette - fever - looks 403 31_donna_pescow_annette_fever
32 card - best - great - love - cards 340 32_card_best_great_love
33 flannel - flannnel - scymbags - rsxcapitalone - pumpkinseason 632 33_flannel_flannnel_scymbags_rsxcapitalone
34 car - auto - navigator - cars - buying 172 34_car_auto_navigator_cars
35 𝗍𝗁𝖺𝗍 - 𝖺𝗇𝖽 - π—ˆπ—Žπ—‹ - π—’π—ˆπ—Ž - π—π—ˆ 566 35_𝗍𝗁𝖺𝗍_𝖺𝗇𝖽_π—ˆπ—Žπ—‹_π—’π—ˆπ—Ž
36 sure - true - correct - yessir - yea 416 36_sure_true_correct_yessir
37 beard - chef - awards - beardfoundation - james 329 37_beard_chef_awards_beardfoundation
38 worst - bank - service - customer - fucking 223 38_worst_bank_service_customer
39 bank - banks - banking - best - chairman 1098 39_bank_banks_banking_best
40 student - unlimited - key - quicksilver - cash 397 40_student_unlimited_key_quicksilver
41 teambradgers - hardwood - winning - head - winners 343 41_teambradgers_hardwood_winning_head
42 giveway - tstheerastour - tsxcapitalone - givesway - givaway 178 42_giveway_tstheerastour_tsxcapitalone_givesway
43 love - awesome - cute - omg - absolutely 155 43_love_awesome_cute_omg
44 postdoctoral - ΞΊΞ±ΞΉ - 𝐭𝐨 - 𝐚𝐧𝐝 - Ψ±Ψ³ΩˆΩ„ 386 44_postdoctoral_ΞΊΞ±ΞΉ_𝐭𝐨_𝐚𝐧𝐝
45 platinum - excluded - cards - credit - secured 357 45_platinum_excluded_cards_credit
46 solving - ai - data - methods - learning 127 46_solving_ai_data_methods
47 dm - sent - haven - gotten - dms 219 47_dm_sent_haven_gotten
48 nice - cool - good - cute - dimples 289 48_nice_cool_good_cute
49 cafΓ©s - branches - offices - closed - holiday 206 49_cafΓ©s_branches_offices_closed
50 gt - page - javier - facebook - sergio 130 50_gt_page_javier_facebook
51 fault - mailbox - buyer - flood - vendors 191 51_fault_mailbox_buyer_flood
52 responsibility - expect - accept - unethical - perjury 89 52_responsibility_expect_accept_unethical
53 holiday - encore - noΓ«l - livestream - songs 88 53_holiday_encore_noΓ«l_livestream
54 bio - webinar - mcws - capitalonecafe - wcwsselfie 44 54_bio_webinar_mcws_capitalonecafe
55 agree - true - haha - ha - right 565 55_agree_true_haha_ha
56 word - code - shib - enter - biiiish 324 56_word_code_shib_enter
57 uber - eats - complimentary - orders - nov 113 57_uber_eats_complimentary_orders
58 bbb - tracker - scams - scam - scamtracker 46 58_bbb_tracker_scams_scam
59 count - days - fucking - bitch - fuckin 85 59_count_days_fucking_bitch
60 tradelines - cpn - repair - removal - inquiries 89 60_tradelines_cpn_repair_removal
61 gas - prices - inflation - cars - electric 102 61_gas_prices_inflation_cars
62 spark - antonelli - preset - cash - cheese 159 62_spark_antonelli_preset_cash
63 waiting - room - queue - open - lounge 57 63_waiting_room_queue_open
64 need - want - got - officce - wish 455 64_need_want_got_officce
65 taher - bapary - abu - wow - baah 371 65_taher_bapary_abu_wow
66 atos - story - cloud - datacenter - upl 46 66_atos_story_cloud_datacenter
67 song - itunes - radio - rainy - bts 77 67_song_itunes_radio_rainy
68 dining - cardholders - reservations - restaurants - rated 163 68_dining_cardholders_reservations_restaurants
69 grant - apply - upfront - federal - government 69 69_grant_apply_upfront_federal
70 scarf - tsgiveaway - towel - tote - bag 103 70_scarf_tsgiveaway_towel_tote
71 annual - fee - fees - loffland - tam 53 71_annual_fee_fees_loffland
72 unsubcribe - mailing - stop - remove - list 204 72_unsubcribe_mailing_stop_remove
73 crooks - money - scumbags - stole - thieves 106 73_crooks_money_scumbags_stole
74 mcws - rebs - omaha - wps - toddy 633 74_mcws_rebs_omaha_wps
75 bejeweled - ok - hi - mm - tay 105 75_bejeweled_ok_hi_mm
76 pov - misclicked - clicked - clicky - pa 719 76_pov_misclicked_clicked_clicky
77 financial - tips - budget - empowerment - healthy 174 77_financial_tips_budget_empowerment
78 wynn - match - brady - hole - las 379 78_wynn_match_brady_hole
79 upgrade - upgrades - upgraded - update - seats 70 79_upgrade_upgrades_upgraded_update
80 step - tapn - wit - walking - method 142 80_step_tapn_wit_walking
81 facilitate - form - cybersuppospy - communicate - following 120 81_facilitate_form_cybersuppospy_communicate
82 debt - credit - need - card - dm 46 82_debt_credit_need_card
83 press - interactive - seat - conference - hot 375 83_press_interactive_seat_conference
84 minority - jackie - scholarship - nation - donating 31 84_minority_jackie_scholarship_nation
85 chefs - chef - exclusive - dining - cardholders 39 85_chefs_chef_exclusive_dining
86 flannel - cardigan - scarf - jk - code 30 86_flannel_cardigan_scarf_jk
87 sell - sold - sellout - selling - tim 91 87_sell_sold_sellout_selling
88 movie - saw - watch - watching - video 90 88_movie_saw_watch_watching
89 denied - approved - applied - pre - tried 411 89_denied_approved_applied_pre
90 balali - dizabali - manager - blame - changing 188 90_balali_dizabali_manager_blame
91 office - direct - contact - don - erica 24 91_office_direct_contact_don
92 stimulus - latest - mon - dates - track 60 92_stimulus_latest_mon_dates
93 itbot - cash - gyvi - creditcard - best 23 93_itbot_cash_gyvi_creditcard
94 pakistan - donate - relief - flood - activities 199 94_pakistan_donate_relief_flood
95 lottery - winner - jackpot - fan - million 27 95_lottery_winner_jackpot_fan
96 payments - sending - email - payment - suppo 77 96_payments_sending_email_payment
97 cardholders - star - mlb - ultimate - allstarweek 127 97_cardholders_star_mlb_ultimate
98 reds - disparage - toss - stats - count 57 98_reds_disparage_toss_stats

Training hyperparameters

  • calculate_probabilities: False
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 100
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.5
  • Pandas: 2.0.3
  • Scikit-Learn: 1.4.1.post1
  • Sentence-transformers: 2.5.1
  • Transformers: 4.40.0
  • Numba: 0.59.1
  • Plotly: 5.20.0
  • Python: 3.11.8