xsum_108_5000000_2500000_validation

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/xsum_108_5000000_2500000_validation")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 9
  • Number of training documents: 11332
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 said - world - first - one - time 41 -1_said_world_first_one
0 said - mr - would - people - also 813 0_said_mr_would_people
1 win - game - league - club - player 7931 1_win_game_league_club
2 sport - olympic - race - gold - world 2105 2_sport_olympic_race_gold
3 round - world - champion - open - golf 219 3_round_world_champion_open
4 murray - match - tennis - set - number 70 4_murray_match_tennis_set
5 race - hamilton - f1 - rosberg - mercedes 60 5_race_hamilton_f1_rosberg
6 yn - ar - ei - yr - wedi 50 6_yn_ar_ei_yr
7 fight - title - boxing - champion - im 43 7_fight_title_boxing_champion

Training hyperparameters

  • calculate_probabilities: True
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.31.0
  • Numba: 0.57.1
  • Plotly: 5.13.1
  • Python: 3.10.12
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.